Skip to content
This repository has been archived by the owner on Aug 13, 2019. It is now read-only.

clean up after running repair tests #372

Merged

Conversation

cstyan
Copy link
Contributor

@cstyan cstyan commented Sep 4, 2018

I noticed that some files are changed and others are created when running tsdb tests
Signed-off-by: Callum Styan callumstyan@gmail.com

@krasi-georgiev
Copy link
Contributor

Yeah I noticed that too, but I think the problem is with the test itself. It should revert the state after completing. Otherwise it doesn't allow running it twice and confuses git as well

I also prefer to use .git/info/exclude for such edge use-cases and other user specific workflows.

@cstyan
Copy link
Contributor Author

cstyan commented Sep 14, 2018

@krasi-georgiev updated so that we're cleaning up after the repair tests, reverted the gitignore changes. It's a bit ugly atm but I need to sleep. I'll clean it up it in the morning.

@krasi-georgiev
Copy link
Contributor

thanks , ping me when ready and I will review again.

@cstyan cstyan changed the title ignore testdata dir since the contents change when running tests clean up after running repair tests Sep 14, 2018
@cstyan
Copy link
Contributor Author

cstyan commented Sep 14, 2018

@krasi-georgiev ping :)

@krasi-georgiev
Copy link
Contributor

krasi-georgiev commented Sep 17, 2018

I just had a look at the test I did for another PR and here is a lot easyer way to achieve the same:

db, err := Open("testdata/repair_index_version/", nil, nil, DefaultOptions)
testutil.Ok(t, err)
defer db.Close()

// Create and work on the db copy so we can run the test many times.
tmpdir := "testdata/repair_index_chunks/snapshot"
defer os.RemoveAll(tmpdir)
testutil.Ok(t, db.Snapshot(tmpdir, false))
dbCopy, err := Open(tmpdir, nil, nil, DefaultOptions)
testutil.Ok(t, err)

@cstyan
Copy link
Contributor Author

cstyan commented Sep 17, 2018

@krasi-georgiev yeah I tried that initially, we need tombstones in order to create a snapshot. I don't we want to deal with generating tombstones for this test.

@krasi-georgiev
Copy link
Contributor

even if you need to put an empty tombstones file it would be a much cleaner solution without requiring so many lines of additional code.

@cstyan
Copy link
Contributor Author

cstyan commented Sep 17, 2018

if all we needed was an empty tombstones file I would have just done that :) let me have another look

@cstyan
Copy link
Contributor Author

cstyan commented Sep 17, 2018

@krasi-georgiev we can't use snapshot here anyways, it's the call to db, err := Open("testdata/repair_index_version/", nil, nil, DefaultOptions) that actually modifies the existing files.

@krasi-georgiev
Copy link
Contributor

I made it work using:

bl := Block{
   meta: BlockMeta{
	ULID: ulid.ULID{},
   },
   dir: "testdata/repair_index_version/01BZJ9WJQPWHGNC2W4J9TA62KC",
}
copy(bl.meta.ULID[:], bl.dir)

tmpdir := "testdata/repair_index_chunks/snapshot"
testutil.Ok(t, bl.Snapshot(tmpdir))

I think we would need to do this quite often form now on so why don't we make some more generic func which takes filepaths as a param and recursively makes hard links using os.Link and put it in:
fileutil/fileutil.go

Something like:

func CopyFiles( files ...string, destDir string){
  //  iterate and os.Link all files
}

than we can have: CopyFile(ReadDir(dirpath),destDir)

if you don't have time for the extra effort let me know and I can take over.

@cstyan
Copy link
Contributor Author

cstyan commented Sep 18, 2018

Thanks @krasi-georgiev, I got it working via the block snapshotting.

Unfortunately I don't understand this generic copy files helper, how do hardlinks help us here?

@krasi-georgiev
Copy link
Contributor

look at the snapshot implementation it uses hard links instead of copying files.

@krasi-georgiev
Copy link
Contributor

hardlinks is essentially like copying , but without even opening the files.

@cstyan
Copy link
Contributor Author

cstyan commented Sep 18, 2018

Right, but it's linked to the same underlying data on disk. So modifying the file at the hardlink name modifies the file at the original name, because they're essentially the same file.

Looking at the snapshot code, I don't understand how using the hardlinks gets us a copy of the DB files that we can modify without making modifications to the originals.

@cstyan
Copy link
Contributor Author

cstyan commented Sep 18, 2018

Okay I see what's happening, we wanted to not modify the index and meta files so that the test could be rerun without modifying the existing db files. Snapshot creates hardlinks, but open calls functions to repair the index file and fix the meta file in temporary files, and then calls os.Rename to move the fixed files into place.

So if you want we can add a help function to do the hardlinks, but I think the name CopyFiles is misleading if we're calling os.Link

@krasi-georgiev
Copy link
Contributor

CloneFiles is another name that comes to mind.

@krasi-georgiev
Copy link
Contributor

krasi-georgiev commented Sep 18, 2018

ooh yeah I misunderstood the hard-links a bit.
Lets use the original idea with os.Open/io.Copy, but just make the func something like: CopyFile(ReadDir(dirpath),destDir) as I have another test that would use it right away.

@cstyan
Copy link
Contributor Author

cstyan commented Sep 19, 2018

@krasi-georgiev okay, done. If it's 👍 I will squash my commits.

Copy link
Contributor

@krasi-georgiev krasi-georgiev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what was the blocker to use something like

CopyFiles(fileutil.ReadDir(srcDir),destDir)

func CopyFiles( files ...string, destDir string)

repair_test.go Outdated

"github.com/prometheus/tsdb/index"
"github.com/prometheus/tsdb/labels"
)

func copyFiles(tmpIndex, tmpMeta *os.File, dir string) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't need this anymore.

@cstyan
Copy link
Contributor Author

cstyan commented Sep 19, 2018

fileutil.ReadDir just returns the names as strings, we'd need the path to the files. Also we need to handle the case where some contents of srcDir are directories even if we don't also want to recursively copy from sub-directories.

If func CopyFiles( files ...string, destDir string) works better for your use case I can make that change, but I don't think passing fileutil.ReadDir(srcDir) is useful unless srcdir happens to also be os.Getwd().

@krasi-georgiev
Copy link
Contributor

// Reads files in the src directory and all sub directories recursively. 
func ReadDirs(dir) map[path]name, error{
  recursively walk all folders including the root dir and build the  `map[path]name` using the `func ReadDir(...)`
}

CopyFiles(ReadDirs(srcDir),dstDir)

something like this should do the trick.

Than I will try if it will work for my usecase as well.

@cstyan
Copy link
Contributor Author

cstyan commented Sep 19, 2018

@krasi-georgiev imo it's easier if ReadDirs returns []string, but how do you want it to handle empty directories? For example for this repair index test, we need the chunks dir to exist in the copy of the tsdb, but we just create an empty db in the real dir before copying it. I can just add another mkdir in the copy, or make ReadDirs and CopyFiles handle empty dirs as well.

@krasi-georgiev
Copy link
Contributor

I would say lets create dirs regardless of whether they will include any files.
This solves our use case so it is a good start. We can always update later if needed.

@krasi-georgiev
Copy link
Contributor

for []string vs map[path]name just do whichever would require less lines of code.

}

// ReadDirs returns a slice of the full path to all files, including empty directories, in src dir and it's sub directories.
func ReadDirs(src string) ([]string, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should be able to simplify this using filepath.Walk and than reuse fileutil.ReadDir()

https://gist.github.com/francoishill/a5aca2a7bd598ef5b563

return err
}
return nil
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not something more simple like:

 data, err := ioutil.ReadFile(src)
 if err != nil {
     return err
  }

 err = ioutil.WriteFile(dst, data, 0644)
 if err != nil {
      return err
 }

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we seem to use os.Rename() everywhere else so I stuck with that

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see a problem here without rename.

@cstyan
Copy link
Contributor Author

cstyan commented Sep 19, 2018

see the latest commit, we also needed to grab extra portions of the path for files we're creating copies of so that the end up in the right destination location

return err
}
return nil
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see a problem here without rename.

return chunks[1]
}

func copyFile(dest, src string) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Usually the order is src, dst

@krasi-georgiev
Copy link
Contributor

see the latest commit, we also needed to grab extra portions of the path for files we're creating copies of so that the end up in the right destination location

this looks hacky, can you try to simplify.

@cstyan
Copy link
Contributor Author

cstyan commented Sep 20, 2018

yes

@cstyan
Copy link
Contributor Author

cstyan commented Sep 20, 2018

@krasi-georgiev simplified a bit. We still need a way to remove a prefix from the files we find when using filepath.Walk, but we can use strings.TrimPrefix rather than the function I had written.

@krasi-georgiev
Copy link
Contributor

can you also try the simplified version I suggested for copyFile.

@cstyan
Copy link
Contributor Author

cstyan commented Sep 20, 2018

@krasi-georgiev yep, just slipped my mind earlier.

Want me to squash commits now?

@krasi-georgiev
Copy link
Contributor

krasi-georgiev commented Sep 20, 2018

no need I will squash when I merge it.

moved few bits around and updated few comment.

// CopyDirs copies all directories, subdirectories and files recursively including the empty folders.
// Source and destination must be full paths.
func CopyDirs(src, dest string) error {
	if err := os.MkdirAll(dest, 0777); err != nil {
		return err
	}
	files, err := readDirs(src)
	if err != nil {
		return err
	}

	for _, f := range files {
		dp := filepath.Join(dest, f)
		sp := filepath.Join(src, f)

		stat, err := os.Stat(sp)
		if err != nil {
			return err
		}

		// Empty directories are also created.
		if stat.IsDir() {
			if err := os.MkdirAll(dest, 0777); err != nil {
				return err
			}
			continue
		}

		if err := copyFile(sp, dp); err != nil {
			return err
		}
	}
	return nil
}

func copyFile(src, dest string) error {
	data, err := ioutil.ReadFile(src)
	if err != nil {
		return err
	}

	err = ioutil.WriteFile(dest, data, 0644)
	if err != nil {
		return err
	}
	return nil
}

// readDirs reads the source directory recursively and
// returns relative paths to all files and empty directories.
func readDirs(src string) ([]string, error) {
	var files []string
	var err error

	err = filepath.Walk(src, func(path string, f os.FileInfo, err error) error {
		relativePath := strings.TrimPrefix(path, src)
		if len(relativePath) > 0 {
			files = append(files, relativePath)
		}
		return nil
	})
	if err != nil {
		return nil, err
	}
	return files, nil
}

@cstyan
Copy link
Contributor Author

cstyan commented Sep 20, 2018

👍 changes lgtm, pushed them

@krasi-georgiev
Copy link
Contributor

resolve the conflict and LGTM

Signed-off-by: Callum Stytan <callumstyan@gmail.com>
Signed-off-by: Callum Stytan <callumstyan@gmail.com>
Signed-off-by: Callum Stytan <callumstyan@gmail.com>
directories

Signed-off-by: Callum Stytan <callumstyan@gmail.com>
subdirectories properly

Signed-off-by: Callum Stytan <callumstyan@gmail.com>
Signed-off-by: Callum Stytan <callumstyan@gmail.com>
since we don't care about this being atomic

Signed-off-by: Callum Stytan <callumstyan@gmail.com>
Signed-off-by: Callum Stytan <callumstyan@gmail.com>
@cstyan
Copy link
Contributor Author

cstyan commented Sep 21, 2018

done :) thanks for all the reviews @krasi-georgiev

@krasi-georgiev krasi-georgiev merged commit a971f52 into prometheus-junkyard:master Sep 21, 2018
@krasi-georgiev
Copy link
Contributor

Thanks for spending the time on this.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants