New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TOOLS-2667: Support list of files for put and get subcommands in mongofiles #283
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good job! I have some questions mainly about the tests and the $in
mongofiles/mongofiles_test.go
Outdated
newFile := util.ToUniversalPath("testdata/" + fmt.Sprintf(copyFormat, i)) | ||
|
||
// Makes new copies of lorem ipsum file | ||
err = func(src, dst string) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You don't need to create the files on the fly, it's easier for the users if those files were pre-generated. And we might want to test with files with different content.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you suggest that I create/download some new lorem ipsum files for this purpose?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea, you can just generate these files locally and push them with this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, I can generate two more lorem ipsum files of the same format and use stat(2)
to keep track of the number of bytes.
Convey("and files should exist in gridfs", func() { | ||
bytesGotten, err := getFilesAndBytesListFromGridFS() | ||
So(err, ShouldBeNil) | ||
So(len(bytesGotten), ShouldEqual, len(testFiles)+1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be checked as well for total bytes gotten
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand what you mean, since I check for the number of bytes gotten in lines 556-557. Could you be more specific about what exactly I'm missing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Check number of files in bytesGotten
-- GridFS should contain the test files and only the test files.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just thought a way to improve the current test, for get
test, instead of trying to output all the files written to db, maybe you can write 4 files and have 3 filenames in the list?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good effort! There is still some more refactoring to be done, though!
mongofiles/mongofiles.go
Outdated
// (see TOOLS-2667). Otherwise, if mongofiles --put_id is called, then | ||
// preserve existing behavior. | ||
case Put: | ||
for _, file := range mf.FileNameList { |
This comment was marked as outdated.
This comment was marked as outdated.
Sorry, something went wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This may have been confusing because there was a mistake in the ticket title, but these aren't options --put
and --get
, they're commands put
and get
. So it's mongofiles get <filename>
not mongofiles --get <filename>
.
mongofiles/mongofiles.go
Outdated
@@ -216,40 +234,52 @@ func (mf *MongoFiles) findGFSFiles(query bson.M) (files []*gfsFile, err error) { | |||
} | |||
|
|||
// Gets the GridFS file the options specify. Use this for the get family of commands. | |||
func (mf *MongoFiles) getTargetGFSFile() (*gfsFile, error) { | |||
func (mf *MongoFiles) getTargetGFSFile() ([]*gfsFile, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this is returning more than one file, you need to rename it to getTargetGFSFiles()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider it done.
…er needed" Restore existing test for mongofiles --get ... --local ...
Thanks for the clarification on this! I will edit the comments I wrote with |
…(*MongoFiles).FileName
mongofiles/mongofiles_test.go
Outdated
Convey("Testing the 'get' command with a file that is in GridFS should", func() { | ||
mf, err := simpleMongoFilesInstanceWithFilename("get", "testfile1") | ||
Convey("Testing the 'get' command with files that are in GridFS should", func() { | ||
testFiles := []string{"testfile1", "testfile2", "testfile3"} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of creating three test files, i.e. of the form testfile(\d)
, create four -- then, test that mongofiles ... get ...
operates only on three, but not four, test files.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good stuff! I've just got a couple of small recommendations.
mongofiles/mongofiles.go
Outdated
case Put: | ||
// If mongofiles --put ... is called, i.e. with multiple supporting | ||
// arguments, then add gridFiles specified in mf.FileNameList | ||
for _, filename := range mf.FileNameList { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should still just be err = mf.handlePut()
. The loop should be inside handlePut()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I had it implemented that way for consistency the first time around, but @mattChiaravalloti recommended that I do it this way instead. I don't think it'd be a big deal one way or another, in any case.
@mattChiaravalloti Any opinions on this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I disagreed that the loop should be inside handlePut
since it had this code:
n, err := mf.put(id, mf.FileName)
if err != nil { ... }
log("copied n bytes")
log("added mf.FileName")
duplicated in the body of the method. It appeared in a loop if mf.FileNameList
was non-empty and outside the loop if it was empty.
Personally, I still think the abstraction as-is makes more sense (handlePut
handles putting a single file), but if you want to combine the FileNameList
and FileName
cases into one that's ok. I advise doing it this way instead of how you initially had it:
func (mf) handlePut() (...) {
id, err := mf.parseOrCreateID()
if err != nil { ... }
if len(mf.FileNameList) == 0 {
mf.FileNameList = []string{mf.FileName}
}
for _, filename := range mf.FileNameList {
n, err := mf.put(id, filename)
if err != nil { ... }
log("...")
}
...
}
This way, instead of having the same code in two branches of a conditional, you make the simplest conditional possible so the rest of the code follows a single path.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I'm okay with the abstraction being for either a single or multiple files. But whatever it is I think it should be consistent with handleGet()
. Right now it isn't consistent and that feels a bit messy to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wouldn't be against say adding a handleGetID and handlePutID if this helps clean things up. I'm fine with any approach as long as it's consistent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. Making it consistent with handleGet
sounds good. No need to start introducing new methods yet. @hariamoor just try to avoid duplicate code inside the body of handlePut
!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mattChiaravalloti Followed your suggestion in be873f0.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, thank you. It looks good 👍 Good thing we caught the issue of having the id
logic in the loop (since my suggestion above is incorrect).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: Matt and I came to the understanding off-band that each file should have a unique ID. That's why I deviate from his suggestion and put parseOrCreateID
in the loop here.
mongofiles/mongofiles.go
Outdated
|
||
case PutID: | ||
err = mf.handlePut(mf.FileName) | ||
if err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You don't need to do the error check here. The error gets returned at the end of the function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm! Well done!
This PR implements logic to support a list of supporting arguments, i.e. file names, for
mongofiles put
andmongofiles get
. The intended behavior is as follows:The following (1) should put the specified files in the remote GridFS filesystem:
mongofiles put first.bson second.bson third.bson
The following (2) should put all files of the form
*.bson
in the current directory or any of its subdirectories:mongofiles put **/*.bson
mongofiles
should not evaluate the glob expression logic -- the underlying shell should evaluate the glob expression and callmongofiles
with a command of the form specified in (1).Furthermore, the following command should get the specified files from the remote GridFS filesystem:
mongofiles get first.bson second.bson third.bson
Notice that glob expressions do not work for
mongofiles get ...
-- this is because globs are generally used to index files within a Unix filesystem, which does not make sense relative to GridFS. However, TOOLS-2668 provides (upon completion) amongofiles get_regex ...
subcommand, which get's all the files in GridFS that match a given PCRE.Test Plan: We introduce two new integration tests for the multivariadic cases for
get
andput
. It is OK to remove the old test cases, because they're "homomorphic" to the new ones (the new ones test the subcommands for n files, and the old ones test for n=1 files).Other than that, there are some new unittests regarding argument parsing.