TOOLS-2667: Support list of files for put and get subcommands in mongofiles #283

hariamoor-zz · 2020-07-22T21:10:43Z

This PR implements logic to support a list of supporting arguments, i.e. file names, for mongofiles put and mongofiles get. The intended behavior is as follows:

The following (1) should put the specified files in the remote GridFS filesystem:

mongofiles put first.bson second.bson third.bson

The following (2) should put all files of the form *.bson in the current directory or any of its subdirectories:

mongofiles put **/*.bson

mongofiles should not evaluate the glob expression logic -- the underlying shell should evaluate the glob expression and call mongofiles with a command of the form specified in (1).

Furthermore, the following command should get the specified files from the remote GridFS filesystem:

mongofiles get first.bson second.bson third.bson

Notice that glob expressions do not work for mongofiles get ... -- this is because globs are generally used to index files within a Unix filesystem, which does not make sense relative to GridFS. However, TOOLS-2668 provides (upon completion) a mongofiles get_regex ... subcommand, which get's all the files in GridFS that match a given PCRE.

Test Plan: We introduce two new integration tests for the multivariadic cases for get and put. It is OK to remove the old test cases, because they're "homomorphic" to the new ones (the new ones test the subcommands for n files, and the old ones test for n=1 files).

Other than that, there are some new unittests regarding argument parsing.

…tions

…tools-2667

and --put_id

huan-Mongo

Good job! I have some questions mainly about the tests and the $in

mongofiles/mongofiles.go

mongofiles/mongofiles_test.go

huan-Mongo · 2020-07-24T20:35:29Z

mongofiles/mongofiles_test.go

+				newFile := util.ToUniversalPath("testdata/" + fmt.Sprintf(copyFormat, i))
+
+				// Makes new copies of lorem ipsum file
+				err = func(src, dst string) error {


You don't need to create the files on the fly, it's easier for the users if those files were pre-generated. And we might want to test with files with different content.

Do you suggest that I create/download some new lorem ipsum files for this purpose?

Yea, you can just generate these files locally and push them with this PR.

Sure, I can generate two more lorem ipsum files of the same format and use stat(2) to keep track of the number of bytes.

huan-Mongo · 2020-07-24T20:38:23Z

mongofiles/mongofiles_test.go

-				Convey("and files should exist in gridfs", func() {
-					bytesGotten, err := getFilesAndBytesListFromGridFS()
-					So(err, ShouldBeNil)
-					So(len(bytesGotten), ShouldEqual, len(testFiles)+1)


This should be checked as well for total bytes gotten

I don't understand what you mean, since I check for the number of bytes gotten in lines 556-557. Could you be more specific about what exactly I'm missing?

Check number of files in bytesGotten -- GridFS should contain the test files and only the test files.

I just thought a way to improve the current test, for get test, instead of trying to output all the files written to db, maybe you can write 4 files and have 3 filenames in the list?

mongofiles/mongofiles_test.go

mongofiles/options_test.go

mattChiaravalloti

Good effort! There is still some more refactoring to be done, though!

mongofiles/mongofiles.go

hariamoor-zz · 2020-07-27T18:03:55Z

mongofiles/mongofiles.go

+		// (see TOOLS-2667). Otherwise, if mongofiles --put_id is called, then
+		// preserve existing behavior.
+	case Put:
+		for _, file := range mf.FileNameList {


mongofiles/mongofiles.go

tfogo

This may have been confusing because there was a mistake in the ticket title, but these aren't options --put and --get, they're commands put and get. So it's mongofiles get <filename> not mongofiles --get <filename>.

tfogo · 2020-07-27T13:29:17Z

mongofiles/mongofiles.go

@@ -216,40 +234,52 @@ func (mf *MongoFiles) findGFSFiles(query bson.M) (files []*gfsFile, err error) {
 }

 // Gets the GridFS file the options specify. Use this for the get family of commands.
-func (mf *MongoFiles) getTargetGFSFile() (*gfsFile, error) {
+func (mf *MongoFiles) getTargetGFSFile() ([]*gfsFile, error) {


If this is returning more than one file, you need to rename it to getTargetGFSFiles()

Consider it done.

mongofiles/mongofiles.go

mongofiles/options_test.go

…er needed" Restore existing test for mongofiles --get ... --local ...

hariamoor-zz · 2020-07-27T17:03:18Z

This may have been confusing because there was a mistake in the ticket title, but these aren't options --put and --get, they're commands put and get. So it's mongofiles get <filename> not mongofiles --get <filename>.

Thanks for the clarification on this! I will edit the comments I wrote with --put and --get as specified.

…ions

…(*MongoFiles).FileName

hariamoor-zz · 2020-07-28T21:35:03Z

mongofiles/mongofiles_test.go

-		Convey("Testing the 'get' command with a file that is in GridFS should", func() {
-			mf, err := simpleMongoFilesInstanceWithFilename("get", "testfile1")
+		Convey("Testing the 'get' command with files that are in GridFS should", func() {
+			testFiles := []string{"testfile1", "testfile2", "testfile3"}


Instead of creating three test files, i.e. of the form testfile(\d), create four -- then, test that mongofiles ... get ... operates only on three, but not four, test files.

tfogo

Good stuff! I've just got a couple of small recommendations.

tfogo · 2020-07-29T12:58:59Z

mongofiles/mongofiles.go

+	case Put:
+		// If mongofiles --put ... is called, i.e. with multiple supporting
+		// arguments, then add gridFiles specified in mf.FileNameList
+		for _, filename := range mf.FileNameList {


I think this should still just be err = mf.handlePut(). The loop should be inside handlePut().

Yeah, I had it implemented that way for consistency the first time around, but @mattChiaravalloti recommended that I do it this way instead. I don't think it'd be a big deal one way or another, in any case.

@mattChiaravalloti Any opinions on this?

I disagreed that the loop should be inside handlePut since it had this code:

n, err := mf.put(id, mf.FileName) if err != nil { ... } log("copied n bytes") log("added mf.FileName")

duplicated in the body of the method. It appeared in a loop if mf.FileNameList was non-empty and outside the loop if it was empty.

Personally, I still think the abstraction as-is makes more sense (handlePut handles putting a single file), but if you want to combine the FileNameList and FileName cases into one that's ok. I advise doing it this way instead of how you initially had it:

func (mf) handlePut() (...) { id, err := mf.parseOrCreateID() if err != nil { ... } if len(mf.FileNameList) == 0 { mf.FileNameList = []string{mf.FileName} } for _, filename := range mf.FileNameList { n, err := mf.put(id, filename) if err != nil { ... } log("...") } ... }

This way, instead of having the same code in two branches of a conditional, you make the simplest conditional possible so the rest of the code follows a single path.

Yeah I'm okay with the abstraction being for either a single or multiple files. But whatever it is I think it should be consistent with handleGet(). Right now it isn't consistent and that feels a bit messy to me.

I wouldn't be against say adding a handleGetID and handlePutID if this helps clean things up. I'm fine with any approach as long as it's consistent.

I see. Making it consistent with handleGet sounds good. No need to start introducing new methods yet. @hariamoor just try to avoid duplicate code inside the body of handlePut!

@mattChiaravalloti Followed your suggestion in be873f0.

Yes, thank you. It looks good 👍 Good thing we caught the issue of having the id logic in the loop (since my suggestion above is incorrect).

Note: Matt and I came to the understanding off-band that each file should have a unique ID. That's why I deviate from his suggestion and put parseOrCreateID in the loop here.

tfogo · 2020-07-29T12:59:26Z

mongofiles/mongofiles.go

+
+	case PutID:
+		err = mf.handlePut(mf.FileName)
+		if err != nil {


You don't need to do the error check here. The error gets returned at the end of the function.

This reverts commit e007c29.

mattChiaravalloti

looks good 👍

huan-Mongo

LGTM! 👍

tfogo

lgtm! Well done!

Hari Amoor added 10 commits July 22, 2020 16:59

Implemented new --put and --get subcommands and modified existing tests

31f5118

Formatting with go fmt and golangci-lint

cd3f1a7

Create tests for new mongofiles --get and mongofiles --put implementa…

58ead50

…tions

Linting changes

e47421e

Change new test name

bd616f1

Merge branch 'master' of https://github.com/mongodb/mongo-tools into …

29cd9fa

…tools-2667

Remove tests for old --get and --put functionality -- no longer needed

20597d6

Fix formatting for lint-go task

93b2722

go fmt -s ...

40d33a2

Parametrize (*MongoFiles).handlePut(...) and separate cases for --put

7e4b968

and --put_id

hariamoor-zz marked this pull request as ready for review July 23, 2020 19:45

hariamoor-zz requested review from tfogo, huan-Mongo and mattChiaravalloti July 23, 2020 19:46

hariamoor-zz self-assigned this Jul 24, 2020

huan-Mongo requested changes Jul 24, 2020

View reviewed changes

mattChiaravalloti requested changes Jul 24, 2020

View reviewed changes

tfogo requested changes Jul 27, 2020

View reviewed changes

Hari Amoor added 3 commits July 27, 2020 12:29

Code review changes to mongofiles.go

accdd4b

Code review changes to options_test.go

29c026f

Revert "Remove tests for old --get and --put functionality -- no long…

2e8bc4e

…er needed" Restore existing test for mongofiles --get ... --local ...

Change comments to show that put and get are subcommands, not opt…

c9b9a90

…ions

hariamoor-zz changed the title ~~TOOLS-2667: Support list of files for --put and --get in mongofiles~~ TOOLS-2667: Support list of files for put and get subcommands in mongofiles Jul 27, 2020

Change test structure to check (*MongoFiles).FileNameList as well as …

b7380a3

…(*MongoFiles).FileName

hariamoor-zz mentioned this pull request Jul 28, 2020

TOOLS-2667: Support list of files for put and get subcommands in mongofiles (verify behavior of $in query) #286

Closed

Add new test files for polyadic mongofiles put

55a7352

hariamoor-zz commented Jul 28, 2020

View reviewed changes

Hari Amoor added 2 commits July 28, 2020 17:44

Test harness for new lorem ipsum files

e5da67f

Make sure that files not included in the query are not copied in get

3deadb4

Hari Amoor added 3 commits July 28, 2020 18:01

Rename local variable so that it doesn't shadow global variable

893a590

Move check for unincluded test file to parent Convey block

ff10ff1

Check that only the required test files are in GridFS

d8699dd

hariamoor-zz requested review from tfogo, huan-Mongo and mattChiaravalloti July 28, 2020 23:24

tfogo requested changes Jul 29, 2020

View reviewed changes

Hari Amoor added 4 commits July 29, 2020 10:39

Remove unnecessary error check

c9805ec

Reconcile symmetry between handleGet and handlePut

e007c29

Revert "Reconcile symmetry between handleGet and handlePut"

615dac2

This reverts commit e007c29.

Reconcile handleGet abstraction with handlePut

be873f0

mattChiaravalloti approved these changes Jul 29, 2020

View reviewed changes

hariamoor-zz requested a review from tfogo July 29, 2020 18:28

huan-Mongo reviewed Jul 29, 2020

View reviewed changes

huan-Mongo approved these changes Jul 29, 2020

View reviewed changes

Fix help message

49f6913

tfogo approved these changes Jul 31, 2020

View reviewed changes

hariamoor-zz merged commit de2083b into mongodb:master Jul 31, 2020

hariamoor-zz deleted the tools-2667 branch July 31, 2020 19:38

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TOOLS-2667: Support list of files for put and get subcommands in mongofiles #283

TOOLS-2667: Support list of files for put and get subcommands in mongofiles #283

hariamoor-zz commented Jul 22, 2020 •

edited

huan-Mongo left a comment

huan-Mongo Jul 24, 2020

hariamoor-zz Jul 27, 2020

huan-Mongo Jul 27, 2020

hariamoor-zz Jul 27, 2020

huan-Mongo Jul 24, 2020

hariamoor-zz Jul 27, 2020

hariamoor-zz Jul 27, 2020

huan-Mongo Jul 27, 2020

mattChiaravalloti left a comment

This comment was marked as outdated.

hariamoor-zz Jul 27, 2020

tfogo left a comment

tfogo Jul 27, 2020

hariamoor-zz Jul 27, 2020

hariamoor-zz commented Jul 27, 2020

hariamoor-zz Jul 28, 2020

tfogo left a comment

tfogo Jul 29, 2020

hariamoor-zz Jul 29, 2020

mattChiaravalloti Jul 29, 2020

tfogo Jul 29, 2020

tfogo Jul 29, 2020

mattChiaravalloti Jul 29, 2020

hariamoor-zz Jul 29, 2020

mattChiaravalloti Jul 29, 2020

hariamoor-zz Jul 29, 2020

tfogo Jul 29, 2020

hariamoor-zz Jul 29, 2020

mattChiaravalloti left a comment

huan-Mongo left a comment

tfogo left a comment

TOOLS-2667: Support list of files for put and get subcommands in mongofiles #283

TOOLS-2667: Support list of files for put and get subcommands in mongofiles #283

Conversation

hariamoor-zz commented Jul 22, 2020 • edited

huan-Mongo left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mattChiaravalloti left a comment

Choose a reason for hiding this comment

This comment was marked as outdated.

Choose a reason for hiding this comment

tfogo left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

hariamoor-zz commented Jul 27, 2020

Choose a reason for hiding this comment

tfogo left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mattChiaravalloti left a comment

Choose a reason for hiding this comment

huan-Mongo left a comment

Choose a reason for hiding this comment

tfogo left a comment

Choose a reason for hiding this comment

hariamoor-zz commented Jul 22, 2020 •

edited