Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add "limit" option to "ls" and "la" commands to return the specified number of objects instead of returning all objects. #757

Merged
merged 1 commit into from Sep 1, 2016

Conversation

mozawa
Copy link
Contributor

@mozawa mozawa commented Jun 27, 2016

Currently "ls" command returns all objects within a bucket under a certain prefix using the delimiter '/' and there seems to be no option to limit the number of returned objects. If the contents of the main folder or a sub-folder is huge, it will continue to list all objects until it finishes. Its possible to take time to finish it and it could be overloaded the S3 server side.

This change is to add "max-ls-keys" option to "ls" command to return the specified number of objects instead of returning all objects.

@fviard
Copy link
Contributor

fviard commented Jun 27, 2016

@mozawa Thank you for your contribution, I think that the idea is good.
One point I see is that it could be better to rename the option to have something more generic, like "limit" so this option can also be re-used in different contexts.

@mozawa
Copy link
Contributor Author

mozawa commented Jun 27, 2016

@fviard Thank you for the comment. You mean its better for the option to use "--limit" rather than "--max-ls-keys"? If you have any recommended option name, please let me know and I will change it.

@fviard
Copy link
Contributor

fviard commented Jun 27, 2016

Yes, I think that "--limit" instead of "--max-ls-keys" will more understandable and reusable.
I think that "limit" is good but it is the first thing that came to my mind, so it's not a problem if you think about a better "generic" name.

@mozawa mozawa changed the title add "max-ls-keys" option to ls command to return the specified number of objects instead of returning all objects. add "limit" option to ls command to return the specified number of objects instead of returning all objects. Jun 28, 2016
@mozawa
Copy link
Contributor Author

mozawa commented Jun 28, 2016

@fviard Thanks. I made changes.

@mozawa mozawa changed the title add "limit" option to ls command to return the specified number of objects instead of returning all objects. add "limit" option to "ls" and "la" commands to return the specified number of objects instead of returning all objects. Jun 30, 2016
@mozawa
Copy link
Contributor Author

mozawa commented Aug 4, 2016

@fviard I changed my mind a little bit. For my current changes, "ls" and "la" commands still will return all objects in the bucket even if the bucket is huge(e.g. millions of objects in the bucket) as default. This also could make the S3 server overloaded. So I'm going to change as follows.

  • Have an internal default e.g. "100000"
  • "ls " or "la" command will return objects up to an internal default. If the list is truncated, it indicates to let the user know.
  • "ls" or "la" command with "--limit=10" return objects up to 10. If the list is truncated, it indicates to let the user know.
  • If you want to get all objects, use "--limit=-1" or "--unlimit" option.

What do you think?

@fviard
Copy link
Contributor

fviard commented Aug 31, 2016

Hi @mozawa : I'm sorry for the big delay for my reply. (Probably my holidays are to blame).
I like the concept of --limit.
Here are my remarks:

  • Regarding your previous message, I think that it is better to let "unlimited" be the default.
  • That could be good if you could add the list of command that supports "--limit" in the limit's help string, as it is quite limited for the moment.
  • The truncation message is good, but it could probably be reduced to a single line. No need to explain to the user that he has to remove the limit, but maybe if the first one you can add something like "because the settings limit was reached".
  • For this message, I would advise you to use "warning()" instead of "output()" so the user is warned, without corrupting the "output" if the output was to be machine parsed.
  • If think that your change has to be a little more complicated. Amazon S3 doesn't return more than 1000 keys in one go, and max-keys is only the maximum of one request, not the total of entries. So max-keys is supposed to be less or equal to max-keys.
    So, I think that what you have to do is add some math when limit is set, and add the number of entries returned at each requests and just set the "max-keys" for the last request when there is less than 1000 keys to return. Probably you can just set max-keys at each request and just decrease it by the number of entries returned. But take care that maybe amazon will raise an error if this number exceed 1000.

number of objects instead of returning all objects in the bucket.
@mozawa
Copy link
Contributor Author

mozawa commented Sep 1, 2016

@fviard Thank you for the advice. I made some changes based on your suggestion. Please review this again. Thanks!

@fviard fviard merged commit bb2a5af into s3tools:master Sep 1, 2016
@fviard
Copy link
Contributor

fviard commented Sep 1, 2016

Merged, thank you very much!

@mozawa
Copy link
Contributor Author

mozawa commented Sep 2, 2016

@fviard Thank you. BTW, I'd like to know when will the next official release including this change be? The latest is 1.6.1 which was released on 2016-01-20 I guess.

@fviard
Copy link
Contributor

fviard commented Sep 2, 2016

@mozawa Time past fast ;-) There could be a new release soon, but I still would like to have 1 or 2 issues resolved before.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants