Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

--dryrun to include download estimate as well as check for free space at destination #6863

Open
1 of 2 tasks
hossimo opened this issue Apr 8, 2022 · 3 comments
Open
1 of 2 tasks
Labels
feature-request A feature should be added or improved. p3 This is a minor priority issue s3sync

Comments

@hossimo
Copy link

hossimo commented Apr 8, 2022

Describe the feature

using the --dryrun flag it would be great to get an idea of how much will be downloaded and if there would be enough space at the destination for the files being downloaded.

For example:

aws s3 sync s3://test-bucket . --dryrun --human
(dryrun) download: s3://test-bucket/file1 to ./file1 (500 MiB)
(dryrun) download: s3://test-bucket/file1 to ./file1 (1 TiB)
Total: 1.5 TiB
Destination Free: 1 TiB
WARNING: Destination does not contain about space for all files.

Use Case

I'm often moving large (multi 100 GB) files in large (multi TB batches) and while I often have a handle on how much space in remaining on my destination drive or share There had been times when I did not realize that a sync might be larger than the remaining space.

Having some indication of the amount of data being transferred while doing a --dryrun would make this clear.

Proposed Solution

ether adding the functionality to --dryrun or if worried about additional API calls adding a --dryrun-with-size (not an inspired flag name). This would display the size in bytes, adding the --human flag would change it the a valid size as ls currently does.

Other Information

Some issues or edge cases.

  • The system would need to know about underlying file shares (NFS, SMB)
  • Would cause additional API calls
  • What happens when there is not enough space at the destination (just warn?)
  • Changing --dryrun might cause other applications to break?

Acknowledgements

  • I may be able to implement this feature request
  • This feature might incur a breaking change

CLI version used

2.4.15

Environment details (OS name and version, etc.)

AWS Linux, OSx, Windows

@hossimo hossimo added feature-request A feature should be added or improved. needs-triage This issue or PR still needs to be triaged. labels Apr 8, 2022
@kdaily kdaily changed the title --dryrun to include download estimate as well as check for free space at destication --dryrun to include download estimate as well as check for free space at destination Apr 8, 2022
@aaoswal aaoswal added needs-review This issue or pull request needs review from a core team member. s3sync and removed needs-triage This issue or PR still needs to be triaged. labels Apr 8, 2022
@aaoswal
Copy link

aaoswal commented Apr 8, 2022

Hi @hossimo,
Thank you for submitting the feature request. We will review this and prioritize this request.

I would also like to request the community to make sure to 👍🏼 this feature-request to help us understand the community needs better.

@aaoswal aaoswal self-assigned this Apr 8, 2022
@aaoswal aaoswal removed the needs-review This issue or pull request needs review from a core team member. label Apr 13, 2022
@aaoswal
Copy link

aaoswal commented Apr 19, 2022

Hi @hossimo,
On reviewing this feature-request with the team, I am able to confirm that showing the overall size of the transfer would be viable to show when using S3 dry run commands.

With regards to checking the remaining disk space, the team is concerned that it may be difficult to reliably determine the disk space left for a transfer. It also raises concerns due to inconsistency when the target is an S3 bucket.

A workaround here would be to find the size of the bucket using:

aws s3 ls my-bucket-name --summarize --recursive --human-readable | tail -1

and then use the du command to get the size of the directory.

Also, I will be updating the title of the issue to Indicate the amount of data being transferred during S3 dry run commands (eg. cp, sync) to clarify the use-case as well as help others find the issue.

Our team just put out a recent proposal in #6828 (#6828) detailing improvements to the contribution process, and we are currently reviewing all open PRs and issues. This issue is now in the intake stage. Once we determine that there is more user impact, as measured by 👍🏻 reactions on this issue, we can move forward with an implementation. You can read more about this stage and the rest of the process here.

Thank you so much for your patience.

@hossimo
Copy link
Author

hossimo commented Apr 19, 2022

Thanks for the reply. good point on the --summarize I forgot about that. it's a fair workaround workaround for the time being.

As for the remaining size, personally I use the df . -h command that always seems to work even with a fileshare.

@tim-finnigan tim-finnigan added the p3 This is a minor priority issue label Nov 9, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature-request A feature should be added or improved. p3 This is a minor priority issue s3sync
Projects
None yet
Development

No branches or pull requests

3 participants