Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ah/chore/update #8

Merged
merged 52 commits into from Mar 4, 2024
Merged

Ah/chore/update #8

merged 52 commits into from Mar 4, 2024

Conversation

Ahuge
Copy link
Owner

@Ahuge Ahuge commented Mar 4, 2024

No description provided.

igungor and others added 30 commits February 2, 2023 22:16
It uses external sort instead of in-memory sort to dramatically reduce memory usage in the expense of the speed.
It uses encoding/gob format to store to disk and restore from there.

Fixes peak#441
Fixes peak#447
This commit adds versioning support to the s5cmd.

Added --all-versions flag to ls, rm, du and select subcommands
to apply operation on(/over) all versions of the objects.
Added --version-id flag to cat, cp/mv, rm, du and select
subcommands to apply operation on(/over) a specific versions of the object.
Added bucket-version command to configure bucket versioning. Bucket name
alone returns the bucket versioning status of the bucket. Bucket versioning can
be configured with set flag which only accepts.
Added --raw flag to cat and select subcommands. It disables the wildcard operations.
Note: Google Cloud Storage uses a different approach for versioning. So with current implementation, s5cmd cannot use or retrieve generation numbers . However, bucket-version command and du command with all-versions flag works as expected since they do not use version ids.

Fixes: peak#218
Fixes: peak#386
Fixes: peak#539
Co-authored-by: Onur Sönmez <onursonmez@peak.com>
This PR updates the help messages for the commands that takes arguments containing wildcards. Since the expansion of wildcards are evaluated by the shell and not by the s5cmd beforehand, there is no way for app to prevent the expansion. We urge users to consider escaping from globbing by wrapping the related argument to quotes. Obviously, the escaping method can be different for every shell implementation.

Fixes peak#555
Resolves peak#479
Local files used to be overwritten even if downloads failed.
Solved it by creating a temporary file and renaming it with the original filename after completing the download successfully.
The flag now shows the date, ETag etc. of the objects.

Updates peak#596 .
Co-authored-by: İbrahim Güngör <igungor@gmail.com>
Co-authored-by: Selman Kayrancioglu <seruman@users.noreply.github.com>
This PR adds a new io.WriterAt adapter for non-seekable writers. It uses an internal linked list to order the incoming chunks. The implementation is independent from the download manager of aws-sdk-go, and because of that currently it can not bound the memory usage. In order to limit the memory usage, we would have had to write a custom manager other than the aws-sdk-go's implementation, which seemed unfeasible.

The new implementation is about %25 percent faster than the older implementation for a 9.4 GB file with partSize=50MB and concurrency=20 parameters, with significantly higher memory usage, on average it uses 0.9 GB of memory and at most 2.1 GB is observed. Obviously, the memory usage and performance is dependent on the partSize-concurrency configuration and the link.

Resolves peak#245

Co-authored-by: İbrahim Güngör <igungor@gmail.com>
Co-authored-by: İbrahim Güngör <igungor@gmail.com>
Resolves peak#51

Enables usage of cp command with show-progress flag to show progress bar for a copy task.

Example progress bar:
78.00%  ━━━━━━━━━━━━━━━━━━━━────────  780 MB / 1.00 GB (189.37 MB/s) 5.9s (28/29)
Co-authored-by: Selman Kayrancioglu <seruman@users.noreply.github.com>
Co-authored-by: İbrahim Güngör <igungor@gmail.com>
Co-authored-by: Selman Kayrancioglu <seruman@users.noreply.github.com>
Co-authored-by: İbrahim Güngör <igungor@gmail.com>
GCS now supports ListObjectsV2 XML API according to their changelog.

cloud.google.com/storage/docs/release-notes#July_12_2021

I choose to believe the doc because I haven't tested it :)

Fixes peak#513
A file could be deleted/renamed/moved after s5cmd calls List(). This
might happen when multiple processes do file operations on the same
folder.

Resolves peak#619
Resolves peak#559
Rajpratik71 and others added 22 commits July 31, 2023 15:44
Manual control of dependency is fine but with a growing no. of distributed upstream dependencies, it is hard to manage. So, for that automation should be there to update dependencies. Further, CI Pipeline is there to test those changes. The current build is having many old dependencies due to that many vulnerabilities were found. i.e. why workflow automation will help here.
Further, this will not update the dependencies automatically, instead, a PR will be opened with changes that can be reviewed and tested with CI.

Signed-off-by: Pratik Raj rajpratik71@gmail.com
Resolves peak#564 

Changes are made:

- Added `exit-on-error` flag. Its value is `false` by default.
- Added `shouldStopSync` function. It determines whether a sync process
should be stopped or not. It does not ignore the errors `AccessDenied`
and `NoSuchBucket` regardless of the value of `exit-on-error` flag.
- `sync` command stops if an error is received when listing objects from
source or destination when the `exit-on-error` flag is `true`. But it
always ignores the `ErrNoObjectFound` error.

---------

Co-authored-by: İlkin Balkanay <ilkinulas@gmail.com>
Co-authored-by: Onur Sönmez <onursonmez@peak.com>
Resolves peak#498 

Changes are made:

- `cp` command checks file mode bits before trying to copy. If the file
is a special file, `cp` do not copy that file. It neither returns an
error nor prints a message. The copy process continues as usual.
- As `sync` uses `cp`  in the background, it behaves like `cp`, so it
does not sync special files.
…ak#636)

Resolves peak#541

While running the sync command it starts to generate cp commands in the
generateCommand function and writes it to a pipe where it is read by the
Run function of the Run command. In the Run function, there is a line

`fields, err := shellquote.Split(line)`

to split the generated cp command into its fields.



 For a command



 
```
 
s5cmd sync --cache-control ‘public, max-age=31536000, immutable’ /Users/ataberk/desktop/test s3://s5cmd-test2
```

We would have expected the generateCommand function to write into the
pipe a command



```
cp --cache-control='public, max-age=31536000, immutable' --raw='true' \"/Users/ataberk/desktop/test/hello.txt\" \"s3://s5cmd-test2/test/hello.txt\"
```

But instead, it writes the command



```
cp --cache-control=public, max-age=31536000, immutable --raw=true \"/Users/ataberk/desktop/test/hello.txt\" \"s3://s5cmd-test2/test/hello.txt\"
```

![Screenshot 2023-08-05 at 13 45
52](https://github.com/peak/s5cmd/assets/30214288/da4db89a-49c2-41ca-9020-89d12484e72c)

This causes the `shellquote.Split(line)` to split fields incorrectly.
This causes `flagset.Parse(fields)` to populate flagset incorrectly and
as a result we end up with an error.

![Screenshot 2023-08-05 at 05 53
43](https://github.com/peak/s5cmd/assets/30214288/6777a474-a220-4b95-b158-16e2848402b5)

The main problem occurs in the generateCommand function while appending
flags without quotes. If we add quotes around the flagValue while
generating a command. The `shellquote.Split(line)` acts as intended and
successfully sync with the given cache-control metadata in S3

![Screenshot 2023-08-05 at 13 58
28](https://github.com/peak/s5cmd/assets/30214288/534aaa13-31e9-46bc-959f-85506d66917d)

![Screenshot 2023-08-05 at 13 59
31](https://github.com/peak/s5cmd/assets/30214288/64817fa7-7467-4046-85d3-f3e694426f63)

Co-authored-by: Ataberk Gürel <ataberk@Ataberks-Mini.home>
Co-authored-by: İbrahim Güngör <igungor@gmail.com>
Co-authored-by: İbrahim Güngör <igungor@gmail.com>
Co-authored-by: İbrahim Güngör <igungor@gmail.com>
This PR extends the support for AWS S3's Select API.

Resolves peak#494, resolves peak#357 .
peak#657 mentioned that `acl` flag was broken in `v2.2.1`. After looking at
the issue, it was discovered that not only the `acl` flag was broken,
there are couple of more flags that were being omitted during the
mentioned commands, which all of them caused by this faulty
[PR](peak#621).

Resolves peak#657.
Go report shows report for the latest version associated with the
relevant module (name), hence the major version.
@Ahuge Ahuge merged commit a0fc23a into master Mar 4, 2024
13 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet