Skip to content

Conversation

@kyleknap
Copy link
Contributor

@kyleknap kyleknap commented Sep 1, 2016

So there is actually not much happening in this PR other than flipping sync to use the new architecture. Most of the code being added is actually from this PR: #2147.

So I was actually surprised that had to make this little of a change to get this to work because I do not at all take into account source clients when I do sync across buckets of different regions with a --delete. This is because we have auto-redirection now by default in botocore for sigv4. The question now becomes is the auto-redirection good enough to handle that use case? Also is there any other edge cases that I miss by not providing a source client for deletes. The only other edge case I could think of is kms, but for deletes of kms sigv4 is not needed.

The only reason I did not add support for providing support for a source client is that client's are attached to their TransferManager. There is currently no way to swap the two out. I thought about what I would do and these were the two best options I can think of:

  1. Add future.meta.provide_client() to allow you to provide a different client for a particular transfer
  2. Use two different transfer managers: one for the regular client and one for the source. Note I do not think that one is a good idea because we may be having separate executor thread pools which could break our configuration contract.

Option 1 or something along the lines of option 1 is the most realistic option. Option 2 is not really realistic at all.

Let me know what you think. Given that we cache the region at the client level and the integration test pass, I am fine with relying on redirection as opposed to having to add a new abstraction.

cc @jamesls @JordonPhillips

@kyleknap kyleknap added the pr:needs-review This PR needs a review from a Member. label Sep 1, 2016
@jamesls
Copy link
Member

jamesls commented Sep 1, 2016

Yeah, I'm surprised how little code was needed to get sync working. IIRC, the bucket region redirector in botocore is not thread safe, so I'd assume in the worst case you'd get N retried/redirected requests initially (N being number of threads). Does that match the behavior you're seeing?

As for not having a source client, the only downside I can think of is that you may not be able to do cross account syncs (i.e source/dest buckets in different AWS accounts). This isn't something you could do previously, but without source clients I don't think we'd be able to implement this feature. So my guess is that option 1 is something we may need to eventually add.

@kyleknap
Copy link
Contributor Author

kyleknap commented Sep 1, 2016

That's a good point. I think some perf testing on this is worth it. I do not even have to use the this branch of the CLI commands. I could just run rm --recursive with sigv4 and the wrong region set to see what the difference is when the correct region is used.

As to supporting a source client, we actually do support use it for copies: https://github.com/aws/aws-cli/blob/integrate-s3transfer/awscli/customizations/s3/s3handler.py#L710-L714. It is just for the cross-region sigv4 enabled sync --delete case that we run into the need to do the region redirector.

@jamesls
Copy link
Member

jamesls commented Sep 2, 2016

Ah that's right. I think if there's not really any noticeable difference (even in the case of a small number of files) in perf, then this approach is simpler and more robust.

@jamesls jamesls force-pushed the integrate-s3transfer branch 2 times, most recently from 8447330 to 50c80e6 Compare September 20, 2016 17:16
@jamesls jamesls force-pushed the integrate-s3transfer branch 3 times, most recently from 6df2ce2 to 98e78f3 Compare September 26, 2016 20:43
@kyleknap kyleknap force-pushed the integrate-s3transfer branch from 98e78f3 to 9fb09a3 Compare September 28, 2016 18:01
@kyleknap kyleknap changed the base branch from integrate-s3transfer to develop September 29, 2016 23:19
@kyleknap kyleknap force-pushed the integrate-sync branch 2 times, most recently from f59e933 to 99698e6 Compare October 12, 2016 16:32
All existing unit, functional, and integration tests pass
@kyleknap
Copy link
Contributor Author

I re-ran the tests after rebasing. They pass now. Merging.

@kyleknap kyleknap merged commit 4dbc179 into aws:develop Oct 13, 2016
@kyleknap kyleknap deleted the integrate-sync branch October 13, 2016 19:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr:needs-review This PR needs a review from a Member.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants