Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ADD] Order groups when deploying to S3 #37

Closed
wants to merge 1 commit into from

Conversation

mpucholblasco
Copy link

Reason

I wanted to use s3deploy to deploy a Single Page Application to S3. On SPAs, the order in which things are deployed is important because you can have inconsistencies. On these cases index.html must be the latest file on being uploaded.

Imagine an scenario in which you have a CloudFront distribution pointing to an S3 bucket. As SPA, index.html must have cache=0 to avoid problems with new versions. If index.html is not uploaded the latest file, you can have these problematic scenarios:

  • Latency uploading to S3: if index.html is the first file on being uploaded and other files (dependencies such as CSS or JS) take a while on being updated (maybe because you are also uploading big files), CloudFront can cache resources not present on a request performed during the deploy.
  • Errors uploading: if you upload index.html first and then an error happens (OS error reading a file or network connectivity), all requests will be pointing to this new file which points to resources that may be are not present.

What is the goal?

Adding support to order file uploads.

Approach

A task of obtaining the order groups for all local files is executed (see groupLocalFiles) before plan process. This method stores on memory the local files (only indispensable information) that must be processed later. Then, each group is processed sequentially, uploading files present on each group in parallel (in the same way than before).

Refactoring

In order to make more easy the testing and allowing new local stores on the future (maybe a tgz file or similar), a new abstraction of local store has been added to the deploy process.

However, as I don't know if this approach fits with the project's needs, I left previous tests intact. Maybe in the future could be changed to adapt this new approach.

Testing

Testing has been added for new processes.

@codecov-io
Copy link

codecov-io commented Nov 21, 2018

Codecov Report

Merging #37 into master will increase coverage by 2.12%.
The diff coverage is 89.07%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master      #37      +/-   ##
==========================================
+ Coverage   67.59%   69.71%   +2.12%     
==========================================
  Files           8        8              
  Lines         577      624      +47     
==========================================
+ Hits          390      435      +45     
- Misses        157      158       +1     
- Partials       30       31       +1
Impacted Files Coverage Δ
lib/store.go 74.22% <ø> (ø) ⬆️
lib/deployer.go 81.21% <87.09%> (-0.12%) ⬇️
lib/files.go 86.2% <91.22%> (+4.76%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 48a7d5f...f7b6fe5. Read the comment docs.

@bep bep mentioned this pull request Nov 21, 2018
@bep
Copy link
Owner

bep commented Dec 5, 2018

Sorry for the delay,

I have looked at this. I understand the use case, though I wouldn't expect this to be a real issue if you use a CDN (the cache is invalidated last).

So, with that, this patch is too big for me to merge. It looks correct, but I don't have the time needed to maintain it.

We could consider an alternative approach by adding some "ordered queue" logic (multiple channels) in enqueueUpload, which would make this much simpler.

@mpucholblasco
Copy link
Author

No problem for the delay.

I guess that it can be a problem even using a CDN because invalidation can take up to 15 minutes.

The real problem is present during a deploy and the use case can be problematic only under circumstances, but it can produce 404 on clients. As example, imagine uploading an html that has a link to a big file and then start to upload the big file. If a user accesses to that link, it's not present yet, and a 404 is produced. These are the problematic cases that this PR tries to avoid.

I also thinked about using multiple channels, but one requirement was based on completing certain uploads before uploading others (in previous case, if you wait until the big file is uploaded, the problem is not longer present, but if you upload them in parallel, as html is smaller, it would be uploaded before). Due to that I took the same approach you had and only added sequential process for each group. If you have just one group, the same approach that you had is followed (except that plan now equeues file names on memory and before were passed directly to the channel).

Most part of the PR is based on adding a LocalStore layer to make testing more easy.

@vangent
Copy link

vangent commented Jun 5, 2019

FYI, hugo now has a deploy command that includes support for ordering.

@mpucholblasco
Copy link
Author

Sorry, I forgot this PR, closing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants