Should support pushing diffs only and not require upload of entire project #119

Open
phoneboy opened this Issue Jun 29, 2015 · 33 comments

Projects

None yet
@phoneboy

My projects are several hundred megabytes. Uploading the entire thing each time I make a change seems utterly wasteful.

@brandonb927
brandonb927 commented Jun 29, 2015 edited

Edit: After a long time of surge use, I want to rescind my +1 as I understand what Surge should and shouldn't do more clearly.


+1 👍 Would love this! I have a blog that is constantly growing in size due to images, coupled with my internet connection at home for uploads being utter shit.

@cfjedimaster

+1

@silentrob
Collaborator

So just a little sidebar. The upload process right now compresses and streams the entire contents without writing any files to disk. Once the files are uploaded to our initial staging server, they are sent to a CDN fleet. That step uses a rsync like process.

We looked into rsync'ing the first step but it added significat complexity and we didn't want to delay the launch at the time. We also acknowledge the primary persona would not have been effected dramatically by this body of work.

Having said that. I think it is something that should still get done.

@silentrob silentrob closed this Jun 29, 2015
@silentrob silentrob reopened this Jun 29, 2015
@jokeyrhyme

Is there a way to do a dummy-run, to see which resources would be different if a real publish action were executed? Is there an API that returns all the hashes / Etags of all currently-published resources?

@Nolanes
Nolanes commented Dec 8, 2015

+1

@dtinth
dtinth commented Jan 4, 2016

@jokeyrhyme auto.appcache file contains the hashes (and size) of all the files.


I solved this problem by rsyncing to a $5 DigitalOcean and then executing surge there. Here’s my drop-in replacement shell script.

#!/bin/bash
if [ "$1" -a "$2" ]
then
  rsync -avz --delete "$1/" "$REMOTE_SERVER_HOST:/surge/$2"
  ssh "$REMOTE_SERVER_HOST" surge "/surge/$2" "$2"
else
  echo 'usage: surgeup [project path] [domain]'
  exec false
fi
@phoneboy
phoneboy commented Jan 4, 2016

You're moving the problem somewhere else.
It doesn't solve the underlying problem.

On Mon, Jan 4, 2016 at 6:52 AM, Thai Pangsakulyanont <
notifications@github.com> wrote:

@jokeyrhyme https://github.com/jokeyrhyme auto.appcache
https://davidwalsh.name/dont-wait-serviceworker-adding-offline-support-oneline

file contains the hashes (and size) of all the files.

I solved this problem by rsyncing to a $5 DigitalOcean and then executing
surge there. Here’s my drop-in replacement shell script.

#!/bin/bashif [ "$1" -a "$2" ]then
rsync -avz --delete "$1/" "$REMOTE_SERVER_HOST:/surge/$2"
ssh "$REMOTE_SERVER_HOST" surge "/surge/$2" "$2"else
echo 'usage: surgeup [project path] [domain]'
exec falsefi


Reply to this email directly or view it on GitHub
#119 (comment).

@dtinth
dtinth commented Jan 4, 2016

@phoneboy I agree that the underlying problem still isn’t solved and this issue should remain open, but DigitalOcean has a pretty decent bandwidth with surge.sh.

My internet connection here isn’t so good, so using DigitalOcean as a middle man sped up my deployment greatly. What took me a minute now takes seconds.

@gemedet
gemedet commented Feb 24, 2016

+1

@edwardabraham

We are using Firebase for deploying a static site, and are seeking a static file host who solves this specific problem (Divshot used to support incremental upload, which was awesome, but it was deprecated when they were acquired by Firebase). It currently takes 8 minutes to deploy our website, which is a long time when you have updated a few typos. If Surge made incremental updates available, we would switch right away.

@brandonb927

@edwardabraham are you uploading images or anything to the site? Surge is mostly for just hosting static JS/CSS/HTML assets and not particularly suited for hosting tonnes of images.

@edwardabraham

That is the problem, we have a big pile of PDFs on our site. A solution
would be to re-engineer it so that those were hosted on Amazon or somewhere
else, but that would make things more complicated ...

On 12 April 2016 at 10:44, Brandon Brown notifications@github.com wrote:

@edwardabraham https://github.com/edwardabraham are you uploading
images or anything to the site? Surge is mostly for just hosting static
JS/CSS/HTML assets and not particularly suited for hosting tonnes of images.


You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub
#119 (comment)

@brandonb927

That's what I've suggested to others running into this problem. I personally have my blog hosted on Surge with HTML/CSS on Surge and images pushed to an S3 bucket in my gulp build script on deploy

@cfjedimaster

Ditto - my Surges were taking 15 minutes. I moved all my images to a S3
bucket, did a search and replace (a careful one), and now my Surge is down
to about 4 minutes.

To be clear, I still want to see an incremental push. :)

On Mon, Apr 11, 2016 at 7:01 PM, Brandon Brown notifications@github.com
wrote:

That's what I've suggested to others running into this problem. I
personally have my blog hosted on Surge with HTML/CSS on Surge and images
pushed to an S3 bucket in my gulp build script on deploy


You are receiving this because you commented.
Reply to this email directly or view it on GitHub
#119 (comment)

Raymond Camden, Developer Advocate for StrongLoop at IBM

Email : raymondcamden@gmail.com
Blog : www.raymondcamden.com
Twitter: raymondcamden

@aresta
aresta commented May 5, 2016

+1 Waiting for this one 💃

@hickford
hickford commented May 11, 2016 edited

My site is 0.2 MB of html and 9.8 MB of images. I have a slow internet connection. Whenever I change the html, I suffer Surge reuploading all the images. Frustrating!

Please think how to conserve bandwidth. Not all the world is blessed with high speed internet.

You could:

  1. Skip re-uploading files that haven't changed (compare sha256sum)
  2. For files that have changed, be clever and calculate diffs to upload

I believe Divshot did 1. It was a delight to use. If Surge did the same, it would make my experience much more pleasant.

2 is probably overkill, typically size(unchanged files) ≫ size(changed files)

@brandonb927

@hickford what sets Surge apart from Divshot is that Surge is more targeted at hosting the HTML/CSS/JS portion of the site and not so much the whole package "images, videos, etc". These media pieces should probably be stored elsewhere on something like S3. How often are you updating said images/media? I have all of my site media hosted on S3 and my monthly charge is about $0.23 CAD so the cost is negligible.

@hickford
hickford commented May 12, 2016 edited

@brandonb927 I use Surge because it's free and simple. I do not care to register, learn or pay for S3 to publish my small website.

Thank you for the suggestion of a workaround, but I shall wait patiently for Surge to fix the problem. To reupload files that haven't changed is wasteful. I imagine it would be simple enough to compare checksums and skip them.

From the statistics on https://surge.sh/ we can calculate the average size of a Surge project: 2 TB / 30000 projects = 60 MB. Obviously slow upload is a real problem for people on this thread.

@bclinkinbeard

@hickford If you refuse to pay or learn anything you might want to look at using GitHub Pages or something. If you're correct and

it would be simple to compare checksums and skip them.

then I look forward to your PR implementing it. :)

@hickford

@bclinkinbeard

If you refuse to pay or learn anything...

That's not helpful.

I look forward to your PR implementing it

It would require a change to Surge's server as well as this client.

@phoneboy

Just to add some context, my website is about 300mb and has a crapton of
files.
One or the other would cause slow uploads, but both is a problem.
Granted, I have other solutions I can leverage, but it seems like very
basic functionality that practically every other comparable solution has.
It's absence is astounding.

On Thu, May 12, 2016 at 6:00 AM, Mirth Hickford notifications@github.com
wrote:

@bclinkinbeard https://github.com/bclinkinbeard

If you refuse to pay or learn anything...

That's not helpful.

I look forward to your PR implementing it

It would require a change to Surge's server as well as this client.


You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub
#119 (comment)

@hickford
hickford commented May 12, 2016 edited

You can read the Divshot client's upload algorithm at https://github.com/divshot/divshot-push/blob/master/lib/sync-tree.js . It uploads to an S3 bucket, keying files by sha256 sum (rather than filename). Thus it can skip unchanged files, even if they've been renamed. Then to the Divshot API, it uploads a file map (path to sha256 sum), presumably so the Divshot webserver can sync the right files from S3, and reassemble the original directory structure.

The file map idea is clever, because if you keep them, then you have the complete history of the website with minimal storage cost.

It looks like the Surge client simply uploads a tar ball of the whole project directly to Surge https://github.com/sintaxi/surge/blob/master/lib/middleware/deploy.js

@hickford
hickford commented May 12, 2016 edited

Off the top of my head, a possible algorithm for the Surge client communicating directly with Surge (no intermediate S3 bucket):

  1. Calculate file map (path to sha256 sum) of local project
  2. Download previous file map from Surge server
  3. Create tar ball from local project, skipping files that match a hash in server file map
  4. Upload tar ball and new file map
  5. Surge server reassembles project, extracting tar ball and copying missing files from previous upload
  6. Surge server deletes previous upload

That wouldn't keep history like Divshot did, but that's okay because it's not a feature of Surge. (Of course, a rollback feature would be welcome.)

@brandonb927

@hickford it still stands that Surge was not, and I don't think ever will be, built to host all given media assets for a site. I'm not saying the idea is bad, and it's definitely useful though. Surge is a CDN backed by an intuitive command line utility and that's where development effort is being placed right now. Based on messaging in the Slack team and what I've seen around the internet, that isn't likely to change anytime soon either.

@hickford hickford referenced this issue in firebase/firebase-tools May 17, 2016
Open

Firebase deploy should not reupload unchanged files #133

@PabloDinella

+1 👍

@ariporad
ariporad commented Jun 9, 2016

👍 , This would be an awesome feature for surge!

@prakaashkpk

+1 that would be a good feature to have.

@michaelnagy

+1

@rickyok
rickyok commented Oct 12, 2016

+1

@RathanKumar

👍

@NurdTurd
NurdTurd commented Dec 5, 2016

+1 👍

@owendall

👍

@sintaxi sintaxi locked and limited conversation to collaborators Dec 12, 2016
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.