New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement the 'Interruptable' feature #32

Open
k0pernicus opened this Issue Jan 21, 2017 · 4 comments

Comments

3 participants
@k0pernicus
Copy link
Contributor

k0pernicus commented Jan 21, 2017

No description provided.

@lambdaupb

This comment has been minimized.

Copy link
Contributor

lambdaupb commented Jan 25, 2017

axel has a state file which gets updated periodically while downloading. Upon restart axel can continue from the state last saved.

It would be enough to save the total file size and the remaining holes in the download to the status file.

@daveallie

This comment has been minimized.

Copy link
Contributor

daveallie commented Feb 1, 2017

As an alternative to the state file approach, you could store the information in the first bytes of the downloading file, and give that file a custom file extension. The extension and download metadata would be stripped when the download completes. Here's a quick proposal for how it would work.

  • First 32 bits would be a u32 representing the number of 64kb chunks in the file (n).
  • Next n bits would be bit flags for if that chunk has been downloaded.
  • Rest of the file would follow

Limitations:

  • This would limit the max size of the download to 274,877,906,880 bytes (~256TB)
  • Total downloaded by threads should be a multiple of 64KB, so that restarting with a different thread count doesn't break everything
  • Threads would need to be aware of the offset the metadata header introduces, so that their chunks are downloaded to the right place
@k0pernicus

This comment has been minimized.

Copy link
Contributor Author

k0pernicus commented Feb 1, 2017

I think that we should try different approaches, make some bench and take the one(s) that offer the best trade-off between a fast step "Retrying the download" and a great memory usage to store current file informations (and what to download next).
Currently, we plan to improve and stabilize the "download" part, before to implement this feature - like, for example, switch to a mono-thread download when the server cannot give any informations about the remote content size, or try a strategy (for example using Divide and conquer algorithms) to "guess" the remote content sending header requests (like byte-ranges or something) and keep a close eye about the server's responses... :-)

@daveallie

This comment has been minimized.

Copy link
Contributor

daveallie commented Mar 26, 2017

I played around with the idea I had above and built https://github.com/daveallie/grapple.

I made some changes to the proposal I had earlier:

  • I used a u64 to hold the chunk count.
    • This costs 4 more bytes but effectively overcomes the max filesize issue.
  • I moved the chunk metadata to the end of the file.
    • When the download is complete, the file can just be truncated to its required length.
  • I used a 128kb chunk size to reduce the amount to writing to file was being done.

Would be happy to give a more detailed explanation of how the 'restarting' phase works if you plan to go with the process I method above.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment