Skip to content
This repository has been archived by the owner on Feb 19, 2019. It is now read-only.

[Enhancement] Do not download if already cached #428

Closed
ferventcoder opened this issue Mar 2, 2014 · 7 comments
Closed

[Enhancement] Do not download if already cached #428

ferventcoder opened this issue Mar 2, 2014 · 7 comments
Assignees

Comments

@ferventcoder
Copy link
Contributor

Check the cache and if the file that you are to download already exists and has the same length as the Content-Length of the expected file, do not re-download the file.

This is super similar to GH-109 but will be implemented in a slightly different way.

@ferventcoder ferventcoder self-assigned this Mar 2, 2014
ferventcoder added a commit to ferventcoder/chocolatey that referenced this issue Mar 2, 2014
If the file exists and it is the size we expect it to be, don't download the
file again.
@jberezanski
Copy link
Contributor

How about verifying the checksum of the cached file as well (if the checksum is provided)? It would make it possible to automatically recover from corrupted downloads, when length matches but content does not. Currently such file will not be downloaded, but will fail the checksum test later.

@ferventcoder
Copy link
Contributor Author

That make sense. I think we may be doing that already though.

@ferventcoder
Copy link
Contributor Author

@jberezanski Yup, check out #427.

@ferventcoder
Copy link
Contributor Author

Note that we verify the file downloaded with the checksum whether or not we using the cache - https://github.com/chocolatey/chocolatey/blob/master/src/helpers/functions/Get-ChocolateyWebFile.ps1#L126

@jberezanski
Copy link
Contributor

Yes, I've seen that line (that's what I meant by "the checksum test later"). However, in the scenario I'm thinking about, it would be too late. Please consider the following sequence of events:

  1. the user invokes cinst foo
  2. chocolateyInstall.ps1 of the foo package invokes Get-ChocolateyWebFile to download http://unreliablehost.com/foo.zip
  3. foo.zip does not exist in the cache, so the condition at L85 fails at $fi.Exists
  4. foo.zip is downloaded, but due to unreliable server/network connection/whatever ends up corrupted
  5. the checksum test at L126 catches the corruption and raises an error
  6. installation fails
  7. the user invokes cinst foo again, hoping this time the file will be downloaded correctly
  8. chocolateyInstall.ps1 of the foo package invokes Get-ChocolateyWebFile to download http://unreliablehost.com/foo.zip
  9. foo.zip exists in the cache and has the expected size, so the condition at L85 passes and $needsDownload is set to $false
  10. foo.zip is not downloaded again and remains corrupted
  11. the checksum test at L126 catches the corruption and raises an error
  12. installation fails

I propose modifying L85 to:

if ($fi.Exists -and ($fi.Length -eq $headers["Content-Length"]) -and (Get-CheckSumValid -file $fi -checkSum $checksum -checksumType $checksumType))

That change would result in failing the condition at step 9 above and redownloading the file, hopefully uncorrupted this time.

Without that change, the user would need to delete the cached file by hand (be familiar with cache directory location and layout etc) to fix the broken download.

@ferventcoder
Copy link
Contributor Author

Ah yeah, perfect use case. Let's get that in there. File an enhancement?

On Monday, March 3, 2014, Jakub Berezanski notifications@github.com wrote:

Yes, I've seen that line (that's what I meant by "the checksum test
later"). However, in the scenario I'm thinking about, it would be too late.
Please consider the following sequence of events:

  1. the user invokes cinst foo
  2. chocolateyInstall.ps1 of the foo package invokes
    Get-ChocolateyWebFile to download http://unreliablehost.com/foo.zip
  3. foo.zip does not exist in the cache, so the condition at L85https://github.com/chocolatey/chocolatey/blob/master/src/helpers/functions/Get-ChocolateyWebFile.ps1#L85fails at
    $fi.Exists
  4. foo.zip is downloaded, but due to unreliable server/network
    connection/whatever ends up corrupted
  5. the checksum test at L126https://github.com/chocolatey/chocolatey/blob/master/src/helpers/functions/Get-ChocolateyWebFile.ps1#L126catches the corruption and raises an error
  6. installation fails
  7. the user invokes cinst foo again, hoping this time the file will be
    downloaded correctly
  8. chocolateyInstall.ps1 of the foo package invokes
    Get-ChocolateyWebFile to download http://unreliablehost.com/foo.zip
  9. foo.zip exists in the cache and has the expected size, so the condition
    at L85https://github.com/chocolatey/chocolatey/blob/master/src/helpers/functions/Get-ChocolateyWebFile.ps1#L85passes and $needsDownload is set to $false
  10. foo.zip is not downloaded again and remains corrupted
  11. the checksum test at L126https://github.com/chocolatey/chocolatey/blob/master/src/helpers/functions/Get-ChocolateyWebFile.ps1#L126catches the corruption and raises an error
  12. installation fails

I propose modifying L85https://github.com/chocolatey/chocolatey/blob/master/src/helpers/functions/Get-ChocolateyWebFile.ps1#L85to:

if ($fi.Exists -and ($fi.Length -eq $headers["Content-Length"]) -and (Get-CheckSumValid -file $fi -checkSum $checksum -checksumType $checksumType))

That change would result in failing the condition at step 9 above and
redownloading the file, hopefully uncorrupted this time.

Without that change, the user would need to delete the cached file by hand
(be familiar with cache directory location and layout etc) to fix the
broken download.

Reply to this email directly or view it on GitHubhttps://github.com//issues/428#issuecomment-36554614
.

Rob
"Be passionate in all you do"

http://devlicio.us/blogs/rob_reynolds
http://ferventcoder.com
http://twitter.com/ferventcoder

@gep13
Copy link
Member

gep13 commented Mar 4, 2014

@ferventcoder I am just spitballing here, but does the implementation of this feature give us the "complete offline install"? Assuming you have already run all the installations once on one machine, and you can transport the cache folder to another machine, you would then be in a position to run all the installs again on that other machine without the need to go off to the internet, right? Or was this part of your cunning plan all along? :-)

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

3 participants