New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor pluto package to make it usable as a library and to simplify main.go #4
Conversation
I've continue working by spliting pluto and worker. I've also added a context to allow handling interrupt (although there is code missing to do so). Tests need to be updated, but I would like to know if it this PR is ok for you to continue working or to let it go. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please see the comments in the review.
main.go
Outdated
log.Fatalf("nothing to do. Please pass some url to fetch") | ||
} | ||
|
||
if options.Connections == 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using a default value of 16/24/32 instead of printing an error when user doesn't gives a value for number of connections would be better idea
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added the default in the options struct:
- Connections uint
short:"n" long:"connections" default:"1" description:"Number of concurrent connections"
this check is trying to catch an user that passes a 0 connections argument. I set the default to 1, we can change it to the value you think is more appropiate
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay. I hadn't thought of that.
pluto/pluto.go
Outdated
downloaded uint64 | ||
metaData fileMetaData | ||
startTime time.Time | ||
saveFile string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
saveFile
should not be a string and pluto
must not create a file itself as it restricts pluto's use as a library. Please keep it like it was before because that was more flexible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
makes sense, I'll change that
pluto/worker.go
Outdated
return nil, fmt.Errorf("status code: %d", resp.StatusCode) | ||
} | ||
|
||
/*if uint64(resp.ContentLength) != (w.end - w.begin) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This check is necessary as several caches like Varnish don't obey the RFC correctly and send the complete file instead of the requested data. Without this check It'll corrupt the file that is being downloaded.
To test this, Run a simple python http server and try downloading a file from that server using pluto.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was wondering why there was this check. In the tests I did I found that some servers answer sometimes 1byte more. I wanted to check a little more why this happens (this is why I commented out the code and not removed it). I will let is as it was
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@josledp 1 or 2 versions of Varnish have this bug.
In a scenario when you are downloading a file of say 10 bytes with 3 parts. Most servers send the last remaining byte in the request which requested bytes from 6-9. This is slightly annoying because there are more servers which send 2-3 bytes of remaining data with the last request.
This is probably done to prevent the HTTP overhead in sending 2-3 bytes of data but I am not sure where to draw the line? When should it start treating extra data sent by server as corrupted data?
BTW, Are you still working on this? Apolgoies for not replying sooner.
ping @josledp |
…hough we are not there yet). Lacking test update
I’ve updated the commit to fix your concerns. I also removed the os.exit from the signal catcher so now the workers are finished by the context, so it should be easy to store the state to recover it later. |
Looks good to me. |
I've updated the tests (and make them work ;)). In order to not make this PR bigger, what about merging it and opening a new one for #2? |
@josledp Okay, I'll merge this one. |
Initial refactor to implement #2
The idea now would be convert worker to its own type, to be able to save/restore its state. Please review if the refactor is ok for you and I'll continue with it.