Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor pluto package to make it usable as a library and to simplify main.go #4

Merged
merged 3 commits into from Oct 31, 2017

Conversation

josledp
Copy link
Contributor

@josledp josledp commented Oct 14, 2017

Initial refactor to implement #2
The idea now would be convert worker to its own type, to be able to save/restore its state. Please review if the refactor is ok for you and I'll continue with it.

@josledp
Copy link
Contributor Author

josledp commented Oct 15, 2017

I've continue working by spliting pluto and worker. I've also added a context to allow handling interrupt (although there is code missing to do so). Tests need to be updated, but I would like to know if it this PR is ok for you to continue working or to let it go.

Copy link
Owner

@ishanjain28 ishanjain28 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please see the comments in the review.

main.go Outdated
log.Fatalf("nothing to do. Please pass some url to fetch")
}

if options.Connections == 0 {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using a default value of 16/24/32 instead of printing an error when user doesn't gives a value for number of connections would be better idea

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added the default in the options struct:

  • Connections uint short:"n" long:"connections" default:"1" description:"Number of concurrent connections"

this check is trying to catch an user that passes a 0 connections argument. I set the default to 1, we can change it to the value you think is more appropiate

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay. I hadn't thought of that.

pluto/pluto.go Outdated
downloaded uint64
metaData fileMetaData
startTime time.Time
saveFile string
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

saveFile should not be a string and pluto must not create a file itself as it restricts pluto's use as a library. Please keep it like it was before because that was more flexible.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

makes sense, I'll change that

pluto/worker.go Outdated
return nil, fmt.Errorf("status code: %d", resp.StatusCode)
}

/*if uint64(resp.ContentLength) != (w.end - w.begin) {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This check is necessary as several caches like Varnish don't obey the RFC correctly and send the complete file instead of the requested data. Without this check It'll corrupt the file that is being downloaded.
To test this, Run a simple python http server and try downloading a file from that server using pluto.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was wondering why there was this check. In the tests I did I found that some servers answer sometimes 1byte more. I wanted to check a little more why this happens (this is why I commented out the code and not removed it). I will let is as it was

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@josledp 1 or 2 versions of Varnish have this bug.

In a scenario when you are downloading a file of say 10 bytes with 3 parts. Most servers send the last remaining byte in the request which requested bytes from 6-9. This is slightly annoying because there are more servers which send 2-3 bytes of remaining data with the last request.

This is probably done to prevent the HTTP overhead in sending 2-3 bytes of data but I am not sure where to draw the line? When should it start treating extra data sent by server as corrupted data?

BTW, Are you still working on this? Apolgoies for not replying sooner.

@ishanjain28
Copy link
Owner

ping @josledp

…hough we are not there yet). Lacking test update
@josledp
Copy link
Contributor Author

josledp commented Oct 30, 2017

I’ve updated the commit to fix your concerns. I also removed the os.exit from the signal catcher so now the workers are finished by the context, so it should be easy to store the state to recover it later.
Let me know what you think.

@ishanjain28
Copy link
Owner

Looks good to me.

@josledp
Copy link
Contributor Author

josledp commented Oct 31, 2017

I've updated the tests (and make them work ;)). In order to not make this PR bigger, what about merging it and opening a new one for #2?

@ishanjain28
Copy link
Owner

@josledp Okay, I'll merge this one.

@ishanjain28 ishanjain28 merged commit a109045 into ishanjain28:master Oct 31, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants