Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add hook to handle Content-Encoding #45

Closed
tmccombs opened this issue Sep 4, 2014 · 15 comments
Closed

Add hook to handle Content-Encoding #45

tmccombs opened this issue Sep 4, 2014 · 15 comments

Comments

@tmccombs
Copy link
Contributor

tmccombs commented Sep 4, 2014

drakma should provide a hook that would make it easier to handle Content-Encoding's like gzip and deflate.

I would imagine such a hook would be a function that takes the headers, or content-encoding header, and the stream for the body, and returns a stream (either the same stream, or a decoded stream). This function would be called before setting the external format.

It would also be nice if drakma automatically handled a Content-Encoding of gzip or deflate, but that would add at least one additional dependency.

I will probably write a patch for this when I get a chance.

@hanshuebner
Copy link
Member

Supporting other encodings would be a great addition, and additional dependencies are no big deal. As the dependency would probably require some external shared library, it would be good if it could be switched off at compile time for non-quicklisp users.

The hook would probably be installed in the wrap-stream (https://github.com/edicl/drakma/blob/master/request.lisp#L581), but I'm unsure whether a hook can be general enough, given that there may be chunking and encryption as well (i.e. does there need to be any flexibility with respect to where in the stack the hook gets to place its wrapper?).

Is a generic, public hook really useful, i.e. are there uses for it other than implementing decompression?

Would it be useful to be able to compress the request body as well? Is that even supported as per the HTTP standard?

@tmccombs
Copy link
Contributor Author

tmccombs commented Sep 7, 2014

The Content-Encoding and Transfer-Encoding headers only applies to responses.

Also ideally, it should be able to handle both the Content-Encoding and Transfer-Encoding headers.

As far as I can tell, decoding should always be done after encryption and chunking.

The only use case I can think of for having a general, public hook is that it allows users to add support for additional encoding types that are either non-standard or not implemented in drakma. That means that drakma could have support for gzip and deflate, but a user could add additional support for the less common pac200-gzip, or the un-official bzip2 if needed.

Another possible implementation would be to use a generic function that dispatches on the encoding type, then have methods for identity, gizp, defalte, etc. I think I prefer this method.

@hanshuebner
Copy link
Member

A generic function that dispatches on the encoding type sounds compelling.
Can you come up with a patch for that?

@tmccombs
Copy link
Contributor Author

Yes, I've started working on it.

@avodonosov
Copy link
Contributor

It is an unfortunate fix, because it breaks backward compatibility.

I've just spent 3 hours debugging my application, to locate this problem.

I am stroing data on Amazon S3 and gzip it to save space. I retrieve my data with zs3:get-vector and ungzip it. Suddenly, ungzip started to signal an error. Turns out, with new drakma zs3:get-vector returns not the data I posted earlier, but ungzipped version of it.

I suggest to rollback this fix, until it's refactored in backward compatible fashion.

@hanshuebner
Copy link
Member

Thayne, would you be able to take care of this?

2015-03-13 12:08 GMT+01:00 Anton Vodonosov notifications@github.com:

It is an unfortunate fix, because it breaks backward compatibility.

I've just spent 3 hours debugging my application, to locate this problem.

I am stroing data on Amazon S3 and gzip it to save space. I retrieve my
data with zs3:get-vector and ungzip it. Suddenly, ungzip started to
signal an error. Turns out, with new drakma zs3:get-vector returns not
the data I posted earlier, but ungzipped version of it.

I suggest to rollback this fix, until it's refactored in backward
compatible fashion.

Reply to this email directly or view it on GitHub
#45 (comment).

@tmccombs
Copy link
Contributor Author

I can look at it tonight

@hanshuebner
Copy link
Member

Thanks!

2015-03-13 15:59 GMT+01:00 Thayne McCombs notifications@github.com:

I can look at it tonight

Reply to this email directly or view it on GitHub
#45 (comment).

@stassats
Copy link
Member

I don't see an issue here, if S3 sets the Content-Encoding to deflate, then it's fair game. An option to disable this could be added, but no backwards compatibility is necessary. Incidentally, there should be an option for easily specifying Accept-Encoding, probably defaulting to gzip,deflate

@avodonosov
Copy link
Contributor

Fuzzy concept of "fair game" can't justify breaking someone's existing code.

Support of gzip and deflate is convenient, but doing so in backward incompatible way ads to degradation of CL ecosystem, which has lot of bugs and instabilities as it is. I regularly monitor QL libraries and every month something breaks (https://common-lisp.net/project/cl-test-grid/ql/).

Who is affected - any code already dealing with gzip or deflate encoding. It's not a theoretical speculation, my case is a concrete example of this.

@stassats
Copy link
Member

Then "tough luck" is a better justification. Sometimes things have to be broken. It's hard to get API right the first time and foresee any future needs. Not breaking anything will result in a shit API full of crutches.

@avodonosov
Copy link
Contributor

Generally speaking, if you need to improve an API, you introduce new API, instead of breaking the existing one.

If I wanted to improve drakma API, I would introduce new method drakma:http-request2 with better default behaviour and simpler parameters. Or, if the changes are very significant, I would created new library, drakma2. So the old tested code remains working using drakma:http-request, and the new code uses new function / library.

That way API evolves without and destructive network effect or efforts / time expenses. Almost always maintaining backward compatibility is easy and cheap (both for library author and the users).

About this particular case. Anyway, we need a parameter to disable / enable automatic gzip handling - I may want to download the gzipped version in order to back up it, or move to another location; in this case I don't want redundant ungzip / gzip processing.

And in my opinion the default should be backwards compatible. In the open source CL we don't have many resources, it takes years to fix even simple bugs, so deliberately breaking things is unacceptable.

Incidentally, there should be an option for easily specifying Accept-Encoding

It's easy to specify using :additional-headers

@hanshuebner
Copy link
Member

hanshuebner commented Mar 24, 2015 via email

@avodonosov
Copy link
Contributor

OK.

For the record, new API for new functionality, instead of breaking the old API.

@hanshuebner
Copy link
Member

hanshuebner commented Mar 24, 2015 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants