Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

change CrlClientOnline to ignore content-type #42

Closed
wants to merge 1 commit into from

Conversation

janflora
Copy link

Using CrlClientOnline to fetch CRLs fails if the webserver publishing the CRL does not provide a content-type header in the response. Since the response is basically just converted to an input stream and is used in its raw format, the content type is not really needed. The case where I've seen this break is while fetching the CRL from Adobe CDS: http://crl.adobe.com/cds.crl

Using CrlClientOnline to fetch CRLs fails if the webserver publishing the CRL does not provide a content-type header in the response. Since the response is basically just converted to an input stream and is used in its raw format, the content type is not really needed. The case where I've seen this break is while fetching the CRL from Adobe CDS: http://crl.adobe.com/cds.crl
@iText-CI
Copy link
Contributor

Can one of the admins verify this patch?

@blagae
Copy link
Contributor

blagae commented Jun 18, 2018

While I'm not an expert on this piece of code, it makes more sense to me to at least try and get the regular content type, and only fall back on failure. So either try { getContent() } catch (SomeException se) { getInputStream() } or by using a null check on the return value of getContent.

@janflora
Copy link
Author

@blagae The original code would make use of getContent() which essentially tries to parse the content using the content-type of the response, only to cast this into an input stream. The way I see it is that the only benefit you get here is that the content type might be validated against the actual data being returned (I'm not sure if any validation takes place).

If you encapsulate the the getContent() call in a try-catch block and ultimately just get the input stream content if it fails, then you'd accept the returned content whether it matches the content-type or not. Isn't this essentially the same as just reading the response as an input stream to begin with?

@blagae
Copy link
Contributor

blagae commented Jun 18, 2018

I'm just looking at the code as I see it, and it seems to be not guaranteed that UrlConnection.getContent() and UrlConnection.getInputStream() will return the same object. In fact, it isn't even guaranteed that getContent() will return an InputStream, hence the cast (which may also need to be caught ?). From the source code in java.net, this is the implementation of getContent():

    public Object getContent() throws IOException {
        getInputStream(); // this is not the returned object !
        return getContentHandler().getContent(this);
    }

As far as I can see, we can't make inferences of the ContentHandler's behavior because the object's type depends on the mimetype of the content-type header.

Again, I'm not the expert here and I'll defer to one of my colleagues who have more knowledge of dig-sig, but I think if we're trying to improve behavior, we should take into account the uncertainties that are not solved by the java.net package.

@avlemos
Copy link

avlemos commented Apr 30, 2019

Hi @janflora,

Right now itext/itextpdf has been deprecated in favor of itext/itext7, although we will continue to incorporate security fixes in itext/itextpdf (iText5).

Could you check if the issue exists in itext/itext7 so we can incorporate your fix?

Thank you for contribution, it is appreciated.

@avlemos avlemos closed this Apr 30, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants