New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HTTP, gzip or normal web page,i want to a common way, if gzip auto unzip, normal no need unzip. #323
Comments
Simply use the method
and unzip the content. After the |
Yes. It will work even if page is not gzipped. On Thu, Jul 21, 2016 at 11:28, govert notifications@github.com wrote: — |
ok, thanks, |
Is it working for you @pzn4jc ? |
ok, thanks for you answer. it is work fine.
when request https: i write some test for https, i found: Exception in thread "main" jodd.http.HttpException: sun.security.validator.ValidatorException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target; <--- sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target |
Hm, that is strange. When you say "sometimes is incorrect"; do you mean on the same page, or on some different pages? Because, maybe some web sites do not play by the HTTP rules and do not set the headers (for example). Would it be possible to give me an URL that does not work; I guess it is something trivial. |
For the second issue, you need to add the missing certificate as trusted to Java. You can read more about this issue here |
Yes,,, I mean on different page, not the same page. I also found "connectionTimeout" param not set, the https page is work fine, when i set connectionTimeout , the https page throws the exception.@IgorSpasic |
Working on this :) I was able to reproduce. Please send me URLs to sites where you had gzip issue, please! |
I just made a fix for this HTTPS and Please, if you find any other problematic URL, just open a new issue! |
ok, thanks. |
i want to use it in web crawler, but web page sometimes return gzip, other return normal web page. i do not know the web page whether compress. so i want to a common way, when it meets gzip, can auto unzip the web page ?
The text was updated successfully, but these errors were encountered: