Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to retrieve readable content http://habrahabr.ru/ #1541

Closed
Aligator77 opened this issue Dec 27, 2015 · 5 comments
Closed

Unable to retrieve readable content http://habrahabr.ru/ #1541

Aligator77 opened this issue Dec 27, 2015 · 5 comments

Comments

@Aligator77
Copy link

Find problem with url like http://habrahabr.ru/post/$someNumber/ and http://habrahabr.ru/company/$someCompany/blog/$someNumber/, but if use rss link part of content was grab, example link - http://habrahabr.ru/rss/post/$someNumber/.
At first time, i think this was problem with ru locale, but i was wrong, i think problem in -> 'default_parser' => 'libxml'.

@j0k3r
Copy link
Member

j0k3r commented Dec 29, 2015

Could you give us real link instead of $blablabla? It'll ease the debugging :)

Yep I don't speak russian

@Aligator77
Copy link
Author

I did a few tests and concluded that the part of the site can provide such an error because they think client is a bot and give ban. Maybe for advanced users make the settings so they could choose to use a browser?
And wallabag not support charset windows-1251.

@Aligator77
Copy link
Author

Thank you, everything works fine!

@ANAT01
Copy link

ANAT01 commented May 24, 2016

And wallabag not support charset windows-1251.

How to add (how posible?) support of content in charset windows-1251 (or different) ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants