Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wallabag is fetching 'Cookie Approval Text' instead of article content #2428

Closed
DonUber opened this issue Oct 11, 2016 · 7 comments · Fixed by fivefilters/ftr-site-config#211

Comments

@DonUber
Copy link

DonUber commented Oct 11, 2016

For a Dutch newspaper when saving to my wallabag, it fetches the text for cookie approval instead of the content of the article. Using Pocket this problem is not present.

Example of an article giving this problem: http://www.nrc.nl/nieuws/2016/10/02/de-nederlandse-school-wanorde-onrust-en-lawaai-4566603-a1524441

The fetched text is:

NRC Media Holding BV maakt gebruik van cookies en daarmee vergelijkbare technieken. www.nrc.nl gebruikt functionele en analytische cookies om u een optimale bezoekerservaring te bieden. Bovendien plaatsen derde partijen tracking cookies om u gepersonaliseerde advertenties te tonen en om buiten de website van NRC relevante aanbiedingen te doen. Ook worden er tracking cookies geplaatst door social media-netwerken. Uw internetgedrag kan door deze derden gevolgd worden door middel van deze tracking cookies. Door hiernaast op akkoord te klikken, of door gebruik te blijven maken van deze website gaat u hiermee akkoord.

@j0k3r
Copy link
Member

j0k3r commented Oct 12, 2016

Should be fixed in the next release

@j0k3r j0k3r closed this as completed Oct 12, 2016
@guyspr
Copy link

guyspr commented Jan 12, 2017

To not create a new issue about the same problem, similar issue with https://tweakers.net/nieuws/119975/google-brengt-android-wear-20-begin-februari-uit.html
(Or any other article from tweakers.net). They have a cookiewall of which the text gets displayed.

@j0k3r
Copy link
Member

j0k3r commented Jan 12, 2017

@guyspronck the article looks great on my side, no cookie content:

image

To be sure the cookie panel isn't retrieved I've pushed a PR on siteconfig to remove it.
Might be available in the next release. Or you can update the file on your own in vendor/vendor/j0k3r/graby-site-config/tweakers.net.txt

@guyspr
Copy link

guyspr commented Jan 12, 2017

This is the result I get:
image

I'll try the tweakers.net.txt!

@guyspr
Copy link

guyspr commented Jan 12, 2017

Tried this and it didn't resolve the issue for me. Seems like its redirecting to a separate cookiewall page. I'm running in Docker if that adds anything.

@j0k3r
Copy link
Member

j0k3r commented Jan 12, 2017

Could you try adding http_header(user-agent): PHP/5.3 ?

@guyspr
Copy link

guyspr commented Jan 12, 2017

Doesn't seem to work for me. Still getting the cookie wall text as result.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants