Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bypass cookie wall? #22

Open
kylescousin opened this issue Jul 20, 2018 · 2 comments
Open

Bypass cookie wall? #22

kylescousin opened this issue Jul 20, 2018 · 2 comments

Comments

@kylescousin
Copy link

For example: https://www.hln.be/nieuws/binnenland/prins-laurent-in-beroep-tegen-dotatiesanctie-opgelegd-door-regering~a81f63c8/

Has a pre-screen to accept cookies, so it's trying to parse that, rather than the actually article.

Can anything be done against this?

@crscheid
Copy link
Owner

Unfortunately I haven't found a good way to do this yet. Will keep this open in case I run across one. We also have trouble with redirects that detect lack of cookies. See #20

@bogdangrab
Copy link

bogdangrab commented Mar 26, 2019

I'm not sure if this is similar to the problem I found. On some websites, when checking for redirects, the URL passed from checkForRedirects() is a JSON string. Apparently, there is a "Location" somewhere in that JSON. I used this regex to avoid that preg_match('/\b[Ll]ocation: (.*)/', $a, $r). Hope this helps.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants