Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add lemonde.fr ruleset #4

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

ndom91
Copy link

@ndom91 ndom91 commented Nov 28, 2023

@ndom91
Copy link
Author

ndom91 commented Nov 28, 2023

Ah so even when cleaned up like this, I still find that only ~75% of the article is apparently loaded. The rest of the content isn't even rendered anywhere on the page.

Once the "Please sign up.." banner and what not are removed/circumvented, you find this message which used to be covered up 😅

image

Translates to approximately: "You have 75% of this article left to read. The rest is reserved for subscribers."

Do you guys have any additional tips / tricks for dealing with publishers / sites like this? 🤔

@ndom91 ndom91 changed the title feat: add fr/lemonde ruleset feat: add lemonde.fr ruleset Nov 28, 2023
@mms-gianni
Copy link
Contributor

I can't see a legal way around it if it is not published for crawling bots.

@ndom91
Copy link
Author

ndom91 commented Nov 29, 2023

I can't see a legal way around it if it is not published for crawling bots.

Okay gotcha, makes sense. Thanks for confirming

@ndom91
Copy link
Author

ndom91 commented Nov 29, 2023

In that case, although not perfect, id like to get this merged if y'all are happy with it too 👍

Seems to be as far as we can get around Le Mondes paywalls with the current methodology.

@ndom91 ndom91 marked this pull request as ready for review March 7, 2024 19:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants