Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

linuxjournal.com multi-page fetches only first page #81

Closed
Strubbl opened this issue Mar 3, 2017 · 2 comments
Closed

linuxjournal.com multi-page fetches only first page #81

Strubbl opened this issue Mar 3, 2017 · 2 comments

Comments

@Strubbl
Copy link

Strubbl commented Mar 3, 2017

I am posting this issue here, because i think it is a bug in graby. The site config appears to be valid.

When adding http://www.linuxjournal.com/content/papas-got-brand-new-nas to wallabag, content fetching lasts very long. Only for adding this page to wallabag, the prod.log of wallabag grows by 2,4 MB.

It is only a problem with multi-page articles of linuxjournal.

What i observe:

  • wallabag needs some minutes to fetch this article
  • after finished fetching only the first page of the article is in wallabag
  • the article in wallabag ends with the sentence: This article appears to continue on subsequent pages which we could not extract

I have attached the log: prod.txt

From a first superficial view at the log, graby has a problem with the URL for the next page, because it contains an unusual string (maybe the comma?).

@j0k3r
Copy link
Owner

j0k3r commented Nov 12, 2017

I've just tested using f43.me and I got the 3 pages without problem.
Did you still have problem with that link?

@Strubbl
Copy link
Author

Strubbl commented Nov 12, 2017

It's fixed. Thank you.

@Strubbl Strubbl closed this as completed Nov 12, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants