Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Obscure parsing error on cheezburger.com #331

Closed
kylealwyn opened this issue Mar 7, 2023 · 5 comments
Closed

Obscure parsing error on cheezburger.com #331

kylealwyn opened this issue Mar 7, 2023 · 5 comments

Comments

@kylealwyn
Copy link

Hello!

Having an issue parsing https://cheezburger.com/19495941/33-purrfect-cat-memes-for-all-the-grumpy-cats-on-monday-morning-february-27-2023 (or really any article on that site) both locally and on https://extractor-demos.pages.dev/article-extractor. Receiving Cannot read properties of null (reading 'length').

@ndaidong
Copy link
Collaborator

ndaidong commented Mar 7, 2023

@kylealwyn thank you. I've checked and will fix that now.

The problem is that website uses wrong property name for Twitter Cards:

Screenshot from 2023-03-07 11-22-42

They should use "content" instead of "value".

Anyway we have to handle this case better.

ndaidong added a commit that referenced this issue Mar 7, 2023
- Fix issue #331
- Update dependencies
- Remove unnecessary watermark
@ndaidong ndaidong mentioned this issue Mar 7, 2023
@ndaidong
Copy link
Collaborator

ndaidong commented Mar 7, 2023

@kylealwyn done. However that website's HTML structure is not convenient for extracting article. You may need to add transformations to improve your extraction result.

@kylealwyn
Copy link
Author

Appreciate it @ndaidong! Was coming here to share one more site with same issue: https://themerkle.com/exploring-the-potential-of-collateral-network-colt-ethereum-eth-and-quant-qnt/

Will see if 7.2.10 is solve for both!

@kylealwyn
Copy link
Author

kylealwyn commented Mar 7, 2023

Ope - looks like 7.2.10 isn't published yet. Will wait for that.

Edit: installed from main branch and worked great 👌

@ndaidong
Copy link
Collaborator

ndaidong commented Mar 7, 2023

@kylealwyn yes, v7.2.10 has just been published. And it works for themerkle.com too ✔️

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants