Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update tine.no scraper #486

Merged
merged 5 commits into from
Jan 25, 2022
Merged

Conversation

mathiazom
Copy link
Contributor

@mathiazom mathiazom commented Jan 23, 2022

Closes #484

The scraper implementation for Tine.no has been updated to resolve issues with missing data in the scrape result.
Since tine.no provides recipe data as JSON-LD, I have replaced most of the existing scraper with the general Schema.org scraper. The exceptions are for the recipe image (to retrieve higher resolution) and a cosmetic detail for the instructions.

Also had to work around an apparent incorrect property naming in the data from tine.no. I chose to use a dictionary key replacement solution from SO (user 'baldr'), and have attempted to properly credit the original creator.

Finally, expanded scraper templates, abstract scraper and scraper factory, but not completely sure how to best organize things. Suggestions are very welcome 😃

@jayaddison
Copy link
Collaborator

This looks great, thank you @mathiazom - and thanks too for being meticulous about the details in the description and commit messages.

I've left two optional comments and will take another look after you've had time to respond to those.

Also regenerated the scraper from templates to add some missing methods
@mathiazom
Copy link
Contributor Author

Thanks for the comments @jayaddison, I have now updated the branch with the suggested changes.

@coveralls
Copy link

Coverage Status

Coverage decreased (-0.03%) to 95.355% when pulling 4862713 on mathiazom:update-tineno-scraper into 7052bb5 on hhursev:main.

Copy link
Collaborator

@jayaddison jayaddison left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All looks good to me, thank you again @mathiazom!

@jayaddison jayaddison merged commit 836a664 into hhursev:main Jan 25, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Tine.no scraper issues
3 participants