-
-
Notifications
You must be signed in to change notification settings - Fork 549
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix import with recipe-scrapers #1788
Conversation
@gloriousDan the site is behind a paywall so can’t be implemented/tested as part of the main library. A test is implemented for that site, so as long as the PR passes all of the url import tests it should be fine. |
@smilerz Ah okay, I already suspected that something like this is the reason. I didn't run the tests locally (yet), are they run by CI? |
The broken scraper for https://www.reddit.com/r/selfhosted/comments/ui6cny/tandoor_recipe_v14_released_shopping_importing/i7kkz9a/?context=3 has now also a fix in a PR upstream (hhursev/recipe-scrapers#538) |
Only on the primary branches. |
I just ran the test suite locally.
Do I have to run |
The fix was merged and published in https://github.com/hhursev/recipe-scrapers/releases/tag/13.32.1 just now. |
Yes, and |
After running |
@smilerz i had to disable the tests for your inmporter as they were failing for me but i suspect its an easy fix, i just did not have the time to understand whats going wrong. @gloriousDan thank you very much for this, i was just about to sit down and implement basically exactly what you did here. I will review and get this merged asap. |
That's great to hear as this problem is holding me back from upgrading to 1.2.x |
I just tried them and they pass for me. (SpruceEats is still broken) |
ok aweseome, thanks for the info |
This PR fixes several import problems which stem from using some internal recipe-scrapers classes.
Overview over changes:
If an url is provided, scrape_me is called in wild mode
If this is unsuccesfull or no url was provided (which happens when data is entered in the source field) the old way of wrapping the input into
<script type='application/ld+json'> ... </script>
and then sending this into recipe-scrapers is used. (This is now the only way howtext_scraper
incookbook/helper/scrapers/scrapers.py
gets called)Additionally, we always try to get data from the property classes first. If that fails we try to get it from the schema attributes.
This doesn't change a lot of the old behaviour but uses the scraper classes of recipe-scrapers better. This way it shouldn't break anything for any sites.
Fixes:
I'm not sure what to do about the CooksIllustrated custom scrapers and if they still work.
@smilerz Why was this implemented within tandoor and not within recipe-scrapers and is this still working?