Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix import with recipe-scrapers #1788

Merged
merged 1 commit into from
May 11, 2022

Conversation

gloriousDan
Copy link
Contributor

@gloriousDan gloriousDan commented May 9, 2022

This PR fixes several import problems which stem from using some internal recipe-scrapers classes.

Overview over changes:

  • If an url is provided, scrape_me is called in wild mode

    • If a scraper for the url exists, this scraper is called by recipe-scrapers
    • If no scraper exists wild_mode comes into effect and tries to search the page for schema.org fields.
  • If this is unsuccesfull or no url was provided (which happens when data is entered in the source field) the old way of wrapping the input into <script type='application/ld+json'> ... </script> and then sending this into recipe-scrapers is used. (This is now the only way how text_scraper in cookbook/helper/scrapers/scrapers.py gets called)

  • Additionally, we always try to get data from the property classes first. If that fails we try to get it from the schema attributes.

This doesn't change a lot of the old behaviour but uses the scraper classes of recipe-scrapers better. This way it shouldn't break anything for any sites.

Fixes:

I'm not sure what to do about the CooksIllustrated custom scrapers and if they still work.
@smilerz Why was this implemented within tandoor and not within recipe-scrapers and is this still working?

@gloriousDan gloriousDan changed the title Call scrape_me first when scraping from url Fix import with recipe-scrapers May 9, 2022
@smilerz
Copy link
Collaborator

smilerz commented May 9, 2022

@gloriousDan the site is behind a paywall so can’t be implemented/tested as part of the main library.

A test is implemented for that site, so as long as the PR passes all of the url import tests it should be fine.

@gloriousDan
Copy link
Contributor Author

@smilerz Ah okay, I already suspected that something like this is the reason. I didn't run the tests locally (yet), are they run by CI?

@gloriousDan
Copy link
Contributor Author

@smilerz
Copy link
Collaborator

smilerz commented May 9, 2022

@smilerz Ah okay, I already suspected that something like this is the reason. I didn't run the tests locally (yet), are they run by CI?

Only on the primary branches.

@gloriousDan
Copy link
Contributor Author

I just ran the test suite locally.
The only issues seem to be because of the same error:

=========================================================================================== 
short test summary info 
============================================================================================
FAILED cookbook/tests/api/test_api_recipe.py::test_share_permission - ValueError: Missing staticfiles manifest entry for 'assets/favicon.svg'
FAILED cookbook/tests/api/test_api_userpreference.py::test_add - ValueError: Missing staticfiles manifest entry for 'assets/favicon.svg'
FAILED cookbook/tests/edits/test_edits_recipe.py::test_switch_recipe - ValueError: Missing staticfiles manifest entry for 'assets/favicon.svg'
FAILED cookbook/tests/edits/test_edits_recipe.py::test_external_recipe_update - ValueError: Missing staticfiles manifest entry for 'assets/favicon.svg'
FAILED cookbook/tests/edits/test_edits_storage.py::test_edit_storage - ValueError: Missing staticfiles manifest entry for 'assets/favicon.svg'
FAILED cookbook/tests/edits/test_edits_storage.py::test_view_permission[arg3] - ValueError: Missing staticfiles manifest entry for 'assets/favicon.svg'
FAILED cookbook/tests/edits/test_edits_storage.py::test_view_permission[arg6] - ValueError: Missing staticfiles manifest entry for 'assets/favicon.svg'
FAILED cookbook/tests/other/test_export.py::test_export_file_cache[arg2] - ValueError: Missing staticfiles manifest entry for 'assets/favicon.svg'
FAILED cookbook/tests/other/test_export.py::test_export_file_cache[arg3] - ValueError: Missing staticfiles manifest entry for 'assets/favicon.svg'
FAILED cookbook/tests/other/test_export.py::test_export_file_cache[arg4] - ValueError: Missing staticfiles manifest entry for 'assets/favicon.svg'
FAILED cookbook/tests/other/test_export.py::test_export_file_cache[arg5] - ValueError: Missing staticfiles manifest entry for 'assets/favicon.svg'
FAILED cookbook/tests/views/test_views_api.py::test_external_link_permission[arg4] - ValueError: Missing staticfiles manifest entry for 'assets/favicon.svg'
FAILED cookbook/tests/views/test_views_api.py::test_external_link_permission[arg5] - ValueError: Missing staticfiles manifest entry for 'assets/favicon.svg'
FAILED cookbook/tests/views/test_views_api.py::test_external_link_permission[arg6] - ValueError: Missing staticfiles manifest entry for 'assets/favicon.svg'
FAILED cookbook/tests/views/test_views_general.py::test_books[arg2] - ValueError: Missing staticfiles manifest entry for 'assets/favicon.svg'
FAILED cookbook/tests/views/test_views_general.py::test_books[arg3] - ValueError: Missing staticfiles manifest entry for 'assets/favicon.svg'
FAILED cookbook/tests/views/test_views_general.py::test_plan[arg2] - ValueError: Missing staticfiles manifest entry for 'assets/favicon.svg'
FAILED cookbook/tests/views/test_views_general.py::test_plan[arg3] - ValueError: Missing staticfiles manifest entry for 'assets/favicon.svg'
FAILED cookbook/tests/views/test_views_general.py::test_shopping[arg2] - ValueError: Missing staticfiles manifest entry for 'assets/favicon.svg'
FAILED cookbook/tests/views/test_views_general.py::test_shopping[arg3] - ValueError: Missing staticfiles manifest entry for 'assets/favicon.svg'
FAILED cookbook/tests/views/test_views_general.py::test_settings[arg1] - ValueError: Missing staticfiles manifest entry for 'assets/favicon.svg'
FAILED cookbook/tests/views/test_views_general.py::test_settings[arg2] - ValueError: Missing staticfiles manifest entry for 'assets/favicon.svg'
FAILED cookbook/tests/views/test_views_general.py::test_settings[arg3] - ValueError: Missing staticfiles manifest entry for 'assets/favicon.svg'
FAILED cookbook/tests/views/test_views_general.py::test_history[arg1] - ValueError: Missing staticfiles manifest entry for 'assets/favicon.svg'
FAILED cookbook/tests/views/test_views_general.py::test_history[arg2] - ValueError: Missing staticfiles manifest entry for 'assets/favicon.svg'
FAILED cookbook/tests/views/test_views_general.py::test_history[arg3] - ValueError: Missing staticfiles manifest entry for 'assets/favicon.svg'
FAILED cookbook/tests/views/test_views_general.py::test_markdown_doc[arg0] - ValueError: Missing staticfiles manifest entry for 'assets/favicon.svg'
FAILED cookbook/tests/views/test_views_general.py::test_markdown_doc[arg1] - ValueError: Missing staticfiles manifest entry for 'assets/favicon.svg'
FAILED cookbook/tests/views/test_views_general.py::test_markdown_doc[arg2] - ValueError: Missing staticfiles manifest entry for 'assets/favicon.svg'
FAILED cookbook/tests/views/test_views_general.py::test_markdown_doc[arg3] - ValueError: Missing staticfiles manifest entry for 'assets/favicon.svg'
FAILED cookbook/tests/views/test_views_general.py::test_api_info[arg1] - ValueError: Missing staticfiles manifest entry for 'assets/favicon.svg'
FAILED cookbook/tests/views/test_views_general.py::test_api_info[arg2] - ValueError: Missing staticfiles manifest entry for 'assets/favicon.svg'
FAILED cookbook/tests/views/test_views_general.py::test_api_info[arg3] - ValueError: Missing staticfiles manifest entry for 'assets/favicon.svg'
FAILED cookbook/tests/views/test_views_recipe_share.py::test_share - ValueError: Missing staticfiles manifest entry for 'assets/favicon.svg'
==================================================================== 
34 failed, 559 passed, 24 skipped, 1327 warnings in 237.80s (0:03:57) 
=====================================================================

Do I have to run yarn build before running the tests?

@gloriousDan
Copy link
Contributor Author

@smilerz
Copy link
Collaborator

smilerz commented May 10, 2022

Do I have to run yarn build before running the tests?

Yes, and python manage.py collectstatic

@gloriousDan
Copy link
Contributor Author

Yes, and python manage.py collectstatic

After running yarn build and python manage.py collectstatic all tests pass.

@vabene1111
Copy link
Collaborator

@smilerz i had to disable the tests for your inmporter as they were failing for me but i suspect its an easy fix, i just did not have the time to understand whats going wrong.

@gloriousDan thank you very much for this, i was just about to sit down and implement basically exactly what you did here. I will review and get this merged asap.

@gloriousDan
Copy link
Contributor Author

That's great to hear as this problem is holding me back from upgrading to 1.2.x

@vabene1111 vabene1111 merged commit cb59a63 into TandoorRecipes:develop May 11, 2022
@smilerz
Copy link
Collaborator

smilerz commented May 11, 2022

@smilerz i had to disable the tests for your inmporter as they were failing for me but i suspect its an easy fix, i just did not have the time to understand whats going wrong.

I just tried them and they pass for me. (SpruceEats is still broken)

@vabene1111
Copy link
Collaborator

ok aweseome, thanks for the info

@gloriousDan gloriousDan deleted the fix-import branch May 12, 2022 12:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Fallback to wildmode if parser is broken
3 participants