Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: quality test for nutriscore on olive oils #8360

Merged
merged 18 commits into from
Jul 6, 2023
Merged

Conversation

stephanegigandet
Copy link
Contributor

Added a test to verify that we estimate the fruits/vegetables content of olive oils without ingredients to 100% for nutriscore computation.

@stephanegigandet stephanegigandet requested a review from a team as a code owner April 25, 2023 14:14
@github-actions github-actions bot added 🚦Nutri-Score https://world.openfoodfacts.org/nutriscore 🧪 tests labels Apr 25, 2023
@benbenben2
Copy link
Collaborator

benbenben2 commented Apr 25, 2023

What is the difference between this new test (en-olive-oil-no-ingredients) and the already existing test named "olive-oil"?

Maybe we could try with Virgin olive oils or/and Extra-virgin olive oils instead

EDIT: just tested by replacing categories by "extra-virgin olive oil". It also results in:
"nutrition_score_warning_fruits_vegetables_nuts_from_category" : "en:olive-oils",
"nutrition_score_warning_fruits_vegetables_nuts_from_category_value" : 100,
This confirms that this estimation is also working for children of "en:olive-oils".

@github-actions github-actions bot added the 💥 Merge Conflicts 💥 Merge Conflicts label May 3, 2023
@github-actions github-actions bot added categories 🧽 Data quality https://wiki.openfoodfacts.org/Quality Tags 🧬 Taxonomies https://wiki.openfoodfacts.org/Global_taxonomies labels May 8, 2023
# the line should contain a single ingredient
my $expected_ingredients = $'; # everything after the matched string

if ($expected_ingredients =~ /,/i) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@stephanegigandet, is there a function that could be used there to make sure that the ingredient provided is in the ingredients taxonomy?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there's an exist_taxonomy_tag() function, but that would imply that the ingredients taxonomy would need to be loaded and built before the categories taxonomy.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

then, I will leave it as is for now

@benbenben2
Copy link
Collaborator

Fixes #8353

1/ Remark: nutriscore grades are a, b, c, d, e letters. Whereas nutriscore scores are some some numbers. Hence, I renamed expected_nutriscore:en:c (suggested in the issue) to expected_nutriscore_grade:en:c

2/ Remark 2: following ingredients exist: en:extra-virgin olive oil, en:virgin olive oil and en:olive oil.
Do we really want to expect en:olive oil as ingredient of en:Extra-virgin olive oils category (not extra-virgin olive oil)?

3/ Tried to prevent incorrect tags in Tags.pm (although these errors are ignored)

Capture d’écran du 2023-05-07 00-31-23


Capture d’écran du 2023-05-07 00-18-01


Capture d’écran du 2023-05-07 00-03-43

4/ Tried to differentiate between cases (missing nutriscore grade, which grade instead of which one) in the quality error names to allow more "fine-grained" monitoring. @CharlesNepote will that be a burden for data_quality taxonomy? (we can have variants for all categories that will have these 2 tags, different nutriscore grade values) see DataQualityFood.pm.


Capture d’écran du 2023-05-06 16-26-35


Capture d’écran du 2023-05-06 16-25-49


@codecov-commenter
Copy link

codecov-commenter commented May 8, 2023

Codecov Report

Merging #8360 (1b036da) into main (5326436) will increase coverage by 0.07%.
The diff coverage is 82.97%.

@@            Coverage Diff             @@
##             main    #8360      +/-   ##
==========================================
+ Coverage   48.61%   48.68%   +0.07%     
==========================================
  Files         117      117              
  Lines       21751    21798      +47     
  Branches     4852     4859       +7     
==========================================
+ Hits        10574    10613      +39     
- Misses       9880     9888       +8     
  Partials     1297     1297              
Impacted Files Coverage Δ
lib/ProductOpener/Tags.pm 40.78% <0.00%> (-0.19%) ⬇️
lib/ProductOpener/DataQualityFood.pm 64.08% <100.00%> (+1.05%) ⬆️
tests/unit/dataqualityfood.t 87.70% <100.00%> (+1.59%) ⬆️

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@github-actions github-actions bot removed the 💥 Merge Conflicts 💥 Merge Conflicts label May 9, 2023
@stephanegigandet
Copy link
Contributor Author

Great! Thanks a lot for the PR! :)

2/ Remark 2: following ingredients exist: en:extra-virgin olive oil, en:virgin olive oil and en:olive oil. Do we really want to expect en:olive oil as ingredient of en:Extra-virgin olive oils category (not extra-virgin olive oil)?

I would check that the actual ingredient is either the expected ingredient, or a child of it. There's a function for it:

is_a("ingredients","en:virgin-olive-oil","en:olive-oil")

4/ Tried to differentiate between cases (missing nutriscore grade, which grade instead of which one) in the quality error names to allow more "fine-grained" monitoring. @CharlesNepote will that be a burden for data_quality taxonomy? (we can have variants for all categories that will have these 2 tags, different nutriscore grade values) see DataQualityFood.pm.

I would suggest to have only one generic error, that does not specify the category, something like "Nutri-Score does not match the Nutri-Score expected for the category". For monitoring, we can always add a /categories facet. If we put too many things in the tag, then it becomes impossible to translate them.

One thing to note is that we show data errors very prominently in the producers platform, that's why we try to translate them.

@benbenben2
Copy link
Collaborator

@stephanegigandet

Reduced to 1 error for nutrition grade and 1 error for ingredient

Included is_a:

	test 1: olive oil 			   -> no error
	test 2: extra virgin olive oil -> no error
	test 3: (no ingredients)         -> error
	test 4: salt 				   -> error
	test 5: olive oil, olive oil  -> error
	test 6: switch in FR and, huile d'olive ->  no error
nutriscore E -> error
nutriscore D -> error
nutriscore C -> no error

@benbenben2
Copy link
Collaborator

benbenben2 commented May 19, 2023

image

@github-actions github-actions bot added 💥 Merge Conflicts 💥 Merge Conflicts and removed 🚦Nutri-Score https://world.openfoodfacts.org/nutriscore labels May 20, 2023
@github-actions github-actions bot removed the 💥 Merge Conflicts 💥 Merge Conflicts label May 26, 2023
Copy link
Member

@CharlesNepote CharlesNepote left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If @stephanegigandet agrees, I think we can go on. Thanks A LOT @benbenben2, this is a very important improvement IMHO.

# add following tag for category having always same nutriscore grade
# only 1 letter is allowed
# expected_nutriscore_grade:en:c
# add following tag for category having always same ingredient
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# add following tag for category having always same ingredient
# add following tag for category having always same ingredient

@stephanegigandet
Copy link
Contributor Author

Updated the tests results to make them pass.

@alexgarel alexgarel changed the title chore: test for nutriscore of olive oils without ingredients feat: quality test for nutriscore on olive oils Jun 16, 2023
@alexgarel alexgarel enabled auto-merge (squash) June 16, 2023 09:47
@github-actions github-actions bot added the 💥 Merge Conflicts 💥 Merge Conflicts label Jun 19, 2023
@teolemon
Copy link
Member

@stephanegigandet merge conflict

@benbenben2
Copy link
Collaborator

Thank you @stephanegigandet! I was stuck

@benbenben2
Copy link
Collaborator

  • Merging with main locally, there are no conflicts but "make cover" leads to many errors:

tests/unit/attributes.t (Wstat: 2048 Tests: 11 Failed: 8)
Failed tests: 1-2, 4-9
Non-zero exit status: 8
tests/unit/ecoscore.t (Wstat: 3840 Tests: 69 Failed: 15)
Failed tests: 29-36, 41, 47-48, 53-54, 56, 63
Non-zero exit status: 15
tests/unit/forest_footprint.t (Wstat: 1024 Tests: 9 Failed: 4)
Failed tests: 4, 7-9
Non-zero exit status: 4
tests/unit/ingredients.t (Wstat: 4096 Tests: 53 Failed: 16)
Failed tests: 1, 3, 15, 17, 24, 32, 38-42, 44, 50-53
Non-zero exit status: 16
tests/unit/ingredients_processing.t (Wstat: 256 Tests: 61 Failed: 1)
Failed test: 61
Non-zero exit status: 1
tests/unit/nutriscore.t (Wstat: 1024 Tests: 27 Failed: 4)
Failed tests: 9, 15-16, 21
Non-zero exit status: 4
tests/unit/packaging.t (Wstat: 5120 Tests: 69 Failed: 20)
Failed tests: 12-15, 26, 29, 34, 41-42, 45-46, 50-56
62, 66
Non-zero exit status: 20
tests/unit/recipes.t (Wstat: 1280 Tests: 8 Failed: 5)
Failed tests: 1-2, 4, 6-7
Non-zero exit status: 5

  • "make update_tests_results" runs until:

Waited too much for backend at /opt/product-opener/lib/ProductOpener/APITest.pm line 128.
ProductOpener::APITest::wait_server() called at /opt/product-opener/lib/ProductOpener/APITest.pm line 141
ProductOpener::APITest::wait_application_ready() called at integration/search_v1.t line 18
make: *** [Makefile:291 : update_tests_results] Erreur 22

  • Now, "make cover" works fine
  • Commit and push
  • Now fails with this:

tests/integration/api_v2_product_read.t (Wstat: 1280 Tests: 43 Failed: 5)
Failed tests: 5, 29, 32, 35, 38
Non-zero exit status: 5
tests/integration/api_v2_product_write.t (Wstat: 512 Tests: 16 Failed: 2)
Failed tests: 5, 11
Non-zero exit status: 2
tests/integration/api_v3_product_read.t (Wstat: 1280 Tests: 48 Failed: 5)
Failed tests: 5, 32, 35, 38, 41
Non-zero exit status: 5
tests/integration/data_quality_knowledge_panel.t (Wstat: 512 Tests: 6 Failed: 2)
Failed tests: 2, 5
Non-zero exit status: 2
tests/integration/protected_product.t (Wstat: 1536 Tests: 34 Failed: 6)
Failed tests: 9, 15, 19, 23, 29, 33
Non-zero exit status: 6

Need help on this @stephanegigandet @alexgarel @teolemon

@github-actions github-actions bot removed the 💥 Merge Conflicts 💥 Merge Conflicts label Jul 4, 2023
@alexgarel
Copy link
Member

@benbenben2 I pushed 5084698 which:

  • fix taxonomy name
  • update tests results.

In your previous version, lots of expected_tests_results had "en:ingredients-single-ingredient-from-category-does-not-match-actual-ingredients" added, without any reason for that (like in cookies for example).

@sonarcloud
Copy link

sonarcloud bot commented Jul 6, 2023

Kudos, SonarCloud Quality Gate passed!    Quality Gate passed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 0 Code Smells

No Coverage information No Coverage information
No Duplication information No Duplication information

@alexgarel alexgarel merged commit 415d68c into main Jul 6, 2023
14 checks passed
@alexgarel alexgarel deleted the olive-oil-test branch July 6, 2023 13:47
@benbenben2
Copy link
Collaborator

Thanks a lot @alexgarel 🥇

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
categories 🧽 Data quality https://wiki.openfoodfacts.org/Quality Tags 🧬 Taxonomies https://wiki.openfoodfacts.org/Global_taxonomies 🧪 tests
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

None yet

6 participants