Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[POC] Return parsed ingredients (name, unit, quantity) #733

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

lizozom
Copy link

@lizozom lizozom commented Feb 12, 2023

Currently ingredients are returned as unprocessed strings.
I'm proposing a change to the existing ingredients API to returned an array of objects with the following structure (using quantulum3):

[{
    'name': 'white sugar', 
    'quantity': 2.0, 
    'unit': 'tablespoon'
}]

We could also make this a non breaking change by adding an optional input parameter or separating this into two APIs.


This PR only updates a few tests to demonstrate the change, and if the community agrees, I can update all other tests\parsers and the documentation.

Curious to hear what you think!

@lizozom
Copy link
Author

lizozom commented Feb 13, 2023

@hhursev would you be interested in this type of improvement for project?
If so, I could work on it.

@hhursev
Copy link
Owner

hhursev commented Feb 24, 2023

Hey!

Interested in what quantulum3 -like package can do out of the box on top of our .ingredients() method!
Does it need to have numpy, scipy, sklearn installed?

I feel like if you are happy with the results you should continue on this idea! I'm thinking the proper approach for us is:

  1. If user installs the package with pip install recipe-scrapers
<scraper>.ingredients()   # returns our current. the unprocessed strings.
  1. If user installs with pip install recipe-scrapers[extras] the
<scraper>.ingredients()  # would return what you are suggesting.

so in a sense what you are proposing won't be in the core package but may overwrite the default .ingredients() method depending on with what instructions the package was installed with.

@dragonpop76
Copy link

Personally I'd love something like this!

@anguswg-ucsb
Copy link

@hhursev @lizozom
For what its worth, the ingredient_slicer package provides this functionality using only base python and python's standard library. @hhursev If you wanted to implement some sort of quantity/unit extraction from the ingredients() method, WITHOUT bringing on any new dependencies (besides the ingredient_slicer itself) then this would work really well. The package is thoroughly tested and works really well for extracting units and quantities from ingredient strings.

An example from the README:

import ingredient_slicer

slicer = ingredient_slicer.IngredientSlicer("2 (15-ounces) cans chickpeas, rinsed and drained")

slicer.to_json()

{   
    'ingredient': '2 (15-ounces) cans chickpeas, rinsed and drained', 
    'standardized_ingredient': '2 cans chickpeas, rinsed and drained', 
    'food': 'chickpeas', 

    # primary quantity and units
    'quantity': '30', 
    'unit': 'ounces', 
    'standardized_unit': 'ounce', 

    # any other secondary quantity and units found in the string
    'secondary_quantity': '2', 
    'secondary_unit': 'cans', 
    'standardized_secondary_unit': 'can', 

    'gram_weight': '850.49', 
    'prep': ['drained', 'rinsed'], 
    'size_modifiers': [], 
    'dimensions': [], 
    'is_required': True, 
    'parenthesis_content': ['15 ounce']
}

Note: I am the author of this package and I'm shilling it because it works better than any other open source solution I could find (there are other good ingredient parsers out there, but most of them require large additional dependencies) and it filled a big hole in my work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants