# Getting all Ingredents in a Text File List

## Author: James Christopher Wolfe

This Python Notebook reads `basic-pantry-101.txt` and will get the ingredents themselves and will place them in text files depending on their category of pantry items.

*WARNING*: Have Python version 3.4 or up.

If you don't have this file in this directory then go though the `basic-pantry-grab_info.ipynb` file and run it to generate the file.

### Check and see if `basic-pantry-101.txt` exists

For this Python Notebook to work you need `basic-pantry-101.txt` generated by the `basic-pantry-grab_info.ipynb` file.

In [178]:
from pathlib import Path

my_file = Path("basic-pantry-101.txt")

if my_file.exists():
    print("File exists, carry on!")
else:
    print("'basic-pantry-101.txt' does not exist")
    print("Go though the 'basic-pantry-grab_info.ipynb' to generate the text file")
    

File exists, carry on!


### Import the regular expression module

Before opening the `basic-pantry-101.txt` file we'll need to import the module for using regular expressions.  If want to learn and practice your Regular Expression go to [Regex One](https://regexone.com/) there is also a SQL version of the site called [SQL Bolt](https://sqlbolt.com/) should that interests you!

[Regular Expression 101](https://regex101.com/) really helps figuring out what expressions to write in this code!

In [179]:
import re

### Start reading the file

We will now start reading the open `basic-pantry-101.txt` and read it. We'll also print out the file just to check it.

In [180]:
file_handler = open(my_file, "r")
print(file_handler.read())
file_handler.close()

~ Oils, Vinegars and Condiments

-- Oils: canola oil, extra-virgin olive oil, toasted sesame
-- Vinegars: balsamic, distilled white, red wine, rice
-- Ketchup
-- Mayonnaise
-- Dijon mustard
-- Soy sauce
-- Chili paste
-- Hot sauce
-- Worcestershire
 
~ Seasonings

-- Kosher salt

-- Black peppercorns

-- Dried herbs and spices: bay leaves, cayenne pepper, crushed red pepper, cumin, ground coriander, oregano, paprika, rosemary, thyme leaves, cinnamon, cloves, allspice, ginger, nutmeg

-- Spice blends: chili powder, curry powder, Italian seasoning

-- Vanilla extract
~ 
Canned Goods and Bottled Items

-- Canned beans: black, cannellini, chickpeas, kidney

-- Capers

-- Olives

-- Peanut butter

-- Preserves or jelly

-- Low-sodium stock or broth

-- Canned tomatoes

-- Tomatoes, canned and paste

-- Salsa

-- Tuna fish


~ Grains and Legumes

-- Breadcrumbs: regular, panko

-- Couscous

-- Dried lentils

-- Pasta: regular, whole wheat

-- Rice

-- Rolled oats

-- One other dried grain: t

### Open file and parse each section and print to check.

We'll create a function called `section()` that splits parts of file into a string, stores them in a list and returns a value of an index in the list.

**References**: (1) [How do I read a specific line of a .txt file with Python 3? I only want to read one line and save it as a variable and not all the lines.](https://www.quora.com/How-do-I-read-a-specific-line-of-a-txt-file-with-Python-3-I-only-want-to-read-one-line-and-save-it-as-a-variable-and-not-all-the-lines?share=1) (2) [Regex One](https://regexone.com/) (3) A Python Server in Discord (3) [Regular expression operations](https://docs.python.org/3/library/re.html)

In [181]:
# r"^(~\WOil.*)$" matches "~ Oils, Vinegars and Condiments"
# r"^=*=$" but r"=*=" works within the code matches "===================================="
# r"^(--\WOil.):" matches "-- Oils"
# r"^(--\WV.*):" matches "-- Vinegars"

def section(regex=r"=*=", filename=my_file, index=0):
    with open(filename, "r") as file_handler:
        part = file_handler.read()
        part = re.split(regex, part)[index]
    file_handler.close()
    return part

oil_vin_cond = section(index=0)
# print(type(oil_vin_cond))
season = section(index=1)
canned_bottled_items = section(index=2)
grains_legumes = section(index=3)
baking_prod = section(index=4)
refrig = section(index=5)
freezer = section(index=6)
stor_prod = section(index=7)

# print(oil_vin_cond, season, canned_bottled_items, grains_legumes, baking_prod, refrig, freezer, stor_prod)

### Write a function that write or re-writes and allow to read file
We'll write a function called `make_or_rewrite_file` that writes, create, or read the text files but also allows you read them if you change the `print_file` boolean param just to check. **Warning**: Can't be used in a loop.

In [182]:
def read_write(str_to_file, filename, mode="w", print_file=False):
    if mode == "w":
        with open(filename, mode) as file_hand: # Works for both mode "a" or "w"
            file_hand.write(str_to_file)
            file_hand.close()
        if print_file:
            file_hand = open(filename, "r")
            print(file_hand.read())
            file_hand.close()
    elif mode == "r":
        try:
            file_hand = open(filename, "r")
            print(file_hand.read())
            file_hand.close()
        except:
            print("File does not exist, make it!")
    else:
        print("Do not accept any modes other than 'r' or 'w' or 'a'")

## Made a string parse from file function

*Disclamer*: Created this after the ``Start with `oil_vin_cond` and make a raw text list file`` section of this file but placed it here for organization purpose and after figuring out how to create the parsing function.

In [183]:
# Disclaimer, no need to worry about "~ Oils,..." or the like.
def print_parsed_file(filename, option_name=[None]): # option_name is e.g: Vinegar, Frozen, Red, Breadcrumbs, etc. 
    hand = open(filename) # We get filename from the funcion's param
    name_type = "" # This is to assign an element from "option_name" param when we know the pantry food type.
    
    new_list = list() # We are going to store each ingredents here then in the end combine into one string.
    
    for line in hand: # For every line read by the file handler.
        line = line.rstrip() # Get rid of the whitespace to the right of the line.
        if re.search('^--\s',line): # Search for this "-- " at start of the line.
            regex = re.search('[\w]+:', line) # Search for "[Any character or more]:" <- and the ":"
            if regex: # I (author) assigned the regex in a variable "regex" for a reason.
                for name in option_name: # Any name in option_name?
                    if option_name[0] != None and name in str(regex).lower():
                        # Returns something like: "<re.match object; span=(3, 8), match='oils:'>".
                        # print(str(regex).lower()) # Basically looking in "match='whatever'".
                        name_type = name # Will be added next for loop below.
                line = line[line.find(":") + 2:] # Get rid of the ": " in the line.
                line = re.split(",\s", line) # Split e.g: "regular, whole wheat" by the ", " .
                for element in line: # This is the e.g: "balsamic, distilled white" split in the line list.
                    if name_type in element: # E.g: If "canola oil" has "oil" init or note.
                        # print(element)
                        new_list.append(element + "\n") # Just add to new_list.
                    else:
                        # print(element + " " + name_type)
                        # E.g: No "oil" in "canola", add it! Then add it to it's name and put it in the list.
                        new_list.append(element + " " + name_type + "\n")
            else:
                # print(line[3:])
                new_list.append(line[3:] + "\n") # Just add entry after "-- " portion of the line.
    hand.close() # Remember to close the file!
    return ''.join(new_list) # Make a string out of the new_list.

### Start with `oil_vin_cond` and make a `oils-vinegars-condiments.txt` file
We'll make a `oils-vinegars-condiments.txt` out of the `oil_vin_cond` string variable, though we also have to clean it up a bit. We are going to take the value of `oil_vin_cond`, process the value, and give the `oil_vin_cond` a new value. Then re-write the `oils-vinegars-condiments.txt` using the new `oil_vin_cond`.

In [184]:
food_file = "oils-vinegars-condiments.txt"

read_write(oil_vin_cond, food_file)

In [185]:
food_type = ["oil", "vinegar"] # Oils and Vinegars are in the names of pantry ingredents.
# Rewriting the old "oil_vin_cond" with a new value from the
#  "print_parsed_file()" function
oil_vin_cond = print_parsed_file(food_file, food_type)

# Bare with me! We are rewriting the old "oils-vinegars-condiments.txt" 
#  with the re-written "oil_vin_cond" variable!
# We need to use the "read_append_write()" function (again) to do this.
read_append_write(oil_vin_cond, food_file, print_file=True)
# Run this cell once, or you have to start at the
#  "Open file and parse each section and print to check." section and down again.

canola oil
extra-virgin olive oil
toasted sesame oil
balsamic vinegar
distilled white vinegar
red wine vinegar
rice vinegar
Ketchup
Mayonnaise
Dijon mustard
Soy sauce
Chili paste
Hot sauce
Worcestershire



### Working with the `season` variable and the `seasonings.txt` file next

*Hint*: When starting this Python Notebook again start running the cells from the ``Check and see if `basic-pantry-101.txt` exists`` downwards but avoid the cells in the ``Start with `oil_vin_cond` and make a raw text list file`` section. You don't need to re-write `oils-vinegars-condiments.txt` if you already created and parsed it.

We'll make a `seasonings.txt` out of the `season` string variable, we'll follow the same logic from the ``Start with `oil_vin_cond` and make a raw text list file`` section.

In [186]:
food_file = "seasonings.txt"

read_write(season, food_file)

In [187]:
food_type = [None]
season = print_parsed_file(food_file, food_type)

read_write(season, food_file, print_file=True)

Kosher salt
Black peppercorns
bay leaves
cayenne pepper
crushed red pepper
cumin
ground coriander
oregano
paprika
rosemary
thyme leaves
cinnamon
cloves
allspice
ginger
nutmeg
chili powder
curry powder
Italian seasoning
Vanilla extract



### Working on `canned_bottled_items` variable and the `canned-and-bottled-goods.txt` file

In [188]:
food_file = "canned-and-bottled-goods.txt"
read_write(canned_bottled_items, food_file)

In [189]:
food_type = ["beans"]
canned_bottled_items = print_parsed_file(food_file, food_type)

read_write(canned_bottled_items, food_file, print_file=True)

black beans
cannellini beans
chickpeas beans
kidney beans
Capers
Olives
Peanut butter
Preserves or jelly
Low-sodium stock or broth
Canned tomatoes
Tomatoes, canned and paste
Salsa
Tuna fish



We need to customize `canned_bottled_items` to deal with "Tomatoes, canned or paste" or any line with "or". Then we simply rewrite the `canned-and-bottled-goods.txt` file.

In [190]:
food_hand = open(food_file, "r")

# for line in food_hand:
#     print(line)

new_list = list()
for line in food_hand:
    line = line.rstrip()
    if "Tomatoes," in line:
        new_list.append("Tomatoes\n")
        new_list.append("Tomato paste\n")
        continue
    if " or " in line:
        line = line.split(" or ")
        for item in line:
            new_list.append(item + "\n")
    else:
        new_list.append(line + "\n")
canned_bottled_items = "".join(new_list)
print(canned_bottled_items)

black beans
cannellini beans
chickpeas beans
kidney beans
Capers
Olives
Peanut butter
Preserves
jelly
Low-sodium stock
broth
Canned tomatoes
Tomatoes
Tomato paste
Salsa
Tuna fish



In [191]:
read_write(canned_bottled_items, food_file, print_file=True)

black beans
cannellini beans
chickpeas beans
kidney beans
Capers
Olives
Peanut butter
Preserves
jelly
Low-sodium stock
broth
Canned tomatoes
Tomatoes
Tomato paste
Salsa
Tuna fish



### Work on the `grains_legumes` variable and the `grains-legumes.txt` file

In [192]:
food_file = "grains-legumes.txt"
read_write(grains_legumes, food_file)

In [193]:
food_type = ["breadcrumbs", "pasta", "grains"]
grains_legumes = print_parsed_file(food_file, food_type)

read_write(grains_legumes, food_file, print_file=True)

regular breadcrumbs
panko breadcrumbs
Couscous
Dried lentils
regular pasta
whole wheat pasta
Rice
Rolled oats
try barley pasta
millet pasta
quinoa or wheatberries pasta



Customize `grains-legumes.txt` to get rid of "try" and "or"

In [194]:
food_hand = open(food_file, "r")

# for line in food_hand:
#     print(line)

new_list = list()
for line in food_hand:
    line = line.rstrip()
    if "oats" in line:
        new_list.append("oats\n")
        continue
    if "try" in line.lower():
        line = line[line.find("try") + 4:]
        new_list.append(line + "\n")
        continue
    if " or " in line:
        line = line.split(" or ")
        for item in line:
            new_list.append(item + "\n")
    else:
        new_list.append(line + "\n")
grains_legumes = "".join(new_list)
print(grains_legumes)

regular breadcrumbs
panko breadcrumbs
Couscous
Dried lentils
regular pasta
whole wheat pasta
Rice
Rolled oats
barley pasta
millet pasta
quinoa
wheatberries pasta



In [195]:
read_write(grains_legumes, food_file, print_file=True)

regular breadcrumbs
panko breadcrumbs
Couscous
Dried lentils
regular pasta
whole wheat pasta
Rice
Rolled oats
barley pasta
millet pasta
quinoa
wheatberries pasta



### Work on the `baking_prod` variable and the `baking-products.txt` file

In [196]:
food_file = "baking-products.txt"
read_write(baking_prod, food_file)

In [197]:
food_type = [None]
baking_prod = print_parsed_file(food_file, food_type)

read_write(baking_prod, food_file, print_file=True)

Baking powder
Baking soda
Brown sugar
Cornstarch
All-purpose flour
Granulated sugar
Honey



### Work on the `refrig` variable and the `refrigerator-basics.txt` file

In [198]:
food_file = "refrigerator-basics.txt"
read_write(refrig, food_file)

In [199]:
food_type = ["cheese"]
refrig = print_parsed_file(food_file, food_type)

read_write(refrig, food_file, print_file=True)

Butter
sharp cheddar cheese
feta cheese
Parmesan cheese
mozzarella cheese
Large eggs
Milk
Plain yogurt
Corn tortillas



### Work on the `freezer` variable and `freezer-basics.txt` file

In [200]:
food_file = "freezer-basics.txt"
read_write(freezer, food_file)

In [201]:
food_type = [None]
freezer = print_parsed_file(food_file, food_type)

read_write(freezer, food_file, print_file=True)

blackberries
blueberries
peaches
strawberries
broccoli
bell pepper and onion mix
corn
edamame
peas
spinach



Modify `freezer-basics.txt` to seperate the "bell pepper and onion mix" entry.

In [202]:
food_hand = open(food_file, "r")

# for line in food_hand:
#     print(line)

new_list = list()
for line in food_hand:
    line = line.rstrip()
    if "bell pepper and onion mix" in line:
        line = line.split(" and ")
        for item in line:
            new_list.append(item + "\n")
    else:
        new_list.append(line + "\n")
freezer = "".join(new_list)
print(freezer)

blackberries
blueberries
peaches
strawberries
broccoli
bell pepper
onion mix
corn
edamame
peas
spinach



In [203]:
read_write(freezer, food_file, print_file=True)

blackberries
blueberries
peaches
strawberries
broccoli
bell pepper
onion mix
corn
edamame
peas
spinach



### And Finally, work on the `stor_prod` variable and the `storage-produce.txt` file

In [204]:
food_file = "storage-produce.txt"
read_write(stor_prod, food_file)

In [205]:
food_type = [None]
stor_prod = print_parsed_file(food_file, food_type)

read_write(stor_prod, food_file, print_file=True)

Garlic
Onions (red, yellow)
Potatoes
raisins
apples
apricots
almonds
peanuts
sunflower



Modify `storage-produce.txt` to make "Onions (red, yellow)" into "onion", "red onion", and "yellow onion". "sunflower" needs the word "seed" appended.

In [206]:
food_hand = open(food_file, "r")

# for line in food_hand:
#     print(line)

new_list = list()
for line in food_hand:
    line = line.rstrip()
    if "sunflower" in line:
        new_list.append(line + " seeds\n")
        continue
    if re.search("Onions\s\(", line):
        onion = "onion\n"
        new_list.append(onion)
        new_list.append("red "+ onion)
        new_list.append("yellow "+ onion)
    else:
        new_list.append(line + "\n")
stor_prod = "".join(new_list)
print(stor_prod)

Garlic
onion
red onion
yellow onion
Potatoes
raisins
apples
apricots
almonds
peanuts
sunflower seeds



In [207]:
read_write(stor_prod, food_file, print_file=True)

Garlic
onion
red onion
yellow onion
Potatoes
raisins
apples
apricots
almonds
peanuts
sunflower seeds



### Now let us create a master `ingredients.txt` file

We are going to take all the text file except for the `basic-pantry-101.txt` file and make a master list out of them!

In [1]:
import glob

 We need to create a list of `txt` files that will be appended (or written) to the master file.
 
 *References*: (1) [Find all files in a directory with extension .txt in Python](https://stackoverflow.com/questions/3964681/find-all-files-in-a-directory-with-extension-txt-in-python#3964691) (2) [Python 3 "glob" documentation](https://docs.python.org/3/library/glob.html)

In [2]:
txt_list = list()
for file in glob.glob("*.txt"):
    if file == "basic-pantry-101.txt":
        continue
    txt_list.append(file)
print(txt_list)

['canned-and-bottled-goods.txt', 'grains-legumes.txt', 'storage-produce.txt', 'seasonings.txt', 'baking-products.txt', 'refrigerator-basics.txt', 'oils-vinegars-condiments.txt', 'ingredients.txt', 'freezer-basics.txt', 'seasoning.txt']


Finally, lets create the master `ingredients.txt` file and append the contents of the other files into the master file.

*Reference*: [How to merge multiple files into a new file using Python?](https://www.tutorialspoint.com/How-to-merge-multiple-files-into-a-new-file-using-Python)

In [3]:
with open("ingredients.txt", "w") as master_file_handler:
    for fname in txt_list:
        with open(fname) as infile:
            master_file_handler.write(infile.read().lower())

Let's read the file and see if it worked.

In [4]:
file = open("ingredients.txt")
for line in file:
    print(line)
file.close()

black beans

cannellini beans

chickpeas beans

kidney beans

capers

olives

peanut butter

preserves

jelly

low-sodium stock

broth

canned tomatoes

tomatoes

tomato paste

salsa

tuna fish

regular breadcrumbs

panko breadcrumbs

couscous

dried lentils

regular pasta

whole wheat pasta

rice

oats

barley pasta

millet pasta

quinoa

wheatberries pasta

garlic

onion

red onion

yellow onion

potatoes

raisins

apples

apricots

almonds

peanuts

sunflower seeds

kosher salt

black peppercorns

bay leaves

cayenne pepper

crushed red pepper

cumin

ground coriander

oregano

paprika

rosemary

thyme leaves

cinnamon

cloves

allspice

ginger

nutmeg

chili powder

curry powder

italian seasoning

vanilla extract

baking powder

baking soda

brown sugar

cornstarch

all-purpose flour

granulated sugar

honey

butter

sharp cheddar cheese

feta cheese

parmesan cheese

mozzarella cheese

large eggs

milk

plain yogurt

corn tortillas

canola oil

extra-virgin olive oil

toasted ses

## Yay!
Master file has been created. In a Python Notebook called `common_ingred_in_recipe.ipynb` we will scan websites like the Food Network to find the common ingredients in Healthy or Well-balanced meals.