# Data Cleaning Notebook

This notebook performs exploratory data analysis and cleaning on a given dataset. It includes:
- Data loading and inspection
- Analysis and visualisation

In [None]:
import pandas as pd
import sqlite3
from utils import find_unique_vals, map_to_main_category, clean_column

----

## Data Loading and Initial Inspection

Load the dataset and perform basic inspection.

In [34]:
# Connect to the SQLite database and retrieve data
conn = sqlite3.connect("./data/japanese_recipes.db")
query = "SELECT * FROM recipes"
df_japanese = pd.read_sql_query(query, conn)
conn.close()

# Columns to keep
df_japanese = df_japanese[
    [
        "title",
        "link",
        "image_url",
        "description",
        "Total Time:",
        "Course:",
        "Cuisine:",
        "ingredients",
        "Calories:",
    ]
]


# Rename columns
df_japanese = df_japanese.rename(
    columns={
        "Total Time:": "total_time",
        "Course:": "course",
        "Cuisine:": "cuisine",
        "Calories:": "calories",
    }
)

# Display basic info about the dataset
print("Dataset Info:")
df_japanese.info()

print("\nSample Data:")
df_japanese.head()

Dataset Info:
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 948 entries, 0 to 947
Data columns (total 9 columns):
 #   Column       Non-Null Count  Dtype 
---  ------       --------------  ----- 
 0   title        948 non-null    object
 1   link         948 non-null    object
 2   image_url    948 non-null    object
 3   description  948 non-null    object
 4   total_time   930 non-null    object
 5   course       940 non-null    object
 6   cuisine      940 non-null    object
 7   ingredients  948 non-null    object
 8   calories     926 non-null    object
dtypes: object(9)
memory usage: 66.8+ KB

Sample Data:


Unnamed: 0,title,link,image_url,description,total_time,course,cuisine,ingredients,calories
0,Yuzu Cha (Citron Tea),https://www.justonecookbook.com/yuzu-cha/,https://www.justonecookbook.com/wp-content/upl...,Try my easy homemade recipe for Yuzu Cha (Citr...,1 hour,How to,Japanese,yuzu; white rock sugar; shochu,
1,Japanese Milk Bread (Shokupan),https://www.justonecookbook.com/japanese-milk-...,https://www.justonecookbook.com/wp-content/upl...,Japanese Milk Bread is possibly the best versi...,3 hours,Breakfast,Japanese,warm water; sugar; Diamond Crystal kosher salt...,1645 kcal
2,Chicken Chashu,https://www.justonecookbook.com/chicken-chashu/,https://www.justonecookbook.com/wp-content/upl...,Juicy and tender Chicken Chashu is a lighter v...,1 hour,Main Course,Japanese,"boneless, skin-on chicken thighs; green onions...",
3,Gyudon (Japanese Beef Rice Bowl),https://www.justonecookbook.com/gyudon/,https://www.justonecookbook.com/wp-content/upl...,With thinly sliced beef and tender onions simm...,20 minutes,Main Course,Japanese,onion; green onion/scallion; thinly sliced bee...,657 kcal
4,Japanese Beef Curry,https://www.justonecookbook.com/japanese-beef-...,https://www.justonecookbook.com/wp-content/upl...,"With tender chunks of beef, potatoes, carrots,...",3 hours,Main Course,Japanese,onions; unsalted butter; neutral oil; russet p...,426 kcal


In [35]:
# Connect to the SQLite database and retrieve data
conn = sqlite3.connect("./data/chinese_recipes.db")
query = "SELECT * FROM recipes"
df_chinese = pd.read_sql_query(query, conn)
conn.close()

# Columns to keep
df_chinese = df_chinese[
    [
        "title",
        "link",
        "image_url",
        "description",
        "Total Time:",
        "Course:",
        "Cuisine:",
        "ingredients",
        "Calories:",
    ]
]


# Rename columns
df_chinese = df_chinese.rename(
    columns={
        "Total Time:": "total_time",
        "Course:": "course",
        "Cuisine:": "cuisine",
        "Calories:": "calories",
    }
)

# Display basic info about the dataset
print("Dataset Info:")
df_chinese.info()

print("\nSample Data:")
df_chinese.head()

Dataset Info:
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 905 entries, 0 to 904
Data columns (total 9 columns):
 #   Column       Non-Null Count  Dtype 
---  ------       --------------  ----- 
 0   title        905 non-null    object
 1   link         905 non-null    object
 2   image_url    905 non-null    object
 3   description  886 non-null    object
 4   total_time   845 non-null    object
 5   course       901 non-null    object
 6   cuisine      902 non-null    object
 7   ingredients  905 non-null    object
 8   calories     837 non-null    object
dtypes: object(9)
memory usage: 63.8+ KB

Sample Data:


Unnamed: 0,title,link,image_url,description,total_time,course,cuisine,ingredients,calories
0,Easy Oyster Mushroom Stir Fry,https://omnivorescookbook.com/easy-oyster-mush...,https://omnivorescookbook.com/wp-content/uploa...,A super quick and easy oyster mushroom stir fr...,15 minutes,Side Dish,Chinese,oyster mushrooms; peanut oil; garlic; sugar; s...,85 kcal
1,Honey Glazed Salmon,https://omnivorescookbook.com/honey-soy-sauce-...,https://omnivorescookbook.com/wp-content/uploa...,A simple yet rich tasting honey glazed salmon ...,55 minutes,Main,Chinese Fusion,salmon filets; salt; sugar; honey; Shaoxing wi...,445 kcal
2,Shrimp Toast,https://omnivorescookbook.com/shrimp-toast/,https://omnivorescookbook.com/wp-content/uploa...,Make these crispy savory shrimp toasts as an a...,40 minutes,Appetizer,Chinese,shrimp; egg white; ginger; garlic; light soy s...,234 kcal
3,Garlic Fried Rice,https://omnivorescookbook.com/garlic-fried-rice/,https://omnivorescookbook.com/wp-content/uploa...,A Chinese style garlic fried rice featuring cr...,25 minutes,Side,Chinese,of leftover cooked jasmine rice; soy sauce; oy...,239 kcal
4,Chicken with Garlic Sauce,https://omnivorescookbook.com/chicken-with-gar...,https://omnivorescookbook.com/wp-content/uploa...,Chicken with garlic sauce is a super easy take...,30 minutes,Main,Chinese,chicken breasts or thighs; Shaoxing wine; salt...,248 kcal


In [36]:
# Connect to the SQLite database and retrieve data
conn = sqlite3.connect("./data/indian_recipes.db")
query = "SELECT * FROM recipes"
df_indian = pd.read_sql_query(query, conn)
conn.close()

# Columns to keep
df_indian = df_indian[
    [
        "title",
        "link",
        "image_url",
        "description",
        "Total Time",
        "Course",
        "Cuisine",
        "ingredients",
        "Calories:",
    ]
]


# Rename columns
df_indian = df_indian.rename(
    columns={
        "Total Time": "total_time",
        "Course": "course",
        "Cuisine": "cuisine",
        "Calories:": "calories",
    }
)

# Display basic info about the dataset
print("Dataset Info:")
df_indian.info()

print("\nSample Data:")
df_indian.head()

Dataset Info:
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 430 entries, 0 to 429
Data columns (total 9 columns):
 #   Column       Non-Null Count  Dtype 
---  ------       --------------  ----- 
 0   title        430 non-null    object
 1   link         430 non-null    object
 2   image_url    430 non-null    object
 3   description  425 non-null    object
 4   total_time   410 non-null    object
 5   course       429 non-null    object
 6   cuisine      429 non-null    object
 7   ingredients  430 non-null    object
 8   calories     420 non-null    object
dtypes: object(9)
memory usage: 30.4+ KB

Sample Data:


Unnamed: 0,title,link,image_url,description,total_time,course,cuisine,ingredients,calories
0,Kala Chana Curry {Gujarati Rasawala Kala Chana...,https://ministryofcurry.com/kala-chana-curry/,https://ministryofcurry.com/wp-content/uploads...,Hearty Kala Chana Curry made with black chickp...,9 hours,dinner,Indian,dry black chana; water; oil; mustard seeds; hi...,246 kcal
1,"Chilli Tofu {Bold Flavors, Light Twist}",https://ministryofcurry.com/chilli-tofu/,https://ministryofcurry.com/wp-content/uploads...,A light spin on Chilli Paneer by using tofu an...,30 minutes,"dinner, Lunch",Indo-Chinese,extra firm tofu; Kashmiri red chili powder; ko...,208 kcal
2,Quick & Easy Khichdi: Perfect for Cozy Evening...,https://ministryofcurry.com/khichdi/,https://ministryofcurry.com/wp-content/uploads...,This simple khichdi recipe makes for a nourish...,30 minutes,dinner,Indian,short-grain rice; moong dal; water; kosher sal...,270 kcal
3,Pomegranate Mojito Recipe,https://ministryofcurry.com/pomegranate-mojito/,https://ministryofcurry.com/wp-content/uploads...,"Twist to the classic mojito, this Pomegranate ...",,Beverage,American,pomegrante juice; club soda; ice cubes; fresh ...,110 kcal
4,Easy Malai Laddo,https://ministryofcurry.com/malai-laddu/,https://ministryofcurry.com/wp-content/uploads...,"Easy 5-ingredient Malai Laddu for a quick, del...",35 minutes,Dessert,Indian,ricotta cheese; heavy cream; powdered sugar; c...,99 kcal


In [37]:
# Connect to the SQLite database and retrieve data
conn = sqlite3.connect("./data/thai_recipes.db")
query = "SELECT * FROM recipes"
df_thai = pd.read_sql_query(query, conn)
conn.close()

# Columns to keep
df_thai = df_thai[
    [
        "title",
        "link",
        "image_url",
        "description",
        "Total Time",
        "Course",
        "Cuisine",
        "ingredients",
        "Calories:",
    ]
]


# Rename columns
df_thai = df_thai.rename(
    columns={
        "Total Time": "total_time",
        "Course": "course",
        "Cuisine": "cuisine",
        "Calories:": "calories",
    }
)

# Display basic info about the dataset
print("Dataset Info:")
df_thai.info()

print("\nSample Data:")
df_thai.head()

Dataset Info:
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 309 entries, 0 to 308
Data columns (total 9 columns):
 #   Column       Non-Null Count  Dtype 
---  ------       --------------  ----- 
 0   title        309 non-null    object
 1   link         309 non-null    object
 2   image_url    309 non-null    object
 3   description  309 non-null    object
 4   total_time   298 non-null    object
 5   course       309 non-null    object
 6   cuisine      309 non-null    object
 7   ingredients  309 non-null    object
 8   calories     309 non-null    object
dtypes: object(9)
memory usage: 21.9+ KB

Sample Data:


Unnamed: 0,title,link,image_url,description,total_time,course,cuisine,ingredients,calories
0,Authentic Thai Beef Satay Recipe With Peanut S...,https://hungryinthailand.com/thai-beef-satay-r...,https://hungryinthailand.com/wp-content/upload...,Enjoy my family’s authentic Thai beef satay wi...,4 hours,"Appetizer, Main Course, Snack",Thai,beef; of garlic; lemongrass; coriander seeds; ...,83 kcal
1,Easy Thai Fish Sauce Wings Recipe,https://hungryinthailand.com/fish-sauce-wings/,https://hungryinthailand.com/wp-content/upload...,Enjoy perfectly crispy chicken every time with...,35 minutes,"Appetizer, Side Dish, Snack","Asian, Thai",chicken wings; fish sauce; rosdee seasoning po...,670 kcal
2,Sweet Thai Chili Wings Recipe,https://hungryinthailand.com/sweet-thai-chili-...,https://hungryinthailand.com/wp-content/upload...,"Sweet Thai chili wings recipe with a sticky, s...",40 minutes,"Appetizer, Snack",Thai,tempura flour; rosdee seasoning powder; ice-co...,239 kcal
3,Shrimp Satay Recipe With Thai Peanut Sauce,https://hungryinthailand.com/shrimp-satay-with...,https://hungryinthailand.com/wp-content/upload...,Enjoy this easy shrimp satay recipe with Thai ...,50 minutes,"Appetizer, Snack",Thai,shrimp; coconut milk; yellow curry powder; Ros...,392 kcal
4,Pork Gyoza Recipe (Pork Dumplings),https://hungryinthailand.com/pork-gyoza-recipe/,https://hungryinthailand.com/wp-content/upload...,"Make this pork gyoza recipe for easy, homemade...",1 hour,"Appetizer, Snack",Thai,ground pork; white pepper; sesame oil; shoyu s...,126 kcal


In [38]:
# Connect to the SQLite database and retrieve data
conn = sqlite3.connect("./data/korean_recipes.db")
query = "SELECT * FROM recipes"
df_korean = pd.read_sql_query(query, conn)
conn.close()

# Columns to keep
df_korean = df_korean[
    [
        "title",
        "link",
        "image_url",
        "description",
        "Total Time:",
        "Course",
        "Cuisine",
        "ingredients",
        "Calories:",
    ]
]


# Rename columns
df_korean = df_korean.rename(
    columns={
        "Total Time:": "total_time",
        "Course": "course",
        "Cuisine": "cuisine",
        "Calories:": "calories",
    }
)

# Display basic info about the dataset
print("Dataset Info:")
df_korean.info()

print("\nSample Data:")
df_korean.head()

Dataset Info:
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 262 entries, 0 to 261
Data columns (total 9 columns):
 #   Column       Non-Null Count  Dtype 
---  ------       --------------  ----- 
 0   title        262 non-null    object
 1   link         262 non-null    object
 2   image_url    262 non-null    object
 3   description  260 non-null    object
 4   total_time   258 non-null    object
 5   course       262 non-null    object
 6   cuisine      262 non-null    object
 7   ingredients  262 non-null    object
 8   calories     254 non-null    object
dtypes: object(9)
memory usage: 18.5+ KB

Sample Data:


Unnamed: 0,title,link,image_url,description,total_time,course,cuisine,ingredients,calories
0,Seolleongtang (Korean Beef Bone Broth),https://kimchimari.com/seolleongtang-korean-be...,https://kimchimari.com/wp-content/uploads/2024...,Seolleontang is a bone broth made from boiling...,10 hours,Soup,Korean,beef bones; water; green onions; beef brisket;...,74 kcal
1,Sujebi (Hand-Pulled Noodle Soup),https://kimchimari.com/sujebi-hand-pulled-nood...,https://kimchimari.com/wp-content/uploads/2023...,Sujebi is a fun hand-pulled or hand-torn noodl...,40 minutes,"Lunch, noodles",Korean,all purpose flour; sea salt; water; water; dri...,530 kcal
2,Instant Pot Gamjatang,https://kimchimari.com/instant-pot-gamjatang-k...,https://kimchimari.com/wp-content/uploads/2018...,Instant Pot Gamjatang recipe was a perfect rec...,50 minutes,"Main Course, Pork",Korean,pork neck bones; cooking sake; potatoes; fresh...,363 kcal
3,Instant Pot Tteok Guk (Rice Cake Soup),https://kimchimari.com/instant-pot-tteokguk-ri...,https://kimchimari.com/wp-content/uploads/2018...,Anchovy Broth Tteokguk is a very elegant versi...,28 minutes,Soup,Korean,tteokguk tteok; beef stew meat; water; sesame ...,407 kcal
4,Tteok guk (떡국) – Korean rice cake soup,https://kimchimari.com/rice-cake-soup-tteokguk...,https://kimchimari.com/wp-content/uploads/2011...,"Every New Year’s day, Koreans make Dduk Guk/T...",45 minutes,"rice, Soup",Korean,rice cake slices/ovalettes for soup; anchovy s...,264 kcal


In [66]:
# Combine all the separate dataframes
df_combined = pd.concat(
    [df_japanese, df_korean, df_chinese, df_indian, df_thai], ignore_index=True
)

df_combined.head()

Unnamed: 0,title,link,image_url,description,total_time,course,cuisine,ingredients,calories
0,Yuzu Cha (Citron Tea),https://www.justonecookbook.com/yuzu-cha/,https://www.justonecookbook.com/wp-content/upl...,Try my easy homemade recipe for Yuzu Cha (Citr...,1 hour,How to,Japanese,yuzu; white rock sugar; shochu,
1,Japanese Milk Bread (Shokupan),https://www.justonecookbook.com/japanese-milk-...,https://www.justonecookbook.com/wp-content/upl...,Japanese Milk Bread is possibly the best versi...,3 hours,Breakfast,Japanese,warm water; sugar; Diamond Crystal kosher salt...,1645 kcal
2,Chicken Chashu,https://www.justonecookbook.com/chicken-chashu/,https://www.justonecookbook.com/wp-content/upl...,Juicy and tender Chicken Chashu is a lighter v...,1 hour,Main Course,Japanese,"boneless, skin-on chicken thighs; green onions...",
3,Gyudon (Japanese Beef Rice Bowl),https://www.justonecookbook.com/gyudon/,https://www.justonecookbook.com/wp-content/upl...,With thinly sliced beef and tender onions simm...,20 minutes,Main Course,Japanese,onion; green onion/scallion; thinly sliced bee...,657 kcal
4,Japanese Beef Curry,https://www.justonecookbook.com/japanese-beef-...,https://www.justonecookbook.com/wp-content/upl...,"With tender chunks of beef, potatoes, carrots,...",3 hours,Main Course,Japanese,onions; unsalted butter; neutral oil; russet p...,426 kcal


----

## Data Cleaning

##### Cleaning of 'cuisine' column

In [40]:
# Show the types of courses present in df
print(df_combined["cuisine"].value_counts().head(50))

cuisine
Japanese                          835
Chinese                           742
Thai                              304
 Indian                           285
Korean                            253
American                           76
 American                          72
hawaii                             37
 American, Indian                  19
Asian                              14
Koreanfusion                       13
 Italian                           12
Fusion                             11
Lao                                10
 Mexican                           10
 Thai                              10
Korean, Koreanfusion                9
Chinese Fusion                      8
Asian Fusion                        6
Asian, Thai                         6
American, Japanese                  6
French                              6
Japanese, Korean                    5
Taiwanese                           5
Indian                              4
Italian                             4
Fili

In [41]:
### Remove all cuisines that aren't the specified asian ones that are being worked with

# Find all cuisines mentioned across all recipes
all_cuisines = find_unique_vals(df_combined, "cuisine")

# Define cuisines to keep
cuisines_to_keep = ["Japanese", "Chinese", "Thai", "Indian", "Korean"]

# Find cuisines to remove from the df
cuisines_to_remove = [item for item in all_cuisines if item not in cuisines_to_keep]

# Remove rows that dont have the key cuisines in it
df_combined["cuisine"] = df_combined["cuisine"].apply(
    lambda x: clean_column(x, cuisines_to_remove)
)
df_combined = df_combined.dropna(subset=["cuisine"]).reset_index(drop=True)

In [42]:
df_combined["cuisine"].value_counts()

cuisine
Japanese    837
Chinese     744
Thai        314
Indian      289
Korean      254
Name: count, dtype: int64

##### Cleaning of 'course' column

In [43]:
# Show the types of courses present in df
print(df_combined["course"].value_counts().head(50))

course
Main Course                      488
Main                             358
Side Dish                        154
Appetizer                        147
Dessert                          142
Soup                              81
Main, Side                        78
Salad                             73
Side                              71
 Entree                           49
Condiments                        39
 Dessert                          36
Drinks                            30
Appetizer, Main                   29
Appetizer, Side Dish              26
Appetizer, Snack                  23
Salad, Side Dish                  22
Main Course, Soup                 22
Appetizer, Main Course            22
Dessert, Snack                    22
Main Course, Salad                21
Breakfast                         20
Snack                             16
Main Course, Side Dish            16
 dinner, Lunch                    16
 Appetizer                        12
Bakery                         

In [44]:
# Mappings to convert similar categories into one
mapping = {
    "Appetizer": ["Appetizer", "Entree"],
    "Breakfast": ["Breakfast", "Brunch", "Porridge"],
    "Dessert": ["Dessert", "Sweets"],
    "Main Course": ["Main Course", "Main", "Main dish", "Dinner", "main dish"],
    "Side Dish": ["Side Dish", "Side"],
    "Soup": ["Soup", "Stew"],
}

In [45]:
# Remove rows where column 'course' has NaN
df_combined = df_combined.dropna(subset=["course"])

# Apply the mappings function to the df
df_combined["course"] = df_combined["course"].apply(
    lambda x: map_to_main_category(x, mapping)
)

# Drop rows where 'course' is None
df_combined = df_combined.dropna(subset=["course"]).reset_index(drop=True)

# Show the types of courses present in df
print(df_combined["course"].value_counts())

course
Main Course                          913
Side Dish                            289
Appetizer                            260
Dessert                              219
Soup                                 104
Main Course, Side Dish                95
Appetizer, Main Course                67
Breakfast                             56
Appetizer, Side Dish                  49
Main Course, Soup                     26
Breakfast, Main Course                11
Breakfast, Dessert                     7
Appetizer, Main Course, Side Dish      5
Breakfast, Main Course, Side Dish      4
Appetizer, Breakfast                   2
Appetizer, Soup                        2
Appetizer, Main Course, Soup           2
Breakfast, Side Dish                   2
Breakfast, Appetizer, Main Course      2
Appetizer, Side Dish, Soup             1
Dessert, Side Dish                     1
Appetizer, Dessert                     1
Side Dish, Soup                        1
Breakfast, Appetizer                   1
Dessert, 

In [46]:
df_combined.head()

Unnamed: 0,title,link,image_url,description,total_time,course,cuisine,ingredients,calories
0,Japanese Milk Bread (Shokupan),https://www.justonecookbook.com/japanese-milk-...,https://www.justonecookbook.com/wp-content/upl...,Japanese Milk Bread is possibly the best versi...,3 hours,Breakfast,Japanese,warm water; sugar; Diamond Crystal kosher salt...,1645 kcal
1,Chicken Chashu,https://www.justonecookbook.com/chicken-chashu/,https://www.justonecookbook.com/wp-content/upl...,Juicy and tender Chicken Chashu is a lighter v...,1 hour,Main Course,Japanese,"boneless, skin-on chicken thighs; green onions...",
2,Gyudon (Japanese Beef Rice Bowl),https://www.justonecookbook.com/gyudon/,https://www.justonecookbook.com/wp-content/upl...,With thinly sliced beef and tender onions simm...,20 minutes,Main Course,Japanese,onion; green onion/scallion; thinly sliced bee...,657 kcal
3,Japanese Beef Curry,https://www.justonecookbook.com/japanese-beef-...,https://www.justonecookbook.com/wp-content/upl...,"With tender chunks of beef, potatoes, carrots,...",3 hours,Main Course,Japanese,onions; unsalted butter; neutral oil; russet p...,426 kcal
4,Japanese Cheesecake,https://www.justonecookbook.com/souffle-japane...,https://www.justonecookbook.com/wp-content/upl...,Jiggly and fluffy Japanese Cheesecake is a cro...,1 hour,Dessert,Japanese,unsalted butter; large eggs (50 g each w/o she...,3560 kcal


In [47]:
# Find all ingredients mentioned across all recipes
find_unique_vals(df_combined, "ingredients")

{'oranges',
 'Homemade Oxtail Broth',
 'rosemary',
 'shishamo (smelt)',
 'boneless skinless breast (or chicken thigh)',
 'unbleached sugar',
 'lemon zest',
 'dried glass/cellophane noodles (harusame)',
 'canned crushed tomatoes',
 'coconut sugar',
 'Sweet bean paste',
 'la-yu (Japanese chili oil)',
 'orange juice',
 'skin-on Japanese-style salmon fillets',
 'lily bulb',
 'arugula',
 'extra soft tofu (순두부 Soondubu)',
 'Korean Curry Sauce Mix',
 'low-sodium beef broth',
 'black pepper powder',
 'beef broth',
 'Frozen Moist Short Grain Rice Flour',
 'small batch cilantro',
 'rice vinegar (unseasoned)',
 'cardamoms',
 'Mexican-style shredded cheese',
 'turmeric',
 'fresh milk',
 'long-grain basmati rice',
 'Moo shu pancakes',
 'fresh lo mein noodles',
 'Vegetable oil for deep-frying',
 'shredded unsweetened coconut',
 'crushed red pepper (red pepper flakes)',
 'potato',
 'dried squid strips (ojingeochae)',
 'dried woodear mushrooms',
 'cooking spray',
 'whole milk',
 'mango chutney',
 'sug


----
## Save Processed Data

Export the cleaned dataset for further use.


In [48]:
# Connect to SQLite database (or create it if it doesn't exist)
conn = sqlite3.connect("./data/all_recipes.db")

# Save DataFrame to SQL database
df_combined.to_sql("recipes", conn, if_exists="replace", index=False)

# Close the connection
conn.close()

print("Data saved to database!")

Data saved to database!
