### Scraping more healthy indian recipes

- Recipe name
- Ingredients list
- Make time
- Instructions
- Nutrient info 
- Tags (e.g., vegan, keto, veg/non veg)

website: https://www.tarladalal.com/
-nutrient breakdown avaliable too

#### Import dependencies

In [3]:
import pandas as pd
import requests
from bs4 import BeautifulSoup

In [4]:
webpage = requests.get("https://www.tarladalal.com/category/Healthy-Indian-Recipes/").text

In [5]:
soup = BeautifulSoup(webpage, 'lxml')

In [6]:
# Getting all recipe links from the main page 
# thet are stored in div tags of class text-center
soup.find_all('div',class_="text-center")[:5]

[<div class="text-center">
 <a class="btn btn-main primary-bg" href="/recipes/category/Vitamin-B12-Cobalamin-Rich-Foods/">View All</a>
 </div>,
 <div class="text-center">
 <a class="btn btn-main primary-bg" href="/recipes/category/Healthy-Low-Calorie-Weight-Loss/">View All</a>
 </div>,
 <div class="text-center">
 <a class="btn btn-main primary-bg" href="/recipes/category/Insoluble-Fiber-Diet/">View All</a>
 </div>,
 <div class="text-center">
 <a class="btn btn-main primary-bg" href="/recipes/category/Low-Cholesterol-/">View All</a>
 </div>,
 <div class="text-center">
 <a class="btn btn-main primary-bg" href="/recipes/category/Soluble-Fibre-Diet/">View All</a>
 </div>]

In [7]:
# getting the tag links from the href attrbute of anchor tag
tags_links=[]
for div in soup.find_all('div',class_="text-center")[:-1]:
    tags_links.append(div.find('a').get("href"))

In [8]:
tags_links[:5]

['/recipes/category/Vitamin-B12-Cobalamin-Rich-Foods/',
 '/recipes/category/Healthy-Low-Calorie-Weight-Loss/',
 '/recipes/category/Insoluble-Fiber-Diet/',
 '/recipes/category/Low-Cholesterol-/',
 '/recipes/category/Soluble-Fibre-Diet/']

In [9]:
tags_links[-5:]

['/recipes/category/Lactose-Free-Dairy-Free-Cake-/',
 '/recipes/category/Chronic-Kidney-Disease/',
 '/recipes/category/indian-recipes-for-relief-from-pregnancy-constipation/',
 '/recipes/category/selenium1/',
 '/recipes/category/healthy-indian-soups-under-100-calories/']

- They are in the form of relative address to the main page, so let's prefix them with the address "https://www.tarladalal.com/"

In [10]:
cleaned_tags_links = ["https://www.tarladalal.com"+link for link in tags_links ]

In [11]:
cleaned_tags_links[:5]

['https://www.tarladalal.com/recipes/category/Vitamin-B12-Cobalamin-Rich-Foods/',
 'https://www.tarladalal.com/recipes/category/Healthy-Low-Calorie-Weight-Loss/',
 'https://www.tarladalal.com/recipes/category/Insoluble-Fiber-Diet/',
 'https://www.tarladalal.com/recipes/category/Low-Cholesterol-/',
 'https://www.tarladalal.com/recipes/category/Soluble-Fibre-Diet/']

In [12]:
cleaned_tags_links[-5:]

['https://www.tarladalal.com/recipes/category/Lactose-Free-Dairy-Free-Cake-/',
 'https://www.tarladalal.com/recipes/category/Chronic-Kidney-Disease/',
 'https://www.tarladalal.com/recipes/category/indian-recipes-for-relief-from-pregnancy-constipation/',
 'https://www.tarladalal.com/recipes/category/selenium1/',
 'https://www.tarladalal.com/recipes/category/healthy-indian-soups-under-100-calories/']

In [13]:
# total filters or tags
len(cleaned_tags_links)

437

- therefore, now we have 437 different filters or tags 
- let's go through each link and extract recipes under that category

In [14]:
# get recipie links from categories
def get_recipie_links(category_link):
    website = requests.get(category_link).text
    soup = BeautifulSoup(website, "lxml")
    tags_links=[]

    # getting links
    for div in soup.find_all('div',class_="img-block"):
        tags_links.append(div.find('a').get("href"))
    
    #filtering links
    category = category_link.rstrip('/').split('/')[-1] 
    recipies = [("https://www.tarladalal.com"+link, category) for link in tags_links ]

    return recipies 

- now we can use this function to extract recipe links from under each category.
- Next, we scrape through each recipe page to get our required data

In [15]:
recipies_cat = []
for category in cleaned_tags_links:
    recipies_cat.extend(get_recipie_links(category))

In [16]:
recipies=[]
for rec, cat in recipies_cat:
    recipies.append(rec)

In [17]:
from collections import defaultdict

recipe_tags = defaultdict(list)

for link, tag in recipies_cat:
    recipe_tags[link].append(tag)


In [18]:
len(set(recipies))

2132

- therefore, here we have a list of 2129 unique recipes

#### Scraping through the first recipe page


In [24]:
recipies = list(set(recipies))

In [29]:
recipies[1]

'https://www.tarladalal.com/vaal-ki-usal-maharashtrian-dalimbi-usal-3964r'

- Features we need:
- name, serving size, Time to make, tags, ingredients, nutrition, instructions

In [30]:
website = requests.get(recipies[1]).text
soup = BeautifulSoup(website,'lxml')

##### Getting the name


In [31]:
soup

<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8"/>
<meta content="width=device-width, initial-scale=1" name="viewport"/>
<title>vaal ki usal recipe | dalimbi usal | Maharashtrian birda usal |</title>
<meta content="vaal ki usal recipe | dalimbi usal | Maharashtrian birda usal | with step by step photos. A popular, easy to make Maharashtrian dish. The combination of jaggery and kokum gives a sweet and tangy taste to the usal. Vaal provides you the much needed protein and iron. A thoughtful addition of vitamin C rich coriander improves the absorption of iron. Serve this usal with hot with roti of your choice." name="description"/>
<meta content="vaala cha bhirda, vaal usal, vaala chi ambti, spicy bhirda, dalimbi chi usal, bhirda" name="keywords"/>
<meta content="Tarla Dalal" name="author"/>
<meta content="index, follow" name="robots"/>
<meta content="website" property="og:type"/>
<link href="/static/frontend/img/favicon.ico" rel="icon" type="image/x-icon"/>
<!-- Canonical UR

In [32]:
soup.find("h4", class_ = "rec-heading").text

'vaal ki usal recipe | dalimbi usal | Maharashtrian birda usal |'

In [33]:
soup.find("h4", class_ = "rec-heading").text.split("|")

['vaal ki usal recipe ', ' dalimbi usal ', ' Maharashtrian birda usal ', '']

In [34]:
# just taking the first name:
soup.find("h4", class_ = "rec-heading").text.split("|")[0].strip()

'vaal ki usal recipe'

In [35]:
def find_name(rec_soup):
    # it will most probably have only one h4 tag so use .find()
    container= rec_soup.find("h4", class_ = "rec-heading")

    if container is None:
        return None
    return rec_soup.find("h4", class_ = "rec-heading").text.split("|")[0].strip()

In [36]:
find_name(soup)

'vaal ki usal recipe'

##### Getting the Make time

In [37]:
soup.find_all('p', class_="mb-0 font-size-13")

[<p class="mb-0 font-size-13"><strong>15 Mins</strong></p>,
 <p class="mb-0 font-size-13"><strong>19 Mins</strong></p>,
 <p class="mb-0 font-size-13"><strong>34 Mins</strong></p>]

In [38]:
soup.find_all('p', class_="mb-0 font-size-13")[2].text

'34 Mins'

In [84]:
def get_make_time(res_soup):
    elements = res_soup.find_all('p', class_="mb-0 font-size-13")
    if len(elements) >= 3:
            return elements[2].get_text(strip=True)
    else:
        return None  

In [40]:
get_make_time(soup)

'34 Mins'

##### Getting serving size

In [41]:
soup.find_all('p',class_="mb-0 font-size-13 font-size-13")

[<p class="mb-0 font-size-13 font-size-13"><strong>4 servings</strong></p>]

In [42]:
#using indexing, resolve to .find() if creates a problem later
soup.find_all('p',class_="mb-0 font-size-13 font-size-13")[0].text

'4 servings'

In [43]:
def get_serving_size(res_soup):
    return soup.find_all('p',class_="mb-0 font-size-13 font-size-13")[0].text

In [44]:
get_serving_size(soup)

'4 servings'

##### Getting the tags

In [45]:
soup.find('ul', class_ = 'tags-list')

<ul class="tags-list">
<li><a href="/recipes-for-equipment-non-stick-pan-321">Non-stick Pan</a></li>
<li><a href="/recipes-for-Boiled--577">Boiled Indian recipes</a></li>
<li><a href="/recipes-for-cooking-basics-saute-274">Saute</a></li>
<li><a href="/recipes-for-Indian-Veg-Recipes-2">Indian Veg Recipes</a></li>
<li><a href="/recipes-for-Maharashtrian-52">Maharashtrian recipes</a></li>
<li><a href="/recipes-for-Maharashtrian-Dal-56">Maharashtrian Dal, Varan / Amti / Kalvan</a></li>
</ul>

In [46]:
li_soup = soup.find('ul', class_ = 'tags-list').find_all('li')
li_soup

[<li><a href="/recipes-for-equipment-non-stick-pan-321">Non-stick Pan</a></li>,
 <li><a href="/recipes-for-Boiled--577">Boiled Indian recipes</a></li>,
 <li><a href="/recipes-for-cooking-basics-saute-274">Saute</a></li>,
 <li><a href="/recipes-for-Indian-Veg-Recipes-2">Indian Veg Recipes</a></li>,
 <li><a href="/recipes-for-Maharashtrian-52">Maharashtrian recipes</a></li>,
 <li><a href="/recipes-for-Maharashtrian-Dal-56">Maharashtrian Dal, Varan / Amti / Kalvan</a></li>]

In [47]:
tags = [li.text for li in li_soup]
tags


['Non-stick Pan',
 'Boiled Indian recipes',
 'Saute',
 'Indian Veg Recipes',
 'Maharashtrian recipes',
 'Maharashtrian Dal, Varan / Amti / Kalvan']

In [48]:
# also need to add the category as tag

tags.extend(recipe_tags[recipies[0]])
tags

['Non-stick Pan',
 'Boiled Indian recipes',
 'Saute',
 'Indian Veg Recipes',
 'Maharashtrian recipes',
 'Maharashtrian Dal, Varan / Amti / Kalvan',
 'Healthy-Heart-International',
 'Low-Calorie-International']

In [71]:
def get_tags(res_soup,recipy):
    container = res_soup.find('ul', class_ = 'tags-list')
    if container is None:
        return None
    li_soup = container.find_all('li')
    tags = [li.text for li in li_soup]
    tags.extend(recipe_tags[recipy])  # recipy will be the iterator in main function
    return tags

#### Getting the ingredients

In [50]:
ingredient_div = soup.find('div', class_="ingredients")

In [51]:

for p in ingredient_div.find_all('p'):
    text = p.get_text(strip=True,separator=" ")
    print(text)

2 cups sprouted vaal (field beans/ butter beans)
1 tbsp oil
1 tsp cumin seeds (jeera)
a pinch of asafoetida (hing)
5 to 6 curry leaves (kadi patta)
1/2 cup finely chopped onion
2 tsp ginger-garlic (adrak-lehsun) paste
1/4 cup finely chopped tomato
1/2 tsp turmeric powder (haldi)
2 tsp malvani masala
5 kokum (optional)
1/2 tsp chopped jaggery (gur) , (optional)
salt to taste
1/4 cup finely chopped coriander (dhania)


In [72]:
def get_ingredients(res_soup):
    ingredient_div = res_soup.find('div', class_="ingredients")

    if ingredient_div is None:
        return None
    ingredients=[]
    for p in ingredient_div.find_all('p'):
        text = p.get_text(strip=True,separator=" ")
        ingredients.append(text)
    return ingredients

In [53]:
get_ingredients(soup)

['2 cups sprouted vaal (field beans/ butter beans)',
 '1 tbsp oil',
 '1 tsp cumin seeds (jeera)',
 'a pinch of asafoetida (hing)',
 '5 to 6 curry leaves (kadi patta)',
 '1/2 cup finely chopped onion',
 '2 tsp ginger-garlic (adrak-lehsun) paste',
 '1/4 cup finely chopped tomato',
 '1/2 tsp turmeric powder (haldi)',
 '2 tsp malvani masala',
 '5 kokum (optional)',
 '1/2 tsp chopped jaggery (gur) , (optional)',
 'salt to taste',
 '1/4 cup finely chopped coriander (dhania)']

#### Getting the nutrients

In [54]:
soup.find("table", id="rcpnutrients")

<table id="rcpnutrients"><tr><td style="padding:0px 2px;">Energy</td><td style="padding:0px 4px;"><span itemprop="calories">195 cal</span></td></tr><tr><td style="padding:0px 2px;">Protein</td><td style="padding:0px 4px;"><span itemprop="proteinContent">10.3 g</span></td></tr><tr><td style="padding:0px 2px;">Carbohydrates</td><td style="padding:0px 4px;"><span itemprop="carbohydrateContent">30.5 g</span></td></tr><tr><td style="padding:0px 2px;">Fiber</td><td style="padding:0px 4px;"><span itemprop="fiberContent">7.9 g</span></td></tr><tr><td style="padding:0px 2px;">Fat</td><td style="padding:0px 4px;"><span itemprop="fatContent">4.1 g</span></td></tr><tr><td style="padding:0px 2px;">Cholesterol</td><td style="padding:0px 4px;"><span itemprop="cholesterolContent">0 mg</span></td></tr><tr><td style="padding:0px 2px;">Sodium</td><td style="padding:0px 4px;"><span itemprop="sodiumContent">8.8 mg</span></td></tr></table>

In [55]:
soup.find("table", id="rcpnutrients").find_all('tr')

[<tr><td style="padding:0px 2px;">Energy</td><td style="padding:0px 4px;"><span itemprop="calories">195 cal</span></td></tr>,
 <tr><td style="padding:0px 2px;">Protein</td><td style="padding:0px 4px;"><span itemprop="proteinContent">10.3 g</span></td></tr>,
 <tr><td style="padding:0px 2px;">Carbohydrates</td><td style="padding:0px 4px;"><span itemprop="carbohydrateContent">30.5 g</span></td></tr>,
 <tr><td style="padding:0px 2px;">Fiber</td><td style="padding:0px 4px;"><span itemprop="fiberContent">7.9 g</span></td></tr>,
 <tr><td style="padding:0px 2px;">Fat</td><td style="padding:0px 4px;"><span itemprop="fatContent">4.1 g</span></td></tr>,
 <tr><td style="padding:0px 2px;">Cholesterol</td><td style="padding:0px 4px;"><span itemprop="cholesterolContent">0 mg</span></td></tr>,
 <tr><td style="padding:0px 2px;">Sodium</td><td style="padding:0px 4px;"><span itemprop="sodiumContent">8.8 mg</span></td></tr>]

In [56]:
for row in soup.find("table", id="rcpnutrients").find_all('tr'):
    print(row.find_all("td")[0].text, row.find_all("td")[1].text)

Energy 195 cal
Protein 10.3 g
Carbohydrates 30.5 g
Fiber 7.9 g
Fat 4.1 g
Cholesterol 0 mg
Sodium 8.8 mg


In [57]:
nutrients = {}
for row in soup.find("table", id="rcpnutrients").find_all('tr'):
    key, value = row.find_all("td")[0].text, row.find_all("td")[1].text
    nutrients[key] = value
nutrients

{'Energy': '195 cal',
 'Protein': '10.3 g',
 'Carbohydrates': '30.5 g',
 'Fiber': '7.9 g',
 'Fat': '4.1 g',
 'Cholesterol': '0 mg',
 'Sodium': '8.8 mg'}

In [73]:
def get_nutrients(res_soup):
    nutrients = {}
    container = res_soup.find("table", id="rcpnutrients")
    if container is None:
        return None
    for row in container.find_all('tr'):
        key, value = row.find_all("td")[0].text, row.find_all("td")[1].text
        nutrients[key] = value
    return nutrients

In [74]:
get_nutrients(soup)

{'Energy': '195 cal',
 'Protein': '10.3 g',
 'Carbohydrates': '30.5 g',
 'Fiber': '7.9 g',
 'Fat': '4.1 g',
 'Cholesterol': '0 mg',
 'Sodium': '8.8 mg'}

#### Get the instructions

In [75]:
soup.find("div", class_="rsepc").text.strip()   ## there are no steps here

'For vaal ki usal\xa0To make vaal ki usal recipe, heat the oil in a non-stick kadhai and add the cumin seeds.When the seeds crackle, add the asafoetida, curry leaves and onions and sauté on a medium flame for 2 minutes.Add ginger garlic paste and mix well. Add tomatoes and cook on a medium flame for 2 to 3 minutes.Add the sprouted vaal, turmeric powder, malvani masala, coriander cumin seeds powder and salt to taste.Mix well and sauté on a medium flame for 2 to 3 minutes.Add 1 cup hot water, kokum, jaggery and mix well.Cover and cook on a medium flame for 12 to 15 minutes or till the dal is cooked, while stirring occasionally.Add coriander and mix well. Serve the vaal ki usal recipe hot.'

In [76]:
def get_instructions(res_soup):
    container = res_soup.find("div", class_="rsepc")
    if container is None:
        return None
    return container.text.strip()

In [77]:
get_instructions(soup)

'For vaal ki usal\xa0To make vaal ki usal recipe, heat the oil in a non-stick kadhai and add the cumin seeds.When the seeds crackle, add the asafoetida, curry leaves and onions and sauté on a medium flame for 2 minutes.Add ginger garlic paste and mix well. Add tomatoes and cook on a medium flame for 2 to 3 minutes.Add the sprouted vaal, turmeric powder, malvani masala, coriander cumin seeds powder and salt to taste.Mix well and sauté on a medium flame for 2 to 3 minutes.Add 1 cup hot water, kokum, jaggery and mix well.Cover and cook on a medium flame for 12 to 15 minutes or till the dal is cooked, while stirring occasionally.Add coriander and mix well. Serve the vaal ki usal recipe hot.'

### Final method

In [78]:
def get_data(link):
    res_webpage = requests.get(link,timeout=10).text
    res_soup = BeautifulSoup(res_webpage,'lxml')

    #name
    name = find_name(res_soup)
    tags = get_tags(res_soup,link)
    serving_size = get_serving_size(res_soup)
    make_time = get_make_time(res_soup)
    ingredients = get_ingredients(res_soup)
    nutrition = get_nutrients(res_soup)
    instructions = get_instructions(res_soup)

    return {"name":name, "tags":tags, "ingredients":ingredients,"serving_size":serving_size,"cook_time":make_time, "nutrition":nutrition, "instructions":instructions}

### Checking to see if all the links can be scraped through

- trying for the first 500

In [79]:
recipies[500]

'https://www.tarladalal.com/palak-pulao-palak-rice-spinach-rice-40890r'

In [80]:
# recipies_first_500 = [get_data(href) for href in recipies[:500]]

- AttributeError: 'NoneType' object has no attribute 'find_all' in the nutrients definition
- This happened because `.find()` returned None (element not found).
- Fix it by checking if result of `.find()` is not None before calling `.find_all()` and returns that link in that case which we can use to check later


In [81]:
def get_nutrients(res_soup):
    nutrients = {}
    table = res_soup.find("table", id="rcpnutrients")
    if table is None:
        return None

    for row in table.find_all('tr'):
        tds = row.find_all("td")
        if len(tds) >= 2:
            key, value = tds[0].text.strip(), tds[1].text.strip()
            nutrients[key] = value
            
    return nutrients


In [85]:
def get_data(link):
    res_webpage = requests.get(link).text
    res_soup = BeautifulSoup(res_webpage,'lxml')

    #name
    name = find_name(res_soup)
    tags = get_tags(res_soup,link)
    serving_size = get_serving_size(res_soup)
    make_time = get_make_time(res_soup)
    ingredients = get_ingredients(res_soup)
    nutrition = get_nutrients(res_soup)
    instructions = get_instructions(res_soup)
    
    return {"name":name, "tags":tags, "ingredients":ingredients,"serving_size":serving_size,"cook_time":make_time, "nutrition":nutrition, "instructions":instructions,"link":link}

In [86]:
recipies_first_500 = [get_data(href) for href in recipies[:500]]

In [87]:
df = pd.DataFrame(recipies_first_500)

In [88]:
df.head()

Unnamed: 0,name,tags,ingredients,serving_size,cook_time,nutrition,instructions,link
0,,,,4 servings,,,,https://www.tarladalal.com/oatmeal-and-spinach...
1,vaal ki usal recipe,"[Non-stick Pan, Boiled Indian recipes, Saute, ...",[2 cups sprouted vaal (field beans/ butter bea...,4 servings,34 Mins,"{'Energy': '195 cal', 'Protein': '10.3 g', 'Ca...","For vaal ki usal To make vaal ki usal recipe, ...",https://www.tarladalal.com/vaal-ki-usal-mahara...
2,capsicum paneer sabzi recipe,"[Non Stick Kadai Veg, Antioxidant Rich Indian,...","[2 1/2 cups capsicum cubes, 1/2 cup low-fat pa...",4 servings,26 Mins,"{'Energy': '74 cal', 'Protein': '2.6 g', 'Carb...",For capsicum paneer sabziTo make capsicum pane...,https://www.tarladalal.com/coloured-capsicum-a...
3,"Coconut Cream recipe, How to make Coconut Crea...","[Indian Desserts , Sweets, Basic Indian Desser...","[1 fresh coconut, 1 cup boiling water]",4 servings,20 Mins,"{'Energy': '444 cal', 'Protein': '4.5 g', 'Car...",,https://www.tarladalal.com/coconut-cream-427r
4,restaurant style palak paneer recipe,"[Non-stick Pan, Indian Dinner, Indian Lunch, L...","[1 cup sliced onions, 3 tbsp roughly chopped c...",4 servings,46 Mins,"{'Energy': '374 cal', 'Protein': '13.3 g', 'Ca...",For the onion-cashew paste Combine all the ing...,https://www.tarladalal.com/restaurant-style-pa...


In [89]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 500 entries, 0 to 499
Data columns (total 8 columns):
 #   Column        Non-Null Count  Dtype 
---  ------        --------------  ----- 
 0   name          499 non-null    object
 1   tags          499 non-null    object
 2   ingredients   499 non-null    object
 3   serving_size  500 non-null    object
 4   cook_time     499 non-null    object
 5   nutrition     469 non-null    object
 6   instructions  497 non-null    object
 7   link          500 non-null    object
dtypes: object(8)
memory usage: 31.4+ KB


- so we do not have nutrition information of 23 recipies and instructions of 4 recipies out of 500

- Let's drop those entries without instruction, the nutrition information can be mapped using the ingredients later

In [90]:
df.dropna(subset="instructions", inplace=True)

In [91]:
df.info()

<class 'pandas.core.frame.DataFrame'>
Index: 497 entries, 1 to 499
Data columns (total 8 columns):
 #   Column        Non-Null Count  Dtype 
---  ------        --------------  ----- 
 0   name          497 non-null    object
 1   tags          497 non-null    object
 2   ingredients   497 non-null    object
 3   serving_size  497 non-null    object
 4   cook_time     497 non-null    object
 5   nutrition     468 non-null    object
 6   instructions  497 non-null    object
 7   link          497 non-null    object
dtypes: object(8)
memory usage: 34.9+ KB


In [92]:
df.to_csv("indian_recipies_1.csv")

#### Running for the rest

In [95]:
recipies_rest = [get_data(href) for href in recipies[500:]]

In [99]:
df = pd.DataFrame(recipies_rest)

In [100]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1632 entries, 0 to 1631
Data columns (total 8 columns):
 #   Column        Non-Null Count  Dtype 
---  ------        --------------  ----- 
 0   name          1632 non-null   object
 1   tags          1632 non-null   object
 2   ingredients   1632 non-null   object
 3   serving_size  1632 non-null   object
 4   cook_time     1632 non-null   object
 5   nutrition     1544 non-null   object
 6   instructions  1617 non-null   object
 7   link          1632 non-null   object
dtypes: object(8)
memory usage: 102.1+ KB


In [101]:
df.dropna(subset="instructions", inplace=True)

In [102]:
df.info()

<class 'pandas.core.frame.DataFrame'>
Index: 1617 entries, 0 to 1631
Data columns (total 8 columns):
 #   Column        Non-Null Count  Dtype 
---  ------        --------------  ----- 
 0   name          1617 non-null   object
 1   tags          1617 non-null   object
 2   ingredients   1617 non-null   object
 3   serving_size  1617 non-null   object
 4   cook_time     1617 non-null   object
 5   nutrition     1529 non-null   object
 6   instructions  1617 non-null   object
 7   link          1617 non-null   object
dtypes: object(8)
memory usage: 113.7+ KB


In [103]:
df.head()

Unnamed: 0,name,tags,ingredients,serving_size,cook_time,nutrition,instructions,link
0,"Palak Pulao, Palak Rice, Spinach Rice","[Pressure Cooker, Indian Pressure Cooker, Indi...","[2 cups finely chopped spinach (palak), 1/4 cu...",4 servings,35 Mins,"{'Energy': '235 cal', 'Protein': '6.2 g', 'Car...","Heat the oil in a pressure cooker, add the oni...",https://www.tarladalal.com/palak-pulao-palak-r...
1,how methi seeds stops diarrhoea recipe,"[Home Remedies, Indian Home Remedies Diarrhoea...","[1 tsp fenugreek (methi) seeds, 1 cup water]",4 servings,1 Mins,"{'Energy': '0 cal', 'Protein': '0 g', 'Carbohy...",MethodHave the fenugreek seeds along with wate...,https://www.tarladalal.com/how-methi-seeds-sto...
2,sprouts tikki recipe,"[Non-stick Pan, Easy, Simple Indian Starters, ...","[1 cup sprouted moong (whole green gram), 3 tb...",4 servings,25 Mins,"{'Energy': '54 cal', 'Protein': '3.2 g', 'Carb...","For sprouts tikki To make sprouts tikki, blend...",https://www.tarladalal.com/sprouts-tikki-healt...
3,Blueberry Strawberry Smoothie Recipe,"[Mixer, Indian Beverages, Indian Drinks, India...","[1/2 cup fresh blueberry, 1/2 cup roughly chop...",4 servings,5 Mins,"{'Energy': '181 cal', 'Protein': '3.9 g', 'Car...",Combine all the ingredients in a mixer and ble...,https://www.tarladalal.com/blueberry-strawberr...
4,jowar palak appe recipe,"[Appe Mould, Indian Breakfast Recipes, Breakfa...","[3/4 cup jowar (white millet) flour, 1/2 cup f...",4 servings,20 Mins,"{'Energy': '24 cal', 'Protein': '0.6 g', 'Carb...","For jowar palak appe To make jowar palak appe,...",https://www.tarladalal.com/jowar-palak-appe-42...


In [104]:
df.to_csv("indian_recipies_2.csv")