
#              **Indian cuisine visualization**


This notebook gives us certain insights into the dishes in the Indian cuisine. Here are some basic visualizations performed using the seaborn library. 

## A brief on what the data contains

* name : name of the dish

* ingredients : what main ingredients are present

* diet : type of diet - either vegetarian or non vegetarian

* prep_time : total time taken for preparation

* cook_time : total time taken for cooking

* flavor_profile : category: spicy, sweet, bitter, etc

* course : category: starter, main course, dessert, etc

* state : state of origin

* region : region of origin


In [None]:
# This Python 3 environment comes with many helpful analytics libraries installed
# It is defined by the kaggle/python Docker image: https://github.com/kaggle/docker-python
# For example, here's several helpful packages to load

import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)

# Input data files are available in the read-only "../input/" directory
# For example, running this (by clicking run or pressing Shift+Enter) will list all files under the input directory

import os
for dirname, _, filenames in os.walk('/kaggle/input'):
    for filename in filenames:
        print(os.path.join(dirname, filename))

# You can write up to 5GB to the current directory (/kaggle/working/) that gets preserved as output when you create a version using "Save & Run All" 
# You can also write temporary files to /kaggle/temp/, but they won't be saved outside of the current session

In [None]:
#Importing the required libraries
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

In [None]:
#Reading the data
food_data=pd.read_csv('/kaggle/input/indian-food-101/indian_food.csv')

In [None]:
food_data.head()

In [None]:
food_data.info()

In [None]:
food_data.isnull().sum().sum()

In [None]:
food_data['state'].unique()

It is noticed that there contains a value '-1' which maybe indicates a Missing value. So we can replace those using Nan


In [None]:
food_data=food_data.replace('-1',np.nan)

In [None]:
sns.countplot(x='diet',data=food_data)

In [None]:
sns.countplot(x='flavor_profile',hue='diet',data=food_data)

In [None]:
sns.countplot(x='course',hue='diet',data=food_data)

Non vegetarian food is present only in starters and in the main course

In [None]:
#Statewise breakdown of diets
fig=plt.figure(figsize=(10,6))
sns.countplot(x='state',hue='diet',data=food_data)
plt.xticks(
    rotation=45, 
    horizontalalignment='right',
    fontweight='light',
    fontsize='x-large'  
)

In [None]:
#regionwise count
sns.countplot(x='region',data=food_data)

In [None]:
#Top 10 states recorded dishes
fig=plt.figure(figsize=(10,6))
plt.xticks(
    rotation=90, 
    horizontalalignment='right',
    fontweight='light',
    fontsize='x-large'  
)
sns.countplot(x='state', order=food_data['state'].value_counts().index[0:10] ,data=food_data)

In [None]:
sns.barplot(x='course',y='cook_time',data=food_data,ci=None)

Desert takes the longest time to cook

In [None]:
sns.barplot(x='course',y='prep_time',data=food_data,ci=None)

Starters take the longest time to prepare

In [None]:
sns.barplot(x='diet',y='cook_time',data=food_data,ci=None)

In [None]:
sns.barplot(x='diet',y='prep_time',data=food_data,ci=None)

Vegetarian dishes take longer time to prep and cook than non vegetarian dishes

In [None]:
food_data['prep_time'].hist(grid=False)

In [None]:
food_data['cook_time'].hist(grid=False)

In [None]:
# Top 10 Dishes with the highest cook time
top10=food_data.sort_values('cook_time',ascending=False).head(10).set_index('name')
sns.barplot(x=top10['cook_time'],y=top10.index)

In [None]:
#Top 10 dishes with the highest prep time
top10=food_data.sort_values('prep_time',ascending=False).head(10).set_index('name')
sns.barplot(x=top10['prep_time'],y=top10.index)