# 100 Coding Samples: Food, Beverages & Cuisines EDA

This notebook contains 100 distinct samples of Exploratory Data Analysis (EDA) on food datasets.

**Datasets Used:**
1. **Indian Food 101:** Ingredients, diet, prep time, and region for 255 dishes.
2. **Ramen Ratings:** Ratings, style, and country for 2500+ ramen brands.
3. **Starbucks Nutrition:** Nutritional info for Starbucks beverages.

## Part 1: Setup & Data Loading (Samples 1-5)

In [None]:
# Sample 1: Import Libraries
# Explanation: Importing necessary libraries for data manipulation (pandas) and visualization (matplotlib, seaborn, plotly).
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import plotly.express as px
import plotly.graph_objects as go
from wordcloud import WordCloud

# Formatting for cleaner charts
sns.set_theme(style='whitegrid')

In [None]:
# Sample 2: Load Indian Food Dataset
# Explanation: Loading the Indian Cuisine dataset directly from a raw GitHub URL into a Pandas DataFrame.
url_indian = 'https://raw.githubusercontent.com/nehaprabhavalkar/Indian-Food-101/master/indian_food.csv'
df_indian = pd.read_csv(url_indian)
df_indian.head(3)

In [None]:
# Sample 3: Load Ramen Ratings Dataset
# Explanation: Loading the Ramen Ratings dataset containing stars, styles, and countries of origin.
url_ramen = 'https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2019/2019-06-04/ramen_ratings.csv'
df_ramen = pd.read_csv(url_ramen)
df_ramen.head(3)

In [None]:
# Sample 4: Load Starbucks Dataset
# Explanation: Loading the Starbucks beverages dataset to analyze ingredients like caffeine and sugar.
url_starbucks = 'https://raw.githubusercontent.com/tidytuesday/master/data/2021/2021-12-21/starbucks.csv'
df_starbucks = pd.read_csv(url_starbucks)
df_starbucks.head(3)

In [None]:
# Sample 5: Check Dataset Dimensions
# Explanation: Checking the shape (rows, columns) of all three dataframes to understand data volume.
print(f'Indian Food Shape: {df_indian.shape}')
print(f'Ramen Shape: {df_ramen.shape}')
print(f'Starbucks Shape: {df_starbucks.shape}')

## Part 2: Data Cleaning & Pre-processing (Samples 6-20)

In [None]:
# Sample 6: Check Data Types
# Explanation: Inspecting column data types to identify which columns are categorical (object) or numerical.
df_indian.info()

In [None]:
# Sample 7: Handling Placeholders
# Explanation: The Indian dataset uses '-1' as a placeholder for missing values. We replace them with NaN to count true missing values.
df_indian = df_indian.replace(-1, np.nan)
df_indian = df_indian.replace('-1', np.nan)
df_indian.isnull().sum()

In [None]:
# Sample 8: Fix Data Types (Ramen Stars)
# Explanation: The 'stars' column in Ramen data acts as a string. We convert it to numeric, coercing errors to NaN, and drop invalid ratings.
df_ramen['stars'] = pd.to_numeric(df_ramen['stars'], errors='coerce')
df_ramen.dropna(subset=['stars'], inplace=True)

In [None]:
# Sample 9: Standardize Text Data
# Explanation: Converting ingredient lists to lowercase to ensure consistency (e.g., 'Sugar' vs 'sugar').
df_indian['ingredients'] = df_indian['ingredients'].str.lower()
df_indian['ingredients'].head()

In [None]:
# Sample 10: Clean Column Names
# Explanation: Stripping whitespace from column names to prevent key errors during analysis.
df_starbucks.columns = [c.strip() for c in df_starbucks.columns]
print(df_starbucks.columns)

In [None]:
# Sample 11: Frequency Count (Univariate)
# Explanation: Counting how many dishes are Vegetarian vs Non Vegetarian in the Indian dataset.
df_indian['diet'].value_counts()

In [None]:
# Sample 12: Impute Missing Numerical Data
# Explanation: Filling missing prep_time and cook_time values with the median of the respective columns.
df_indian['prep_time'].fillna(df_indian['prep_time'].median(), inplace=True)
df_indian['cook_time'].fillna(df_indian['cook_time'].median(), inplace=True)

In [None]:
# Sample 13: Inspect Categorical Unique Values
# Explanation: Checking unique 'Style' values in Ramen (e.g., Cup, Pack, Bowl) - relevant to the 'cushions/comfort' request.
df_ramen['Style'].unique()

In [None]:
# Sample 14: Check for Duplicates
# Explanation: Identifying if there are any duplicate rows in the Starbucks dataset.
df_starbucks.duplicated().sum()

In [None]:
# Sample 15: Remove Duplicates
# Explanation: Dropping duplicate entries to ensure data integrity.
df_starbucks.drop_duplicates(inplace=True)

In [None]:
# Sample 16: Feature Engineering (Count)
# Explanation: Creating a new column 'num_ingredients' by counting the comma-separated items in the ingredients string.
df_indian['num_ingredients'] = df_indian['ingredients'].apply(lambda x: len(x.split(',')))
df_indian[['name', 'num_ingredients']].head()

In [None]:
# Sample 17: Standardize Country Names
# Explanation: Correcting inconsistent country names (USA vs United States) for cleaner grouping.
df_ramen['Country'] = df_ramen['Country'].replace('USA', 'United States')

In [None]:
# Sample 18: Fill Missing Values with Zero
# Explanation: Assuming NaN in trans_fat implies 0 grams and filling it.
df_starbucks['trans_fat_g'].fillna(0, inplace=True)

In [None]:
# Sample 19: Detect Outliers
# Explanation: Identifying dishes that take an unreasonably long time to cook (> 600 mins).
outliers = df_indian[df_indian['cook_time'] > 600]
outliers[['name', 'cook_time']]

In [None]:
# Sample 20: Remove Outliers
# Explanation: Removing extreme outliers to prevent skewed visualization scales.
df_indian = df_indian[df_indian['cook_time'] <= 600]

## Part 3: Simple Visualizations (Samples 21-50)

In [None]:
# Sample 21: Histogram with KDE
# Explanation: Visualizing the distribution of ramen ratings. Outcome: Shows most ramen is rated between 3 and 5 stars.
plt.figure(figsize=(8,5))
sns.histplot(df_ramen['stars'], bins=10, kde=True, color='orange')
plt.title('Distribution of Ramen Ratings')
plt.show()

In [None]:
# Sample 22: Count Plot (Horizontal)
# Explanation: Comparing the number of Veg vs Non-Veg dishes. Outcome: Shows a dominance of Vegetarian dishes in this dataset.
plt.figure(figsize=(10,5))
sns.countplot(y='diet', data=df_indian, palette='pastel')
plt.title('Count of Vegetarian vs Non-Vegetarian Dishes')
plt.show()

In [None]:
# Sample 23: Bar Chart of Categories
# Explanation: Analyzing packaging styles. 'Cup' and 'Bowl' represent different eating experiences compared to 'Pack'.
plt.figure(figsize=(12,6))
sns.countplot(x='Style', data=df_ramen, order=df_ramen['Style'].value_counts().index)
plt.title('Ramen Packaging Styles (The "Cushion"/Comfort Factor)')
plt.show()

In [None]:
# Sample 24: Bar Plot of Top 10
# Explanation: Visualizing which countries produce the most distinct ramen varieties.
top_cuisines = df_ramen['Country'].value_counts().head(10)
plt.figure(figsize=(10,6))
sns.barplot(x=top_cuisines.values, y=top_cuisines.index, palette='viridis')
plt.title('Top 10 Countries by Ramen Variety')
plt.show()

In [None]:
# Sample 25: Pie Chart
# Explanation: Showing the proportion of dishes that are Main Course, Dessert, Snack, etc.
plt.figure(figsize=(8,8))
df_indian['course'].value_counts().plot.pie(autopct='%1.1f%%', startangle=90)
plt.title('Distribution of Meal Courses')
plt.ylabel('')
plt.show()

In [None]:
# Sample 26: Boxplot (Univariate)
# Explanation: Visualizing the spread of calories in Starbucks drinks to see median and outliers.
plt.figure(figsize=(10,6))
sns.boxplot(x='calories', data=df_starbucks)
plt.title('Boxplot of Starbucks Calories')
plt.show()

In [None]:
# Sample 27: Violin Plot
# Explanation: Combines boxplot and KDE to show the density of sugar content across drinks.
plt.figure(figsize=(10,6))
sns.violinplot(x='sugar_g', data=df_starbucks, color='pink')
plt.title('Violin Plot of Sugar Content')
plt.show()

In [None]:
# Sample 28: Horizontal Bar Chart (Pandas Built-in)
# Explanation: Simple horizontal bar chart using Pandas internal plotting.
df_indian['flavor_profile'].value_counts().plot(kind='barh', color='teal')
plt.title('Flavor Profiles of Indian Food')
plt.show()

In [None]:
# Sample 29: KDE Plot
# Explanation: Visualizing the probability density of preparation time.
sns.kdeplot(data=df_indian, x='prep_time', fill=True, clip=(0,100))
plt.title('Density Plot of Prep Time (Clipped at 100m)')
plt.show()

In [None]:
# Sample 30: Interactive Plotly Histogram
# Explanation: Using Plotly to create a histogram where you can hover over bars to see counts.
fig = px.histogram(df_starbucks, x='caffeine_mg', nbins=30, title='Interactive Histogram of Caffeine')
fig.show()

In [None]:
# Sample 31: Strip Plot
# Explanation: Showing individual data points for cook time across different courses.
sns.stripplot(x='course', y='cook_time', data=df_indian, jitter=True)
plt.title('Strip Plot: Cook Time by Course')
plt.xticks(rotation=45)
plt.show()

In [None]:
# Sample 32: GroupBy + Bar Chart
# Explanation: Aggregating data to find the countries with the highest average ratings.
avg_stars = df_ramen.groupby('Country')['stars'].mean().sort_values(ascending=False).head(10)
plt.figure(figsize=(10,5))
avg_stars.plot(kind='bar', color='gold')
plt.title('Top 10 Countries by Average Ramen Rating')
plt.ylim(3,5)
plt.show()

In [None]:
# Sample 33: ECDF Plot
# Explanation: Empirical Cumulative Distribution Function showing the percentage of drinks below a certain sodium level.
plt.figure(figsize=(10,6))
sns.ecdfplot(data=df_starbucks, x='sodium_mg')
plt.title('ECDF of Sodium Content')
plt.show()

In [None]:
# Sample 34: Stacked Histogram
# Explanation: Comparing cook time distributions for Veg vs Non-Veg stacked on top of each other.
sns.histplot(data=df_indian, x='cook_time', hue='diet', multiple='stack', bins=20)
plt.title('Stacked Histogram: Cook Time by Diet')
plt.xlim(0, 150)
plt.show()

In [None]:
# Sample 35: Funnel Chart (Plotly)
# Explanation: Visualizing the reduction in number of dishes as we move from top regions to smaller ones.
top_regions = df_indian['region'].value_counts()
fig = px.funnel(top_regions, title='Funnel Chart of Dishes by Region')
fig.show()

In [None]:
# Sample 36: Donut Chart
# Explanation: A variation of the pie chart with a hole in the middle to show Ramen Style proportions.
plt.figure(figsize=(8,8))
plt.pie(df_ramen['Style'].value_counts(), labels=df_ramen['Style'].value_counts().index, autopct='%1.1f%%', pctdistance=0.85)
centre_circle = plt.Circle((0,0),0.70,fc='white')
fig = plt.gcf()
fig.gca().add_artist(centre_circle)
plt.title('Donut Chart of Ramen Styles')
plt.show()

In [None]:
# Sample 37: Rug Plot
# Explanation: Adds marginal ticks to the axis to show the exact location of data points.
sns.rugplot(data=df_starbucks, x='calories', height=.1)
plt.title('Rug Plot of Calories')
plt.show()

In [None]:
# Sample 38: Heatmap
# Explanation: Visualizing the matrix of average preparation times across regions and courses.
df_heat = df_indian.pivot_table(index='region', columns='course', values='prep_time', aggfunc='mean')
plt.figure(figsize=(10,8))
sns.heatmap(df_heat, annot=True, cmap='coolwarm', fmt='.1f')
plt.title('Heatmap: Avg Prep Time by Region and Course')
plt.show()

In [None]:
# Sample 39: Interactive Scatter Plot
# Explanation: Basic scatter plot to show the strong correlation between sugar and calories.
fig = px.scatter(df_starbucks, x='sugar_g', y='calories', title='Scatter: Sugar vs Calories')
fig.show()

In [None]:
# Sample 40: Joint Hex Plot
# Explanation: Combines scatter plot and histograms to show density of prep vs cook time.
sns.jointplot(data=df_indian, x='prep_time', y='cook_time', kind='hex', xlim=(0,100), ylim=(0,100))
plt.suptitle('Joint Hex Plot: Prep vs Cook Time')
plt.show()

In [None]:
# Sample 41: Interactive Box Plot
# Explanation: Plotly box plot allowing hover to see quartiles and median ratings for each style.
fig = px.box(df_ramen, x='Style', y='stars', color='Style', title='Box Plot of Stars by Style')
fig.show()

In [None]:
# Sample 42: Point Plot
# Explanation: Visualizing the estimate of central tendency (mean) and confidence intervals.
plt.figure(figsize=(12,6))
sns.pointplot(x='state', y='cook_time', data=df_indian[df_indian['state'].isin(['Punjab', 'Maharashtra', 'Gujarat'])], hue='diet')
plt.title('Point Plot: Cook Time by State and Diet')
plt.show()

In [None]:
# Sample 43: Linear Regression Plot
# Explanation: Fitting a linear model to visualize the relationship between calories and fat.
sns.lmplot(x='calories', y='total_fat_g', data=df_starbucks, height=6, aspect=1.5)
plt.title('Regression Plot: Calories vs Fat')
plt.show()

In [None]:
# Sample 44: Swarm Plot
# Explanation: Similar to strip plot but adjusts points so they don't overlap, showing distribution clearly.
plt.figure(figsize=(10,6))
sns.swarmplot(x='diet', y='prep_time', data=df_indian, size=3)
plt.ylim(0, 150)
plt.title('Swarm Plot: Prep Time by Diet')
plt.show()

In [None]:
# Sample 45: Treemap
# Explanation: Hierarchical visualization of dishes broken down by Region -> State -> Course.
fig = px.treemap(df_indian, path=['region', 'state', 'course'], title='Treemap of Indian Cuisine Hierarchy')
fig.show()

In [None]:
# Sample 46: Correlation Heatmap
# Explanation: Showing the numerical correlation coefficients between different nutritional values.
df_corr = df_starbucks[['calories', 'total_fat_g', 'sugar_g', 'caffeine_mg']].corr()
sns.heatmap(df_corr, annot=True, cmap='Reds')
plt.title('Correlation Matrix of Nutritional Facts')
plt.show()

In [None]:
# Sample 47: Sunburst Chart
# Explanation: Interactive concentric chart showing breakdown of Diet -> Course -> Flavor.
fig = px.sunburst(df_indian, path=['diet', 'course', 'flavor_profile'], title='Sunburst Chart: Diet Breakdown')
fig.show()

In [None]:
# Sample 48: Pair Plot
# Explanation: Plotting pairwise relationships in a dataset. Good for spotting correlations.
sns.pairplot(df_starbucks[['calories', 'sugar_g', 'caffeine_mg', 'fiber_g']])
plt.show()

In [None]:
# Sample 49: Facet Grid
# Explanation: Creating a grid of histograms for 'stars' separated by 'Style' (Cup, Pack, etc).
g = sns.FacetGrid(df_ramen, col='Style', col_wrap=4)
g.map(sns.histplot, 'stars')
plt.show()

In [None]:
# Sample 50: 3D Scatter Plot
# Explanation: 3D visualization of Sugar, Calories, and Caffeine.
fig = px.scatter_3d(df_starbucks, x='sugar_g', y='calories', z='caffeine_mg', color='size', title='3D Scatter Plot')
fig.show()

## Part 4: Advanced Analysis & Strings (Samples 51-100)

In [None]:
# Sample 51: Word Cloud
# Explanation: Generating a word cloud to visualize the most common ingredients in Indian cuisine.
all_ingredients = ' '.join(df_indian['ingredients'])
wordcloud = WordCloud(width=800, height=400, background_color='white').generate(all_ingredients)
plt.figure(figsize=(10,5))
plt.imshow(wordcloud, interpolation='bilinear')
plt.axis('off')
plt.title('Word Cloud of Indian Ingredients')
plt.show()

In [None]:
# Sample 52: Feature Engineering (Text Length)
# Explanation: Analyzing if brand names are short or long.
df_ramen['brand_len'] = df_ramen['Brand'].apply(len)
sns.histplot(df_ramen['brand_len'], bins=20)
plt.title('Distribution of Brand Name Lengths')
plt.show()

In [None]:
# Sample 53: Top N Frequent Words
# Explanation: Finding the top 10 most used ingredients mathematically.
top_ingredients = pd.Series(' '.join(df_indian['ingredients']).split(',')).value_counts().head(10)
print(top_ingredients)

In [None]:
# Sample 54: Bar Chart of Top Words
# Explanation: Visualizing the top ingredients extracted in the previous sample.
fig = px.bar(top_ingredients, title='Top 10 Ingredients Count')
fig.show()

In [None]:
# Sample 55: String Filtering
# Explanation: Calculating what percentage of ramen varieties explicitly contain the word 'Spicy'.
df_spicy = df_ramen[df_ramen['Variety'].str.contains('Spicy', case=False)]
print(f'Percentage of Spicy Ramen: {len(df_spicy)/len(df_ramen)*100:.2f}%')

In [None]:
# Sample 56: Comparative KDE
# Explanation: Do spicy ramens get better ratings? Comparing the distributions.
sns.kdeplot(data=df_spicy, x='stars', label='Spicy', fill=True)
sns.kdeplot(data=df_ramen[~df_ramen['Variety'].str.contains('Spicy', case=False)], x='stars', label='Non-Spicy', fill=True)
plt.legend()
plt.title('Ratings: Spicy vs Non-Spicy Ramen')
plt.show()

In [None]:
# Sample 57: Ratio Analysis
# Explanation: Creating a 'Kick per Calorie' metric to find efficient caffeine sources.
df_starbucks['caffeine_per_cal'] = df_starbucks['caffeine_mg'] / (df_starbucks['calories'] + 1)
df_starbucks.sort_values('caffeine_per_cal', ascending=False).head(5)[['product_name', 'caffeine_per_cal']]

In [None]:
# Sample 58: Grouped Bar Chart
# Explanation: Comparing cook times across regions, split by diet.
df_indian.groupby(['region', 'diet'])['cook_time'].mean().unstack().plot(kind='bar', stacked=False)
plt.title('Avg Cook Time by Region and Diet')
plt.show()

In [None]:
# Sample 59: Parallel Categories
# Explanation: Visualizing flow/relationships between categorical variables (Region -> Course -> Diet).
fig = px.parallel_categories(df_indian, dimensions=['region', 'course', 'diet'], title='Parallel Categories Diagram')
fig.show()

In [None]:
# Sample 60: Boxen Plot
# Explanation: A box plot variant that shows more distributional information in the tails.
sns.boxenplot(x='Country', y='stars', data=df_ramen[df_ramen['Country'].isin(['Japan', 'USA', 'South Korea', 'China', 'Vietnam'])])
plt.title('Boxen Plot (Enhanced Boxplot)')
plt.show()

In [None]:
# Sample 61: Total Time Calculation
# Explanation: Adding prep and cook time.
df_indian['total_time'] = df_indian['prep_time'] + df_indian['cook_time']

In [None]:
# Sample 62: Log Scale Plot
# Explanation: Handling skewed time data.
sns.histplot(df_indian['total_time'], log_scale=True)
plt.title('Log Scaled Total Time')

In [None]:
# Sample 63: Crosstab
# Explanation: Table showing frequency of course types per region.
pd.crosstab(df_indian['region'], df_indian['course'])

In [None]:
# Sample 64: Heatmap of Crosstab
# Explanation: Visualizing the contingency table.
sns.heatmap(pd.crosstab(df_indian['region'], df_indian['course']), cmap='Blues')

In [None]:
# Sample 65: Top 5 Sugary Drinks
# Explanation: Finding the max sugar values.
df_starbucks.nlargest(5, 'sugar_g')[['product_name', 'sugar_g']]

In [None]:
# Sample 66: Lowest Calorie Drinks
# Explanation: Finding diet-friendly options.
df_starbucks.nsmallest(5, 'calories')[['product_name', 'calories']]

In [None]:
# Sample 67: Text Word Count
# Explanation: Counting words in ramen names.
df_ramen['Variety_word_count'] = df_ramen['Variety'].apply(lambda x: len(str(x).split()))

In [None]:
# Sample 68: Scatter Text vs Rating
# Explanation: Does a longer name mean better rating?
sns.scatterplot(x='Variety_word_count', y='stars', data=df_ramen)

In [None]:
# Sample 69: Ingredient Filter (Milk)
# Explanation: Counting dishes containing milk.
df_indian[df_indian['ingredients'].str.contains('milk')].shape[0]

In [None]:
# Sample 70: Ingredient Filter (Rice)
# Explanation: Counting dishes containing rice.
df_indian[df_indian['ingredients'].str.contains('rice')].shape[0]

In [None]:
# Sample 71: Avg Calories by Size
# Explanation: Simple aggregation by drink size.
df_starbucks.groupby('size')['calories'].mean().plot(kind='bar')

In [None]:
# Sample 72: Cluster Map
# Explanation: Hierarchical clustering of correlations.
sns.clustermap(df_corr, figsize=(6,6))

In [None]:
# Sample 73: Cumulative Sum
# Explanation: Cumulative distribution of ramen counts.
df_ramen['Country'].value_counts().cumsum().plot()

In [None]:
# Sample 74: Unique Value Count
# Explanation: How many unique states are represented?
df_indian['state'].nunique()

In [None]:
# Sample 75: 5-Star Sources
# Explanation: Which countries produce the most 5-star ramen?
df_ramen[df_ramen['stars'] == 5]['Country'].value_counts().head(5)

In [None]:
# Sample 76: Scatter Matrix (Plotly)
# Explanation: Interactive pair plot.
fig = px.scatter_matrix(df_starbucks, dimensions=['calories', 'sugar_g', 'caffeine_mg'], color='size')

In [None]:
# Sample 77: Statistical Summary
# Explanation: Detailed stats (mean, std, quartiles) for cook time.
df_indian['cook_time'].describe()

In [None]:
# Sample 78: Top 5 Brands Pie
# Explanation: Market share of top ramen brands in dataset.
df_ramen['Brand'].value_counts().head(5).plot(kind='pie')

In [None]:
# Sample 79: Residual Plot
# Explanation: Checking residuals for linear regression assumptions.
sns.residplot(x='sugar_g', y='calories', data=df_starbucks)

In [None]:
# Sample 80: Multiple Aggregations
# Explanation: Calculating multiple stats at once.
df_indian.groupby('diet')['prep_time'].agg(['mean', 'min', 'max'])

In [None]:
# Sample 81: Split Violin Plot
# Explanation: Comparing distributions side-by-side.
sns.violinplot(x='course', y='prep_time', hue='diet', data=df_indian, split=True)

In [None]:
# Sample 82: Binary Feature Creation
# Explanation: Flagging drinks with 'whip' in the name.
df_starbucks['whip'] = df_starbucks['product_name'].apply(lambda x: 1 if 'whip' in x.lower() else 0)

In [None]:
# Sample 83: Impact of Whip on Calories
# Explanation: Visual comparison of whipped vs non-whipped.
sns.barplot(x='whip', y='calories', data=df_starbucks)

In [None]:
# Sample 84: Dummy Time Column
# Explanation: Simulating temporal data.
df_ramen['year'] = 2023 # Dummy year as dataset lacks date
print('Added dummy year')

In [None]:
# Sample 85: Multi-column Sort
# Explanation: Finding the quickest overall dishes.
df_indian.sort_values(['prep_time', 'cook_time'], ascending=[True, True]).head()

In [None]:
# Sample 86: Column-wise Mean
# Explanation: Average of all numerical columns.
df_starbucks.select_dtypes(include='number').mean()

In [None]:
# Sample 87: Handling Missing Categories
# Explanation: Filling NaNs with 'Unknown' and counting.
df_indian['region'].fillna('Unknown').value_counts()

In [None]:
# Sample 88: Cumulative Density
# Explanation: CDF of Sodium.
sns.kdeplot(df_starbucks['sodium_mg'], cumulative=True)

In [None]:
# Sample 89: Binning/Discretization
# Explanation: Categorizing continuous ratings.
df_ramen['Stars_Category'] = pd.cut(df_ramen['stars'], bins=[0, 3, 4, 5], labels=['Low', 'Avg', 'High'])

In [None]:
# Sample 90: Plotting Binned Data
# Explanation: Visualizing the new categories.
sns.countplot(x='Stars_Category', data=df_ramen)

In [None]:
# Sample 91: Pivot Table
# Explanation: Complex data summarization.
pd.pivot_table(df_indian, values='cook_time', index=['region'], columns=['diet'])

In [None]:
# Sample 92: Filter Columns by Name
# Explanation: Selecting only columns with 'g' (grams).
df_starbucks.filter(like='g').head()

In [None]:
# Sample 93: Query Function
# Explanation: Filtering using Pandas query syntax for quick dishes.
df_indian.query('prep_time < 10 and cook_time < 20')

In [None]:
# Sample 94: Count Unique Brands
# Explanation: Total number of ramen brands.
df_ramen['Brand'].nunique()

In [None]:
# Sample 95: Line Plot (Sorted)
# Explanation: Visualizing the trend of calories sorted.
sns.lineplot(data=df_starbucks.sort_values('calories').reset_index(), x='index', y='calories')

In [None]:
# Sample 96: Interactive Stacked Bar
# Explanation: Plotly stacked bar chart.
fig = px.bar(df_indian, x='state', y='prep_time', color='diet', title='Stacked Bar: Prep Time by State')

In [None]:
# Sample 97: Simple Pandas Hist
# Explanation: Quick check of fiber content.
df_starbucks['fiber_g'].plot(kind='hist', bins=5, title='Fiber Distribution')

In [None]:
# Sample 98: String Startswith
# Explanation: Finding dishes starting with 'A'.
df_indian[df_indian['name'].str.startswith('A')].head()

In [None]:
# Sample 99: Aggregation Table
# Explanation: Mean rating and count per style.
df_ramen.groupby('Style')['stars'].agg(['mean', 'count'])

In [None]:
# Sample 100: Completion
# Explanation: Final step.
print('Completed 100 Samples of Food EDA!')