In theory, as you grow older, you have more opportunities to consume more uncommon foods.

Start with some imports, get the data, and drop the unnecessary columns.

In [1]:
import pandas as pd
import seaborn as sns

sns.set_style("dark")
sns.despine()
sns.color_palette("bright")

df = pd.read_csv("../input/have-you-ever-eaten-these-foods/responses.csv")

df = df.drop(["Race", "Home location"], axis="columns")

Get the number of foods each respondent has eaten.

In [1]:
foods = list(df)[2:]

def count_foods(row):
    total = 0
    
    for column in foods:
        if row[column]:
            total += 1
            
    return total

df["Foods eaten"] = df.apply(count_foods, axis=1)

Put it all together in a figure, adding in a trendline as well.

In [1]:
sns.regplot(data=df, x="Age in years", y="Foods eaten", scatter_kws={'alpha':0.1}).set_title("Foods eaten by age in years")

We can see that the regression line loosely goes up, meaning as you grow older, you do in fact consume more uncommon foods.

**Note: the following graphics may not fully represent all data.**

In [1]:
import numpy as np
from scipy import stats

of_df = df[np.abs(stats.zscore(df["Foods eaten"])) < 3]

sns.regplot(data=of_df, x="Age in years", y="Foods eaten", scatter_kws={'alpha':0.1}).set_title("Foods eaten by age in years (excl. outliers)")

We can see that the regression line still goes up, but not as much when you do not consider outliers.

Now let's see what the graph looks like when we remove all null answers.

In [1]:
nf_df = df.dropna(axis=1)

sns.regplot(data=nf_df, x="Age in years", y="Foods eaten", scatter_kws={'alpha':0.1}).set_title("Foods eaten by age in years (excl. blank answers)")

And now without null answers or outliers:

In [1]:
nfof_df = df.dropna(axis=1)[np.abs(stats.zscore(df["Foods eaten"])) < 3]

sns.regplot(data=nfof_df, x="Age in years", y="Foods eaten", scatter_kws={'alpha':0.1}).set_title("Foods eaten by age in years (excl. outliers/blank answers)")