# 8 Jul 2024

In [2]:
import pandas as pd 
import plotly.express as px 

df = pd.read_csv("../data/OnlineShoppersbyAgeGroup.csv")
print(f"Shape of online shopper dataframe; {df.shape}")

Shape of online shopper dataframe; (70, 3)


In [4]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 70 entries, 0 to 69
Data columns (total 3 columns):
 #   Column      Non-Null Count  Dtype 
---  ------      --------------  ----- 
 0   year        70 non-null     int64 
 1   age_group   70 non-null     object
 2   percentage  70 non-null     int64 
dtypes: int64(2), object(1)
memory usage: 1.8+ KB


Based on the info above, there are no missing data in the dataframe. Explore the trend using line graph.

In [5]:
df['age_group'].value_counts()

age_group
15 to 24 years    12
25 to 34 years    12
35 to 49 years    12
50 to 59 years    12
60 and above      12
less than 7        5
7 to 14 years      5
Name: count, dtype: int64

In [6]:
df = df[~df['age_group'].isin(['less than 7', '7 to 14 years'])]

Remove minors where percentage is only availabel from 2014 and mostly close to 0%

In [8]:
fig = px.line(df, x='year', y='percentage', color='age_group', markers=True,
            title="% Online Shoppers by Age Group", labels={"age_group": "Age Group"})
fig.show()

Observations:
1. Age group with highest proportion of online shopeers are young working adults between 25-34 years. Most would probably be working and have the financial means to spend. 
2. Closely behind include 15-24 years and 35-49 years of age. 
   - Potential reason: 
     - Looking for between deals online for among youths who are still schooling as means to save and managing within their budget which can be difficult to do while in-store. 
     - Increase technology adoption and savviness among the middle age group. 
3. An uptick can be observed for the age group 50-59 years old from 2016 onwards where more turned towards online shopping. This could be an increased familiarity with the use of technology to aid with online shopping among these groups of individual.
4. Based on the data, the worrying trend those 60 and above where proportion of online shoppers is the least among all the age groups. It might be difficult for them to learn and use technology due to socioeconomic factors restricting access. 

In [10]:
import ipywidgets as widgets
from IPython.display import display

age = widgets.Dropdown(options=sorted(list(df['age_group'].unique())), description='Age Group')

def plotchartbyagegroup(selection):
    tmp = df[df['age_group']==selection]
    fig = px.line(tmp, x='year', y='percentage', markers=True,
            title=f'% Online Shoppers for {selection}', text='percentage')
    fig.update_traces(textposition='top center')
    fig.show()

widgets.interact(plotchartbyagegroup, selection=age)

interactive(children=(Dropdown(description='Age Group', options=('15 to 24 years', '25 to 34 years', '35 to 49…

<function __main__.plotchartbyagegroup(selection)>