# Introduction

In this section, we will answer the question "**What are the top 3 age groups that use the Netflix platform?**", this will help in determining the age groups we need to focus on while working on adding new shows or movies

# Import pandas and prepare data

In [2]:
import pandas as pd

In [3]:
# Import and take a look at the dataset
data = pd.read_csv("Netflix Userbase.csv")
data.head()

Unnamed: 0,User ID,Subscription Type,Monthly Revenue,Join Date,Last Payment Date,Country,Age,Gender,Device,Plan Duration
0,1,Basic,10,15-01-22,10-06-23,United States,28,Male,Smartphone,1 Month
1,2,Premium,15,05-09-21,22-06-23,Canada,35,Female,Tablet,1 Month
2,3,Standard,12,28-02-23,27-06-23,United Kingdom,42,Male,Smart TV,1 Month
3,4,Standard,12,10-07-22,26-06-23,Australia,51,Female,Laptop,1 Month
4,5,Basic,10,01-05-23,28-06-23,Germany,33,Male,Smartphone,1 Month


We will also need to check any null values

In [4]:
data.isnull().value_counts()

User ID  Subscription Type  Monthly Revenue  Join Date  Last Payment Date  Country  Age    Gender  Device  Plan Duration
False    False              False            False      False              False    False  False   False   False            2500
Name: count, dtype: int64

We can see that there are no null values in the dataset, this is why we will skip the cleaning, however, we will need to add the ```age_group``` column that will include the age group of each column

In [None]:
print(f"Minimum age: {data.Age.min()}") # Find the minimum age
print(f"Maximum age {data.Age.max()}") # Find the maximum age

Minimum age: 26
Maximum age 51


In [None]:
bins = [20, 30, 40, 50, 60]  # Define age ranges
labels = ['20-30', '30-40', '40-50', '50+']  # Age group labels

# Add age_group column
data['age_group'] = pd.cut(data['Age'], bins=bins, labels=labels, right=False)

In [None]:
data.head() # Take a look at the new dataset

Unnamed: 0,User ID,Subscription Type,Monthly Revenue,Join Date,Last Payment Date,Country,Age,Gender,Device,Plan Duration,age_group
0,1,Basic,10,15-01-22,10-06-23,United States,28,Male,Smartphone,1 Month,20-30
1,2,Premium,15,05-09-21,22-06-23,Canada,35,Female,Tablet,1 Month,30-40
2,3,Standard,12,28-02-23,27-06-23,United Kingdom,42,Male,Smart TV,1 Month,40-50
3,4,Standard,12,10-07-22,26-06-23,Australia,51,Female,Laptop,1 Month,50+
4,5,Basic,10,01-05-23,28-06-23,Germany,33,Male,Smartphone,1 Month,30-40


# Find the top 3 age groups 

In [8]:
top = data["age_group"].value_counts().sort_values(ascending=False)

In [9]:
for i in range(3):
    print(f"#{i+1} most common age group: {top.index[i]}")

#1 most common age group: 30-40
#2 most common age group: 40-50
#3 most common age group: 20-30


# Conclusion
In this section, we have used pandas to find the top 3 age groups of Netflix users, we have found out that they are the following groups: 30-40, 40-50, and 20-30

# Next up
In the next section, we will explore the dataset to find the most used devices by Netflix users, you can check it by clicking <a>here</a>