<a href="https://colab.research.google.com/github/gauravvxv/Netflix-Users-Database/blob/main/Notebook/users.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# 🎥**Netflix Users Project**

##  📌**1. Introduction**

In this project, i am working on Netflix users dataset. The main objective is to analyze and visualize user data to uncover patterns and insights. This includes understanding user demographics, subscription types, content preferences, viewing habits. By performing data cleaning, exploration, and visualization.

##  📥**2. Import libraries**

We import necessary libraries for data handling and visualization:



*   `pandas` for data manipulation
*   `matplotlib` and `seaborn` for data visualization



In [1]:
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline

## 📁 **3. Load Dataset**

We load dataset using pandas display first few rows to understand the structure

In [4]:
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


In [6]:
path = '/content/drive/MyDrive/Netflix-users'

df = pd.read_csv(path+'/netflix_users.csv')

In [7]:
df.head()

Unnamed: 0,User_ID,Name,Age,Country,Subscription_Type,Watch_Time_Hours,Favorite_Genre,Last_Login
0,1,James Martinez,18,France,Premium,80.26,Drama,2024-05-12
1,2,John Miller,23,USA,Premium,321.75,Sci-Fi,2025-02-05
2,3,Emma Davis,60,UK,Basic,35.89,Comedy,2025-01-24
3,4,Emma Miller,44,USA,Premium,261.56,Documentary,2024-03-25
4,5,Jane Smith,68,USA,Standard,909.3,Drama,2025-01-14


## 🧐 4. **Initial Data Exploration**

Exploring the dataset using basic functions like `.info()`, `.describe()` `.shape` and check the missing values.

In [8]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 25000 entries, 0 to 24999
Data columns (total 8 columns):
 #   Column             Non-Null Count  Dtype  
---  ------             --------------  -----  
 0   User_ID            25000 non-null  int64  
 1   Name               25000 non-null  object 
 2   Age                25000 non-null  int64  
 3   Country            25000 non-null  object 
 4   Subscription_Type  25000 non-null  object 
 5   Watch_Time_Hours   25000 non-null  float64
 6   Favorite_Genre     25000 non-null  object 
 7   Last_Login         25000 non-null  object 
dtypes: float64(1), int64(2), object(5)
memory usage: 1.5+ MB


In [9]:
df.describe()

Unnamed: 0,User_ID,Age,Watch_Time_Hours
count,25000.0,25000.0,25000.0
mean,12500.5,46.48288,500.468858
std,7217.022701,19.594861,286.381815
min,1.0,13.0,0.12
25%,6250.75,29.0,256.5675
50%,12500.5,46.0,501.505
75%,18750.25,63.0,745.7325
max,25000.0,80.0,999.99


In [10]:
df.shape

(25000, 8)

In [11]:
df.isnull().sum()

Unnamed: 0,0
User_ID,0
Name,0
Age,0
Country,0
Subscription_Type,0
Watch_Time_Hours,0
Favorite_Genre,0
Last_Login,0


**After exploring the dataset, it was found that there are 25000 rows and 8 columns.**

**Also there is no null values in dataset.**


## ⏲ 5. **Convert Last_login into datetime**

In [20]:
df['Last_Login'] = pd.to_datetime(df['Last_Login'])

In [21]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 25000 entries, 0 to 24999
Data columns (total 8 columns):
 #   Column             Non-Null Count  Dtype         
---  ------             --------------  -----         
 0   User_ID            25000 non-null  int64         
 1   Name               25000 non-null  object        
 2   Age                25000 non-null  int64         
 3   Country            25000 non-null  object        
 4   Subscription_Type  25000 non-null  object        
 5   Watch_Time_Hours   25000 non-null  float64       
 6   Favorite_Genre     25000 non-null  object        
 7   Last_Login         25000 non-null  datetime64[ns]
dtypes: datetime64[ns](1), float64(1), int64(2), object(4)
memory usage: 1.5+ MB


## 💾 6. **Exploratory Data Analysis (EDA)**

### Numbers of users per Subscription_Type

In [25]:
subscription_counts = df['Subscription_Type'].value_counts()
subscription_counts

Unnamed: 0_level_0,count
Subscription_Type,Unnamed: 1_level_1
Premium,8402
Basic,8356
Standard,8242
