Exploratory Analysis of Netflix User Base Data

===========================================================================

Exploratory Analysis of Netflix User Base Data

By leveraging the power of data analysis and engineering tools such as Matplotlib, Pandas, MisingNo, and Seaborn, an in-depth and visual exploration is conducted in order to discover key insights about age demographics, age, and gender distribution, subscription types, and so forth. This notebook might serve as a hands-on experience for beginners in the field of data science.

◘ Introduction

The acquired dataset provides a sample Netflix user base, showcasing a plethora of monthly revenue, user subscriptions, activity, and account details. Each sample represents a unique user, identified by their identification as a user ID, and includes information such as the subscription type which is categorized as Basic, Standard, or Premium. The revenue generated monthly from their subscription is also included along with the date of joining Netflix labeled as “Join Date”, the date of their last payment as “Last Payment Date”, and the country in which they resided.

Additional columns have been included to provide insights into user behavior and preferences, which include the type of devices, case in point, Smart TV, Mobile phone, Desktop, and Tablet. Moreover, the total watch time (in minutes), and account status including whether the account is active or not is also provided. It can be used to analyze and model user trends, preferences, and revenue generation within a hypothetical Netflix user base.

◘ Objective

The primary incentive of this research is to:

Process dataset by analyzing its integrity, missing values, duplicated values, and so forth.
Perform various clean-ups, if required, and improve accessibility for more convenient exploratory analysis.
Conduct exploratory analysis using a myriad of graphing tools to reach a conclusion.
To reach a proper decision on which model to apply to the processed dataset in a future project to achieve the ideal optimization tuning and hopefully, a better outcome in the model's generalization.

◘ Approach

This research is classified into 2 steps:

Data Wrangling: Where the dataset is extracted, tested, cleaned, processed, and stored in memory.
Feature Analysis: Where the processed data is then explored thoroughly to acquire a viable insight.

◘ Methodologies & Technologies applied

Diagnose and fix structural errors
Check and Clean data
Address duplicates & outliers
Logical feature amalgamation to construct a unique variable
Univariate inspection
Bivariate inspection
Feature correlations
Seaborn & Matplotplib visualizations

◘ Project Flowchart

◘ Required Modules

pandas 2.0.3
missingNo 0.5.2
matplotlib 3.7.0
seaborn 0.12.2

◘ Jupyter core packages

IPython : 8.10.0
ipykernel : 6.19.2
ipywidgets : 7.6.5
jupyter_client : 7.3.4
jupyter_core : 5.2.0
jupyter_server : 1.23.4
jupyterlab : 3.5.3

◘ Project Organization

├── LICENSE
│
├── README.md          <- The top-level README for developers using this project.
│
├── data
│   └── processed      <- The final, canonical data sets for modeling.
│   └── raw               <- The original, immutable data dump.
│
│
├── notebooks          <- Jupyter notebooks for EDA
│                         		
│
├── figures               <- Generated graphics and figures to be used in reporting using Jupyter Notebooks
|
│
├── img            <- Project related files
│
├── requirements.txt   <- The requirements file for reproducing the analysis environment, e.g.
│                         generated with `pip freeze > requirements.txt`
│
│
│
└── tox.ini            <- tox file with settings for running tox; see tox.readthedocs.io

◘ Installation (using pip)

In Jupyter, the console commands can be executed by the ‘!’ sign before the command within the cell. For example, If the following code is written in the Jupyter cell, it will execute as a command in CMD. To install any modules effectively, the sys python package is used and works as follows:

import sys
!{sys.executable} -m pip install [package_name]

For Pandas, run:

!{sys.executable} -m pip install pandas

To install missingNo:

!{sys.executable} -m pip install missingno

Matplotlib can be installed by running the following command:

!{sys.executable} -m pip install matplotlib

Lastly, for seaborn:

!{sys.executable} -m pip install seaborn

◘ Import Packages

To import the dependencies, simply open the preferred IDE or Notebook:

For Pandas, run the following command:

import pandas as pd

To use missingno, run:

import missingno as msn

Import matplotlib using:

import matplotlib.pyplot as plt

Seaborn can be accessed by:

import seaborn as sns

◘ Supplementary Resources

◘ License

This is free and unencumbered software released into the public domain. Anyone is free to copy, modify, publish, use, compile, sell, or distribute this software, either in source code form or as a compiled binary, for any purpose, commercial or non-commercial, and by any means.

===========================================================================

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Exploratory Analysis of Netflix User Base Data

◘ Introduction

◘ Objective

◘ Approach

◘ Methodologies & Technologies applied

◘ Project Flowchart

◘ Required Modules

◘ Jupyter core packages

◘ Project Organization

◘ Installation (using pip)

◘ Import Packages

◘ Supplementary Resources

◘ License

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 52 Commits
dataset		dataset
figures		figures
img		img
notebook		notebook
.gitignore		.gitignore
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
requirements.txt		requirements.txt
tox.ini		tox.ini

License

shahriar-rahman/Exploratory-Analysis-of-Netflix-Userbase

Folders and files

Latest commit

History

Repository files navigation

Exploratory Analysis of Netflix User Base Data

◘ Introduction

◘ Objective

◘ Approach

◘ Methodologies & Technologies applied

◘ Project Flowchart

◘ Required Modules

◘ Jupyter core packages

◘ Project Organization

◘ Installation (using pip)

◘ Import Packages

◘ Supplementary Resources

◘ License

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages