Data_Analysis_project (pandas)

📘 Amazon Prime Titles – Data Analysis Using Pandas

1. Project Overview

This project demonstrates basic data analysis operations using Pandas on the Amazon Prime Titles Dataset. It focuses on performing data loading, inspection, cleaning, and filtering to understand and manipulate tabular data efficiently using Python.

The goal is to gain hands-on experience with fundamental data-handling techniques essential for any data analytics or data science workflow.

2. Features

📂 Dataset Loading – Reads and displays data using Pandas.
🧾 Data Inspection – Views dataset shape, column names, and data types.
🧹 Data Cleaning – Handles missing values, trims spaces, and converts date formats.
🔍 Filtering and Indexing – Extracts subsets of data based on specific conditions (e.g., release year, country).
⚙️ Data Transformation – Converts duration strings into numeric values for easier processing.

3. Concepts Covered

Pandas Operations: read_csv(), head(), tail(), info(), shape, columns
Data Cleaning: Handling missing values with dropna(), type conversion with to_datetime()
Filtering: Conditional selection using Boolean indexing (df[df['column'] == value])
Feature Extraction: Creating new columns (e.g., numeric duration from string)
Basic Analysis: Viewing subsets and summaries of data

4. Installation

Make sure you have Python and required libraries installed.

pip install pandas numpy

You can run the notebook using:

Jupyter Notebook, or
VS Code with the Jupyter extension enabled.

5. How to Use

Download or clone the project folder.

Open the notebook file:

b3856ea8-c4b6-41de-a449-168e3732e8c6.ipynb

Place the dataset file amazon_prime_titles.csv in the same folder.
Run each cell in order to:
- Load and display dataset
- Clean data (remove null values, convert columns)
- Filter data by year, country, or type
- Transform columns (like duration → minutes)

6. Dataset Source

Dataset: Amazon Prime Titles Source: Kaggle – Amazon Prime Movies and TV Shows Description: Contains detailed information about Amazon Prime Video titles including show ID, title, director, cast, country, release year, rating, and duration.

7. Insights Summary

Some columns such as cast and date_added had missing values, which were cleaned.
Majority of the data entries represent Movies, with fewer TV Shows.
The dataset includes movies and shows from various countries, including India and the USA.
Filtering by release year helped isolate recent titles (e.g., post-2015).
Converted duration strings (e.g., “90 min”) into numeric form for further use.

8. Future Enhancements

📊 Add visualization using Matplotlib/Seaborn for better understanding.
🧠 Include EDA (Exploratory Data Analysis) to discover content trends.
🧩 Create dashboards using Power BI or Streamlit.
🕵️ Add summary statistics such as most frequent countries, top release years, etc.

9. Author

👤 Name: Prasad Goud 🎓 Role: Engineering Student 💻 Skills Used: Python, Pandas, NumPy, Data Cleaning, Filtering 📅 Year: 2025

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
AMAZON_PRIME_Data _Analysis.ipynb		AMAZON_PRIME_Data _Analysis.ipynb
README.md		README.md
amazon_prime_titles.csv		amazon_prime_titles.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Data_Analysis_project (pandas)

📘 Amazon Prime Titles – Data Analysis Using Pandas

Table of Contents

1. Project Overview

2. Features

3. Concepts Covered

4. Installation

5. How to Use

6. Dataset Source

7. Insights Summary

8. Future Enhancements

9. Author

About

Uh oh!

Releases

Packages

Languages

PG970/Data_Analysis_project

Folders and files

Latest commit

History

Repository files navigation

Data_Analysis_project (pandas)

📘 Amazon Prime Titles – Data Analysis Using Pandas

Table of Contents

1. Project Overview

2. Features

3. Concepts Covered

4. Installation

5. How to Use

6. Dataset Source

7. Insights Summary

8. Future Enhancements

9. Author

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages