Python Projects

Ecommerce Dataset Exploratory Data Analysis (EDA) Link

Project Overview

Exploratory data analysis on an ecommerce dataset to gain insights, identify patterns, and visualize findings using various visualization libraries.

Technologies Used

Python
Pandas
NumPy
Seaborn
Matplotlib
Plotly

Key Findings

Customer demographics, purchase history, and product details analysis
Identified correlations and relationships between variables
Conducted hypothesis testing and confidence intervals for significant findings

Insights

Consumer age group analysis
Country-wise analysis
Gender classification
Income distribution analysis
Customer segmentation

Snaps

IMDb Movie Data Scraper Link

Project Description

Web scraping of movie data from IMDb using Selenium and Beautiful Soup, followed by data cleaning and storage in a CSV file.

Tools Used

Selenium
Beautiful Soup
Requests
Python
Jupyter Notebook
NumPy and Pandas

Process

Data collection using Selenium
Data extraction using Beautiful Soup
Error handling and data cleaning using Jupyter Notebook and NumPy and Pandas

Output

A CSV file containing the cleaned and processed movie data.

Project Stats

Extracted data from 1900+ movies
1300+ data points obtained after cleaning and preprocessing

Scalability

Can be used to extract more than 100000+ movies data by adjusting parameters and running the script for an extended period.

Snaps

Raw Data

Cleaned Data

Magic Bricks Data Scraper Link

Project Overview

This project involves web scraping real estate data from Magic Bricks, a popular Indian real estate portal. The scraper extracts valuable information such as property details, prices, locations, and more. I undertook this project to demonstrate my web scraping and data cleaning skills.

Technologies Used

Web Scraping

Python
Selenium
Requests
Beautiful Soup (initially used, but replaced by Selenium due to infinite scroll functionality)

Data Cleaning

Jupyter Notebook
Pandas
NumPy
re (regular expressions)

Features

Extracts property details from Magic Bricks
Handles pagination and scraping multiple pages
Saves data to a CSV file
Used error handling to make the script robust

Reason for choosing Selenium over Beautiful Soup

The website has an infinite scroll function, making it impossible to scrape all details using Beautiful Soup. Therefore, I used Selenium WebDriver to scroll and extract all the details.

Usage

You can use this script to scrape Magic Bricks listing details for any city!

Challenges Faced

Extracting the property ID to get each listing's summary
Error handling
Data cleaning took significant time due to extracting insights from summary, description, and title

Snaps

Raw Data

Cleaned Data

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
python-retail-data-analysis-project		python-retail-data-analysis-project
webscraping-imdb-movie-data-scraper-project		webscraping-imdb-movie-data-scraper-project
webscraping-realestate-data-project		webscraping-realestate-data-project
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Python Projects

Ecommerce Dataset Exploratory Data Analysis (EDA) Link

Project Overview

Technologies Used

Key Findings

Insights

Snaps

IMDb Movie Data Scraper Link

Project Description

Tools Used

Process

Output

Project Stats

Scalability

Snaps

Magic Bricks Data Scraper Link

Project Overview

Technologies Used

Web Scraping

Data Cleaning

Features

Reason for choosing Selenium over Beautiful Soup

Usage

Challenges Faced

Snaps

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Python Projects

Ecommerce Dataset Exploratory Data Analysis (EDA) Link

Project Overview

Technologies Used

Key Findings

Insights

Snaps

IMDb Movie Data Scraper Link

Project Description

Tools Used

Process

Output

Project Stats

Scalability

Snaps

Magic Bricks Data Scraper Link

Project Overview

Technologies Used

Web Scraping

Data Cleaning

Features

Reason for choosing Selenium over Beautiful Soup

Usage

Challenges Faced

Snaps

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages