IMDB-Scrapper

Scrapper to scrape top 250 movies page of IMDB

A spider is built using scraping library Scrapy which crawls through the IMDB top 250 movies webpage (https://www.imdb.com/chart/top) and gets back the names, rating, directors, genre and the cast in JSON format.

Using this scraped data exploratory data analysis and visualization was done to find the number of movies in each genre or cast members common in these top 250 films.

Screenshots

Visualization done as part of EDA.

Dependencies

Python 3, Scrapy, Pandas, Matplotlib, Seaborn

My Original Contribution & Learnings

Contribution => Implemented a spider that scrapes data from IMDB website by taking reference from Scrapy documentation. Implemented a script to do EDA on scraped data using pandas, matplotlib and seaborn.

Major Learnings => Learnt how to write spiders using scrapy. Learnt how to preprocess and do EDA on scraped data.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
imdbscraping		imdbscraping
IMDB Data Exploratory Data Analysis.ipynb		IMDB Data Exploratory Data Analysis.ipynb
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

IMDB-Scrapper

Screenshots

Dependencies

My Original Contribution & Learnings

About

Releases

Packages

Languages

aryansi225/IMDB-Scrapper

Folders and files

Latest commit

History

Repository files navigation

IMDB-Scrapper

Screenshots

Dependencies

My Original Contribution & Learnings

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages