Skip to content

aryansi225/IMDB-Scrapper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 

Repository files navigation

IMDB-Scrapper

Scrapper to scrape top 250 movies page of IMDB

A spider is built using scraping library Scrapy which crawls through the IMDB top 250 movies webpage (https://www.imdb.com/chart/top) and gets back the names, rating, directors, genre and the cast in JSON format.

Using this scraped data exploratory data analysis and visualization was done to find the number of movies in each genre or cast members common in these top 250 films.

Screenshots

Visualization done as part of EDA.

image

image

Dependencies

Python 3, Scrapy, Pandas, Matplotlib, Seaborn

My Original Contribution & Learnings

Contribution => Implemented a spider that scrapes data from IMDB website by taking reference from Scrapy documentation. Implemented a script to do EDA on scraped data using pandas, matplotlib and seaborn.

Major Learnings => Learnt how to write spiders using scrapy. Learnt how to preprocess and do EDA on scraped data.

About

Scrapper to scrape top 250 movies page of IMDB

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published