Skip to content

The Wiki Movies Scraper is a Scrapy project designed to collect information on movies from Wikipedia, including their Title, Genre, Director, Country & Year, and IMDB Rating.

Notifications You must be signed in to change notification settings

RomiconEZ/Scrapy-Wiki-IMDb-Movie-Info

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Movie Information Scraper

Project Overview

The Wiki Movies Scraper is a Scrapy project designed to collect information on movies from Wikipedia, including their Title, Genre, Director, Country & Year, and IMDB Rating. The data is then stored in a CSV format.

To use it in your language, change the scraping parameters related to the selection of words from the html page

Features

  • Scrape movie details from Wikipedia.
  • Output the data in CSV format

Requirements

  • Python 3.10
  • Scrapy

Installation and Setup

Clone the repository and navigate to the project directory:

git clone https://github.com/RomiconEZ/Scrapy-Wiki-IMDb-Movie-Info.git
cd wiki_movies_scraper

Usage

To start scraping movies, run the following command:

scrapy crawl movies_spider

Project Structure

The directory structure for this Scrapy project is as follows:

  • ScrapyParsers/
    • wiki_movies_scraper/
      • wiki_movies_scraper/
        • spiders/
          • __init__.py
          • movies_spider.py
        • __init__.py
        • items.py
        • middlewares.py
        • pipelines.py
        • settings.py
        • scrapy.cfg
        • movies_example.csv
      • poetry.lock
      • pyproject.toml
      • README.md

About

The Wiki Movies Scraper is a Scrapy project designed to collect information on movies from Wikipedia, including their Title, Genre, Director, Country & Year, and IMDB Rating.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages