Skip to content

A package written in TypeScript to scrape film data from Letterboxd lists.

Notifications You must be signed in to change notification settings

JamesGJ5/letterboxd-list-scraper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

17 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

npm package: https://www.npmjs.com/package/letterboxd-list-scraper

Tool built using TypeScript and Puppeteer to scrape data from films in a Letterboxd lists. Can extract data for each film in a Letterboxd list at a rate of ~1s per film.

Letterboxd lists and film pages load data dynamically, hence the use of Puppeteer. Letterboxd lists require scrolling in order to load all list content, which the package puppeteer-autoscroll-down is used for https://www.npmjs.com/package/puppeteer-autoscroll-down.

Installation:

npm install letterboxd-list-scraper

Quick Start:

  1. Import processFilmsInList from the package. For now, only ES6+ imports are enabled:

    import processFilmsInList from 'letterboxd-list-scraper'
    
  2. When using the function processFilmsInList, make sure firstListPageURL links to a Letterboxd list page in grid view.

    processFilmsInList(
        firstListPageURL: string,
        processor: Function
    );
    
  3. Format of the film object passed to processor for each film:

    {
        filmTitle: string,
        releaseYearString: string,
        directorNameArray: string[],
        averageRatingString: string,
        filmPosterURL: string,
        filmBackdropImageURL: string,
    }
    

About

A package written in TypeScript to scrape film data from Letterboxd lists.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages