GitHub - JamesGJ5/letterboxd-list-scraper: A package written in TypeScript to scrape film data from Letterboxd lists.

npm package: https://www.npmjs.com/package/letterboxd-list-scraper

Tool built using TypeScript and Puppeteer to scrape data from films in a Letterboxd lists. Can extract data for each film in a Letterboxd list at a rate of ~1s per film.

Letterboxd lists and film pages load data dynamically, hence the use of Puppeteer. Letterboxd lists require scrolling in order to load all list content, which the package puppeteer-autoscroll-down is used for https://www.npmjs.com/package/puppeteer-autoscroll-down.

Installation:

npm install letterboxd-list-scraper

Quick Start:

Import processFilmsInList from the package. For now, only ES6+ imports are enabled:
```
import processFilmsInList from 'letterboxd-list-scraper'
```
When using the function processFilmsInList, make sure firstListPageURL links to a Letterboxd list page in grid view.
```
processFilmsInList(
    firstListPageURL: string,
    processor: Function
);
```

Format of the film object passed to processor for each film:

{
    filmTitle: string,
    releaseYearString: string,
    directorNameArray: string[],
    averageRatingString: string,
    filmPosterURL: string,
    filmBackdropImageURL: string,
}

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
dist		dist
src		src
.gitignore		.gitignore
README.md		README.md
package-lock.json		package-lock.json
package.json		package.json
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About

Releases

Packages

Languages

JamesGJ5/letterboxd-list-scraper

Folders and files

Latest commit

History

Repository files navigation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages