Skip to content

algo7/TripAdvisor-Review-Scraper

Repository files navigation

TripAdvisor-Review-Scraper

A simple scraper for TripAdvisor (Hotel, Restaurant, Airline) reviews.

Build & Push [Container Provisioner]

Build & Push [Scraper]

Build & Push [VPN Worker]

CodeQL

Table of Contents

Requirements

  1. Go +v1.21
  2. Make [Optional]
  3. Docker [Optional]
  4. Docker Compose [Optional]
  5. Node.js +18 [Optional. Only required if you want to use the scraper written in Node.js, which is deprecated.]

How to Install Docker:

  1. Windows
  2. Mac
  3. Linux

Project Layout

Scraper

There are 2 scrapers available:

  1. Scraper written in Go
  2. Scraper written in Node.js [Deprecated]

The scraper written in Go is preferred because it calls the API directly and is much faster than the scraper written in Node.js which goes the traditional way of parsing HTML. The instructions of how to use them are located in their separate folders.

Container Provisioner

Automates the process of provisioning containers for the scraper.

Please read more about the container provisioner here

Proxy Pool

Provides a pool of proxies for the scraper to use.

Please read more about the proxy pool here