Skip to content

gvisco/trippo_scrappo

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

trippo_scrappo

A simple command-line scraper for TripAdvisor reviews, written in GO.

Info

Heavily based on the code from TripAdvisor-Review-Scraper (that works like a charm by the way), I have modified it to get to a simple command-line experience, getting rid of all the docker-related features and and adding a bit of flexibility where needed.

Main changes include

  • Input arguments via command line instead of environment variables
  • Support to scrape reviews in different laguages
  • Append the output to an existing CSV file, if the file already exists

How to use it

It requires Go +v1.21.

Run the main to see the help

$ go run main.go 
  -lang string
        The language of the review like: en,  it, etc. (default "en")
  -out string
        The output CSV file (default "reviews.csv")
  -url string
        The URL of the target Hotel

The url is mandatory and is the URL of the TripAdvisor page of the hotel/restaurant/airline you want to scrape from https://www.tripadvisor.com (note that other domains, different from .com are not supported). The URL should be in the following format:

  1. Airline: https://www.tripadvisor.com/Airline_Review-d8729113-Reviews-Lufthansa
  2. Hotel: https://www.tripadvisor.com/Hotel_Review-g188107-d231860-Reviews-Beau_Rivage_Palace-Lausanne_Canton_of_Vaud.html
  3. Restaurant: https://www.tripadvisor.com/Restaurant_Review-g187265-d11827759-Reviews-La_Terrasse-Lyon_Rhone_Auvergne_Rhone_Alpes.html

Example

This example shows how to download all the reviews, in French, of the Beau-Rivage Palace hotel. The output will be saved in the BeauRivage.CSV file.

$go run main.go -url=https://www.tripadvisor.com/Hotel_Review-g188107-d231860-Reviews-Beau_Rivage_Palace-Lausanne_Canton_of_Vaud.html -lang=fr -out="BeauRivage.csv"
2024/02/04 17:43:34 Location URL: https://www.tripadvisor.com/Hotel_Review-g188107-d231860-Reviews-Beau_Rivage_Palace-Lausanne_Canton_of_Vaud.html
2024/02/04 17:43:34 Language: fr
2024/02/04 17:43:34 Location Type: HOTEL
2024/02/04 17:43:34 Location ID: 231860
2024/02/04 17:43:34 Location Name: Beau_Rivage_Palace
2024/02/04 17:43:34 Get HTTP Client...
2024/02/04 17:43:34 Done (HTTP Client)
2024/02/04 17:43:34 Get reviews count...
2024/02/04 17:43:34 Sending request...
2024/02/04 17:43:35 Done (Response received)
2024/02/04 17:43:35 Done (Review count): 694
2024/02/04 17:43:35 Creating file: BeauRivage
2024/02/04 17:43:35 Total Iterations: 35
2024/02/04 17:43:35 Iteration: 0. Delaying for 4 seconds
2024/02/04 17:43:39 Sending request... 
2024/02/04 17:43:39 Done (Response received)
2024/02/04 17:43:39 Iteration: 1. Delaying for 5 seconds
2024/02/04 17:43:44 Sending request... 
2024/02/04 17:43:44 Done (Response received)
2024/02/04 17:43:44 Iteration: 2. Delaying for 3 seconds
2024/02/04 17:43:47 Sending request... 
2024/02/04 17:43:48 Done (Response received)
[...]
2024/02/04 17:45:38 Iteration: 34. Delaying for 3 seconds
2024/02/04 17:45:41 Sending request... 
2024/02/04 17:45:41 Done (Response received)

License and Credits

This software is based on the TripAdvisor-Review-Scraper and it is released accordingly under a GNU license. If you want to credit the authors, please refer to the original CITATION instructions.

About

A simple TripAdvisor reviews scraper

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Languages