GitHub

The script HotelScrape.py scrapes hotel data from the popular hotel reservation site: https://www.booking.com/.

The demonstrated example uses Hong Kong (destination) and 21 October-09 November 2019 (stay period) as inputs.

When running the script, it fills out the form automatically for the user as follows,

and it navigates to each page automatically by clicking on the next-page button when scraping hotel data as follows,

The scraped hotel data is saved in a csv file (see hotel_prices.csv for a sample) as a list of hotels with their key info: rating, star, and (total) price in the user local currency for the stay period. The user local currency is dependent on the user IP address when running the scrape.

The default scrape runs from the first to the last page of all available listings. Alternatively, one can also comment out parts of the code to scrape up to a maximum number of pages only. Instructions to do so are given in the script.

Using Task Scheduler in Windows or Cron in Unix/Linux, one can schedule the scrape to run daily to gather a large amount of data over many days. A possible application of this can be done by a tourist planning to visit a certain destination wanting to collect hotel prices days before the trip.

The print output after a run of the script is shown below:

Name		Name	Last commit message	Last commit date
Latest commit History 44 Commits
Images		Images
Checkpoints.zip		Checkpoints.zip
HotelScrape.py		HotelScrape.py
README.md		README.md
hotel_prices.csv		hotel_prices.csv
scrape_new.zip		scrape_new.zip
scraper.zip		scraper.zip
scraper1.zip		scraper1.zip
scraper_flight.zip		scraper_flight.zip

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

About

Uh oh!

Releases

Packages

Languages

QuantStats/WebScraping

Folders and files

Latest commit

History

Repository files navigation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages