booking_scraper

I made this python 3.+ to prove that is possible to scrape complex websites with tons of data like booking.com, to study the Beautifulsoup module and to have fun programming.

For this project i started from the @ZoranPandovski BookingScraper

How to install

$ pip install booking-scraper

How to use

From bash

launch the script with :

$ python3 -m booking_scraper.bkscraper --city "New York" --limit 0

Usable Parameters

Parameter	Description	Mandatory
`--people`	Used to specify the number of people to the booking request. The default value is 1	No
`--country`	Used to specify the country for the scraping request.	Yes if `--city` is not used
`--city`	Used to specify the city for the scraping request.	Yes if `--country` is not used
`--datein`	Used to specifiy checkin day. If not setted the field will automatically be setted to today. Date format YYYY-mm-dd (e.g. 2021-01-21)	No
`--dateout`	Used to specifiy checkout day. If not setted the field will automatically be setted to tomorrow. Date format YYYY-mm-dd (e.g. 2021-02-21)	No
`-o --outdir`	Used to specify the output dir and filename. If not setted the field the script will automatically set the same execution script directory with the following nomenclature: CountryCity_date_time.json	No
`-d --detail`	Used it if you want more details in the output. It will take time to produce the output (not reccomended for long analysis or `--limit` is not setted to 0)	No
`-v --verbose`	Used it if you want more logs during the process.	No
`-l --limit`	Used to specify the number of page to fetch. If the number will be greater than the effective number of pages retrieved from the search it will be ignored like if the input number will be < 0. If will be 0 the process will fetch all the pages and if the number will be less than the effective number of page retrieved it will process only those pages	No

Bash Example

$ python3 -m booking_scraper.bkscraper --city "New York" --limit 0 -d -v -o ./test.json

From python Code

you can import the module with:

from booking_scraper import bkscraper

to use it the only available method is:

bkscraper.get_result(**kwargs)

it use the same logic as the bash execution mode and the available params are:

Parameter	Description	Mandatory
`people`	Used to specify the number of people to the booking request. The default value is 1	No
`country`	Used to specify the country for the scraping request.	Yes if `city` is not used
`city`	Used to specify the city for the scraping request.	Yes if `country` is not used
`datein`	Used to specifiy checkin day. If not setted the field will automatically be setted to today. format: YYYY-mm-dd (e.g. 2021-01-21)	No
`dateout`	Used to specifiy checkout day. If not setted the field will automatically be setted to tomorrow. format: YYYY-mm-dd (e.g. 2021-02-21)	No
`detail`	Used it if you want more details in the output. It will take time to produce the output (not reccomended for long analysis or `limit` is not setted to 0)	No
`limit`	Used to specify the number of page to fetch. If the number will be greater than the effective number of pages retrieved from the search it will be ignored like if the input number will be < 0. If will be 0 the process will fetch all the pages and if the number will be less than the effective number of page retrieved it will process only those pages	No

If either city or country param is not setted the script will rise an Exception.

Code Example

import json
from booking_scraper import bkscraper

#It fetches only the first page for New York city with details
result = bkscraper.get_result(city="New York", limit=1, detail=True)


with open("output.json", 'w', encoding='utf-8') as f:
    json.dump(result, f, ensure_ascii=False, indent=4)
    f.close()

Public

This script was first featured on Reddit at /r/learnpython - see here for the thread. I’ve finally decided to release the script.

ToDo List

Scrape the "all facility checklist" offered in the structure divided by groups like the website in the detail view
Make the script as module
Apply filters in the search
code review to make it lighter

Disclaimer

This project is for educational and research purposes only. Any actions and/or activities related to the material contained on this GitHub Repository is solely your responsibility. The misuse of the information in this GitHub Repository can result in criminal charges brought against the persons in question. The author will not be held responsible in the event any criminal charges be brought against any individuals misusing the information in this GitHub Repository to break the law.

You are not allowed to copy and paste content from Booking.com on to your own or third party pages (including social media pages such as Facebook, Twitter, Instagram etc.).

This applies to all types of content that can be found on Booking.com pages, including but not limited to hotel descriptions, reviews, hotel and room photos, hotel facility information, and prices. Moreover, this restriction also applies to content from Booking.com partner hotel websites and Booking Holdings Group company brands: such as Agoda, Priceline, Kayak, OpenTable, Rentalcars.

Clause 4.1.5 from Booking.com Affiliate Agreement: The Affiliate shall not programmatically evaluate and extract information (including guest reviews) from any part of the Booking.com Website (e.g. screen scrape)

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
booking_scraper		booking_scraper
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

booking_scraper

How to install

How to use

From bash

Usable Parameters

Bash Example

From python Code

Code Example

Public

ToDo List

Disclaimer

About

Releases

Packages

Languages

License

HexNio/booking_scraper

Folders and files

Latest commit

History

Repository files navigation

booking_scraper

How to install

How to use

From bash

Usable Parameters

Bash Example

From python Code

Code Example

Public

ToDo List

Disclaimer

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages