Cloudflare worker for scraping listings from vrbo.com.
- Makes use of the undocumented vrbo.com API on GraphQL.
- Extracts a given number of listings for a specific location.
- Stores the listing details in a Cloudflare's D1 database.
- Returns a simplified JSON response with the relevant data.
- Click on Workers
- Create a subdomain (choose an unique name - let's say its 'my-subdomain')
- Create a service (enter 'wbee' as service name)
- Test the service by clicking on Preview.
It should go tohttps://wbee.my-subdomain.workers.dev,
which shows a simple 'Hello world' message.
npm install -g wranglercdto any directory where you want to put the worker in.git clone https://github.com/quark1482/wbeecd wbeenpm installwrangler loginwrangler d1 create wbee
Copy the database_id. Let's say its '2ba86d35-c3e1-6a21-db83-e963b4789720'.echo 'name = "wbee"' > wrangler.tomlecho 'main = "src/index.js"' >> wrangler.tomlecho 'compatibility_date = "2023-03-07"' >> wrangler.tomlecho '[[ d1_databases ]]' >> wrangler.tomlecho 'binding = "DB"' >> wrangler.tomlecho 'database_name = "wbee"' >> wrangler.tomlecho 'database_id = "2ba86d35-c3e1-6a21-db83-e963b4789720"' >> wrangler.toml
Use the database_id value from the previous wrangler command.wrangler publish
Browsing tohttps://wbee.my-subdomain.workers.devat this time, should show
something like{"error":"Missing parameter 'location'"}, and it's fine.wrangler d1 execute wbee --file=./schema.sql
- Browse to
https://wbee.my-subdomain.workers.dev?location=boston&count=10.
Location can be a full name, like 'boston, massachusetts, united states'.
Invalid locations should show{"error":"Unexpected content: suggestions array came empty"}.
The parameter 'count' is optional and its default value is 50.
Count is a 'maximum possible'. There could be fewer results for small cities. - @ dash.cloudflare.com, click to Workers, and then on D1.
- The database 'wbee' is now visible. Click on its name.
- 'Listings' appears in the list of tables. Click on it.
The table is being cleared on every worker request, to save resources.
Remove the 'Delete From' instruction in the function 'saveResults()'
located in './src/index.js' to avoid this behavior, and publish the worker again.
The response JSON includes a simplified array of listing details, which is pretty fast to gather,
compared to the vrbo.com API's internal GraphQL query result.
Pay attention to the file ./schema.sql to see how the Listings table is created:
CREATE TABLE Listings (
ListingId INT,
URL TEXT,
Name TEXT,
Description TEXT,
Type TEXT,
Beds INT,
Bedrooms INT,
Bathrooms INT,
Guests INT,
Price TEXT,
Rating REAL,
Amenities TEXT,
Photos TEXT,
Location TEXT,
PRIMARY KEY (ListingId)
);Given their 'composed' nature, the fields Price, Amenities, Photos and Location are stored
as JSON content, to overcome the SQLITE (the D1's underlying database engine) limitations.
This README file is under construction.