My scraper collects three sets of data from the LCBO website:
Designed by pch.vector / Freepik
There are approximately registered products: 9,400 wine and 4,100 non-wine. "Non-wine" products can include beer, liquor, and reusuable bags.
When a Github Actions workflow (see .github/workflows/) is triggered, a bash script is executed. The bash script contains a cURL command that returns a JSON with the desired data. That's it!
For an in-depth guide, check out these blog posts:
- Scraping LCBO Data (Part 1: Store Information)
- Scraping LCBO Data (Part 2: Product Inventory)
- Scraping LCBO Data (Part 3: Product Descriptions)
- Making my own scraper bot!
- DIY Wine Database with postgreSQL
- Fork this repository!
- Settings > Actions > Workflow permissions: Read and write permissions
- Modify the frequency (cron) of scraping in the workflow files in .github/workflows.
- Please scrape gently. I purposely do not run simultaneous scraping jobs because (a) I am in no rush, (b) I don't want LCBO to be mad and change their setup, and (c) it is a waste of free cpu minutes.
Stephen Ro
This project is licensed under the BSD 3-Clause License - see the LICENSE.md file for details