This is a Julia script to scrape a users brewtoad account. Currently downloads
- HTML for each recipe page
- BeerXML for each recipe
- Brew logs for each recipe (under
brewlogs/in the recipe dir)
- Your Brewtoad user id (the number after "users/" in your profile URL).
- Julia 1.0 or later
- Required packages are specifed in the Project.toml file. Install them by instantiating the environment (as shown below)
$ cd brewtoad-scrape # install dependencies: $ julia -e 'using Pkg; pkg"activate .; instantiate"' $ julia --project=. scrape.jl <brewtoad-user-id>
I wrote this before I found Cascadia.jl, which sould seriously streamline the selector/querying parts.
At some point brewtoad started returning 403: Forbidden responses to requests
from HTTP.jl (even with
User-Agent set to something sensible), while still
serving pages normally for both a private browser window and
curl. So I
replaced all the
HTTP.request("GET", uri) with
because I was too lazy to figure out what was really going wrong.
This repository also contains my own scraped data in