Scrapes America's Test Kitchen website for recipes and saves PNG screenshots and/or JSON for import to a recipe manager (e.g. https://mealie.io).
-
Chrome v111. If you have a different version of Chrome, replace with the corresponding driver found here.
-
Python 3.6 with an environment built off of
requirements.txt
. -
America's Test Kitchen/Cook's Country/Cook's Illustrated web subscription (or trial).
-h, --help
: show this help message and exit-e EMAIL, --email EMAIL
: ATK email for login.-p PASSWORD, --password PASSWORD
: Single quoted password for login. For example'my_password!*'
-r RECIPES, --recipes RECIPES
: Text file containing a list of individual recipes to grab.-j JSON, --json JSON
: Get recipes as json for mealie (default True)-i IMAGE, --image IMAGE
: Get recipes as images (default False)-o OUT_PATH, --out_path OUT_PATH
: Location to save images/json (default './recipes/')--driver DRIVER
: Path to the chromedriver. (default './chromedriver')--verbose
: verbose output
-h, --help
: show this help message and exit-e EMAIL, --email EMAIL
: ATK email for login.-p PASSWORD, --password PASSWORD
: Single quoted ATK password for login. For example'my_password!*'
-r RECIPES, --recipes RECIPES
: Text file containing a list of search result pages to recursively descend and grab all recipes from. See recipes.txt for an example. Using "All Recipes" page will not work as the site stops loading recipes after 900 are reached. It will not load "All Recipes" as the name implies. This is why you need to separate into smaller search sets-j JSON, --json JSON
: Get recipes as json for mealie (default True)-i IMAGE, --image IMAGE
: Get recipes as images (default False)-o OUT_PATH, --out_path OUT_PATH
: Location to save images/json (default./recipes/
)--driver DRIVER
: Path to the chromedriver. (default./chromedriver
)--verbose
: verbose output
- Selenium opens Chrome driver in headless mode.
- Logs into ATK using credentials provided.
- Iterates through the list of pages, whether individual recipes or full search pages.
- Each page source is passed to BeautifulSoup, which extracts all recipe links.
- Each recipe link is loaded with Selenium. Page dimensions are determined using page divs.
- If
-i
is specified, a screenshot is saved. The Chrome window is resized to fit these dimensions and a screenshot is saved to the specified path. Screenshots are cleaned using Pillow and saved as<image>.trimmed.png
- If
-j
is specified, recipe information is smartly scraped and loaded into JSON for later import to a recipe manager (e.g. mealie). The highlight image is also saved as a.jp2
image (this is the format used by ATK) - The program will load the next page and repeat.
This project is for educational, read-only purposes.
The use of this project is done at your own discretion and risk.
You are solely responsible for liability and consequences.