Skip to content

Commit

Permalink
Merge branch 'main' into add-aldi-au
Browse files Browse the repository at this point in the history
  • Loading branch information
mlduff committed Jun 6, 2024
2 parents 1bc7b4b + 188c5fd commit 7911034
Show file tree
Hide file tree
Showing 490 changed files with 73,300 additions and 9,420 deletions.
17 changes: 17 additions & 0 deletions README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -89,15 +89,18 @@ Some Python HTTP clients that you can use to retrieve HTML include `requests <ht
Scrapers available for:
-----------------------

- `https://15gram.be <https://15gram.be>`_
- `https://aberlehome.com/ <https://aberlehome.com>`_
- `https://claudia.abril.com.br/ <https://claudia.abril.com.br>`_
- `https://abuelascounter.com/ <https://abuelascounter.com>`_
- `https://www.acouplecooks.com <https://acouplecooks.com/>`_
- `https://aflavorjournal.com/ <https://aflavorjournal.com/>`_
- `https://addapinch.com/ <https://addapinch.com/>`_
- `http://www.afghankitchenrecipes.com/ <http://www.afghankitchenrecipes.com/>`_
- `https://akispetretzikis.com/ <https://akispetretzikis.com/>`_
- `https://ah.nl/ <https://ah.nl/>`_
- `https://alittlebityummy.com/ <https://alittlebityummy.com/>`_
- `https://alexandracooks.com/ <https://alexandracooks.com/>`_
- `https://allrecipes.com/ <https://allrecipes.com/>`_
- `https://allthehealthythings.com/ <https://allthehealthythings.com/>`_
- `https://alltommat.se/ <https://alltommat.se/>`_
Expand All @@ -109,6 +112,7 @@ Scrapers available for:
- `https://www.arla.se/ <https://www.arla.se/>`_
- `https://www.atelierdeschefs.fr/ <https://www.atelierdeschefs.fr/>`_
- `https://averiecooks.com/ <https://www.averiecooks.com/>`_
- `https://barefeetinthekitchen.com/ <https://barefeetinthekitchen.com/>`_
- `https://barefootcontessa.com/ <https://barefootcontessa.com>`_
- `https://www.bakels.com.au/ <https://www.bakels.com.au/>`_
- `https://baking-sense.com/ <https://baking-sense.com/>`_
Expand All @@ -130,6 +134,7 @@ Scrapers available for:
- `https://breadtopia.com/ <https://breadtopia.com/>`_
- `https://briceletbaklava.ch/ <https://briceletbaklava.ch/>`_
- `https://budgetbytes.com/ <https://budgetbytes.com>`_
- `https://cafedelites.com/ <https://cafedelites.com/>`_
- `https://carlsbadcravings.com/ <https://carlsbadcravings.com/>`_
- `https://castironketo.net/ <https://castironketo.net/>`_
- `https://cdkitchen.com/ <https://cdkitchen.com/>`_
Expand All @@ -153,9 +158,11 @@ Scrapers available for:
- `https://cucchiaio.it/ <https://cucchiaio.it>`_
- `https://cuisineaz.com/ <https://cuisineaz.com>`_
- `https://cybercook.com.br/ <https://cybercook.com.br/>`_
- `https://damndelicious.net/ <https://damndelicious.net/>`_
- `https://www.davidlebovitz.com/ <https://www.davidlebovitz.com/>`_
- `https://delish.com/ <https://delish.com>`_
- `https://dinneratthezoo.com/ <https://dinneratthezoo.com>`_
- `https://dinnerthendessert.com/ <https://dinnerthendessert.com/>`_
- `https://dish.co.nz/ <https://dish.co.nz>`_
- `https://domesticate-me.com/ <https://domesticate-me.com/>`_
- `https://downshiftology.com/ <https://downshiftology.com/>`_
Expand All @@ -172,6 +179,7 @@ Scrapers available for:
- `https://ethanchlebowski.com/ <https://ethanchlebowski.com>`_
- `https://epicurious.com/ <https://epicurious.com>`_
- `https://www.evolvingtable.com/ <https://www.evolvingtable.com/>`_
- `https://www.familyfoodonthetable.com/ <https://www.familyfoodonthetable.com/>`_
- `https://www.errenskitchen.com/ <https://www.errenskitchen.com/>`_
- `https://recipes.farmhousedelivery.com/ <https://recipes.farmhousedelivery.com/>`_
- `https://www.farmhouseonboone.com/ <https://www.farmhouseonboone.com/>`_
Expand Down Expand Up @@ -230,6 +238,7 @@ Scrapers available for:
- `https://izzycooking.com/ <https://izzycooking.com/>`_
- `https://jamieoliver.com/ <https://jamieoliver.com>`_
- `https://jimcooksfoodgood.com/ <https://jimcooksfoodgood.com/>`_
- `https://www.jocooks.com/ <https://www.jocooks.com>`_
- `https://joshuaweissman.com/ <https://joshuaweissman.com/>`_
- `https://joyfoodsunshine.com/ <https://joyfoodsunshine.com>`_
- `https://joythebaker.com/ <https://joythebaker.com>`_
Expand Down Expand Up @@ -275,6 +284,7 @@ Scrapers available for:
- `https://www.marthastewart.com/ <https://www.marthastewart.com/>`_
- `https://matprat.no/ <https://matprat.no/>`_
- `https://www.mccormick.com/ <https://www.mccormick.com/>`_
- `https://www.modernhoney.com/ <https://www.modernhoney.com/>`_
- `https://meljoulwan.com/ <https://meljoulwan.com/>`_
- `https://www.melskitchencafe.com/ <https://www.melskitchencafe.com/>`_
- `http://mindmegette.hu/ <http://mindmegette.hu/>`_
Expand All @@ -285,6 +295,7 @@ Scrapers available for:
- `https://momswithcrockpots.com/ <https://momswithcrockpots.com>`_
- `https://monsieur-cuisine.com/ <https://monsieur-cuisine.com>`_
- `http://motherthyme.com/ <http://motherthyme.com/>`_
- `https://www.momontimeout.com/ <https://www.momontimeout.com/>`_
- `https://www.moulinex.fr/ <https://www.moulinex.fr/>`_
- `https://www.mundodereceitasbimby.com.pt/ <https://www.mundodereceitasbimby.com.pt/>`_
- `https://mybakingaddiction.com/ <https://mybakingaddiction.com>`_
Expand All @@ -298,6 +309,7 @@ Scrapers available for:
- `https://nibbledish.com/ <https://nibbledish.com>`_
- `https://www.nhs.uk/healthier-families/ <https://www.nhs.uk/healthier-families/>`_
- `https://www.nosalty.hu/ <https://www.nosalty.hu>`_
- `https://www.notenoughcinnamon.com/ <https://www.notenoughcinnamon.com/>`_
- `https://nourishedbynutrition.com/ <https://nourishedbynutrition.com/>`_
- `https://www.nrk.no/ <https://www.nrk.no/>`_
- `https://www.number-2-pencil.com/ <https://www.number-2-pencil.com/>`_
Expand Down Expand Up @@ -335,6 +347,7 @@ Scrapers available for:
- `https://realsimple.com/ <https://www.realsimple.com>`_
- `https://recept.se/ <https://recept.se/>`_
- `https://www.receitasnestle.com.br <https://www.receitasnestle.com.br>`_
- `https://www.recipegirl.com/ <https://www.recipegirl.com/>`_
- `https://reciperunner.com/ <https://www.reciperunner.com>`_
- `https://recipetineats.com/ <https://www.recipetineats.com/>`_
- `https://redhousespice.com/ <https://redhousespice.com/>`_
Expand All @@ -349,6 +362,7 @@ Scrapers available for:
- `https://sallys-blog.de <https://sallys-blog.de/>`_
- `https://saltpepperskillet.com/ <https://saltpepperskillet.com/>`_
- `https://www.saveur.com/ <https://www.saveur.com/>`_
- `https://www.savorynothings.com/ <https://www.savorynothings.com/>`_
- `https://seriouseats.com/ <https://seriouseats.com>`_
- `https://simple-veganista.com/ <https://simple-veganista.com/>`_
- `https://simplyquinoa.com/ <https://simplyquinoa.com>`_
Expand All @@ -371,6 +385,7 @@ Scrapers available for:
- `https://sweetcsdesigns.com/ <https://www.sweetcsdesigns.com/>`_
- `https://sweetpeasandsaffron.com/ <https://sweetpeasandsaffron.com/>`_
- `https://www.tasteatlas.com/ <https://www.tasteatlas.com/>`_
- `https://www.thecookierookie.com/ <hhttps://www.thecookierookie.com/>`_
- `https://www.taste.com.au/ <https://www.taste.com.au/>`_
- `https://tasteofhome.com <https://tasteofhome.com>`_
- `https://tastesbetterfromscratch.com <https://tastesbetterfromscratch.com>`_
Expand All @@ -387,6 +402,7 @@ Scrapers available for:
- `https://www.themagicalslowcooker.com/ <https://www.themagicalslowcooker.com/>`_
- `https://themodernproper.com/ <https://themodernproper.com/>`_
- `https://www.thepalatablelife.com <https://www.thepalatablelife.com/>`_
- `https://thesaltymarshmallow.com/ <https://thesaltymarshmallow.com/>`_
- `https://thepioneerwoman.com/ <https://thepioneerwoman.com>`_
- `https://therecipecritic.com/ <https://therecipecritic.com>`_
- `https://thespruceeats.com/ <https://thespruceeats.com/>`_
Expand All @@ -399,6 +415,7 @@ Scrapers available for:
- `https://tudogostoso.com.br/ <https://www.tudogostoso.com.br/>`_
- `https://twopeasandtheirpod.com/ <http://twopeasandtheirpod.com>`_
- `https://uitpaulineskeuken.nl/ <https://uitpaulineskeuken.nl>`_
- `https://unsophisticook.com/ <https://unsophisticook.com/>`_
- `https://usapears.org/ <https://usapears.org>`_
- `https://www.valdemarsro.dk/ <https://www.valdemarsro.dk/>`_
- `https://vanillaandbean.com/ <https://vanillaandbean.com>`_
Expand Down
28 changes: 14 additions & 14 deletions docs/how-to-develop-scraper.md
Original file line number Diff line number Diff line change
Expand Up @@ -171,20 +171,20 @@ The generated json file will look something like this, with only the host field

```json
{
"author": "",
"canonical_url": "",
"host": "<host>",
"description": "",
"image": "",
"ingredients": "",
"ingredient_groups": "",
"instructions": "",
"instructions_list": "",
"language": "",
"site_name": "",
"title": "",
"total_time": "",
"yields": ""
"host": "<host>",
"canonical_url": "",
"site_name": "",
"author": "",
"language": "",
"title": "",
"ingredients": "",
"ingredient_groups": "",
"instructions": "",
"instructions_list": "",
"total_time": "",
"yields": "",
"image": "",
"description": "",
}
```

Expand Down
14 changes: 7 additions & 7 deletions generate.py
Original file line number Diff line number Diff line change
Expand Up @@ -35,20 +35,20 @@ def generate_scraper_test(class_name, host_name):
os.mkdir(f"tests/test_data/{host_name}")

testjson = {
"author": "",
"canonical_url": "",
"host": host_name,
"description": "",
"image": "",
"canonical_url": "",
"site_name": "",
"author": "",
"language": "",
"title": "",
"ingredients": "",
"ingredient_groups": "",
"instructions": "",
"instructions_list": "",
"language": "",
"site_name": "",
"title": "",
"total_time": "",
"yields": "",
"image": "",
"description": "",
}

output = f"tests/test_data/{host_name}/{class_name.lower()}.json"
Expand Down
34 changes: 34 additions & 0 deletions recipe_scrapers/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,9 +14,11 @@
from .acouplecooks import ACoupleCooks
from .addapinch import AddAPinch
from .afghankitchenrecipes import AfghanKitchenRecipes
from .aflavorjournal import AFlavorJournal
from .akispetretzikis import AkisPetretzikis
from .albertheijn import AlbertHeijn
from .aldi import Aldi
from .alexandracooks import AlexandraCooks
from .alittlebityummy import ALittleBitYummy
from .allrecipes import AllRecipes
from .allthehealthythings import AllTheHealthyThings
Expand All @@ -32,6 +34,7 @@
from .bakels import Bakels
from .bakingmischief import BakingMischief
from .bakingsense import BakingSense
from .barefeetinthekitchen import BarefeetInTheKitchen
from .barefootcontessa import BareFootContessa
from .bbcfood import BBCFood
from .bbcgoodfood import BBCGoodFood
Expand All @@ -49,6 +52,7 @@
from .breadtopia import Breadtopia
from .briceletbaklava import BricelEtBaklava
from .budgetbytes import BudgetBytes
from .cafedelites import CafeDelites
from .carlsbadcravings import CarlsBadCravings
from .castironketo import CastIronKeto
from .cdkitchen import CdKitchen
Expand All @@ -71,9 +75,11 @@
from .cucchiaio import Cucchiaio
from .cuisineaz import CuisineAZ
from .cybercook import Cybercook
from .damndelicious import DamnDelicious
from .davidlebovitz import DavidLebovitz
from .delish import Delish
from .dinneratthezoo import DinnerAtTheZoo
from .dinnerthendessert import DinnerThenDessert
from .dishnz import Dishnz
from .domesticateme import DomesticateMe
from .downshiftology import Downshiftology
Expand All @@ -91,10 +97,12 @@
from .errenskitchen import ErrensKitchen
from .ethanchlebowski import EthanChlebowski
from .evolvingtable import EvolvingTable
from .familyfoodonthetable import FamilyfoodOnTheTable
from .farmhousedelivery import FarmhouseDelivery
from .farmhouseonboone import FarmhouseOnBoone
from .fattoincasadabenedetta import FattoInCasaDaBenedetta
from .felixkitchen import FelixKitchen
from .fifteengram import FifteenGram
from .fifteenspatulas import FifteenSpatulas
from .finedininglovers import FineDiningLovers
from .fitmencook import FitMenCook
Expand Down Expand Up @@ -144,6 +152,7 @@
from .izzycooking import IzzyCooking
from .jamieoliver import JamieOliver
from .jimcooksfoodgood import JimCooksFoodGood
from .jocooks import JoCooks
from .joshuaweissman import JoshuaWeissman
from .joyfoodsunshine import Joyfoodsunshine
from .joythebaker import JoyTheBaker
Expand Down Expand Up @@ -191,6 +200,8 @@
from .ministryofcurry import MinistryOfCurry
from .misya import Misya
from .mob import Mob
from .modernhoney import ModernHoney
from .momontimeout import MomOnTimeout
from .momswithcrockpots import MomsWithCrockPots
from .monsieurcuisine import MonsieurCuisine
from .motherthyme import MotherThyme
Expand All @@ -206,6 +217,7 @@
from .nibbledish import NibbleDish
from .nihhealthyeating import NIHHealthyEating
from .norecipes import NoRecipes
from .notenoughcinnamon import NotEnoughCinnamon
from .nourishedbynutrition import NourishedByNutrition
from .nrkmat import NRKMat
from .number2pencil import Number2Pencil
Expand Down Expand Up @@ -243,6 +255,7 @@
from .realsimple import RealSimple
from .receitasnestlebr import ReceitasNestleBR
from .recept import Recept
from .recipegirl import RecipeGirl
from .reciperunner import RecipeRunner
from .recipetineats import RecipeTinEats
from .redhousespice import RedHouseSpice
Expand All @@ -257,6 +270,7 @@
from .sallysblog import SallysBlog
from .saltpepperskillet import SaltPepperSkillet
from .saveur import Saveur
from .savorynothings import SavoryNothings
from .seriouseats import SeriousEats
from .simpleveganista import SimpleVeganista
from .simplycookit import SimplyCookit
Expand Down Expand Up @@ -287,6 +301,7 @@
from .tasty import Tasty
from .tastykitchen import TastyKitchen
from .theclevercarrot import TheCleverCarrot
from .thecookierookie import TheCookieRookie
from .thecookingguy import TheCookingGuy
from .theexpertguides import TheExpertGuides
from .thehappyfoodie import TheHappyFoodie
Expand All @@ -298,6 +313,7 @@
from .thepalatablelife import ThePalatableLife
from .thepioneerwoman import ThePioneerWoman
from .therecipecritic import Therecipecritic
from .thesaltymarshmallow import TheSaltyMarshmallow
from .thespruceeats import TheSpruceEats
from .thevintagemixer import TheVintageMixer
from .thewoksoflife import Thewoksoflife
Expand All @@ -309,6 +325,7 @@
from .tudogostoso import TudoGostoso
from .twopeasandtheirpod import TwoPeasAndTheirPod
from .uitpaulineskeukennl import UitPaulinesKeukenNL
from .unsophisticook import Unsophisticook
from .usapears import USAPears
from .usdamyplate import USDAMyPlate
from .valdemarsro import Valdemarsro
Expand Down Expand Up @@ -338,6 +355,7 @@

SCRAPERS = {
ACoupleCooks.host(): ACoupleCooks,
AFlavorJournal.host(): AFlavorJournal,
ALittleBitYummy.host(): ALittleBitYummy,
AberleHome.host(): AberleHome,
Abril.host(): Abril,
Expand All @@ -347,6 +365,7 @@
AkisPetretzikis.host(): AkisPetretzikis,
AlbertHeijn.host(): AlbertHeijn,
Aldi.host(): Aldi,
AlexandraCooks.host(): AlexandraCooks,
AllRecipes.host(): AllRecipes,
AllTheHealthyThings.host(): AllTheHealthyThings,
AllTomat.host(): AllTomat,
Expand All @@ -365,6 +384,7 @@
BakingSense.host(): BakingSense,
BakingMischief.host(): BakingMischief,
BareFootContessa.host(): BareFootContessa,
BarefeetInTheKitchen.host(): BarefeetInTheKitchen,
BestRecipes.host(): BestRecipes,
BettyBossi.host(): BettyBossi,
BettyCrocker.host(): BettyCrocker,
Expand All @@ -379,6 +399,7 @@
Breadtopia.host(): Breadtopia,
BricelEtBaklava.host(): BricelEtBaklava,
BudgetBytes.host(): BudgetBytes,
CafeDelites.host(): CafeDelites,
CarlsBadCravings.host(): CarlsBadCravings,
CastIronKeto.host(): CastIronKeto,
CdKitchen.host(): CdKitchen,
Expand All @@ -401,38 +422,50 @@
Cucchiaio.host(): Cucchiaio,
CuisineAZ.host(): CuisineAZ,
Cybercook.host(): Cybercook,
DamnDelicious.host(): DamnDelicious,
DavidLebovitz.host(): DavidLebovitz,
Delish.host(): Delish,
DinnerAtTheZoo.host(): DinnerAtTheZoo,
DinnerThenDessert.host(): DinnerThenDessert,
Dishnz.host(): Dishnz,
EatLiveRun.host(): EatLiveRun,
ElaVegan.host(): ElaVegan,
EvolvingTable.host(): EvolvingTable,
FamilyfoodOnTheTable.host(): FamilyfoodOnTheTable,
FifteenGram.host(): FifteenGram,
FitSlowCookerQueen.host(): FitSlowCookerQueen,
GourmetTraveller.host(): GourmetTraveller,
GrandFrais.host(): GrandFrais,
HeatherChristo.host(): HeatherChristo,
InBloomBakery.host(): InBloomBakery,
JoCooks.host(): JoCooks,
JoshuaWeissman.host(): JoshuaWeissman,
JoyTheBaker.host(): JoyTheBaker,
KitchenAidAustralia.host(): KitchenAidAustralia,
KristinesKitchenBlog.host(): KristinesKitchenBlog,
KuchynaLidla.host(): KuchynaLidla,
McCormick.host(): McCormick,
ModernHoney.host(): ModernHoney,
MomOnTimeout.host(): MomOnTimeout,
Moulinex.host(): Moulinex,
MundoDeReceitasBimby.host(): MundoDeReceitasBimby,
MyJewishLearning.host(): MyJewishLearning,
MyKoreanKitchen.host(): MyKoreanKitchen,
NotEnoughCinnamon.host(): NotEnoughCinnamon,
NutritionFacts.host(): NutritionFacts,
OneSweetAppetite.host(): OneSweetAppetite,
PinchOfYum.host(): PinchOfYum,
PotatoRolls.host(): PotatoRolls,
Recept.host(): Recept,
RecipeGirl.host(): RecipeGirl,
RicettePerBimby.host(): RicettePerBimby,
SavoryNothings.host(): SavoryNothings,
StrongrFastr.host(): StrongrFastr,
TasteAtlas.host(): TasteAtlas,
TheCookieRookie.host(): TheCookieRookie,
TheCookingGuy.host(): TheCookingGuy,
ThePalatableLife.host(): ThePalatableLife,
TheSaltyMarshmallow.host(): TheSaltyMarshmallow,
Thinlicious.host(): Thinlicious,
DomesticateMe.host(): DomesticateMe,
Downshiftology.host(): Downshiftology,
Expand Down Expand Up @@ -667,6 +700,7 @@
TwoPeasAndTheirPod.host(): TwoPeasAndTheirPod,
USAPears.host(): USAPears,
USDAMyPlate.host(): USDAMyPlate,
Unsophisticook.host(): Unsophisticook,
Valdemarsro.host(): Valdemarsro,
VanillaAndBean.host(): VanillaAndBean,
VegRecipesOfIndia.host(): VegRecipesOfIndia,
Expand Down
Loading

0 comments on commit 7911034

Please sign in to comment.