It is a CRM to manage restaurants and recipes. User can:
- List of available recipes.
- See detailed view of a recipe.
- Add a new Recipe
- Run the initial setup:
make docker_setup
- Create the migrations:
make docker_makemigrations
- Run the migrations:
make docker_migrate
- Scrape and Populate:
make docker_populate
- Run the project:
make docker_up
- Access http://localhost:8000 on your browser and the project should be running there
- To access the logs for each service, run: make
docker_logs <service name>
(eitherbackend
,frontend
, etc) - To stop the project, run:
make docker_down
- Add test for scraping recipes.
- Add API tests for CRUD operations.
- Add frontend tests.
- Setup cron jobs to scrape recipes.
- Setting up a CI/CD pipeline.
The webscraper will scrape https://www.allrecipes.com/ for recipe pages available. Although the website contains over 10,000 recipes, we can decide how many recipes we want to scrape. The scraper starts at the main page and looks through all the links that are listed on the page, of which it filters out Recipe List urls and the Recipe Detail urls. The scraper uses Depth First Search to crawl through the whole website and scrape the Recipe List urls to find Recipe Details urls. At the end it parses through all the Recipe Detail Urls to capture the data and store in the database.
-
The CookBook scraper extracts the text from the Recipe Detail web page and stores it in the database. We can apply Natural Language Processing techniques to break down the text into further details which can be helpful to eliminate redundant data storage. Examples:
½ cup diced onion :
{ "ingredient": "Onion", "quantity": 0.5, "unit": "cup", "cut": "diced" }
2 ½ cups egg noodles :
{ "ingredient": "Egg Noodles", "quantity": 2.5, "unit": "cup" }
2 tablespoons vodka :
{ "ingredient": "Vodka", "quantity": 2, "unit": "tablespoon" }
4 (4 ounce) salmon fillets :
{ "ingredient": "Salmon", "quantity": 4, "unit": "ounce", "cut": "fillet" }
-
CookBook uses a SQL data storage but this applies limitation for the structure of data, and size of texts it can store. Using a NoSQL database would be big help for storing data in without worrying about missing out on data loss and will help extend CookBook to scrape and store data from different websites and source without worrying too much about the schema.
-
We can use Airflow or similar tools to schedule scraping of data from different sources at regular intervals.
- Django for the backend.
- Django Rest Framework to create APIs.
- PostgreSQL to store data.
- React for frontend.
- Ant Design for UI components.
- Ant Design Icons for Icons.
Using SQL database requires to define a schema for the models and thus setting a maximum length for CharField in Django. This requires the trimming of the scraped and parsed data and thus can display data which might be incomplete or might not makes sense. The following are the maximum length of characters that can be stored for different Models and fields.
Recipe:
title: 100
description: 200
directions: 400
Ingredient:
name: 20
Restaurant:
name: 20
address: 40
city: 10
Dish:
name: 10