CookBook

What is CookBook ?

It is a CRM to manage restaurants and recipes. User can:

List of available recipes.
See detailed view of a recipe.
Add a new Recipe

How to Setup and Run ?

Run the initial setup: make docker_setup
Create the migrations: make docker_makemigrations
Run the migrations: make docker_migrate
Scrape and Populate: make docker_populate
Run the project: make docker_up
Access http://localhost:8000 on your browser and the project should be running there
To access the logs for each service, run: make docker_logs <service name> (either backend, frontend, etc)
To stop the project, run: make docker_down

How to make it Production Ready ?

Add test for scraping recipes.
Add API tests for CRUD operations.
Add frontend tests.
Setup cron jobs to scrape recipes.
Setting up a CI/CD pipeline.

How does the Web Scraper works ?

The webscraper will scrape https://www.allrecipes.com/ for recipe pages available. Although the website contains over 10,000 recipes, we can decide how many recipes we want to scrape. The scraper starts at the main page and looks through all the links that are listed on the page, of which it filters out Recipe List urls and the Recipe Detail urls. The scraper uses Depth First Search to crawl through the whole website and scrape the Recipe List urls to find Recipe Details urls. At the end it parses through all the Recipe Detail Urls to capture the data and store in the database.

How can we make CookBook better ?

The CookBook scraper extracts the text from the Recipe Detail web page and stores it in the database. We can apply Natural Language Processing techniques to break down the text into further details which can be helpful to eliminate redundant data storage. Examples:

½ cup diced onion : { "ingredient": "Onion", "quantity": 0.5, "unit": "cup", "cut": "diced" }

2 ½ cups egg noodles : { "ingredient": "Egg Noodles", "quantity": 2.5, "unit": "cup" }

2 tablespoons vodka : { "ingredient": "Vodka", "quantity": 2, "unit": "tablespoon" }

4 (4 ounce) salmon fillets : { "ingredient": "Salmon", "quantity": 4, "unit": "ounce", "cut": "fillet" }
CookBook uses a SQL data storage but this applies limitation for the structure of data, and size of texts it can store. Using a NoSQL database would be big help for storing data in without worrying about missing out on data loss and will help extend CookBook to scrape and store data from different websites and source without worrying too much about the schema.
We can use Airflow or similar tools to schedule scraping of data from different sources at regular intervals.

What is the Tech Stack ?

Django for the backend.
Django Rest Framework to create APIs.
PostgreSQL to store data.
React for frontend.
Ant Design for UI components.
Ant Design Icons for Icons.

What are the Limitations ?

Using SQL database requires to define a schema for the models and thus setting a maximum length for CharField in Django. This requires the trimming of the scraped and parsed data and thus can display data which might be incomplete or might not makes sense. The following are the maximum length of characters that can be stored for different Models and fields.

Recipe:
    title: 100
    description: 200
    directions: 400

Ingredient:
    name: 20

Restaurant:
    name: 20
    address: 40
    city: 10

Dish:
    name: 10

How does it look ?

Recipe List
Recipe Detail
Add Recipe

Name		Name	Last commit message	Last commit date
Latest commit History 29 Commits
backend		backend
documentation/screenshots		documentation/screenshots
frontend		frontend
.dockerignore		.dockerignore
.editorconfig		.editorconfig
.eslintignore		.eslintignore
.eslintrc.js		.eslintrc.js
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
CookBook.postman_collection.json		CookBook.postman_collection.json
LICENSE.txt		LICENSE.txt
Makefile		Makefile
README.md		README.md
babel.config.json		babel.config.json
docker-compose.yml		docker-compose.yml
jest.config.js		jest.config.js
jest.setup.js		jest.setup.js
package-lock.json		package-lock.json
package.json		package.json
parsed_recipes.json		parsed_recipes.json
proj_main.yml		proj_main.yml
pyproject.toml		pyproject.toml
recipes1000.json		recipes1000.json
render.yaml		render.yaml
render_build.sh		render_build.sh
scraper.ipynb		scraper.ipynb
webpack.config.js		webpack.config.js

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

CookBook

What is CookBook ?

How to Setup and Run ?

How to make it Production Ready ?

How does the Web Scraper works ?

How can we make CookBook better ?

What is the Tech Stack ?

What are the Limitations ?

How does it look ?

About

Uh oh!

Uh oh!

Languages

License

compmonk/CookBook

Folders and files

Latest commit

History

Repository files navigation

CookBook

What is CookBook ?

How to Setup and Run ?

How to make it Production Ready ?

How does the Web Scraper works ?

How can we make CookBook better ?

What is the Tech Stack ?

What are the Limitations ?

How does it look ?

About

Topics

Resources

License

Code of conduct

Contributing

Uh oh!

Stars

Watchers

Forks

Uh oh!

Languages