NYCDSA Web Scraping Project

This is a web scraping project where I scraped the website xiachufang using Scrapy package.
Scrapy codes contained in xiachufang folder.
Data analysis folder contains csv files and jupyter notebook file for processing and analyzing the data.

I mainly studied the following questions.

1.How would the exclusive, picture steps, master cook and authors influence the ratings, number of attempts for the recipe and the number of ingredients used in the recipe?
2.What are the popularities for each major ingredients?
Pork, chicken, beef, lamb, fish, shrimp, egg, tofu
3.How would the number of ingredients affect the ratings and number of attempts for the recipe?
4.What are the more common combinations of ingredients?

Possible future work could be the followings.
1.Statistical analysis of exclusiveness, picture steps and master cooks.
2.Try to perform a more detailed classification of the ingredients.
3.If possible, scrape another similar website and see if certain conclusions about common combinations hold as well.

Name		Name	Last commit message	Last commit date
Latest commit History 39 Commits
Data_Analysis		Data_Analysis
xiachufang		xiachufang
README.md		README.md
scrapy.cfg		scrapy.cfg
xiachufang.pptx		xiachufang.pptx

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Data_Analysis

Data_Analysis

xiachufang

xiachufang

README.md

README.md

scrapy.cfg

scrapy.cfg

xiachufang.pptx

xiachufang.pptx

Repository files navigation

NYCDSA Web Scraping Project

About

Releases

Packages

Languages

shimmer-croissant0707/web_scraping_project

Folders and files

Latest commit

History

Repository files navigation

NYCDSA Web Scraping Project

About

Resources

Stars

Watchers

Forks

Languages