This is a web scraping project where I scraped the website xiachufang using Scrapy package.
Scrapy codes contained in xiachufang folder.
Data analysis folder contains csv files and jupyter notebook file for processing and analyzing the data.
I mainly studied the following questions.
1.How would the exclusive, picture steps, master cook and authors influence the ratings, number of attempts for the recipe and the number of ingredients used in the recipe?
2.What are the popularities for each major ingredients?
Pork, chicken, beef, lamb, fish, shrimp, egg, tofu
3.How would the number of ingredients affect the ratings and number of attempts for the recipe?
4.What are the more common combinations of ingredients?
Possible future work could be the followings.
1.Statistical analysis of exclusiveness, picture steps and master cooks.
2.Try to perform a more detailed classification of the ingredients.
3.If possible, scrape another similar website and see if certain conclusions about common combinations hold as well.