Skip to content

MongoDB is used to store data, which is modified in Python and analyzed with Pandas in Jupyter Notebook.

Notifications You must be signed in to change notification settings

hfattor/nosql-challenge

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

nosql-challenge

The UK Food Standards Agency evaluates various establishments across the United Kingdom and gives them a food hygiene rating.

Database and Jupyter Notebook Set Up

In the NoSQL_setup.ipynb Jupyter Notebook, there is code to import the JSON data in the Resources folder into MongoDB using Command Prompt/Terminal. It can also be uploaded through MongoDB Compass. The database is called uk_food and the collection is called establishments.

The file checks that data was uploaded correctly and can be accessed through Jupyter Notebook. It adds a new restaurant, Penang Flavours, to the database with pymongo and updates the field 'BusinessType' with the code for 'Restaurant/Cafe/Canteen' that is categorized in this dataset. All documents related to the Dover Local Authority are removed from the database. The values for latitude and longitude in the database are updated to doubles instead of strings.

Exploratory Analysis

In the NoSQL_analysis.ipynb Jupyter Notebook, pymongo queries and aggregation pipelines are used to answer the following questions:

  1. Which establishments have a hygiene score equal to 20?
  2. Which establishments in London have a RatingValue greater than or equal to 4?
  3. What are the top 5 establishments with a RatingValue of '5', sorted by lowest hygiene score, nearest to the new restaurant added, Penang Flavours?
  4. How many establishments in each Local Authority area have a hygiene score of 0?

Data Source

UK Food Standards Agency (2022). UK food hygiene rating data API. https://ratings.food.gov.uk/open-data/en-GB. Contains public sector information licensed under the Open Government Licence v3.0. Accessed Sept 9, 2022 and Sept 12, 2022 with the establishment settings as follows: longitude=51.5072, latitude=-0.1276, maxdistancelimit=4567, pagesize=10000, sortoptionkey=distance, pagenumber=(1,2,3,4,5,6,7,8).

About

MongoDB is used to store data, which is modified in Python and analyzed with Pandas in Jupyter Notebook.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published