# Eat Safe, Love

## Part 1: Database and Jupyter Notebook Set Up

Import the data provided in the `establishments.json` file from your Terminal. Name the database `uk_food` and the collection `establishments`.

Within this markdown cell, copy the line of text you used to import the data from your Terminal. This way, future analysts will be able to repeat your process.

Import the dataset with `$ mongoimport --type json -d uk_food -c establishments --drop --jsonArray establishments.json`

In [1]:
# Import dependencies
from pymongo import MongoClient
from pprint import pprint

In [2]:
# Create an instance of MongoClient
mongo = MongoClient(port=27017)

In [3]:
# confirm that our new database was created
print(mongo.list_database_names())

['admin', 'class_db', 'config', 'epa', 'garden_db', 'local', 'met', 'petsitly_marketing', 'uk_food']


In [4]:
# assign the uk_food database to a variable name
db = mongo['uk_food']

In [5]:
# review the collections in our new database
print(db.list_collection_names())

['establishments']


In [6]:
# view a document in the establishments collection
one_establishment = db.establishments.find_one()

pprint(one_establishment)

{'AddressLine1': 'The Bay',
 'AddressLine2': 'St Margarets Bay',
 'AddressLine3': 'Kent',
 'AddressLine4': '',
 'BusinessName': 'Refreshment Kiosk',
 'BusinessType': 'Restaurant/Cafe/Canteen',
 'BusinessTypeID': 1,
 'ChangesByServerID': 0,
 'Distance': 4587.347174863443,
 'FHRSID': 254719,
 'LocalAuthorityBusinessID': 'PI/000069980',
 'LocalAuthorityCode': '182',
 'LocalAuthorityEmailAddress': 'publicprotection@dover.gov.uk',
 'LocalAuthorityName': 'Dover',
 'LocalAuthorityWebSite': 'http://www.dover.gov.uk/',
 'NewRatingPending': False,
 'Phone': '',
 'PostCode': 'CT15 6DY',
 'RatingDate': '2022-03-24T00:00:00',
 'RatingKey': 'fhrs_5_en-gb',
 'RatingValue': '5',
 'RightToReply': '',
 'SchemeType': 'FHRS',
 '_id': ObjectId('64e1abcb73c9924e75b8eb74'),
 'geocode': {'latitude': '51.152225', 'longitude': '1.387974'},
 'links': [{'href': 'https://api.ratings.food.gov.uk/establishments/254719',
            'rel': 'self'}],
 'meta': {'dataSource': None,
          'extractDate': '0001-01-01T0

In [7]:
# assign the collection to a variable
establishments = db['establishments']

## Part 2: Update the Database

1. An exciting new halal restaurant just opened in Greenwich, but hasn't been rated yet. The magazine has asked you to include it in your analysis. Add the following restaurant "Penang Flavours" to the database.

In [8]:
# Create a dictionary for the new restaurant data
new_restaurant = {
    "BusinessName":"Penang Flavours",
    "BusinessType":"Restaurant/Cafe/Canteen",
    "BusinessTypeID":"",
    "AddressLine1":"Penang Flavours",
    "AddressLine2":"146A Plumstead Rd",
    "AddressLine3":"London",
    "AddressLine4":"",
    "PostCode":"SE18 7DY",
    "Phone":"",
    "LocalAuthorityCode":"511",
    "LocalAuthorityName":"Greenwich",
    "LocalAuthorityWebSite":"http://www.royalgreenwich.gov.uk",
    "LocalAuthorityEmailAddress":"health@royalgreenwich.gov.uk",
    "scores":{
        "Hygiene":"",
        "Structural":"",
        "ConfidenceInManagement":""
    },
    "SchemeType":"FHRS",
    "geocode":{
        "longitude":"0.08384000",
        "latitude":"51.49014200"
    },
    "RightToReply":"",
    "Distance":4623.9723280747176,
    "NewRatingPending":True  
}

In [9]:
# Insert the new restaurant into the collection
insert_result = establishments.insert_one(new_restaurant)

In [10]:
# Check that the new restaurant was inserted
query = {'BusinessName': 'Penang Flavours'}
results = establishments.find(query)
for result in results:
    print(result)

{'_id': ObjectId('64e1ac112960880a426b75f0'), 'BusinessName': 'Penang Flavours', 'BusinessType': 'Restaurant/Cafe/Canteen', 'BusinessTypeID': '', 'AddressLine1': 'Penang Flavours', 'AddressLine2': '146A Plumstead Rd', 'AddressLine3': 'London', 'AddressLine4': '', 'PostCode': 'SE18 7DY', 'Phone': '', 'LocalAuthorityCode': '511', 'LocalAuthorityName': 'Greenwich', 'LocalAuthorityWebSite': 'http://www.royalgreenwich.gov.uk', 'LocalAuthorityEmailAddress': 'health@royalgreenwich.gov.uk', 'scores': {'Hygiene': '', 'Structural': '', 'ConfidenceInManagement': ''}, 'SchemeType': 'FHRS', 'geocode': {'longitude': '0.08384000', 'latitude': '51.49014200'}, 'RightToReply': '', 'Distance': 4623.972328074718, 'NewRatingPending': True}


2. Find the BusinessTypeID for "Restaurant/Cafe/Canteen" and return only the BusinessTypeID and BusinessType fields.

In [11]:
# Find the BusinessTypeID for "Restaurant/Cafe/Canteen" and return only the BusinessTypeID and BusinessType fields
# I created a query that finds all the BusinessTypeID for "Restaurant/Cafe/Canteen"
query = {'BusinessType': "Restaurant/Cafe/Canteen"}
fields = {'BusinessType': 1, 'BusinessTypeID': 1}

# Retrieve the results using find()
results = establishments.find(query, fields)

# Print just the first result from the query
# Loop through the cursor and print each document
for result in results:
    pprint(result)

{'BusinessType': 'Restaurant/Cafe/Canteen',
 'BusinessTypeID': 1,
 '_id': ObjectId('64e1abcb73c9924e75b8eb74')}
{'BusinessType': 'Restaurant/Cafe/Canteen',
 'BusinessTypeID': 1,
 '_id': ObjectId('64e1abcb73c9924e75b8eb77')}
{'BusinessType': 'Restaurant/Cafe/Canteen',
 'BusinessTypeID': 1,
 '_id': ObjectId('64e1abcb73c9924e75b8eb78')}
{'BusinessType': 'Restaurant/Cafe/Canteen',
 'BusinessTypeID': 1,
 '_id': ObjectId('64e1abcb73c9924e75b8eb79')}
{'BusinessType': 'Restaurant/Cafe/Canteen',
 'BusinessTypeID': 1,
 '_id': ObjectId('64e1abcb73c9924e75b8eb7a')}
{'BusinessType': 'Restaurant/Cafe/Canteen',
 'BusinessTypeID': 1,
 '_id': ObjectId('64e1abcb73c9924e75b8eb84')}
{'BusinessType': 'Restaurant/Cafe/Canteen',
 'BusinessTypeID': 1,
 '_id': ObjectId('64e1abcb73c9924e75b8eb89')}
{'BusinessType': 'Restaurant/Cafe/Canteen',
 'BusinessTypeID': 1,
 '_id': ObjectId('64e1abcb73c9924e75b8eb8b')}
{'BusinessType': 'Restaurant/Cafe/Canteen',
 'BusinessTypeID': 1,
 '_id': ObjectId('64e1abcb73c9924e75b8

 '_id': ObjectId('64e1abcc73c9924e75b95878')}
{'BusinessType': 'Restaurant/Cafe/Canteen',
 'BusinessTypeID': 1,
 '_id': ObjectId('64e1abcc73c9924e75b9587c')}
{'BusinessType': 'Restaurant/Cafe/Canteen',
 'BusinessTypeID': 1,
 '_id': ObjectId('64e1abcc73c9924e75b9587d')}
{'BusinessType': 'Restaurant/Cafe/Canteen',
 'BusinessTypeID': 1,
 '_id': ObjectId('64e1abcc73c9924e75b9587f')}
{'BusinessType': 'Restaurant/Cafe/Canteen',
 'BusinessTypeID': 1,
 '_id': ObjectId('64e1abcc73c9924e75b95881')}
{'BusinessType': 'Restaurant/Cafe/Canteen',
 'BusinessTypeID': 1,
 '_id': ObjectId('64e1abcc73c9924e75b95882')}
{'BusinessType': 'Restaurant/Cafe/Canteen',
 'BusinessTypeID': 1,
 '_id': ObjectId('64e1abcc73c9924e75b9588b')}
{'BusinessType': 'Restaurant/Cafe/Canteen',
 'BusinessTypeID': 1,
 '_id': ObjectId('64e1abcc73c9924e75b9588c')}
{'BusinessType': 'Restaurant/Cafe/Canteen',
 'BusinessTypeID': 1,
 '_id': ObjectId('64e1abcc73c9924e75b9588d')}
{'BusinessType': 'Restaurant/Cafe/Canteen',
 'BusinessType

3. Update the new restaurant with the `BusinessTypeID` you found.

In [12]:
# Update the new restaurant with the correct BusinessTypeID
# Define the filter (query) to identify the document you want to update
new_restaurant_update = {'BusinessName': 'Penang Flavours'}

# Define the update using the $set operator
update_operation = {'$set': {'BusinessTypeID': "1"}}

# Perform the update on a single document
update_result = establishments.update_one(new_restaurant_update, update_operation)

In [13]:
# Confirm that the new restaurant was updated
query = {'BusinessName': 'Penang Flavours'}
fields = {'BusinessName': 1, 'BusinessType': 1, 'BusinessTypeID': 1}
results = establishments.find(query, fields)

for result in results:
    print(result)

{'_id': ObjectId('64e1ac112960880a426b75f0'), 'BusinessName': 'Penang Flavours', 'BusinessType': 'Restaurant/Cafe/Canteen', 'BusinessTypeID': '1'}


4. The magazine is not interested in any establishments in Dover, so check how many documents contain the Dover Local Authority. Then, remove any establishments within the Dover Local Authority from the database, and check the number of documents to ensure they were deleted.

In [14]:
# Find how many documents have LocalAuthorityName as "Dover"
dover_local_authority_count = establishments.count_documents({'LocalAuthorityName': 'Dover'})
print(f"Number of establishments with Dover Local Authority Name: {dover_local_authority_count}")

Number of establishments with Dover Local Authority Name: 994


In [15]:
# Delete all documents where LocalAuthorityName is "Dover"
delete_result = establishments.delete_many({'LocalAuthorityName': 'Dover'})

In [16]:
# Check if any remaining documents include Dover
remaining_count = establishments.count_documents({'LocalAuthorityName': 'Dover'})
print(f"Number of establishments remaining with Dover Local Authority Name: {remaining_count}")

Number of establishments remaining with Dover Local Authority Name: 0


In [17]:
# Check that other documents remain with 'find_one'
one_establishment = db.establishments.find_one()

pprint(one_establishment)

{'AddressLine1': 'East Cliff Pavilion',
 'AddressLine2': 'Wear Bay Road',
 'AddressLine3': 'Folkestone',
 'AddressLine4': 'Kent',
 'BusinessName': 'The Pavilion',
 'BusinessType': 'Restaurant/Cafe/Canteen',
 'BusinessTypeID': 1,
 'ChangesByServerID': 0,
 'Distance': 4591.765489457773,
 'FHRSID': 1043695,
 'LocalAuthorityBusinessID': 'PI/000073616',
 'LocalAuthorityCode': '188',
 'LocalAuthorityEmailAddress': 'foodteam@folkestone-hythe.gov.uk',
 'LocalAuthorityName': 'Folkestone and Hythe',
 'LocalAuthorityWebSite': 'http://www.folkestone-hythe.gov.uk',
 'NewRatingPending': False,
 'Phone': '',
 'PostCode': 'CT19 6BL',
 'RatingDate': '2018-04-04T00:00:00',
 'RatingKey': 'fhrs_5_en-gb',
 'RatingValue': '5',
 'RightToReply': '',
 'SchemeType': 'FHRS',
 '_id': ObjectId('64e1abcb73c9924e75b8ee4e'),
 'geocode': {'latitude': '51.083812', 'longitude': '1.195625'},
 'links': [{'href': 'https://api.ratings.food.gov.uk/establishments/1043695',
            'rel': 'self'}],
 'meta': {'dataSource': 

5. Some of the number values are stored as strings, when they should be stored as numbers. Use `update_many` to convert `latitude` and `longitude` to decimal numbers.

In [18]:
# Change the data type from String to Decimal for longitude
update_lon = establishments.update_many({}, [{'$set': {'geocode.longitude': {'$toDouble': '$geocode.longitude'}}}])

In [19]:
# Change the data type from String to Decimal for latitude
update_lat = establishments.update_many({}, [{'$set': {'geocode.latitude': {'$toDouble': '$geocode.latitude'}}}])

In [20]:
# Check that the coordinates are now numbers
# Initialize counters
float_count = 0
non_float_count = 0

# Iterate through documents and count "Longitude" and "Latitude" types
for document in establishments.find({}):
    if 'geocode' in document:
        geocode = document['geocode']
        if 'longitude' in geocode and 'latitude' in geocode:
            longitude_value = geocode['longitude']
            latitude_value = geocode['latitude']
            if isinstance(longitude_value, (int, float)) and isinstance(latitude_value, (int, float)):
                float_count += 1
            else:
                non_float_count += 1
        else:
            non_float_count += 1
    else:
        non_float_count += 1

# Print the results
print(f"Count of documents with both Longitude and Latitude as float: {float_count}")
print(f"Count of documents where Longitude or Latitude are not float: {non_float_count}")

Count of documents with both Longitude and Latitude as float: 38786
Count of documents where Longitude or Latitude are not float: 0


In [21]:
# Change the data type from String to Integer for RatingValue
# Set non 1-5 Rating Values to Null
non_ratings = ["AwaitingInspection", "Awaiting Inspection", "AwaitingPublication", "Pass", "Exempt"]
establishments.update_many({"RatingValue": {"$in": non_ratings}}, [ {'$set':{ "RatingValue" : None}} ])


<pymongo.results.UpdateResult at 0x268107aacc0>

In [22]:
# Change the data type from String to Integer for RatingValue
establishments.update_many({}, [ {'$set':{ "RatingValue" : {'$toInt': "$RatingValue"}}} ])


<pymongo.results.UpdateResult at 0x2681050dbc0>

In [23]:
# Check that the rating value are now numbers
# Iterate through documents and print the type of "RatingValue" field
for document in establishments.find({}):
    if 'RatingValue' in document:
        rating_value = document['RatingValue']
        rating_type = type(rating_value).__name__
        print(f"Document with _id {document['_id']} has RatingValue type: {rating_type}")
    else:
        print(f"Document with _id {document['_id']} does not have 'RatingValue' field.")

Document with _id 64e1abcb73c9924e75b8ee4e has RatingValue type: int
Document with _id 64e1abcb73c9924e75b8ee54 has RatingValue type: int
Document with _id 64e1abcb73c9924e75b8ee55 has RatingValue type: int
Document with _id 64e1abcb73c9924e75b8ee56 has RatingValue type: int
Document with _id 64e1abcb73c9924e75b8ee57 has RatingValue type: int
Document with _id 64e1abcb73c9924e75b8ee58 has RatingValue type: int
Document with _id 64e1abcb73c9924e75b8ee59 has RatingValue type: int
Document with _id 64e1abcb73c9924e75b8ee5a has RatingValue type: int
Document with _id 64e1abcb73c9924e75b8ee5b has RatingValue type: int
Document with _id 64e1abcb73c9924e75b8ee5c has RatingValue type: int
Document with _id 64e1abcb73c9924e75b8ee5d has RatingValue type: int
Document with _id 64e1abcb73c9924e75b8ee5e has RatingValue type: int
Document with _id 64e1abcb73c9924e75b8ee5f has RatingValue type: int
Document with _id 64e1abcb73c9924e75b8ee60 has RatingValue type: int
Document with _id 64e1abcb73c9924e

Document with _id 64e1abcb73c9924e75b90c38 has RatingValue type: int
Document with _id 64e1abcb73c9924e75b90c39 has RatingValue type: int
Document with _id 64e1abcb73c9924e75b90c3a has RatingValue type: int
Document with _id 64e1abcb73c9924e75b90c3b has RatingValue type: int
Document with _id 64e1abcb73c9924e75b90c3c has RatingValue type: int
Document with _id 64e1abcb73c9924e75b90c3d has RatingValue type: int
Document with _id 64e1abcb73c9924e75b90c3e has RatingValue type: int
Document with _id 64e1abcb73c9924e75b90c3f has RatingValue type: NoneType
Document with _id 64e1abcb73c9924e75b90c40 has RatingValue type: int
Document with _id 64e1abcb73c9924e75b90c41 has RatingValue type: int
Document with _id 64e1abcb73c9924e75b90c42 has RatingValue type: int
Document with _id 64e1abcb73c9924e75b90c43 has RatingValue type: int
Document with _id 64e1abcb73c9924e75b90c44 has RatingValue type: int
Document with _id 64e1abcb73c9924e75b90c45 has RatingValue type: int
Document with _id 64e1abcb73c

Document with _id 64e1abcb73c9924e75b92dbf has RatingValue type: int
Document with _id 64e1abcb73c9924e75b92dc0 has RatingValue type: NoneType
Document with _id 64e1abcb73c9924e75b92dc1 has RatingValue type: int
Document with _id 64e1abcb73c9924e75b92dc2 has RatingValue type: int
Document with _id 64e1abcb73c9924e75b92dc3 has RatingValue type: int
Document with _id 64e1abcb73c9924e75b92dc4 has RatingValue type: int
Document with _id 64e1abcb73c9924e75b92dc5 has RatingValue type: NoneType
Document with _id 64e1abcb73c9924e75b92dc6 has RatingValue type: int
Document with _id 64e1abcb73c9924e75b92dc7 has RatingValue type: int
Document with _id 64e1abcb73c9924e75b92dc8 has RatingValue type: int
Document with _id 64e1abcb73c9924e75b92dc9 has RatingValue type: int
Document with _id 64e1abcb73c9924e75b92dca has RatingValue type: int
Document with _id 64e1abcb73c9924e75b92dcb has RatingValue type: int
Document with _id 64e1abcb73c9924e75b92dcc has RatingValue type: int
Document with _id 64e1ab

In [None]:
|