# Eat Safe, Love

## Part 1: Database and Jupyter Notebook Set Up

Import the data provided in the `establishments.json` file from your Terminal. Name the database `uk_food` and the collection `establishments`.

Within this markdown cell, copy the line of text you used to import the data from your Terminal. This way, future analysts will be able to repeat your process.

e.g.: Import the dataset with "file_path = 'Resources/establishments.json'"  

In [1]:
# Import dependencies
from pymongo import MongoClient
from pprint import pprint
import json

In [2]:
# Create an instance of MongoClient
mongo = MongoClient(port=27017)

In [3]:
# confirm that our new database was created
file_path = 'Resources/establishments.json'  
with open(file_path) as f:
    data = json.load(f)

In [4]:
# assign the uk_food database to a variable name
db = mongo['uk_food']

In [5]:
# review the collections in our new database

# start by inserting data into the 'establishments' collection
mongo['uk_food']['establishments'].insert_many(data)
print("Data imported successfully.")

# review the collection by listing the databases
databases = mongo.list_database_names()
print("All databases:", databases)

# Confirm that uk_food is listed
if 'uk_food' in databases:
    print("Database 'uk_food' exists.")
else:
    print("Database 'uk_food' does not exist.")

Data imported successfully.
All databases: ['admin', 'classDB', 'config', 'local', 'travel_db', 'uk_food']
Database 'uk_food' exists.


In [6]:
# review the collections in our new database
collections = db.list_collection_names()
print("Collections in 'uk_food' database:", collections)

Collections in 'uk_food' database: ['establishments']


In [7]:
# review a document in the establishments collection
# Find and display one document in the establishments collection
document = db['establishments'].find_one()

print("One document from the 'establishments' collection:")
print(document)


One document from the 'establishments' collection:
{'_id': ObjectId('664e32ae1fe90211939f752f'), 'FHRSID': 254719, 'ChangesByServerID': 0, 'LocalAuthorityBusinessID': 'PI/000069980', 'BusinessName': 'Refreshment Kiosk', 'BusinessType': 'Restaurant/Cafe/Canteen', 'BusinessTypeID': 1, 'AddressLine1': 'The Bay', 'AddressLine2': 'St Margarets Bay', 'AddressLine3': 'Kent', 'AddressLine4': '', 'PostCode': 'CT15 6DY', 'Phone': '', 'RatingValue': '5', 'RatingKey': 'fhrs_5_en-gb', 'RatingDate': '2022-03-24T00:00:00', 'LocalAuthorityCode': '182', 'LocalAuthorityName': 'Dover', 'LocalAuthorityWebSite': 'http://www.dover.gov.uk/', 'LocalAuthorityEmailAddress': 'publicprotection@dover.gov.uk', 'scores': {'Hygiene': 0, 'Structural': 5, 'ConfidenceInManagement': 5}, 'SchemeType': 'FHRS', 'geocode': {'longitude': '1.387974', 'latitude': '51.152225'}, 'RightToReply': '', 'Distance': 4587.347174863443, 'NewRatingPending': False, 'meta': {'dataSource': None, 'extractDate': '0001-01-01T00:00:00', 'itemCou

In [8]:
# assign the collection to a variable
establishments = db['establishments']

## Part 2: Update the Database

1. An exciting new halal restaurant just opened in Greenwich, but hasn't been rated yet. The magazine has asked you to include it in your analysis. Add the following restaurant "Penang Flavours" to the database.

In [9]:
# Create a dictionary for the new restaurant data
# Define the new restaurant document
new_restaurant = {
    "BusinessName": "Penang Flavours",
    "BusinessType": "Restaurant/Cafe/Canteen",
    "BusinessTypeID": "",
    "AddressLine1": "Penang Flavours",
    "AddressLine2": "146A Plumstead Rd",
    "AddressLine3": "London",
    "AddressLine4": "",
    "PostCode": "SE18 7DY",
    "Phone": "",
    "LocalAuthorityCode": "511",
    "LocalAuthorityName": "Greenwich",
    "LocalAuthorityWebSite": "http://www.royalgreenwich.gov.uk",
    "LocalAuthorityEmailAddress": "health@royalgreenwich.gov.uk",
    "scores": {
        "Hygiene": "",
        "Structural": "",
        "ConfidenceInManagement": ""
    },
    "SchemeType": "FHRS",
    "geocode": {
        "longitude": "0.08384000",
        "latitude": "51.49014200"
    },
    "RightToReply": "",
    "Distance": 4623.9723280747176,
    "NewRatingPending": True
}

In [10]:
# Insert the new restaurant into the collection
result = establishments.insert_one(new_restaurant)

In [11]:
# Check that the new restaurant was inserted
print("New restaurant inserted with ID:", result.inserted_id)

# Optionally, retrieve and print the inserted document to verify
inserted_document = establishments.find_one({"_id": result.inserted_id})
print("Inserted document:")
pprint(inserted_document)

New restaurant inserted with ID: 664e32e11fe9021193a01092
Inserted document:
{'AddressLine1': 'Penang Flavours',
 'AddressLine2': '146A Plumstead Rd',
 'AddressLine3': 'London',
 'AddressLine4': '',
 'BusinessName': 'Penang Flavours',
 'BusinessType': 'Restaurant/Cafe/Canteen',
 'BusinessTypeID': '',
 'Distance': 4623.972328074718,
 'LocalAuthorityCode': '511',
 'LocalAuthorityEmailAddress': 'health@royalgreenwich.gov.uk',
 'LocalAuthorityName': 'Greenwich',
 'LocalAuthorityWebSite': 'http://www.royalgreenwich.gov.uk',
 'NewRatingPending': True,
 'Phone': '',
 'PostCode': 'SE18 7DY',
 'RightToReply': '',
 'SchemeType': 'FHRS',
 '_id': ObjectId('664e32e11fe9021193a01092'),
 'geocode': {'latitude': '51.49014200', 'longitude': '0.08384000'},
 'scores': {'ConfidenceInManagement': '', 'Hygiene': '', 'Structural': ''}}


2. Find the BusinessTypeID for "Restaurant/Cafe/Canteen" and return only the `BusinessTypeID` and `BusinessType` fields.

In [12]:
# Find the BusinessTypeID for "Restaurant/Cafe/Canteen" and return only the BusinessTypeID and BusinessType fields
# Connect to the 'establishments' collection
collection = db.establishments

# Query to find the BusinessTypeID for "Restaurant/Cafe/Canteen"
result = collection.find_one(
    { "BusinessType": "Restaurant/Cafe/Canteen" },
    { "_id": 0, "BusinessType": 1, "BusinessTypeID": 1 }
)

# Print the result
if result:
    print("BusinessTypeID and BusinessType for 'Restaurant/Cafe/Canteen':")
    pprint(result)
else:
    print("No results found for BusinessType 'Restaurant/Cafe/Canteen'")

BusinessTypeID and BusinessType for 'Restaurant/Cafe/Canteen':
{'BusinessType': 'Restaurant/Cafe/Canteen', 'BusinessTypeID': 1}


3. Update the new restaurant with the `BusinessTypeID` you found.

In [13]:
# Update the new restaurant with the correct BusinessTypeID
# Find the most recently inserted document with BusinessName 'Penang Flavours'
document = establishments.find_one({"BusinessName": "Penang Flavours"}, {"_id": 1}, sort=[("_id", -1)])

# Get the ObjectId and store it in a variable
if document:
    penang_flavours_id = document["_id"]
    print(f"Latest ObjectID for 'Penang Flavours': {penang_flavours_id}")
else:
    print("Document with BusinessName 'Penang Flavours' not found")

collection = db.establishments
result = collection.update_one(
    { "_id": penang_flavours_id },
    { "$set": { "BusinessTypeID": "1" } }
)

Latest ObjectID for 'Penang Flavours': 664e32e11fe9021193a01092


In [14]:
# Confirm that the new restaurant was updated

# Find and display the updated document for the new restaurant
updated_restaurant = establishments.find_one({"BusinessName": "Penang Flavours"})

if updated_restaurant:
    print("New restaurant details after update:")
    for key, value in updated_restaurant.items():
        print(f"{key}: {value}")
else:
    print("New restaurant not found. Check if the update was successful.")


New restaurant details after update:
_id: 664e32e11fe9021193a01092
BusinessName: Penang Flavours
BusinessType: Restaurant/Cafe/Canteen
BusinessTypeID: 1
AddressLine1: Penang Flavours
AddressLine2: 146A Plumstead Rd
AddressLine3: London
AddressLine4: 
PostCode: SE18 7DY
Phone: 
LocalAuthorityCode: 511
LocalAuthorityName: Greenwich
LocalAuthorityWebSite: http://www.royalgreenwich.gov.uk
LocalAuthorityEmailAddress: health@royalgreenwich.gov.uk
scores: {'Hygiene': '', 'Structural': '', 'ConfidenceInManagement': ''}
SchemeType: FHRS
geocode: {'longitude': '0.08384000', 'latitude': '51.49014200'}
RightToReply: 
Distance: 4623.972328074718
NewRatingPending: True


4. The magazine is not interested in any establishments in Dover, so check how many documents contain the Dover Local Authority. Then, remove any establishments within the Dover Local Authority from the database, and check the number of documents to ensure they were deleted.

In [15]:
# Find how many documents have LocalAuthorityName as "Dover"
dover_documents_count = collection.count_documents({"LocalAuthorityName": "Dover"})
print(f"Number of documents in Dover Local Authority: {dover_documents_count}")

Number of documents in Dover Local Authority: 994


In [16]:
# Delete all documents where LocalAuthorityName is "Dover"
delete_result = collection.delete_many({"LocalAuthorityName": "Dover"})
print(f"Number of documents deleted: {delete_result.deleted_count}")

Number of documents deleted: 994


In [17]:
# Check if any remaining documents include Dover
new_dover_documents_count = collection.count_documents({"LocalAuthorityName": "Dover"})
print(f"Number of documents in Dover Local Authority after deletion: {new_dover_documents_count}")

Number of documents in Dover Local Authority after deletion: 0


In [18]:
# Check that other documents remain with 'find_one'
remaining_document = collection.find_one()

if remaining_document:
    print("At least one document remains in the collection after deletion:")
    for key, value in remaining_document.items():
        print(f"{key}: {value}")
else:
    print("No documents remain in the collection after deletion.")

At least one document remains in the collection after deletion:
_id: 664e32ae1fe90211939f7816
FHRSID: 1043695
ChangesByServerID: 0
LocalAuthorityBusinessID: PI/000073616
BusinessName: The Pavilion
BusinessType: Restaurant/Cafe/Canteen
BusinessTypeID: 1
AddressLine1: East Cliff Pavilion
AddressLine2: Wear Bay Road
AddressLine3: Folkestone
AddressLine4: Kent
PostCode: CT19 6BL
Phone: 
RatingValue: 5
RatingKey: fhrs_5_en-gb
RatingDate: 2018-04-04T00:00:00
LocalAuthorityCode: 188
LocalAuthorityName: Folkestone and Hythe
LocalAuthorityWebSite: http://www.folkestone-hythe.gov.uk
LocalAuthorityEmailAddress: foodteam@folkestone-hythe.gov.uk
scores: {'Hygiene': 5, 'Structural': 5, 'ConfidenceInManagement': 5}
SchemeType: FHRS
geocode: {'longitude': '1.195625', 'latitude': '51.083812'}
RightToReply: 
Distance: 4591.765489457773
NewRatingPending: False
meta: {'dataSource': None, 'extractDate': '0001-01-01T00:00:00', 'itemCount': 0, 'returncode': None, 'totalCount': 0, 'totalPages': 0, 'pageSize':

5. Some of the number values are stored as strings, when they should be stored as numbers.

Use `update_many` to convert `latitude` and `longitude` to decimal numbers.

In [19]:

# Update documents in the collection to convert 'longitude' and 'latitude' fields to doubles
establishments.update_many(
    {"geocode.longitude": {"$type": "string"}, "geocode.latitude": {"$type": "string"}},
    [{"$set": {"geocode.longitude": {"$toDouble": "$geocode.longitude"}, "geocode.latitude": {"$toDouble": "$geocode.latitude"}}}]
)

# Print a message indicating the update is complete
print("Coordinates updated successfully.")

Coordinates updated successfully.


Use `update_many` to convert `RatingValue` to integer numbers.

In [23]:
# Set non 1-5 Rating Values to Null
non_ratings = ["AwaitingInspection", "Awaiting Inspection", "AwaitingPublication", "Pass", "Exempt"]
establishments.update_many({"RatingValue": {"$in": non_ratings}}, [ {'$set':{ "RatingValue" : None}} ])

UpdateResult({'n': 0, 'nModified': 0, 'ok': 1.0, 'updatedExisting': False}, acknowledged=True)

In [21]:
# Change the data type from String to Integer for RatingValue
# Convert RatingValue to integer numbers, handling non-numeric values
collection.update_many({}, [
    {"$set": {
        "RatingValue": {
            "$cond": {
                "if": {"$eq": [{"$type": "$RatingValue"}, "string"]},
                "then": {
                    "$cond": {
                        "if": {"$regexMatch": {"input": "$RatingValue", "regex": "^[0-9]+$"}},
                        "then": {"$toInt": "$RatingValue"},
                        "else": None
                    }
                },
                "else": "$RatingValue"
            }
        }
    }}
])

UpdateResult({'n': 38786, 'nModified': 34694, 'ok': 1.0, 'updatedExisting': True}, acknowledged=True)

In [22]:
# Check that the coordinates and rating value are now numbers
# Retrieve a document from the collection
document = collection.find_one()

if document:
    if 'geocode' in document:
        latitude = document['geocode'].get('latitude')
        longitude = document['geocode'].get('longitude')
        print("Data type of latitude:", type(latitude))
        print("Data type of longitude:", type(longitude))
    else:
        print("'geocode' field not found in the document.")

    rating_value = document.get('RatingValue')
    print("Data type of rating value:", type(rating_value))
else:
    print("No documents found in the collection.")


Data type of latitude: <class 'float'>
Data type of longitude: <class 'float'>
Data type of rating value: <class 'int'>
