# Eat Safe, Love

## Part 1: Database and Jupyter Notebook Set Up

Import the data provided in the `establishments.json` file from your Terminal. Name the database `uk_food` and the collection `establishments`.

Within this markdown cell, copy the line of text you used to import the data from your Terminal. This way, future analysts will be able to repeat your process.

e.g.: Import the dataset with:  mongoimport --type json -d uk_food -c establishments --drop --jsonArray establishments.json

In [14]:
# Import dependencies
from pymongo import MongoClient
from bson import ObjectId
from pprint import pprint

import os

import sys
sys.path.append("../")

In [5]:
# Create an instance of MongoClient

mongo = MongoClient(port=27017)

# client = pymongo.MongoClient('mongodb://localhost:27017/')

# assign the uk_food database to a variable name
db = mongo['uk_food']


In [6]:
# assign the uk_food database to a variable name

db = mongo['uk_food']

In [8]:
mongo.list_database_names()

['admin', 'config', 'local', 'uk_food']

In [9]:
print(db.name)

uk_food


In [10]:
# Get the list of collection names in the uk_food database

collection_names = db.list_collection_names()

print(collection_names)


['establishments']


In [11]:
# review the collections in our new database

# assign the collection to a variable
establishments = db['establishments']

establishments

Collection(Database(MongoClient(host=['localhost:27017'], document_class=dict, tz_aware=False, connect=True), 'uk_food'), 'establishments')

In [15]:
# review a document in the establishments collection

# Capture the results to a variable

document = db.establishments.find_one()

# Get the field names from the document
field_names = list(document.keys())

print("==== Fields: ====")
print(field_names)

print("================\n")

# Pretty print the document

pprint(document)

==== Fields: ====
['_id', 'FHRSID', 'ChangesByServerID', 'LocalAuthorityBusinessID', 'BusinessName', 'BusinessType', 'BusinessTypeID', 'AddressLine1', 'AddressLine2', 'AddressLine3', 'AddressLine4', 'PostCode', 'Phone', 'RatingValue', 'RatingKey', 'RatingDate', 'LocalAuthorityCode', 'LocalAuthorityName', 'LocalAuthorityWebSite', 'LocalAuthorityEmailAddress', 'scores', 'SchemeType', 'geocode', 'RightToReply', 'Distance', 'NewRatingPending', 'meta', 'links']

{'AddressLine1': 'The Bay',
 'AddressLine2': 'St Margarets Bay',
 'AddressLine3': 'Kent',
 'AddressLine4': '',
 'BusinessName': 'The Coastguard Inn',
 'BusinessType': 'Pub/bar/nightclub',
 'BusinessTypeID': 7843,
 'ChangesByServerID': 0,
 'Distance': 4587.347174863443,
 'FHRSID': 1034540,
 'LocalAuthorityBusinessID': 'PI/000078691',
 'LocalAuthorityCode': '182',
 'LocalAuthorityEmailAddress': 'publicprotection@dover.gov.uk',
 'LocalAuthorityName': 'Dover',
 'LocalAuthorityWebSite': 'http://www.dover.gov.uk/',
 'NewRatingPending': Fa

## Part 2: Update the Database

1. An exciting new halal restaurant just opened in Greenwich, but hasn't been rated yet. The magazine has asked you to include it in your analysis. Add the following restaurant "Penang Flavours" to the database.

In [16]:
# get shared local authority values from another entity in Greenwich 

greenwich_entity = db.establishments.find_one({'LocalAuthorityName' : 'Greenwich'})

greenwich_entity

{'_id': ObjectId('64931478a8c36d50fdd250df'),
 'FHRSID': 1451407,
 'ChangesByServerID': 0,
 'LocalAuthorityBusinessID': '14959',
 'BusinessName': 'Oaks Nursing Home',
 'BusinessType': 'Caring Premises',
 'BusinessTypeID': 5,
 'AddressLine1': 'The Oaks 904 Sidcup Road',
 'AddressLine2': '',
 'AddressLine3': 'Eltham',
 'AddressLine4': 'Greenwich',
 'PostCode': 'SE9 3PW',
 'Phone': '',
 'RatingValue': '5',
 'RatingKey': 'fhrs_5_en-gb',
 'RatingDate': '2022-01-12T00:00:00',
 'LocalAuthorityCode': '511',
 'LocalAuthorityName': 'Greenwich',
 'LocalAuthorityWebSite': 'http://www.royalgreenwich.gov.uk',
 'LocalAuthorityEmailAddress': 'health@royalgreenwich.gov.uk',
 'scores': {'Hygiene': 5, 'Structural': 5, 'ConfidenceInManagement': 5},
 'SchemeType': 'FHRS',
 'geocode': {'longitude': '0.0740289', 'latitude': '51.4320613'},
 'RightToReply': '',
 'Distance': 4645.598535750726,
 'NewRatingPending': False,
 'meta': {'dataSource': None,
  'extractDate': '0001-01-01T00:00:00',
  'itemCount': 0,
  '

In [104]:
# Create a dictionary for the new restaurant data for "Penang Flavours"

# Because we later need a real longitude and latitude for this
# new restaurant in part 2, I used an address for what was recommended
# as one of the top 15 restaurants in Greenwich, in an area of other restaurants.
# It is new and unsung and serves Turkish food, which seemed like
# something a customer looking for Halal food might like. 
# I then got the latitude and longitude for that address from
# https://www.latlong.net/place/blackheath-london-uk-32669.html

dict = {'AddressLine1': '',
 'AddressLine2': '',
 'AddressLine3': '',
 'AddressLine4': '',
 'BusinessName': 'Penang Flavours',
 'BusinessType': '',
 'BusinessTypeID': 0,
 'ChangesByServerID': 0,
 'Distance': 0.00,
 'FHRSID': 0,
 'LocalAuthorityBusinessID': '',
 'LocalAuthorityCode': greenwich_entity['LocalAuthorityCode'],
 'LocalAuthorityEmailAddress': greenwich_entity['LocalAuthorityEmailAddress'],
 'LocalAuthorityName': greenwich_entity['LocalAuthorityName'],
 'LocalAuthorityWebSite': greenwich_entity['LocalAuthorityWebSite'],
 'NewRatingPending': True,
 'Phone': '',
 'PostCode': '',
 'RatingDate': '',
 'RatingKey': '',
 'RatingValue': None,
 'SchemeType': '',
 'geocode': {'latitude': '51.465691', 'longitude': '0.005687'},
 'links': [{'href': greenwich_entity['links'][0]['rel'],
            'rel': 'self'}],
 'meta': {'dataSource': None,
          'extractDate': '',
          'itemCount': 0,
          'pageNumber': 0,
          'pageSize': 0,
          'returncode': None,
          'totalCount': 0,
          'totalPages': 0},
 'scores': {'ConfidenceInManagement': 0, 'Hygiene': 0, 'Structural': 0}}


dict

{'AddressLine1': '',
 'AddressLine2': '',
 'AddressLine3': '',
 'AddressLine4': '',
 'BusinessName': 'Penang Flavours',
 'BusinessType': '',
 'BusinessTypeID': 0,
 'ChangesByServerID': 0,
 'Distance': 0.0,
 'FHRSID': 0,
 'LocalAuthorityBusinessID': '',
 'LocalAuthorityCode': '511',
 'LocalAuthorityEmailAddress': 'health@royalgreenwich.gov.uk',
 'LocalAuthorityName': 'Greenwich',
 'LocalAuthorityWebSite': 'http://www.royalgreenwich.gov.uk',
 'NewRatingPending': True,
 'Phone': '',
 'PostCode': '',
 'RatingDate': '',
 'RatingKey': '',
 'RatingValue': None,
 'SchemeType': '',
 'geocode': {'latitude': '51.465691', 'longitude': '0.005687'},
 'links': [{'href': 'self', 'rel': 'self'}],
 'meta': {'dataSource': None,
  'extractDate': '',
  'itemCount': 0,
  'pageNumber': 0,
  'pageSize': 0,
  'returncode': None,
  'totalCount': 0,
  'totalPages': 0},
 'scores': {'ConfidenceInManagement': 0, 'Hygiene': 0, 'Structural': 0}}

In [18]:
# Insert the new restaurant into the collection

db.establishments.insert_one(dict)

<pymongo.results.InsertOneResult at 0x279cb4b4100>

In [19]:
# Check that the new restaurant was inserted
penang_entity = db.establishments.find_one({'BusinessName' : 'Penang Flavours'})

penang_entity

{'_id': ObjectId('64931602e4246859347e97c2'),
 'AddressLine1': '',
 'AddressLine2': '',
 'AddressLine3': '',
 'AddressLine4': '',
 'BusinessName': 'Penang Flavours',
 'BusinessType': '',
 'BusinessTypeID': 0,
 'ChangesByServerID': 0,
 'Distance': 0.0,
 'FHRSID': 0,
 'LocalAuthorityBusinessID': '',
 'LocalAuthorityCode': '511',
 'LocalAuthorityEmailAddress': 'health@royalgreenwich.gov.uk',
 'LocalAuthorityName': 'Greenwich',
 'LocalAuthorityWebSite': 'http://www.royalgreenwich.gov.uk',
 'NewRatingPending': True,
 'Phone': '',
 'PostCode': '',
 'RatingDate': '',
 'RatingKey': '',
 'RatingValue': '',
 'RightToReply': '',
 'SchemeType': '',
 'geocode': {'latitude': '', 'longitude': ''},
 'links': [{'href': 'self', 'rel': 'self'}],
 'meta': {'dataSource': None,
  'extractDate': '',
  'itemCount': 0,
  'pageNumber': 0,
  'pageSize': 0,
  'returncode': None,
  'totalCount': 0,
  'totalPages': 0},
 'scores': {'ConfidenceInManagement': 0, 'Hygiene': 0, 'Structural': 0}}

In [20]:
# Find how many other restaurants have Penang in their names

penang_entities = db.establishments.find(
    {'BusinessName': {'$regex': 'Penang'}},
    {'BusinessName': 1, 'BusinessType': 1, 'BusinessTypeID': 1})

for entry in penang_entities:
    print(entry)

{'_id': ObjectId('64931602e4246859347e97c2'), 'BusinessName': 'Penang Flavours', 'BusinessType': '', 'BusinessTypeID': 0}


In [21]:
# Get total number of entities

total_ents = db.establishments.count_documents({})

print(f"After adding the Penang Flavours restaurant, there are {total_ents} entities.")

After adding the Penang Flavours restaurant, there are 39780 entities.


2. Find the BusinessTypeID for "Restaurant/Cafe/Canteen" and return only the `BusinessTypeID` and `BusinessType` fields.

In [22]:
# Find the BusinessTypeID for "Restaurant/Cafe/Canteen" and return only the BusinessTypeID and BusinessType fields

restaurant_type = db.establishments.find_one(
    {'BusinessType': 'Restaurant/Cafe/Canteen'},
    {'BusinessTypeID': 1, 'BusinessType': 1}
)

print(f"BusinessType: {restaurant_type['BusinessType']}, BusinessTypeID: {restaurant_type['BusinessTypeID']}" )


BusinessType: Restaurant/Cafe/Canteen, BusinessTypeID: 1


3. Update the new restaurant with the `BusinessTypeID` you found.

In [23]:
# Update the new restaurant with the correct BusinessTypeID


db.establishments.update_one(
    {'BusinessName': 'Penang Flavours'},
    {'$set': {
        'BusinessType': restaurant_type['BusinessType'],
        'BusinessTypeID': restaurant_type['BusinessTypeID']
    }}
)


<pymongo.results.UpdateResult at 0x279cb5d93c0>

In [24]:
# Confirm that the new restaurant was updated

penang_ent = db.establishments.find_one({'BusinessName' : 'Penang Flavours'})

penang_ent

{'_id': ObjectId('64931602e4246859347e97c2'),
 'AddressLine1': '',
 'AddressLine2': '',
 'AddressLine3': '',
 'AddressLine4': '',
 'BusinessName': 'Penang Flavours',
 'BusinessType': 'Restaurant/Cafe/Canteen',
 'BusinessTypeID': 1,
 'ChangesByServerID': 0,
 'Distance': 0.0,
 'FHRSID': 0,
 'LocalAuthorityBusinessID': '',
 'LocalAuthorityCode': '511',
 'LocalAuthorityEmailAddress': 'health@royalgreenwich.gov.uk',
 'LocalAuthorityName': 'Greenwich',
 'LocalAuthorityWebSite': 'http://www.royalgreenwich.gov.uk',
 'NewRatingPending': True,
 'Phone': '',
 'PostCode': '',
 'RatingDate': '',
 'RatingKey': '',
 'RatingValue': '',
 'RightToReply': '',
 'SchemeType': '',
 'geocode': {'latitude': '', 'longitude': ''},
 'links': [{'href': 'self', 'rel': 'self'}],
 'meta': {'dataSource': None,
  'extractDate': '',
  'itemCount': 0,
  'pageNumber': 0,
  'pageSize': 0,
  'returncode': None,
  'totalCount': 0,
  'totalPages': 0},
 'scores': {'ConfidenceInManagement': 0, 'Hygiene': 0, 'Structural': 0}}

4. The magazine is not interested in any establishments in Dover, so check how many documents contain the Dover Local Authority. Then, remove any establishments within the Dover Local Authority from the database, and check the number of documents to ensure they were deleted.

In [25]:
# Find how many documents have LocalAuthorityName as "Dover"

dover_entities = list(db.establishments.find({'LocalAuthorityName' : 'Dover'}))

num_dover_ents = len(dover_entities)

print(f"There are {num_dover_ents} business entities in Dover:")
for entity in dover_entities:
    print(entity['BusinessName'])

There are 994 business entities in Dover:
The Coastguard Inn
Boodles
FirstLight Bar & Café
The Barn
The Pines Calyx
The Halfway Hut
The Tea Room
Mrs Knotts Tea Room
Refreshment Kiosk
Walmer and Kingsdown Golf Club
Lovetocater
St Margarets At Cliffe C P School
Lenox House
St Margarets At Cliffe Nursery And After School Club
The White Cliffs
St Margaret's Bowls and Social Club
Portal House School
The Smugglers
The Village Shop
Boat House & Langdon
The Lounge Bar
Rising Sun
Seahaven & Kingsdown Lodge
Goodwins Suite (Reception)
Oxtale
Kingsdown Pre School
Kingsdown Newsagents
Zetland Arms
Kingsdown And Ringwould Cofe Primary School
Costa Coffee
Kings Head
W H Smiths
Kingsdown & Ringwould Breakfast Club
Pride of Canterbury
Burger King
Spirit of France
Hogbox
Spirit of Britain
Pride of Burgundy
National Trust White Cliffs
Adventure Backpackers Hostel
Pride of Kent
Loddington House Hotel
Delft Seaways
Dunkerque Seaways
Glendale Lodge
Cinque Port Arms
Dover Cargo Terminal
Food Outlets
GCP Cate

In [26]:
# Delete all documents where LocalAuthorityName is "Dover"

db.establishments.delete_many({'LocalAuthorityName': 'Dover'})


<pymongo.results.DeleteResult at 0x279cb482440>

In [27]:
# Check if any remaining documents include Dover
updated_dover_entities = list(db.establishments.find({'LocalAuthorityName' : 'Dover'}))

count_dover_ents = len(updated_dover_entities)

print(f"There were {num_dover_ents} business entities in Dover and now there are {count_dover_ents}.")

# check overall totals
new_tots = db.establishments.count_documents({})

print(f"There were {total_ents} total entities, and now there are {new_tots}.")

There were 994 business entities in Dover and now there are 0.
There were 39780 total entities, and now there are 38786.


In [28]:
# Check that other documents remain with 'find_one'
# review a document in the establishments collection

doc = db.establishments.find_one()

pprint(doc)

{'AddressLine1': 'East Cliff Pavilion',
 'AddressLine2': 'Wear Bay Road',
 'AddressLine3': 'Folkestone',
 'AddressLine4': 'Kent',
 'BusinessName': 'The Pavilion',
 'BusinessType': 'Restaurant/Cafe/Canteen',
 'BusinessTypeID': 1,
 'ChangesByServerID': 0,
 'Distance': 4591.765489457773,
 'FHRSID': 1043695,
 'LocalAuthorityBusinessID': 'PI/000073616',
 'LocalAuthorityCode': '188',
 'LocalAuthorityEmailAddress': 'foodteam@folkestone-hythe.gov.uk',
 'LocalAuthorityName': 'Folkestone and Hythe',
 'LocalAuthorityWebSite': 'http://www.folkestone-hythe.gov.uk',
 'NewRatingPending': False,
 'Phone': '',
 'PostCode': 'CT19 6BL',
 'RatingDate': '2018-04-04T00:00:00',
 'RatingKey': 'fhrs_5_en-gb',
 'RatingValue': '5',
 'RightToReply': '',
 'SchemeType': 'FHRS',
 '_id': ObjectId('64931477a8c36d50fdd1e3f5'),
 'geocode': {'latitude': '51.083812', 'longitude': '1.195625'},
 'links': [{'href': 'https://api.ratings.food.gov.uk/establishments/1043695',
            'rel': 'self'}],
 'meta': {'dataSource': 

5. Some of the number values are stored as strings, when they should be stored as numbers.

Use `update_many` to convert `latitude` and `longitude` to decimal numbers.

In [None]:
# first set any blank latitude and longitude fields to None
# before trying to do the conversion from string to float

blank_values = ['', ' ']

# Update latitude
query_blank = {"geocode.latitude": {"$in": blank_values}}
update_query = {"$set": {"geocode.latitude": None}}

result = establishments.update_many(query_blank, update_query)
print("Modified documents (latitude):", result.modified_count)

# Update longitude
query_blank = {"geocode.longitude": {"$in": blank_values}}
update_query = {"$set": {"geocode.longitude": None}}

result = establishments.update_many(query_blank, update_query)
print("Modified documents (longitude):", result.modified_count)


In [67]:
# now convert the strings to floats

# Get all the documents
docs = establishments.find()

# loop through updating each individually and replacing the existing 
# doc with the updated version.  

for doc in docs:
    # Convert the latitude field to float if it's a string, not a null
    if isinstance(doc['geocode']['latitude'], str):
        doc['geocode']['latitude'] = float(doc['geocode']['latitude'])
        
    # Convert the longitude field to float if it's a string, not a null   
    if isinstance(doc['geocode']['longitude'], str):
        doc['geocode']['longitude'] = float(doc['geocode']['longitude'])

    # Update the document in the establishments collection
    establishments.replace_one({'_id': doc['_id']}, doc)


Use `update_many` to convert `RatingValue` to integer numbers.

In [68]:
# Check the values of the RatingValue field before tring to convert them

print(establishments.distinct('RatingValue'))

['', '0', '1', '2', '3', '4', '5', 'Awaiting Inspection', 'AwaitingInspection', 'AwaitingPublication', 'Exempt', 'Pass']


In [69]:
# First find everything that won't convert to an integer

non_ratings = ["AwaitingInspection", "Awaiting Inspection", "AwaitingPublication", "Pass", "Exempt", ""]

query_none = {"RatingValue": {"$in": non_ratings}}

results = establishments.count_documents(query_none)
print(results)

4092


In [70]:
# set all of the non-numeric ratings values to None
# This is per the requirements of the challenge.  In actuality, it would
# seem to be much better to keep the different types of
# reasons for not having a rating value, especially for examptions,
# and instead to just convert to int where the RatingVale is in ['0', '1', '2', '3', '4', '5']
# this would preserve more information and require less updating.  But, the starter
# code indicated that all of these types of values should be set to "None".

update_query = {"$set": {"RatingValue": None}}

result = establishments.update_many(query_none, update_query)

print("Modified documents:", result.modified_count)

Modified documents: 4092


In [91]:
# check the ratings values after the null value cleanup

print(f"After null conversion: {establishments.distinct('RatingValue')}")  

After null conversion: [None, 0, 1, 2, 3, 4, 5]


In [92]:
match_query = {"RatingValue": {"$ne": None}}
none_query = {"RatingValue": {"$eq": None}}

total_ratings = establishments.count_documents({})
print(f"Total documents: {total_ratings}")

valid_ratings = establishments.count_documents(match_query)
print(f"Convertable ratings: {valid_ratings}")

result_none = establishments.count_documents(none_query)
print(f"Null ratings: {result_none}")


Total documents: 38786
Convertable ratings: 34694
Null ratings: 4092


In [93]:
# now convert the valid Ratings Values to integers

# Get all the documents
docs = establishments.find()

# Loop through updating each individually and replacing the existing doc
# with the updated version. 

for doc in docs:
    
    # Convert the RatingValue field to integer if it is a string
    if isinstance(doc['RatingValue'], str):
        doc['RatingValue'] = int(doc['RatingValue'])
        
        # Update the document in the establishments collection
        establishments.replace_one({'_id': doc['_id']}, doc)

# check the results
print(f"Converted ratings values: {establishments.distinct('RatingValue')}")                               

Converted ratings values: [None, 0, 1, 2, 3, 4, 5]


In [100]:
# Check that the coordinates and rating value are now numbers.

# This also was already checked in other ways, in cells above.
# Disregarding all "None"/Null cells as that appears to be 
# acceptable for this application and these fields.

# Get all the documents
updated_docs = establishments.find()

for doc in updated_docs:
        
        if doc['geocode']['latitude'] is not None:

            if not isinstance(doc['geocode']['latitude'], float):
                print(f"The latitude {doc['geocode']['latitude']} is not a float")

        if doc['geocode']['longitude'] is not None:

            if not isinstance(doc['geocode']['longitude'], float):
                print(f"The longitude {doc['geocode']['longitude']} is not a float")

        if doc['RatingValue'] is not None:

            if not isinstance(doc['RatingValue'], int):
                print(f"The Rating Value {doc['RatingValue']} is not an integer")
                
print("conversion type checking complete.")

conversion type checking complete.
