<a href="https://colab.research.google.com/github/FrK06/Visualisation/blob/main/london_food_hygiene_rating_interactive_map.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Imports

In [None]:
import pandas as pd
import plotly.express as px

# Loading Dataset

In [None]:
# Load the data
data_path = '/content/food_hygiene_rating_data.csv'
data = pd.read_csv(data_path)

# Display the first few rows of the dataframe
data.head()

Unnamed: 0.1,Unnamed: 0,FHRSID,LocalAuthorityBusinessID,BusinessName,BusinessType,PostCode,RatingValue,RatingKey,RatingDate,LocalAuthorityCode,LocalAuthorityName,Longitude,Latitude
0,0,1438654,21/00856/FOOD,1st Base Catering,Mobile caterer,E20 2ST,AwaitingInspection,fhrs_awaitinginspection_en-GB,,525,Newham,-0.018066,51.538799
1,1,1132140,19/00459/FOOD,53.5 Degrees,Restaurant/Cafe/Canteen,E16 2RD,5,fhrs_5_en-GB,2019-05-10,525,Newham,0.064757,51.507405
2,2,1132134,19/00447/FOOD,53.5 Degrees,Restaurant/Cafe/Canteen,E15 4LZ,5,fhrs_5_en-GB,2019-05-14,525,Newham,0.009809,51.543395
3,3,1260384,20/00288/FOOD,55 Square Limited,Restaurant/Cafe/Canteen,E16 1EN,2,fhrs_2_en-GB,2020-12-09,525,Newham,0.012417,51.517514
4,4,1389145,21/00354/FOOD,7 Mamas Ltd,Takeaway/sandwich shop,E6 3HN,5,fhrs_5_en-GB,2022-01-05,525,Newham,0.055372,51.527803




---


The dataset contains various pieces of information about food establishments

**Unnamed**: 0: Appears to be a row identifier.

**FHRSID**: The unique identifier for each food establishment.

**LocalAuthorityBusinessID**: Another identifier, possibly specific to the local authority.

**BusinessName**: The name of the food establishment.

**BusinessType**: The type of food establishment (e.g., restaurant, cafe, mobile caterer).

**PostCode**: The postcode of the establishment.

**RatingValue**: The hygiene rating value assigned to the establishment.

**RatingKey**: A key related to the rating value.

**RatingDate**: The date on which the rating was given.

**LocalAuthorityCode**: A code for the local authority.

**LocalAuthorityName**: The name of the local authority.

**Longitude**: The longitude coordinate of the establishment.

Latitude: The latitude coordinate of the establishment.


---



# Exploratory Data Analysis (EDA)

In [None]:
data.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 24352 entries, 0 to 24351
Data columns (total 13 columns):
 #   Column                    Non-Null Count  Dtype  
---  ------                    --------------  -----  
 0   Unnamed: 0                24352 non-null  int64  
 1   FHRSID                    24352 non-null  int64  
 2   LocalAuthorityBusinessID  24352 non-null  object 
 3   BusinessName              24351 non-null  object 
 4   BusinessType              24352 non-null  object 
 5   PostCode                  24352 non-null  object 
 6   RatingValue               24352 non-null  object 
 7   RatingKey                 24352 non-null  object 
 8   RatingDate                21901 non-null  object 
 9   LocalAuthorityCode        24352 non-null  int64  
 10  LocalAuthorityName        24352 non-null  object 
 11  Longitude                 24352 non-null  float64
 12  Latitude                  24352 non-null  float64
dtypes: float64(2), int64(3), object(8)
memory usage: 2.4+ MB


In [None]:
# Count the number of businesses with each rating value
numeric_ratings = data[data['RatingValue'].apply(lambda x: str(x).isdigit())].copy()

# Convert the 'RatingValue' column to integers for proper counting
numeric_ratings['RatingValue'] = numeric_ratings['RatingValue'].astype(int)

# Count the number of businesses for each rating value
rating_counts = numeric_ratings['RatingValue'].value_counts().sort_index()

print(rating_counts)

0      140
1      550
2      675
3     2450
4     4483
5    12544
Name: RatingValue, dtype: int64


In [None]:
# Count the number of occurrences of each business type
business_type_counts = data['BusinessType'].value_counts()

print(business_type_counts)

Restaurant/Cafe/Canteen                  7307
Retailers - other                        5553
Takeaway/sandwich shop                   3102
Other catering premises                  2917
Hospitals/Childcare/Caring Premises      1338
School/college/university                1101
Pub/bar/nightclub                        1040
Mobile caterer                            608
Retailers - supermarkets/hypermarkets     536
Manufacturers/packers                     373
Hotel/bed & breakfast/guest house         308
Distributors/Transporters                 110
Importers/Exporters                        51
Farmers/growers                             8
Name: BusinessType, dtype: int64


# Visualisations

## Distribution of Hygiene Ratings across Business Types

In [None]:
# Distribution of RatingValue across different BusinessType categories
fig_rating_business_type = px.histogram(data, x='BusinessType', color='RatingValue',
                                        title='Distribution of Hygiene Ratings across Business Types',
                                        labels={'RatingValue':'Hygiene Rating', 'count':'Number of Establishments'},
                                        category_orders={"RatingValue": ["5", "4", "3", "2", "1", "0", "AwaitingInspection", "AwaitingPublication", "Exempt"]},
                                        color_discrete_sequence=px.colors.qualitative.Set2,
                                        height=600)

fig_rating_business_type.update_layout(barmode='group', xaxis={'categoryorder':'total descending'}, xaxis_title='Business Type', yaxis_title='Count')
fig_rating_business_type.update_xaxes(tickangle=-45)
fig_rating_business_type.show()



---

*Here I am going to filter the data as we only need to check businesses rated*

---



In [None]:
#excluding unpublished or not yet inspected business details.
filtered_data = data[~data['RatingValue'].isin(['AwaitingInspection', 'AwaitingPublication', 'Exempt'])]


## Geo-Interactive Rating Discovery Map - Full Rating

*The map displays businesses rated on a hygiene scale from 0 to 5.*

*Every point on the map represents a business, complete with a detailed label containing all pertinent information.*

In [None]:
fig = px.scatter_mapbox(filtered_data, lat="Latitude", lon="Longitude",hover_name="BusinessName",
                        hover_data=["RatingValue", 'LocalAuthorityName',"PostCode"],color="RatingValue", size_max=15, zoom=10,
                        mapbox_style="carto-positron",
                        title="Food Hygiene Ratings by Location")

# Layout to make the map larger.
fig.update_layout(margin={"r":0,"t":0,"l":0,"b":0})

fig.show()


## Dynamic Distribution Map

*Press the play button to observe how businesses are distributed according to their hygiene ratings.*







In [None]:
fig = px.scatter_mapbox(filtered_data, hover_data=["RatingValue", 'LocalAuthorityName',"PostCode"],hover_name="BusinessName",color="RatingValue",lat="Latitude", lon="Longitude",
                         color_continuous_scale=px.colors.sequential.Bluered,
                        height=700,size_max=15, zoom=10,animation_frame="RatingValue")
fig.update_layout(mapbox_style="carto-positron")
fig.update_layout(margin={"r":0,"t":0,"l":0,"b":0})
fig.update_layout(transition = {'duration': 2000})
fig.show()

## Hygiene Rating Timeline Map

*This map shows the sequential exploration of hygiene ratings, which is facilitated by the animation frame feature.*



---



*Disclaimer!*

*Admittedly, it might not be the most exciting viewing experience, but given the presence of time-related data in the dataset, I was eager to put it to use!*

In [None]:
filtered_data = filtered_data.copy()

# Convert 'RatingDate' to datetime, sort, and then to string format for Plotly
filtered_data['RatingDate'] = pd.to_datetime(filtered_data['RatingDate'])
filtered_data = filtered_data.sort_values('RatingDate')
filtered_data['RatingDateStr'] = filtered_data['RatingDate'].dt.strftime('%Y-%m-%d')

fig = px.scatter_mapbox(filtered_data, lat="Latitude", lon="Longitude",
                        color="RatingValue", size_max=15, zoom=10,
                        animation_frame="RatingDateStr", animation_group="BusinessName",
                        color_continuous_scale=px.colors.sequential.Bluered,
                        height=700)

fig.update_layout(mapbox_style="carto-positron",
                  margin={"r":0, "t":0, "l":0, "b":0},
                  transition={'duration': 2000})

fig.show()