# Analysis of NSW Food Authority's Name & Shame Register

The NSW Food Authority publishes lists of businesses that have breached or are alleged to have breached NSW food safety laws. Publishing the lists gives consumers more information to make decisions about where they eat or buy food. Individuals and businesses may receive either a penalty notice for their alleged offence or be prosecuted before a court.

In [1]:
import sys
print(sys.version)

3.8.8 (default, Apr 13 2021, 12:59:45) 
[Clang 10.0.0 ]


In [2]:
# Libraries
import sys
sys.path.append('../utils')  # For notebooks
import utils 
import pandas as pd
import numpy as np
#import boto3
import os
import io
from dotenv import load_dotenv #for loading env variables
from github import Github #for pushing data to Github

## 1. Get Existing Data from Github

The Food Authority's Name & Shame website only displays the last 12 months of data. But since I started this repository (in June, 2024) I simply append any new data to the bottom of a dataset stored on Github. So step 1 of the overall process is to get this data

#### Get access keys and read from Github

In [4]:
# Load the environment variables from .env
load_dotenv()

# GitHub Authentication (Replace placeholders with your information)
access_token = os.environ.get("GITHUB_PERSONAL_ACCESS_TOKEN")
g = Github(access_token)

# Repository and File Information
repo_owner = "liampearson" 
repo_name = "nsw-food-authority-name-and-shame"
file_path = "data/dataset.csv"

# Get Repository
repo = g.get_user(repo_owner).get_repo(repo_name)

# Get File Contents
try:
    file_content = repo.get_contents(file_path)
    
    decoded_content = file_content.decoded_content.decode() # Decode if necessary    

    df = pd.read_csv(io.StringIO(decoded_content))

    df['notice_number'] = df['notice_number'].astype(str) #convert to string for comparison
    print("   Dataset has been downloaded. Shape: {}\n".format(df.shape))

except Exception as e:
    print(f"Error downloading file: {e}")

df.head()

   Dataset has been downloaded. Shape: (729, 20)



Unnamed: 0,published_date,notice_number,council,trade_name,suburb,address,postcode,date_alleged_offence,offence_code,offence_description,offence_circumstances,party_served_company,party_served_given_name,party_served_surname,penalty_amount,penalty_issued_by,penalty_date_served,updated_date,scrape_timestamp_utc,date_removed_from_website
0,2024-07-16,3264041716,BAYSIDE,KING FRUIT MARKET,KINGSGROVE,268 KINGSGROVE ROAD,2208,2024-01-12,11367,Fail to notify appropriate enforcement agency ...,Fail to notify appropriate enforcement agency ...,,BASSAM,JOMAA,440,Bayside Council,2024-01-12,2024-07-16,2024-07-16 06:30:30,
1,2024-07-16,3225724682,BEGA VALLEY,SOUTH COAST MEATS,EDEN,181-183 IMLAY STREET,2551,2023-08-02,11369,Handle sell food so as to contravene the food ...,Fail to comply with the requirements of a food...,SOUTH COAST MEATS PTY LTD,,,1320,NSW Food Authority,2023-09-28,2024-07-16,2024-07-16 06:30:30,
2,2024-07-16,3225724691,BEGA VALLEY,SOUTH COAST MEATS,EDEN,181-183 IMLAY STREET,2551,2023-06-06,11369,Handle sell food so as to contravene the food ...,Fail to comply with the requirements of a food...,SOUTH COAST MEATS PTY LTD,,,1320,NSW Food Authority,2023-09-28,2024-07-16,2024-07-16 06:30:30,
3,2024-07-16,3120329832,BLACKTOWN,DE HAWKERS,BLACKTOWN,17 PATRICK STREET,2148,2024-01-23,11339,Fail to comply with Food Standards Code - Corp...,"Fail to maintain the premises, and all fixture...",SNOWMAN GROUP PTY LTD,,,880,Blacktown City Council,2024-03-14,2024-07-16,2024-07-16 06:30:30,
4,2024-07-16,3261177361,BURWOOD,FISH BARREL,BURWOOD,"SHOP 4, 39 BELMORE STREET",2134,2024-01-09,11339,Fail to comply with Food Standards Code - Corp...,Fail to maintain the food premises to the requ...,LIN BROTHERS GROUP PTY LTD,,,880,Burwood Council,2024-01-22,2024-07-16,2024-07-16 06:30:30,


## Begin Analysis

In [10]:
date_columns = ['published_date', 'date_alleged_offence', 'penalty_date_served', 'date_removed_from_website']
datetime_columns = ['scrape_timestamp_utc']

for d in date_columns:
    df[d] = pd.to_datetime(df[d], errors='coerce')

for dt in datetime_columns:
    df[dt] = pd.to_datetime(df[dt]) 

df[date_columns + datetime_columns]

Unnamed: 0,published_date,date_alleged_offence,penalty_date_served,date_removed_from_website,scrape_timestamp_utc
0,2024-07-16,2024-01-12,2024-01-12,NaT,2024-07-16 06:30:30
1,2024-07-16,2023-08-02,2023-09-28,NaT,2024-07-16 06:30:30
2,2024-07-16,2023-06-06,2023-09-28,NaT,2024-07-16 06:30:30
3,2024-07-16,2024-01-23,2024-03-14,NaT,2024-07-16 06:30:30
4,2024-07-16,2024-01-09,2024-01-22,NaT,2024-07-16 06:30:30
...,...,...,...,...,...
724,2023-07-25,2023-04-23,2023-04-23,2024-06-25,2024-06-22 00:12:46
725,2023-07-18,2023-04-25,2023-05-07,2024-06-25,2024-06-22 00:12:46
726,2023-07-18,2023-04-25,2023-05-07,2024-06-25,2024-06-22 00:12:46
727,2023-07-18,2023-05-03,2023-05-11,2024-06-25,2024-06-22 00:12:46


### Occurences of offence codes

In [29]:
print("Since {} there have only been {} offence_codes ever used.".format(min(df['date_alleged_offence']), df['offence_code'].value_counts().count()))
print("The most common is: {} which has been used {} (or {}%) of the time".format(df['offence_code'].value_counts().index[0], df['offence_code'].value_counts().max(), round(df['offence_code'].value_counts().max()/len(df)*100)))

Since 2022-05-16 00:00:00 there have only been 22 offence_codes ever used.
The most common is: 11339 which has been used 521 (or 71%) of the time


#### So what is Offence Code 11339?

*Fail to comply with Food Standards Code - Corporation*

which is awfully vague...Lets look at the different `offence_circumstances` to get an idea of the offences



In [38]:
df[df['offence_code']==11339]['offence_circumstances'].value_counts().head(10)

Fail to maintain the food premises to the required standard of cleanliness                                                             73
Fail to store food in such a way that it is protected from the likelihood of contamination                                             45
Fail to maintain all fixtures, fittings and equipment to the required standard of cleanliness                                          38
Fail to take all practicable measures to eradicate and prevent the harbourage of pests                                                 21
Fail to ensure food contact surfaces of equipment is in a clean and sanitary condition                                                 19
Fail to maintain premises, fixtures, fittings, and equipment in a good state of repair and working order having regard to their use    16
Fail to store potentially hazardous food under temperature control                                                                     13
Fail to display potentially hazard

### Which Suburb/Cities are being served with the most penalty notices?

Keep in mind the following:

#### Socioeconomic Disparities:

**Lower-income areas:** Restaurants in lower-income areas may have less access to resources for staff training, facility upgrades, or high-quality ingredients, making it more challenging to consistently meet food safety standards.

**Language barriers:** In areas with diverse populations, language barriers can make it difficult for some restaurant owners or staff to fully understand and comply with regulations.

#### Enforcement Bias:

**Targeted inspections:** If inspectors are more likely to focus on certain areas or types of establishments, this could lead to an over-representation of those areas on the register, even if the underlying rate of non-compliance isn't actually higher.

In [60]:
df['suburb'].value_counts().head(10)

BURWOOD      84
BANKSTOWN    33
CAMPSIE      26
CHATSWOOD    25
LAKEMBA      24
AUBURN       21
RHODES       17
BLACKTOWN    17
CAMDEN       13
LIVERPOOL    13
Name: suburb, dtype: int64

### Are there any restaurants which appear multiple times

(exlcuding multiple penalties on same day)

### What offences carry the biggest fines?

In [52]:
df.sort_values(by='penalty_amount', ascending=False)[['offence_circumstances', 'penalty_amount']].head(20).value_counts()

offence_circumstances                                                                                                                            penalty_amount
Fail to comply with a prohibition order                                                                                                          1540              3
Sale of food that is unsuitable                                                                                                                  1320              3
Fail to comply with a prohibition order - fail to store potentially hazardous food in accordance with the Food Standards Code                    1540              2
Fail to take all practicable measures to process only safe and suitable food                                                                     1540              2
Fail to comply with the requirements of a food safety scheme - did not implement corrective action in accordance with their food safety program  1320              1
Fail to comply 