# **USE CASE QUERIES**
---
This notebook runs the queries that to support of our use case. 

### **USE CASE** 
**Which establishments, in D, have the most food inspection violations?**

* Only **failed** inspections matter, therefore, a violations is only counted against the establishment when the inspection was also failed.

* Each row is a specific establishment at a unique location.

* The three columns displayed in the queries are:

    * **Name** - The establishment's name.

    * **Violations** - The amount of violations an establishment has acquired from a failed inspection.

    * **Failure Rate** - The establishment's percentage of failed inspections per total inspections.

## **BEFORE CLEANING**
---

In [None]:
import sqlite3
import pandas as pd
db_dirty = sqlite3.connect('../data/dirty_data/dirty_food_inspections.db')
cur = db_dirty.cursor()

In [None]:
Q = '''
SELECT 
    j.Name, 
    j.violations, 
    CAST(j.failed_inspections/j.total_inspections*100 as int) || '%' as 'Failure Rate'
FROM (SELECT 
        e.estName as Name, 
        COUNT(CASE WHEN i.result = "Fail" THEN violations END) as Violations,
        COUNT(CASE WHEN i.result = "Fail" THEN 1 END)*1.0 as failed_inspections, 
        COUNT(i.result)*1.0 as total_inspections
      FROM Inspections AS i
      INNER JOIN EstablishmentInspections AS ei
      ON i.inspecID = ei.inspecID
      INNER JOIN Establishments AS e
      ON e.estID = ei.estID
      GROUP BY e.estID) as j
ORDER BY Violations DESC
'''
cur.execute(Q)
df = pd.read_sql_query(Q, db_dirty)
df.head(20)

## **AFTER CLEANING**
---

In [None]:
import sqlite3
import pandas as pd
db_clean = sqlite3.connect('../data/clean_data/food_inspections.db')
cur = db_clean.cursor()

In [None]:
Q = '''
SELECT 
    j.Name, 
    j.violations, 
    CAST(j.failed_inspections/j.total_inspections*100 as int) || '%' as 'Failure Rate'
FROM (SELECT 
        e.estName as Name, 
        COUNT(CASE WHEN i.result = "Fail" THEN number END) as Violations,
        COUNT(CASE WHEN i.result = "Fail" THEN 1 END)*1.0 as failed_inspections, 
        COUNT(i.result)*1.0 as total_inspections
      FROM Inspections AS i
      INNER JOIN EstablishmentInspections AS ei
      ON i.inspecID = ei.inspecID
      INNER JOIN Establishments AS e
      ON e.estID = ei.estID
      GROUP BY e.estID) as j
ORDER BY Violations DESC
'''
cur.execute(Q)
df = pd.read_sql_query(Q, db_clean)
df.head(20)

In [None]:
db_dirty.close()
db_clean.close()