# Congestion Charges - Hard

You may need to create views to complete these questions - but you do not have permission to create tables or views in the default schema. Your SQL commands are executed by user scott in schema gisq - you may create or drop views and tables in schema scott but not in gisq.

In [1]:
import getpass
import psycopg2
from sqlalchemy import create_engine
import pandas as pd
import numpy as np
pwd = getpass.getpass()
engine = create_engine(
    'postgresql+psycopg2://postgres:%s@192.168.31.31:15432/sqlzoo' % (pwd))
pd.set_option('display.max_rows', 60)


 ········


In [2]:
camera = pd.read_sql_table('camera', engine)
keeper = pd.read_sql_table('keeper', engine)
vehicle = pd.read_sql_table('vehicle', engine)
image = pd.read_sql_table('image', engine)
permit = pd.read_sql_table('permit', engine)

## 1.
When creating a view in scott you must specify the schema name of the sources and the destination.

In [3]:
scott=pd.DataFrame()

## 2.
There are four types of permit. The most popular type means that this type has been issued the highest number of times. Find out the most popular type, together with the total number of permits issued.

In [4]:
(permit.groupby('chargetype').agg(cnt=pd.NamedAgg(column='reg', aggfunc='count'))
 .reset_index()
 .sort_values('cnt', ascending=False).iloc[:1])

Unnamed: 0,chargetype,cnt
1,Daily,27


## 3.
For each of the vehicles caught by camera 19 - show the registration, the earliest time at camera 19 and the time and camera at which it left the zone.

In [5]:
#  registrations showing at camera 19, the earliest time
t = (image.loc[image['camera']==19]
     .merge(vehicle, left_on='reg', right_on='id')
     .groupby('reg')
     .agg(earliest=pd.NamedAgg(column='whn', aggfunc='min'))
     .reset_index()[['reg', 'earliest']])
a = (t.merge(image, on='reg', how='left')
     .query('earliest < whn')
     .groupby(['reg', 'earliest'])
     .agg(next=pd.NamedAgg(column='whn', aggfunc='min'))
     .reset_index()[['reg', 'earliest', 'next']])
(a.merge(image, left_on=['reg', 'next'], right_on=['reg', 'whn'])
 [['reg', 'earliest', 'next', 'camera']])

Unnamed: 0,reg,earliest,next,camera
0,SO 02 CSP,2007-02-25 07:51:10,2007-02-25 07:55:11,18
1,SO 02 DSP,2007-02-25 16:31:01,2007-02-25 17:42:41,19
2,SO 02 JSP,2007-02-25 17:14:11,2007-02-25 17:17:03,3
3,SO 02 TSP,2007-02-25 07:23:00,2007-02-25 07:26:31,19


## 4.
For all 19 cameras - show the position as IN, OUT or INTERNAL and the busiest hour for that camera.

In [6]:
t = (camera.assign(type=camera['perim'].fillna('INTERNAL'))
     .merge(image.assign(hr=image['whn'].dt.hour), 
            left_on='id', right_on='camera')
     .groupby(['camera', 'type', 'hr'])
     .agg(n=pd.NamedAgg(column='id', aggfunc='count'))
     .reset_index())
(t.groupby(['camera', 'type', 'hr'])['n'].max()
 .reset_index().sort_values('camera'))

Unnamed: 0,camera,type,hr,n
0,1,IN,6,1
1,2,IN,7,1
2,3,IN,17,3
3,3,IN,18,2
4,5,IN,7,1
5,8,IN,7,2
6,9,OUT,6,1
7,9,OUT,16,6
8,9,OUT,18,1
9,10,OUT,5,1


## 5.
Anomalous daily permits. Daily permits should not be issued for non-charging days. Find a way to represent charging days. Identify the anomalous daily permits.

In [7]:
permit.loc[(permit['sdate'].dt.weekday.isin([5, 6])) &
           (permit['chargetype']=='Daily')]

Unnamed: 0,reg,sdate,chargetype
1,SO 02 ATP,2007-01-21,Daily
6,SO 02 BTP,2007-02-03,Daily
7,SO 02 BTP,2007-02-04,Daily
12,SO 02 CTP,2007-01-21,Daily
21,SO 02 FTP,2007-02-25,Daily
27,SO 02 HTP,2006-01-21,Daily
28,SO 02 HTP,2006-01-22,Daily
33,SO 02 JTP,2007-01-21,Daily


## 6.
Issuing fines: Vehicles using the zone during the charge period, on charging days must be issued with fine notices unless they have a permit covering that day. List the name and address of such culprits, give the camera and the date and time of the first offence.

In [8]:
from pandas.tseries.offsets import DateOffset

# vehicles with permits
t = (vehicle.merge(permit, left_on='id', right_on='reg', how='left')
     [['reg', 'sdate', 'keeper', 'chargetype']]
    .assign(edate=permit['sdate']))
t.loc[t['chargetype']=='Daily', 'edate'] = t.loc[
    t['chargetype']=='Daily', 'sdate'] + DateOffset(days=1)
t.loc[t['chargetype']=='Weekly', 'edate'] = t.loc[
    t['chargetype']=='Weekly', 'sdate'] + DateOffset(weeks=1)
t.loc[t['chargetype']=='Monthly', 'edate'] = t.loc[
    t['chargetype']=='Monthly', 'sdate'] + DateOffset(months=1)
t.loc[t['chargetype']=='Annual', 'edate'] = t.loc[
    t['chargetype']=='Annual', 'sdate'] + DateOffset(years=1)

f = (t.merge(image, on='reg', how='right')
     .merge(keeper.rename(columns={'id': 'keeper'}), on='keeper'))
f = f.loc[(f['whn']<f['sdate']) | (f['whn']>f['edate'])]

a = (f.groupby(['reg', 'name', 'address'])
     .agg(first_offence=pd.NamedAgg(column='whn', aggfunc='min'))
     .reset_index())
(a.merge(f[['reg', 'whn', 'camera']], 
         left_on=['reg', 'first_offence'], right_on=['reg', 'whn'])
 [['reg', 'name', 'address', 'first_offence', 'camera']])

Unnamed: 0,reg,name,address,first_offence,camera
0,SO 02 ASP,"Ambiguous, Arthur",Absorption Ave.,2007-02-25 06:10:13,1
1,SO 02 CSP,"Ambiguous, Arthur",Absorption Ave.,2007-02-25 06:57:31,17
2,SO 02 DSP,"Strenuous, Sam",Surjection Street,2007-02-25 16:29:11,18
3,SO 02 GSP,"Incongruous, Ingrid",Irresolution Pl.,2007-02-25 07:10:00,5
4,SO 02 HSP,"Assiduous, Annie",Attribution Alley,2007-02-25 16:45:04,9
5,SO 02 ISP,"Incongruous, Ingrid",Irresolution Pl.,2007-02-25 16:58:01,9
6,SO 02 JSP,"Inconspicuous, Iain",Interception Rd.,2007-02-25 17:07:00,3
