Previous work by the team published in JAC [here](https://academic.oup.com/jac/article/74/1/242/5098536) identified that Higher antibiotic prescribing was associated with greater practice size, proportion of patients >65 years or <18 years, ruralness and deprivation. Our more recent work found that financial incentives can affect dispensing doctors prescribing and electronic health records can affect prescribing decisions as well. This notebook sets out to examine if these two factors have any affect on antibiotic prescribing.

Current Measures of Antibiotic Prescribing
- Items per 1000 patients (used in previous paper)
- Antibiotic stewardship: co-amoxiclav, cephalosporins & quinolones (KTT9)
- Antibiotic stewardship: co-amoxiclav, cephalosporins & quinolones (KTT9) prescribing volume
- Antibiotic stewardship: prescribing of trimethoprim vs nitrofurantoin
- Antibiotic stewardship: three-day courses for uncomplicated UTIs
- Antibiotic stewardship: volume of antibiotic prescribing (KTT9)

Items have been traditionally used to assess antibiotics prescribing as they are self-limiting the "repeat prescribing cycle" differences (e.g. 28days v 56 days v 84 days) does not need to be accounted for. However recenent guidance has highlighted the importance of course length and national guidance has reduced the course length for certain common infections. This [paper in the BMJ](https://www.bmj.com/content/364/bmj.l440) found that a substantial proportion of prescriptions have durations exceeding guideline recommendations, although it only looked at prescribing in one (we think unrepresentative despite claims to the contrary) GP EHR. Therefor we should propose new measures

- Total quantity of antibiotics (For this we could use DDD or ADQ)
- Proprotion of Rx for a duration of (3 day versus 5 day versus 7 days versus 10 days)




#### Library  Import

In [2]:
##import libraries that will probably needed 
import pandas as pd
import numpy as np
from ebmdatalab import bq, maps, charts
import matplotlib.pyplot as plt
import matplotlib.gridspec as gridspec
import seaborn as sns

In [3]:
## ensuring the format is consistent for pounds and pence
pd.set_option('display.float_format', lambda x: '%.2f' % x)

## Import Data

### 1. Overall Prescribing as defined by previous paper

In [4]:
sql = '''
SELECT
  presc.month,
  pct,
  presc.practice,
  TRIM(Principal_Supplier) AS supplier,
  dd.dispensing_patients AS dispensing_patients,
  dd.prescribing_patients AS prescribing_patients,
  SUM(IF(SUBSTR(presc.bnf_code,1,6)='050113',items,0)) AS uti_items,
  SUM(IF(SUBSTR(presc.bnf_code,1,6)='050103',items,0)) AS tetracyclines,
  SUM(IF(SUBSTR(presc.bnf_code,1,6)='050108',items,0)) AS sulphonamides_trimethoprim,
  SUM(IF(SUBSTR(presc.bnf_code,1,9)='0501013K0',items,0)) AS coamoxiclav,
  SUM(IF(SUBSTR(presc.bnf_code,1,6)='050102',items,0)) AS cephalosporins, ##discrepancy with all broad spec,ask Helen
  SUM(IF(SUBSTR(presc.bnf_code,1,6)='050112',items,0)) AS quinolones,
  SUM(IF(SUBSTR(presc.bnf_code,1,6)='050105',items,0)) AS macrolides,
  SUM(IF(SUBSTR(presc.bnf_code,1,6)='050111',items,0)) AS metroni_tini_ornidazole,
  SUM(IF(SUBSTR(presc.bnf_code,1,6)='050101',items,0)) AS penicillins,
  SUM(IF(SUBSTR(presc.bnf_code,1,9) IN ('0501012G0','0501012H0'),items,0)) AS flucloxacillin,
  SUM(IF(SUBSTR(presc.bnf_code,1,9)='0501013K0' OR
         SUBSTR(presc.bnf_code,1,7)='0501021' OR
         SUBSTR(presc.bnf_code,1,6)='050112',items,0)) AS all_broad_spectrum,
  SUM(IF(SUBSTR(presc.bnf_code,1,9)='0501013K0' OR
         SUBSTR(presc.bnf_code,1,7)='0501021' OR
         SUBSTR(presc.bnf_code,1,6) IN ('050112','050113','050103','050105','050108','050111','050101'),items,0)) AS denom_broad_spectrum,
  SUM(IF(SUBSTR(presc.bnf_code,1,6)='050110',items,0)) AS antileprotic,
  SUM(IF(SUBSTR(presc.bnf_code,1,6)='050109',items,0)) AS antituberculosis,
  SUM(IF(SUBSTR(presc.bnf_code,1,6)='050107',items,0)) AS some_other_antibacterials,
  SUM(IF(SUBSTR(presc.bnf_code,1,6)='050104',items,0)) AS aminogylcosides,
  SUM(IF(SUBSTR(presc.bnf_code,1,9) IN ('0501013C0','0501013F0','0501013E0','0501013B0'),items,0)) AS amoxicillin,
  SUM(items) AS items,
  ## This will probably dropped SUM(IF((presc.bnf_code like'0501130R0%AG' OR presc.bnf_code like '0501130R0%AA' OR presc.bnf_code like '0501130R0%AD' 
  ##  OR presc.bnf_code LIKE '0501015P0%AB' OR presc.bnf_code LIKE '0501080W0%AE'), presc.quantity,0) 
  ##  * r.percent_of_adq) AS numerator_uti_course,
  ## SUM(IF((presc.bnf_code like '0501130R0%AG' OR presc.bnf_code like '0501130R0%AA' OR presc.bnf_code like '0501130R0%AD'
  ##  OR presc.bnf_code like '0501015P0%AB' OR presc.bnf_code LIKE '0501080W0%AE'), presc.items,0)) AS denominator_uti_course,
  AVG(total_list_size) AS list_size,
  CAST(JSON_EXTRACT(MAX(star_pu), '$.oral_antibacterials_item') AS FLOAT64) AS star_pu_items
FROM
  ebmdatalab.hscic.normalised_prescribing_standard AS presc ##this is our core dataset
INNER JOIN
  ebmdatalab.hscic.practices AS prac ## this has info on practices
ON
  presc.practice = prac.code
  AND (prac.setting = 4) ## this limits it to "normal GP practices"
INNER JOIN
ebmdatalab.bsa.dispensing_practices_nov_2018  AS dd
ON
presc.practice = dd.practice_code
JOIN
  ebmdatalab.alex.vendors AS software #this is where the up to date vendors table is held
ON
  software.ODS = presc.practice
  AND Date = presc.month
LEFT JOIN
ebmdatalab.hscic.practice_statistics_all_years AS stat
ON
 presc.practice = stat.practice
  AND presc.month = stat.month
## As above probably be dropped LEFT JOIN
##  ebmdatalab.hscic.presentation r
## ON
##  presc.bnf_code = r.bnf_code
GROUP BY
  practice,
  pct,
  setting,
  month,
  supplier,
  dispensing_patients,
  prescribing_patients
  ORDER BY
  practice,
  month
'''

df_all_abx = bq.cached_read(sql, csv_path='all_abx.csv', use_cache=True)
df_all_abx.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 277336 entries, 0 to 277335
Data columns (total 26 columns):
month                         277336 non-null datetime64[ns]
pct                           277336 non-null object
practice                      277336 non-null object
supplier                      277336 non-null object
dispensing_patients           277336 non-null int64
prescribing_patients          277336 non-null int64
uti_items                     277336 non-null int64
tetracyclines                 277336 non-null int64
sulphonamides_trimethoprim    277336 non-null int64
coamoxiclav                   277336 non-null int64
cephalosporins                277336 non-null int64
quinolones                    277336 non-null int64
macrolides                    277336 non-null int64
metroni_tini_ornidazole       277336 non-null int64
penicillins                   277336 non-null int64
flucloxacillin                277336 non-null int64
all_broad_spectrum            277336 non-null

In [5]:
df_all_abx.head()

Unnamed: 0,month,pct,practice,supplier,dispensing_patients,prescribing_patients,uti_items,tetracyclines,sulphonamides_trimethoprim,coamoxiclav,...,all_broad_spectrum,denom_broad_spectrum,antileprotic,antituberculosis,some_other_antibacterials,aminogylcosides,amoxicillin,items,list_size,star_pu_items
0,2016-01-01,00K,A81001,TPP,0,4083,11,28,24,5,...,14,260,0,0,0,0,92,7212,4233.0,2543.77
1,2016-02-01,00K,A81001,TPP,0,4083,13,32,25,7,...,12,251,0,0,1,0,85,7326,4233.0,2543.77
2,2016-03-01,00K,A81001,TPP,0,4083,13,27,27,11,...,18,270,0,0,0,0,105,7735,4233.0,2543.77
3,2016-04-01,00K,A81001,TPP,0,4083,7,26,19,3,...,9,197,0,0,0,0,66,7361,4245.0,2539.25
4,2016-05-01,00K,A81001,TPP,0,4083,6,29,15,0,...,4,170,0,0,0,0,48,6882,4245.0,2539.25


### 2. OpenPrescribing Measures of Antibiotic Prescribing

### 3. Raw Quantity Data