# Mapping BNF codes to dm+d

We have had a request from NHS England:

>Our initial need is to have a reference file that can be used to map data in BNF code form (from NHS BSA) to drug information in dm+d (SNOMED) form (at VMP/AMP level but also with VTM information).

We hold this information in the BQ database, and should be able to create a query to deliver this need.

In [1]:
import os
import pandas as pd
import numpy as np
import matplotlib
import matplotlib.pyplot as plt
from matplotlib.dates import DateFormatter
import matplotlib.ticker as ticker
import matplotlib.dates as mdates
import seaborn as sns
from matplotlib.dates import  DateFormatter
%matplotlib inline
from ebmdatalab import bq
from ebmdatalab import charts
from ebmdatalab import maps
import datetime

## Create data from BigQuery

In [18]:
sql = """
  SELECT
  "vmp" AS type, # create type column, shows whether VMP or AMP
  vmp.id AS id, # VMP code
  vmp.nm AS nm, # VMP name
  vmp.vtm AS vtm, # VTM code
  vtm.nm AS vtm_nm, # VTM name
  bnf_code, # BNF code
  vpidprev AS vmp_previous, # Previous VMP code
  vpiddt AS vmp_previous_date # Date that previous VMP code changed
FROM
  ebmdatalab.dmd.vmp_full AS vmp
LEFT OUTER JOIN
  dmd.vtm AS vtm
ON
  vmp.vtm = vtm.id

UNION ALL # join VMP and AMP tables together to form single table
SELECT
  "amp" AS type,
  amp.id,
  amp.descr,
  vmp.vtm AS vtm,
  vtm.nm AS vtm_nm,
  amp.bnf_code AS bnf_code,
  NULL AS amp_previous,
  NULL AS amp_previous_date
FROM
  ebmdatalab.dmd.amp_full AS amp
INNER JOIN
  dmd.vmp AS vmp # join VMP to AMP table to get VMP codes to obtain VTM information
ON
  amp.vmp = vmp.id
LEFT OUTER JOIN
  dmd.vtm AS vtm
ON
  vmp.vtm = vtm.id
  """

exportfile = os.path.join("..","data","dmd_df.csv")
dmd_df = bq.cached_read(sql, csv_path=exportfile, use_cache=True)

In [19]:
dmd_df.head()

Unnamed: 0,type,id,nm,vtm,vtm_nm,bnf_code,vmp_previous,vmp_previous_date
0,vmp,6259002,Hydrogen peroxide 3% solution,31231007.0,Hydrogen peroxide,1311060I0AAABAB,4224911000000000.0,2005-08-15
1,vmp,68461003,Lubricant gels,,,,3485311000000000.0,2006-01-04
2,vmp,134460003,Irbesartan 300mg / Hydrochlorothiazide 12.5mg ...,398914000.0,Irbesartan + Hydrochlorothiazide,0205052A0AAABAB,,
3,vmp,134461004,Irbesartan 150mg / Hydrochlorothiazide 12.5mg ...,398914000.0,Irbesartan + Hydrochlorothiazide,0205052A0AAAAAA,,
4,vmp,134463001,Telmisartan 20mg tablets,129487008.0,Telmisartan,0205052Q0AAACAC,,


As we can see from above we appear to have successfully imported all `VMPs` and `AMPs`.  However, there are some products which either do not have a `VTM` or `bnf_code`.  We will explore this further below.

#### Check data with 12 months of primary care prescribing data

Importing prescribing data from BigQuery to check the impact of "missing" data

In [27]:
sql = """
SELECT
  bnf_code,
  bnf_name,
  SUM(items) AS items
FROM
  ebmdatalab.hscic.normalised_prescribing AS rx
WHERE
  month BETWEEN '2021-08-01'
  AND '2022-07-01'
GROUP BY
  bnf_name,
  bnf_code
  """

exportfile = os.path.join("..","data","rx_df.csv")
rx_df = bq.cached_read(sql, csv_path=exportfile, use_cache=True)

Please visit this URL to authorize this application: https://accounts.google.com/o/oauth2/auth?response_type=code&client_id=725825577420-unm2gnkiprugilg743tkbig250f4sfsj.apps.googleusercontent.com&redirect_uri=urn%3Aietf%3Awg%3Aoauth%3A2.0%3Aoob&scope=https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fbigquery&state=TqQ4ZdHcluER8Hj6XqZcRU6rfoiI6Q&prompt=consent&access_type=offline


Enter the authorization code:  4/1ARtbsJrjCR_Ou21T-ppSLpaAJVaDe4pSMPNqqLdoUztkLpgBRXASeeJm-dg


Downloading: 100%|██████████| 27595/27595 [00:01<00:00, 19592.23rows/s]


In [28]:
rx_df.head()

Unnamed: 0,bnf_code,bnf_name,items
0,0906040G0BRABBW,Calcichew D3 Forte chewable tablets,840864
1,0906040G0AABYBY,Colecalciferol 400unit / Calcium carbonate 1.5...,1571449
2,21010900607,BD Viva hypodermic insulin needles for pre-fil...,446581
3,0407020Q0AAABAB,Morphine sulfate 10mg/1ml solution for injecti...,215700
4,21220000215,Zerocream,373162


We can now merge the two dataframes on BNF code

In [29]:
test_df = pd.merge(dmd_df, rx_df, left_on='bnf_code', right_on='bnf_code', how='right')

In [30]:
test_df.head()

Unnamed: 0,type,id,nm,vtm,vtm_nm,bnf_code,vmp_previous,vmp_previous_date,bnf_name,items
0,vmp,6259002.0,Hydrogen peroxide 3% solution,31231007.0,Hydrogen peroxide,1311060I0AAABAB,4224911000000000.0,2005-08-15,Hydrogen peroxide 3% solution,263
1,amp,4221611000000000.0,Hydrogen peroxide 3% solution (A A H Pharmaceu...,31231007.0,Hydrogen peroxide,1311060I0AAABAB,,,Hydrogen peroxide 3% solution,263
2,amp,4222111000000000.0,Hydrogen peroxide 3% solution (Thornton & Ross...,31231007.0,Hydrogen peroxide,1311060I0AAABAB,,,Hydrogen peroxide 3% solution,263
3,amp,4222611000000000.0,Hydrogen peroxide 3% solution (Alliance Health...,31231007.0,Hydrogen peroxide,1311060I0AAABAB,,,Hydrogen peroxide 3% solution,263
4,vmp,134460000.0,Irbesartan 300mg / Hydrochlorothiazide 12.5mg ...,398914000.0,Irbesartan + Hydrochlorothiazide,0205052A0AAABAB,,,Irbesartan 300mg / Hydrochlorothiazide 12.5mg ...,19101


We can now check by seeing which items prescribed which don't have a corresponding `dm+d` code

In [31]:
test_df_no_dmd = test_df[test_df['id'].isnull()].sort_values(by='items', ascending=False) # filter only prescribing which has a null VMP or AMP
test_df_no_dmd.head()

Unnamed: 0,type,id,nm,vtm,vtm_nm,bnf_code,vmp_previous,vmp_previous_date,bnf_name,items
108089,,,,,,190201000AABLBL,,,Exception Handler Unspecified Item,122793
108090,,,,,,190201000AABNBN,,,Exception Handler Discount Not Deducted Item,17962
108088,,,,,,0904010H0BDACAA,,,Juvela gluten free bread rolls,6623
108092,,,,,,130201000BBAHAV,,,Ultrabase cream,3912
108091,,,,,,0904010H0BYAMAA,,,Barkat gluten free wholemeal bread sliced,285


We can see from the above there are very few items apart from "unspecified item", which by definition cannot have a BNF code.

The other part of the request was to link `VTM` codes.  We can also check which drugs do not link to a `VTM`.

In [33]:
test_vtm_no_dmd  = test_df[test_df['vtm'].isnull()].sort_values(by='items', ascending=False)

In [35]:
test_vtm_no_dmd.head(30)

Unnamed: 0,type,id,nm,vtm,vtm_nm,bnf_code,vmp_previous,vmp_previous_date,bnf_name,items
64859,amp,1.742061e+16,Laxido Orange oral powder sachets sugar free (...,,,0106040M0BCACAA,,,Laxido Orange oral powder sachets sugar free,2703900
17166,amp,2.877031e+16,Macrogol compound oral powder sachets sugar fr...,,,0106040M0AAAAAA,,,Macrogol compound oral powder sachets NPF suga...,1696902
17167,amp,3.355801e+16,Macrogol compound oral powder sachets sugar fr...,,,0106040M0AAAAAA,,,Macrogol compound oral powder sachets NPF suga...,1696902
17165,amp,2.840711e+16,Macrogol compound oral powder sachets sugar fr...,,,0106040M0AAAAAA,,,Macrogol compound oral powder sachets NPF suga...,1696902
17156,vmp,3591211000000000.0,Macrogol compound oral powder sachets NPF suga...,,,0106040M0AAAAAA,,,Macrogol compound oral powder sachets NPF suga...,1696902
17157,amp,1.550391e+16,Macrogol compound oral powder sachets sugar fr...,,,0106040M0AAAAAA,,,Macrogol compound oral powder sachets NPF suga...,1696902
17158,amp,1.644631e+16,Macrogol compound oral powder sachets sugar fr...,,,0106040M0AAAAAA,,,Macrogol compound oral powder sachets NPF suga...,1696902
17159,amp,1.928291e+16,Macrogol compound oral powder sachets sugar fr...,,,0106040M0AAAAAA,,,Macrogol compound oral powder sachets NPF suga...,1696902
17160,amp,2.179041e+16,Macrogol compound oral powder sachets sugar fr...,,,0106040M0AAAAAA,,,Macrogol compound oral powder sachets NPF suga...,1696902
17161,amp,2.239391e+16,Macrogol compound oral powder sachets sugar fr...,,,0106040M0AAAAAA,,,Macrogol compound oral powder sachets NPF suga...,1696902


In [40]:
#df.groupby(['Country', 'Item_Code'])[["Y1961", "Y1962", "Y1963"]].sum()

group_vtm_no_dmd = test_vtm_no_dmd.groupby(['bnf_name'])[['items']].mean().sort_values(by='items', ascending=False)

In [42]:
group_vtm_no_dmd.head(30)

Unnamed: 0_level_0,items
bnf_name,Unnamed: 1_level_1
Laxido Orange oral powder sachets sugar free,2703900
Macrogol compound oral powder sachets NPF sugar free,1696902
Dermol 500 lotion,1678850
Otomize ear spray,1306758
FreeStyle Libre 2 Sensor,1210030
Vitamin B compound strong tablets,1199121
GlucoRx Nexus testing strips,1100323
Epimax original cream,1055408
WaveSense JAZZ testing strips,678138
Medi Derma-S barrier cream,624223


The largest number of prescribing items with a `NULL` `VTM` are either a) where they are not drugs, but appliances or devices (such as Freestyle Libre), OR where the drug has more than 3 ingredients.  In this case (such as Laxido) no VTM is assigned to the formulation in the dm+d.  Therefore it appears that the table accurately reflects what the dm+d says.