# Usage - python-pandas-datareader-NHSDigital

In [1]:
#!pip3 install --upgrade --force-reinstall --no-deps python-pandas-datareader-NHSDigital/ 

import pd_datareader_nhs.nhs_digital_ods as ods

## Find Datasets From NHS Digital Organisational Data Service

Running any command from the `python-pandas-datareader-NHSDigital` will check whether a "directory" of datasets published by the NHS Organisational Data Service (ODS) is locally available. This information is scraped from the NHS Digital website, cached locally, and may also be persisted in a local SQLiite3 database. For this reason, the first  command run may take some time as the NHS Digital data is downloaded if a persistent database is not available.

In [2]:
#Find all published datasets (filtered through a whitelist defined in the package)
ss=ods.search(string='')
ss

Unnamed: 0,Dataset,Date,Label,Period,Type,URL
0,epraccur,26 May 2017,GP Practices,Quarterly,gp-data,https://digital.nhs.uk/media/372/epraccur/zip/...
1,egpcur,26 May 2017,GP Practitioners,Quarterly,gp-data,https://digital.nhs.uk/media/370/egpcur/zip/eg...
2,epracmem,26 May 2017,GPs by GP Practices,Quarterly,gp-data,https://digital.nhs.uk/media/379/epracmem/zip/...
3,epcmem,26 May 2017,GP Practices linked to CCG/LHG,Quarterly,gp-data,https://digital.nhs.uk/media/378/epcmem/zip/ep...
4,epracarc,26 May 2017,Archived GP Practices,Quarterly,gp-data,https://digital.nhs.uk/media/376/epracarc/zip/...
5,egparc,26 May 2017,Archived GP Practitioners,Quarterly,gp-data,https://digital.nhs.uk/media/374/egparc/zip/eg...
6,ebranchs,26 May 2017,Branch Surgeries,Quarterly,gp-data,https://digital.nhs.uk/media/393/ebranchs/zip/...
7,epharmacyhq,26 May 2017,Pharmacy Headquarters Organisations,Quarterly,gp-data,https://digital.nhs.uk/media/391/epharmacyhq/z...
8,edispensary,26 May 2017,Dispensaries,Quarterly,gp-data,https://digital.nhs.uk/media/390/edispensary/z...
9,enurse,26 May 2017,Nurse Presribers in England,Quarterly,gp-data,https://digital.nhs.uk/media/388/enurse/zip/en...


In [3]:
#View whitelist of packages that have dataframe mappings defined
ods.dataset_codes

['epraccur',
 'etrust',
 'eccg',
 'eccgsite',
 'epcmem',
 'epracmem',
 'egdpprac',
 'egpcur',
 'egparc',
 'epracarc',
 'ehospice',
 'epharmacyhq',
 'edispensary',
 'enurse',
 'epcdp',
 'eabeydispgp',
 'ecarehomehq',
 'ecarehomesite',
 'ecarehomesucc',
 'ephp',
 'ephpsite',
 'enonnhs',
 'eprison',
 'eschools',
 'ejustice',
 'ecare']

In [4]:
#The mappings are provided as JSON data that define things such as column headings and code mappings
ods.jdata['egpcur']

{'codes': {'Organisation Sub-Type Code': {'O': 'Other GP in practice (not Principal/Senior GP)',
   'P': 'Principal/Senior GP at practice'},
  'Status Code': {'A': 'Active',
   'B': 'Retired',
   'C': 'Closed',
   'P': 'Proposed'}},
 'cols': ['Organisation Code',
  'Name',
  'National Grouping',
  'High Level Health Geography',
  'Address Line 1',
  'Address Line 2',
  'Address Line 3',
  'Address Line 4',
  'Address Line 5',
  'Postcode',
  'Open Date',
  'Close Date',
  'Status Code',
  'Organisation Sub-Type Code',
  'Parent Organisation Code',
  'Join Parent Date',
  'Left Parent Date',
  'Contact Telephone Number',
  'Null',
  'Null',
  'Null',
  'Amended Record Indicator',
  'Null',
  'Current Care Organisation',
  'Null',
  'Null',
  'Null'],
 'dates': 'auto',
 'index': 'auto'}

In [5]:
#Search by ODS dataset type - searches are partial match
ods.search(string='Prison', field='Label')

Unnamed: 0,Dataset,Date,Label,Period,Type,URL
45,eprison,26 May 2017,Prisons in England and Wales,Quarterly,non-nhs,https://digital.nhs.uk/media/401/eprison/zip/e...


In [6]:
#Case sensitive search of datasets by field
len(ods.search('gp-data', field='Type')), len(ods.search(string='gp', field='Label', case=True)) 

(18, 0)

In [7]:
#Search datasets by field
ods.search(string='health-authorities', field='Type')

Unnamed: 0,Dataset,Date,Label,Period,Type,URL
29,eauth,26 May 2017,NHS England Commissioning and Government Offic...,Quarterly,health-authorities,https://digital.nhs.uk/media/332/eauth/zip/eauth
30,espha,26 May 2017,Special Health Authorities,Quarterly,health-authorities,https://digital.nhs.uk/media/343/espha/zip/espha
31,ecsu,26 May 2017,Commissioning Support Units,Quarterly,health-authorities,https://digital.nhs.uk/media/342/ecsu/zip/ecsu
32,ecsusite,26 May 2017,Commissioning Support Units sites,Quarterly,health-authorities,https://digital.nhs.uk/media/341/ecsusite/zip/...
33,eother,26 May 2017,Executive Agency Programme,Quarterly,health-authorities,https://digital.nhs.uk/media/340/eother/zip/eo...
34,ensa,26 May 2017,NHS Support Agencies and Shared Services,Quarterly,health-authorities,https://digital.nhs.uk/media/339/ensa/zip/ensa


## Download Datasets

*Available datasets* are ones that have been downloaded and cached locally.

In [8]:
ods.availableDatasets()

In [9]:
epraccur_df = ods.download('epraccur')
epraccur_df.head()

Unnamed: 0,Organisation Code,Name,National Grouping,High Level Health Geography,Address Line 1,Address Line 2,Address Line 3,Address Line 4,Address Line 5,Postcode,...,Commissioner,Join Provider/Purchaser Date,Left Provider/Purchaser Date,Contact Telephone Number,Amended Record Indicator,Provider/Purchaser,Prescribing Setting,Status Code Value,Organisation Sub-Type Code Value,Prescribing Setting Value
0,A81001,THE DENSHAM SURGERY,Y54,Q74,THE HEALTH CENTRE,LAWSON STREET,STOCKTON-ON-TEES,CLEVELAND,,TS18 1HU,...,00K,2013-04-01,NaT,01642 672351,0,00K,4,Active,Allocated to a Provider/Purchaser Organisation,GP Practice
1,A81002,QUEENS PARK MEDICAL CENTRE,Y54,Q74,QUEENS PARK MEDICAL CTR,FARRER STREET,STOCKTON ON TEES,CLEVELAND,,TS18 2AW,...,00K,2013-04-01,NaT,01642 679681,0,00K,4,Active,Allocated to a Provider/Purchaser Organisation,GP Practice
2,A81003,VICTORIA MEDICAL PRACTICE,Y54,Q74,THE HEALTH CENTRE,VICTORIA ROAD,HARTLEPOOL,CLEVELAND,,TS26 8DB,...,00K,2013-04-01,NaT,01429 272945,0,00K,4,Dormant,Allocated to a Provider/Purchaser Organisation,GP Practice
3,A81004,WOODLANDS ROAD SURGERY,Y54,Q74,6 WOODLANDS ROAD,,MIDDLESBROUGH,CLEVELAND,,TS1 3BE,...,00M,2013-04-01,NaT,01642 247982,0,00M,4,Active,Allocated to a Provider/Purchaser Organisation,GP Practice
4,A81005,SPRINGWOOD SURGERY,Y54,Q74,SPRINGWOOD SURGERY,RECTORY LANE,GUISBOROUGH,,,TS14 7DJ,...,00M,2013-04-01,NaT,01287 619611,0,00M,4,Active,Allocated to a Provider/Purchaser Organisation,GP Practice


In [10]:
#We can check to see what datasets have been downloaded and persisted in SQLite storage
ods.availableDatasets()

In [11]:
#Or check to see what files are avialble in the local, in-memory cach
ods.availableDatasets(typ='cached')

Unnamed: 0,Dataset,Date,Label,Period,Type,URL
0,epraccur,26 May 2017,GP Practices,Quarterly,gp-data,https://digital.nhs.uk/media/372/epraccur/zip/...


In [12]:
#We can download all the datasets in an ODS group
dd=ods.download(datatype='other-nhs')
ods.availableDatasets(typ='cached')

Unnamed: 0,Dataset,Date,Label,Period,Type,URL
0,epraccur,26 May 2017,GP Practices,Quarterly,gp-data,https://digital.nhs.uk/media/372/epraccur/zip/...
18,eccg,26 May 2017,Clinical Commissioning Groups,Quarterly,other-nhs,https://digital.nhs.uk/media/354/eccg/zip/eccg
19,eccgsite,26 May 2017,Clinical Commissioning Group sites,Quarterly,other-nhs,https://digital.nhs.uk/media/353/eccgsite/zip/...
22,etrust,26 May 2017,NHS Trusts and sites,Quarterly,other-nhs,https://digital.nhs.uk/media/350/etrust/zip/et...
25,ecare,26 May 2017,Care Trusts and sites,Quarterly,other-nhs,https://digital.nhs.uk/media/347/ecare/zip/ecare


In [13]:
#We can also download a list of names datsets
dd=ods.download(['eprison','eschools'])
ods.availableDatasets(typ='cached')

Unnamed: 0,Dataset,Date,Label,Period,Type,URL
0,epraccur,26 May 2017,GP Practices,Quarterly,gp-data,https://digital.nhs.uk/media/372/epraccur/zip/...
18,eccg,26 May 2017,Clinical Commissioning Groups,Quarterly,other-nhs,https://digital.nhs.uk/media/354/eccg/zip/eccg
19,eccgsite,26 May 2017,Clinical Commissioning Group sites,Quarterly,other-nhs,https://digital.nhs.uk/media/353/eccgsite/zip/...
22,etrust,26 May 2017,NHS Trusts and sites,Quarterly,other-nhs,https://digital.nhs.uk/media/350/etrust/zip/et...
25,ecare,26 May 2017,Care Trusts and sites,Quarterly,other-nhs,https://digital.nhs.uk/media/347/ecare/zip/ecare
41,eschools,26 May 2017,Schools in England,Quarterly,non-nhs,https://digital.nhs.uk/media/406/eschools/zip/...
45,eprison,26 May 2017,Prisons in England and Wales,Quarterly,non-nhs,https://digital.nhs.uk/media/401/eprison/zip/e...


In [14]:
#If we try to download files for which there is no mapping, we get a warning
dd=ods.download(['eother','eauth'])
ods.availableDatasets(typ='cached')



Unnamed: 0,Dataset,Date,Label,Period,Type,URL
0,epraccur,26 May 2017,GP Practices,Quarterly,gp-data,https://digital.nhs.uk/media/372/epraccur/zip/...
18,eccg,26 May 2017,Clinical Commissioning Groups,Quarterly,other-nhs,https://digital.nhs.uk/media/354/eccg/zip/eccg
19,eccgsite,26 May 2017,Clinical Commissioning Group sites,Quarterly,other-nhs,https://digital.nhs.uk/media/353/eccgsite/zip/...
22,etrust,26 May 2017,NHS Trusts and sites,Quarterly,other-nhs,https://digital.nhs.uk/media/350/etrust/zip/et...
25,ecare,26 May 2017,Care Trusts and sites,Quarterly,other-nhs,https://digital.nhs.uk/media/347/ecare/zip/ecare
41,eschools,26 May 2017,Schools in England,Quarterly,non-nhs,https://digital.nhs.uk/media/406/eschools/zip/...
45,eprison,26 May 2017,Prisons in England and Wales,Quarterly,non-nhs,https://digital.nhs.uk/media/401/eprison/zip/e...


In [15]:
#Display a downloaded dataset - we will use cached data if available
ods.download('eprison').head()

Unnamed: 0,Organisation Code,Name,Address Line 1,Address Line 2,Address Line 3,Address Line 4,Address Line 5,Postcode,Open Date,Close Date,Contact Telephone Number,Amended Record Indicator
0,YDE01,HMP NORTHUMBERLAND,ACKLINGTON,,,MORPETH,NORTHUMBERLAND,NE65 9XG,2003-04-01,NaT,,0
1,YDE02,HMP ISLE OF WIGHT,CLISSOLD ROAD,,,NEWPORT,ISLE OF WIGHT,PO30 5RS,2003-04-01,NaT,,0
2,YDE03,HMP ALTCOURSE,HIGHER LANE,FAZAKERLEY,,LIVERPOOL,MERSEYSIDE,L9 7LH,2003-04-01,NaT,,0
3,YDE05,HMP BEDFORD,ST. LOYES STREET,,,BEDFORD,BEDFORDSHIRE,MK40 1HG,2003-04-01,NaT,,0
4,YDE06,HMP BELMARSH,1 BELMARSH ROAD,WESTERN WAY,THAMESMEAD,LONDON,GREATER LONDON,SE28 0EB,2003-04-01,NaT,,0


In [16]:
#If we download multiple datsets in one go, we get a dict of dataframes
dd=ods.download(['eprison','eschools'])
dd['eschools'].head()

Unnamed: 0,Organisation Code,Name,National Grouping,High Level Health Geography,Address Line 1,Address Line 2,Address Line 3,Address Line 4,Address Line 5,Postcode,Open Date,Close Date,Local Authority,Contact Telephone Number,Amended Record Indicator,Current Care Organisation,Type of Establishment,Type of Establishment Value
0,EE100000,SIR JOHN CASS'S FOUNDATION PRIMARY SCHOOL,Y56,Q71,ST JAMES'S PASSAGE,DUKE'S PLACE,,LONDON,,EC3A 5DE,1900-01-01,NaT,714,2072831000.0,0,07T,2,VOLUNTARY AIDED SCHOOL
1,EE100001,CITY OF LONDON SCHOOL FOR GIRLS,Y56,Q71,ST GILES' TERRACE,BARBICAN,,LONDON,,EC2Y 8BB,1920-01-01,NaT,714,2078476000.0,0,07T,11,OTHER INDEPENDENT SCHOOL
2,EE100002,ST PAUL'S CATHEDRAL SCHOOL,Y56,Q71,2 NEW CHANGE,,,LONDON,,EC4M 9AD,1939-01-01,NaT,714,2072485000.0,0,07T,11,OTHER INDEPENDENT SCHOOL
3,EE100003,CITY OF LONDON SCHOOL,Y56,Q71,QUEEN VICTORIA STREET,,,LONDON,,EC4V 3AL,1919-01-01,NaT,714,2074890000.0,0,07T,11,OTHER INDEPENDENT SCHOOL
4,EE100005,THOMAS CORAM CENTRE,Y56,Q71,49 MECKLENBURGH SQUARE,,,LONDON,,WC1N 2NY,1900-01-01,NaT,702,2075200000.0,0,07R,15,LA NURSERY SCHOOL


## Persistent Database

We can initialise the package to use a persistent, named database.

If there is any cached data, add that to the database. If there is no cached data, and no netwrok connection, and a database is specified, we use that.

If there is a network connection, if the date field in the list of datatables is different to the persisted item in the database, the database copy will be uploaded with a new downloaded version if the datastet is "downloaded".

In [17]:
!rm nhs_ods_test1.sqlite
ods.init(sqlite3db='nhs_ods_test1.sqlite')

rm: nhs_ods_test1.sqlite: No such file or directory
Setting up a new dataset_date table...


  chunksize=chunksize, dtype=dtype)


In [18]:
ods.availableDatasets()

Unnamed: 0,Dataset,Date
0,epraccur,26 May 2017
1,etrust,26 May 2017
2,eccg,26 May 2017
3,eccgsite,26 May 2017
4,eprison,26 May 2017
5,eschools,26 May 2017
6,ecare,26 May 2017


In [20]:
#We can also update the database to grab all the whitelisted datasets
ods.updatedb()
ods.availableDatasets()

  chunksize=chunksize, dtype=dtype)


Unnamed: 0,Dataset,Date
0,epraccur,26 May 2017
1,etrust,26 May 2017
2,eccg,26 May 2017
3,eccgsite,26 May 2017
4,epcmem,26 May 2017
5,epracmem,26 May 2017
6,egdpprac,25 Nov 2016
7,egpcur,26 May 2017
8,egparc,26 May 2017
9,epracarc,26 May 2017
