# NHS and GP Administrative Data

Notebook showing how to import administrative data for GP practices and additionally place it into a SQLite database. *(SQLite is a simple file based SQL database that "just works".)*

Data file download URLs identfied via:

- GP Practices: http://systems.digital.nhs.uk/data/ods/datadownloads/gppractice
- High Level Admin: http://systems.digital.nhs.uk/data/ods/datadownloads/haandsa
- Trusts: http://systems.digital.nhs.uk/data/ods/datadownloads/othernhs

In [1]:
#!pip install pandas
#pandas is a python library for working with tabular datasets
#It can be used to import data from CSV files and Excel spreadsheets
import pandas as pd

In [2]:
#SQLite is a file based SQL database included in the Python distribution
import sqlite3
#If you want to build the database from scratch, delete any outstanding copy
#Uncomment and run the following command line (!) command
!rm nhsadmin.sqlite

In [3]:
#Create a connection to the database
con = sqlite3.connect("nhsadmin.sqlite")

In [4]:
#This function helps download and unpack data files
def downloader(typ,url=None):
    ''' Download and unzip data file '''
    !rm downloads/{typ}.zip
    if url is None:
        url='http://systems.digital.nhs.uk/data/ods/datadownloads/data-files/{typ}.zip'.format(typ=typ)
    #Download the data from the HSCIC website
    !wget -P downloads/ {url}
    !rm -r data/{typ}/
    #Unzip the downloaded files into a subdirectory of the data folder, making sure the data dir exists first
    !mkdir -p data
    #The -o flag is overkill - if we hadn't deleted the original folder it would overwrite any similar files
    !unzip -o -d data/{typ} downloads/{typ}.zip

In [98]:
def getData(typ,dates=False,encoding=None):
    ''' Read CSV file in from downloaded and unzipped file '''
    downloader(typ)
    df = pd.read_csv('data/{typ}/{typ}.csv'.format(typ=typ),header=None,parse_dates=dates,encoding=encoding)
    return df

In [122]:
def normaliser(typ,cols,dates=False,index=None,encoding=None,db_con=None):
    ''' Download, read and process data file, adding it to a SQLite database '''
    df=getData(typ,dates=dates,encoding=encoding)
    df.columns=cols
    if 'Null' in df.columns: df.drop('Null', axis=1, inplace=True)
    if index is not None: df=df.set_index(index)
    if db_con is not None: df.to_sql(con=db_con, name=typ,if_exists='replace')
    return df

## epraccur - Current Medical Practices and Prescribing Cost Centres

In [None]:
#via http://systems.digital.nhs.uk/data/ods/datadownloads/gppractice
#epraccur is administrative info about GP practices - practice codes, address, etc etc

EPRACCUR='epraccur'
epraccur= getData(EPRACCUR,dates=[10,11,15,16])

In [7]:
epraccur.columns

Int64Index([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16,
            17, 18, 19, 20, 21, 22, 23, 24, 25, 26],
           dtype='int64')

In [8]:
#Update the column names
#Really, we should do this by loading in the Excel version of the file
#and then extracting the metadata from the spreadsheet to identify the column names
#The following information is extracted from the metadata PDF
cols=['Organisation Code','Name','National Grouping','High Level Health Geography',
      'Address Line 1','Address Line 2','Address Line 3','Address Line 4','Address Line 5','Postcode',
      'Open Date','Close Date','Status Code','Organisation Sub-Type code',
      'Commissioner','Join Provider/Purchaser Date','Left Provider/Purchaser Date','Contact Telephone Number',
      'Null','Null','Null',
      'Amended Record Indicator','Null',
      'Provider/Purchaser','Null','Prescribing Setting','Null']

In [9]:
#Set the column names
epraccur.columns=cols
#Drop the "Available for future use" columns
epraccur.drop('Null', axis=1, inplace=True)
#preview the data
epraccur.head(3)

Unnamed: 0,Organisation Code,Name,National Grouping,High Level Health Geography,Address Line 1,Address Line 2,Address Line 3,Address Line 4,Address Line 5,Postcode,...,Close Date,Status Code,Organisation Sub-Type code,Commissioner,Join Provider/Purchaser Date,Left Provider/Purchaser Date,Contact Telephone Number,Amended Record Indicator,Provider/Purchaser,Prescribing Setting
0,A81001,THE DENSHAM SURGERY,Y54,Q74,THE HEALTH CENTRE,LAWSON STREET,STOCKTON-ON-TEES,CLEVELAND,,TS18 1HU,...,NaT,A,B,00K,2013-04-01,NaT,01642 672351,0,00K,4
1,A81002,QUEENS PARK MEDICAL CENTRE,Y54,Q74,QUEENS PARK MEDICAL CTR,FARRER STREET,STOCKTON ON TEES,CLEVELAND,,TS18 2AW,...,NaT,A,B,00K,2013-04-01,NaT,01642 679681,0,00K,4
2,A81003,VICTORIA MEDICAL PRACTICE,Y54,Q74,THE HEALTH CENTRE,VICTORIA ROAD,HARTLEPOOL,CLEVELAND,,TS26 8DB,...,NaT,A,B,00K,2013-04-01,NaT,01429 272945,0,00K,4


In [10]:
#Example showing how to filter on Parent Organisation Code
epraccur[epraccur['Commissioner']=='10L'].head(3)

Unnamed: 0,Organisation Code,Name,National Grouping,High Level Health Geography,Address Line 1,Address Line 2,Address Line 3,Address Line 4,Address Line 5,Postcode,...,Close Date,Status Code,Organisation Sub-Type code,Commissioner,Join Provider/Purchaser Date,Left Provider/Purchaser Date,Contact Telephone Number,Amended Record Indicator,Provider/Purchaser,Prescribing Setting
5255,J84003,VENTNOR MEDICAL CENTRE,Y57,Q70,VENTNOR MEDICAL CENTRE,3 ALBERT STREET,VENTNOR,ISLE OF WIGHT,,PO38 1EZ,...,NaT,A,B,10L,2013-04-01,NaT,01983 857288,0,10L,4
5256,J84004,EAST COWES MEDICAL CENTRE,Y57,Q70,EAST COWES MEDICAL CENTRE,CHURCH PATH,EAST COWES,ISLE OF WIGHT,,PO32 6RP,...,NaT,A,B,10L,2013-04-01,NaT,01983 284333,0,10L,4
5257,J84005,ESPLANADE SURGERY,Y57,Q70,THE ESPLANADE SURGERY,19 THE ESPLANADE,RYDE,ISLE OF WIGHT,,PO33 2EH,...,NaT,A,B,10L,2013-04-01,NaT,01983 618388,0,10L,4


### Storing the Data in a SQLite3 Database
If we store several administrative files in the same database, we can run linked queries over them using SQL.

In [11]:
tmp=epraccur.set_index(['Organisation Code'])
#If the table exists, replace it, under the assumption we are using a more recent version of the data
tmp.to_sql(con=con, name=EPRACCUR,if_exists='replace')

  chunksize=chunksize, dtype=dtype)


In [12]:
#We can now run a SQL query over the data
orgcode='J84007'
pd.read_sql_query('SELECT * FROM {typ} WHERE "Organisation Code"="{orgcode}"'.format(typ=EPRACCUR,orgcode=orgcode), con)

Unnamed: 0,Organisation Code,Name,National Grouping,High Level Health Geography,Address Line 1,Address Line 2,Address Line 3,Address Line 4,Address Line 5,Postcode,...,Close Date,Status Code,Organisation Sub-Type code,Commissioner,Join Provider/Purchaser Date,Left Provider/Purchaser Date,Contact Telephone Number,Amended Record Indicator,Provider/Purchaser,Prescribing Setting
0,J84007,ST.HELENS MEDICAL CENTRE,Y57,Q70,ST.HELENS MEDICAL CENTRE,UPPER GREEN ROAD,ST.HELENS,ISLE OF WIGHT,,PO33 1UG,...,,A,B,10L,2013-04-01 00:00:00,,01983 871828,0,10L,4


## etrust - NHS Trusts and Trust Sites

In [13]:
ETRUST='etrust'

#via http://systems.digital.nhs.uk/data/ods/datadownloads/othernhs
etrust= getData(ETRUST,dates=[10,11])
etrust.head(2)

--2016-09-27 01:30:18--  http://systems.digital.nhs.uk/data/ods/datadownloads/data-files/etrust.zip
Resolving systems.digital.nhs.uk... 194.189.27.101
Connecting to systems.digital.nhs.uk|194.189.27.101|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 717137 (700K) [application/zip]
Saving to: 'downloads/etrust.zip'


2016-09-27 01:30:19 (1.84 MB/s) - 'downloads/etrust.zip' saved [717137/717137]

Archive:  downloads/etrust.zip
  inflating: data/etrust/etrust.csv  
  inflating: data/etrust/etrust.pdf  


Unnamed: 0,0,1,2,3,4,5,6,7,8,9,...,17,18,19,20,21,22,23,24,25,26
0,R1A,WORCESTERSHIRE HEALTH AND CARE NHS TRUST,Y55,Q77,ISAAC MADDOX HOUSE,SHRUB HILL INDUSTRIAL ESTATE,,WORCESTER,WORCESTERSHIRE,WR4 9RW,...,,,,,0,,F,,,
1,R1A01,PATHWAYS SUPPORT SERVICES,Y55,Q77,30A TENBY STREET,,,BIRMINGHAM,WEST MIDLANDS,B1 3EE,...,,,,,0,,F,,,


In [14]:
cols=['Organisation Code','Name','National Grouping','High Level Health Geography',
      'Address Line 1','Address Line 2','Address Line 3','Address Line 4','Address Line 5','Postcode',
      'Open Date','Close Date','Null','Null',
      'Null','Null','Null','Contact Telephone Number',
      'Null','Null','Null',
      'Amended Record Indicator','Null',
      'GOR Code','Null','Null','Null']

In [15]:
etrust.columns=cols
etrust.drop('Null', axis=1, inplace=True)

etrust.head(2)

Unnamed: 0,Organisation Code,Name,National Grouping,High Level Health Geography,Address Line 1,Address Line 2,Address Line 3,Address Line 4,Address Line 5,Postcode,Open Date,Close Date,Contact Telephone Number,Amended Record Indicator,GOR Code
0,R1A,WORCESTERSHIRE HEALTH AND CARE NHS TRUST,Y55,Q77,ISAAC MADDOX HOUSE,SHRUB HILL INDUSTRIAL ESTATE,,WORCESTER,WORCESTERSHIRE,WR4 9RW,2011-07-01,NaT,,0,F
1,R1A01,PATHWAYS SUPPORT SERVICES,Y55,Q77,30A TENBY STREET,,,BIRMINGHAM,WEST MIDLANDS,B1 3EE,2011-07-01,NaT,,0,F


In [16]:
etrust[etrust['Name'].str.lower().str.contains('Wight'.lower())].head(2)

Unnamed: 0,Organisation Code,Name,National Grouping,High Level Health Geography,Address Line 1,Address Line 2,Address Line 3,Address Line 4,Address Line 5,Postcode,Open Date,Close Date,Contact Telephone Number,Amended Record Indicator,GOR Code
1226,R1F,ISLE OF WIGHT NHS TRUST,Y57,Q70,ST MARY'S HOSPITAL,PARKHURST ROAD,,NEWPORT,ISLE OF WIGHT,PO30 5TG,2012-04-01,NaT,,0,J
1284,R1FHQ,ISLE OF WIGHT NHS - HQ,Y57,Q70,ST MARY'S HOSPITAL,PARKHURST ROAD,,NEWPORT,ISLE OF WIGHT,PO30 5TG,2012-04-01,NaT,,0,J


### Storing the Data in a SQLite3 Database

In [17]:
tmp=etrust.set_index(['Organisation Code'])
#If the table exists, replace it, under the assumption we are using a more recent version of the data
tmp.to_sql(con=con, name=ETRUST,if_exists='replace')

  chunksize=chunksize, dtype=dtype)


In [18]:
orgcode='R1F'
pd.read_sql_query('SELECT * FROM {typ} WHERE "Organisation Code"="{orgcode}"'.format(typ=ETRUST,orgcode=orgcode), con)

Unnamed: 0,Organisation Code,Name,National Grouping,High Level Health Geography,Address Line 1,Address Line 2,Address Line 3,Address Line 4,Address Line 5,Postcode,Open Date,Close Date,Contact Telephone Number,Amended Record Indicator,GOR Code
0,R1F,ISLE OF WIGHT NHS TRUST,Y57,Q70,ST MARY'S HOSPITAL,PARKHURST ROAD,,NEWPORT,ISLE OF WIGHT,PO30 5TG,2012-04-01 00:00:00,,,0,J


## eccg - Clinical Commissioning Groups

In [19]:
#via http://systems.digital.nhs.uk/data/ods/datadownloads/othernhs
ECCG='eccg'
eccg= getData(ECCG,dates=[10,11])
eccg.head(2)

--2016-09-27 01:30:20--  http://systems.digital.nhs.uk/data/ods/datadownloads/data-files/eccg.zip
Resolving systems.digital.nhs.uk... 194.189.27.101
Connecting to systems.digital.nhs.uk|194.189.27.101|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 19824 (19K) [application/zip]
Saving to: 'downloads/eccg.zip'


2016-09-27 01:30:20 (464 KB/s) - 'downloads/eccg.zip' saved [19824/19824]

Archive:  downloads/eccg.zip
  inflating: data/eccg/eccg.csv      
  inflating: data/eccg/eccg.pdf      


Unnamed: 0,0,1,2,3,4,5,6,7,8,9,...,17,18,19,20,21,22,23,24,25,26
0,00C,NHS DARLINGTON CCG,Y54,Q74,DR PIPER HOUSE,KING STREET,,DARLINGTON,COUNTY DURHAM,DL3 6JL,...,,,,,0,,,,,
1,00D,"NHS DURHAM DALES, EASINGTON AND SEDGEFIELD CCG",Y54,Q74,SEDGEFIELD COMMUNITY HOSPITAL,SALTERS LANE,SEDGEFIELD,STOCKTON-ON-TEES,CLEVELAND,TS21 3EE,...,,,,,0,,,,,


In [20]:
cols=['Organisation Code','Name','National Grouping','High Level Health Geography',
      'Address Line 1','Address Line 2','Address Line 3','Address Line 4','Address Line 5','Postcode',
      'Open Date','Close Date','Null','Organisation Sub-Type Code',
      'Null','Null','Null','Null',
      'Null','Null','Null',
      'Amended Record Indicator','Null',
      'Null','Null','Null','Null']

In [21]:
eccg.columns=cols
eccg.drop('Null', axis=1, inplace=True)
eccg.head(2)

Unnamed: 0,Organisation Code,Name,National Grouping,High Level Health Geography,Address Line 1,Address Line 2,Address Line 3,Address Line 4,Address Line 5,Postcode,Open Date,Close Date,Organisation Sub-Type Code,Amended Record Indicator
0,00C,NHS DARLINGTON CCG,Y54,Q74,DR PIPER HOUSE,KING STREET,,DARLINGTON,COUNTY DURHAM,DL3 6JL,2013-04-01,NaT,C,0
1,00D,"NHS DURHAM DALES, EASINGTON AND SEDGEFIELD CCG",Y54,Q74,SEDGEFIELD COMMUNITY HOSPITAL,SALTERS LANE,SEDGEFIELD,STOCKTON-ON-TEES,CLEVELAND,TS21 3EE,2013-04-01,NaT,C,0


In [22]:
eccg[eccg['Name'].str.lower().str.contains('Wight'.lower())]

Unnamed: 0,Organisation Code,Name,National Grouping,High Level Health Geography,Address Line 1,Address Line 2,Address Line 3,Address Line 4,Address Line 5,Postcode,Open Date,Close Date,Organisation Sub-Type Code,Amended Record Indicator
171,10L,NHS ISLE OF WIGHT CCG,Y57,Q70,SOUTH BLOCK,ST MARY'S HOSPITAL,PARKHURST ROAD,NEWPORT,ISLE OF WIGHT,PO30 5TG,2013-04-01,NaT,C,0


### Storing the Data in a SQLite3 Database

In [23]:
tmp=eccg.set_index(['Organisation Code'])
#If the table exists, replace it, under the assumption we are using a more recent version of the data
tmp.to_sql(con=con, name=ECCG,if_exists='replace')

  chunksize=chunksize, dtype=dtype)


In [24]:
orgcode='10L'
pd.read_sql_query('SELECT * FROM {typ} WHERE "Organisation Code"="{orgcode}"'.format(typ=ECCG,orgcode=orgcode), con)

Unnamed: 0,Organisation Code,Name,National Grouping,High Level Health Geography,Address Line 1,Address Line 2,Address Line 3,Address Line 4,Address Line 5,Postcode,Open Date,Close Date,Organisation Sub-Type Code,Amended Record Indicator
0,10L,NHS ISLE OF WIGHT CCG,Y57,Q70,SOUTH BLOCK,ST MARY'S HOSPITAL,PARKHURST ROAD,NEWPORT,ISLE OF WIGHT,PO30 5TG,2013-04-01 00:00:00,,C,0


In [25]:
#We can now see the benefit of having data from mulitple source data files in the same database
#For example, we can run queries across joined tables such as finding GP Practices by CCG
ccg='NHS ISLE OF WIGHT CCG'
q='''
SELECT epraccur."Organisation Code" AS code, epraccur.Name AS Name 
FROM eccg, epraccur 
WHERE eccg.Name="{}" AND eccg."Organisation Code"=epraccur.Commissioner'''

pd.read_sql_query(q.format(ccg), con)

Unnamed: 0,code,Name
0,J84003,VENTNOR MEDICAL CENTRE
1,J84004,EAST COWES MEDICAL CENTRE
2,J84005,ESPLANADE SURGERY
3,J84007,ST.HELENS MEDICAL CENTRE
4,J84008,ARGYLL HOUSE
5,J84010,SHANKLIN MEDICAL CENTRE
6,J84011,CARISBROOKE HEALTH CENTRE
7,J84012,TOWER HOUSE SURGERY
8,J84013,SANDOWN HEALTH CENTRE
9,J84014,THE DOWER HOUSE


## eccgsite - CCG Sites

In [26]:
#http://systems.digital.nhs.uk/data/ods/datadownloads/othernhs
ECCGSITE='eccgsite'
eccgsite=getData(ECCGSITE,dates=[10,11,15,16])
eccgsite.head(2)

--2016-09-27 01:30:21--  http://systems.digital.nhs.uk/data/ods/datadownloads/data-files/eccgsite.zip
Resolving systems.digital.nhs.uk... 194.189.27.101
Connecting to systems.digital.nhs.uk|194.189.27.101|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 67185 (66K) [application/zip]
Saving to: 'downloads/eccgsite.zip'


2016-09-27 01:30:21 (874 KB/s) - 'downloads/eccgsite.zip' saved [67185/67185]

Archive:  downloads/eccgsite.zip
  inflating: data/eccgsite/eccgsite.csv  
  inflating: data/eccgsite/eccgsite.pdf  


Unnamed: 0,0,1,2,3,4,5,6,7,8,9,...,17,18,19,20,21,22,23,24,25,26
0,00CAA,NHS DARLINGTON CCG HQ,Y54,Q74,DR PIPER HOUSE,KING STREET,,DARLINGTON,COUNTY DURHAM,DL3 6JL,...,,,,,0,,,,,
1,00DAA,"NHS DURHAM DALES, EASINGTON AND SEDGEFIELD HQ",Y54,Q74,SEDGEFIELD COMMUNITY HOSPITAL,SALTERS LANE,SEDGEFIELD,STOCKTON-ON-TEES,CLEVELAND,TS21 3EE,...,,,,,0,,,,,


In [27]:
cols=['Organisation Code','Name','National Grouping','High Level Health Geography',
      'Address Line 1','Address Line 2','Address Line 3','Address Line 4','Address Line 5','Postcode',
      'Open Date','Close Date','Null','Null',
      'Parent Organisation Code','Join Parent Date','Left Parent Date','Null',
      'Null','Null','Null',
      'Amended Record Indicator','Null',
      'Null','Null','Null','Null']

In [28]:
eccgsite.columns=cols
eccgsite.drop('Null', axis=1, inplace=True)
eccgsite.head(2)

Unnamed: 0,Organisation Code,Name,National Grouping,High Level Health Geography,Address Line 1,Address Line 2,Address Line 3,Address Line 4,Address Line 5,Postcode,Open Date,Close Date,Parent Organisation Code,Join Parent Date,Left Parent Date,Amended Record Indicator
0,00CAA,NHS DARLINGTON CCG HQ,Y54,Q74,DR PIPER HOUSE,KING STREET,,DARLINGTON,COUNTY DURHAM,DL3 6JL,2013-04-01,NaT,00C,2013-04-01,NaT,0
1,00DAA,"NHS DURHAM DALES, EASINGTON AND SEDGEFIELD HQ",Y54,Q74,SEDGEFIELD COMMUNITY HOSPITAL,SALTERS LANE,SEDGEFIELD,STOCKTON-ON-TEES,CLEVELAND,TS21 3EE,2013-04-01,NaT,00D,2013-04-01,NaT,0


### Storing the Data in a SQLite3 Database

In [29]:
tmp=eccgsite.set_index(['Organisation Code'])
#If the table exists, replace it, under the assumption we are using a more recent version of the data
tmp.to_sql(con=con, name=ECCGSITE,if_exists='replace')

  chunksize=chunksize, dtype=dtype)


## epcmem - Current and historical records of membership of CCGs, Primary Care Trusts, Primary Care Groups by General Medical Practice

In [30]:
#via http://systems.digital.nhs.uk/data/ods/datadownloads/gppractice
EPCMEM='epcmem'
epcmem=getData(EPCMEM,dates=[3,4])
epcmem.head(2)

--2016-09-27 01:30:22--  http://systems.digital.nhs.uk/data/ods/datadownloads/data-files/epcmem.zip
Resolving systems.digital.nhs.uk... 194.189.27.101
Connecting to systems.digital.nhs.uk|194.189.27.101|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 205985 (201K) [application/zip]
Saving to: 'downloads/epcmem.zip'


2016-09-27 01:30:22 (1.09 MB/s) - 'downloads/epcmem.zip' saved [205985/205985]

Archive:  downloads/epcmem.zip
  inflating: data/epcmem/epcmem.csv  
  inflating: data/epcmem/epcmem.pdf  


Unnamed: 0,0,1,2,3,4,5
0,A81001,4QP36,W,1999-04-01,2001-03-31,0
1,A81001,5E1,W,2001-04-01,2013-03-31,0


In [31]:
cols=['Organisation Code','Parent Organisation Code',
'Parent Organisation Type','Join Parent Date','Left Parent Date','Amended Record Indicator']

In [32]:
epcmem.columns=cols
epcmem.head(2)

Unnamed: 0,Organisation Code,Parent Organisation Code,Parent Organisation Type,Join Parent Date,Left Parent Date,Amended Record Indicator
0,A81001,4QP36,W,1999-04-01,2001-03-31,0
1,A81001,5E1,W,2001-04-01,2013-03-31,0


### Storing the Data in a SQLite3 Database

In [33]:
tmp=epcmem.set_index(['Organisation Code','Parent Organisation Code'])
#If the table exists, replace it, under the assumption we are using a more recent version of the data
tmp.to_sql(con=con, name=EPCMEM,if_exists='replace')

  chunksize=chunksize, dtype=dtype)


In [34]:
orgcode='A81001'
pd.read_sql_query('SELECT * from {typ} where "Organisation Code"="{orgcode}"'.format(typ=EPCMEM,orgcode=orgcode), con)

Unnamed: 0,Organisation Code,Parent Organisation Code,Parent Organisation Type,Join Parent Date,Left Parent Date,Amended Record Indicator
0,A81001,00K,W,2013-04-01 00:00:00,,0
1,A81001,4QP36,W,1999-04-01 00:00:00,2001-03-31 00:00:00,0
2,A81001,5E1,W,2001-04-01 00:00:00,2013-03-31 00:00:00,0


In [35]:
#Example:
#Look up the history of parent organisations for a particular practice
gp='VENTNOR MEDICAL CENTRE'

q='''
SELECT epraccur."Organisation Code" AS code, epraccur.Name AS Name, epcmem."Parent Organisation Code"
FROM epcmem, epraccur 
WHERE epraccur.Name="{}" AND epcmem."Organisation Code"=epraccur."Organisation Code"'''

pd.read_sql_query(q.format(gp), con)

#More work needs to be done here
# eg checking the Parent Organisation Type and then using this to look up the appropriate Parent Organsation Code

Unnamed: 0,code,Name,Parent Organisation Code
0,J84003,VENTNOR MEDICAL CENTRE,10L
1,J84003,VENTNOR MEDICAL CENTRE,4NG74
2,J84003,VENTNOR MEDICAL CENTRE,5DG
3,J84003,VENTNOR MEDICAL CENTRE,5QT


## epracmem - current and historical records of membership of practices by GPs

In [36]:
EPRACMEM='epracmem'
epracmem=getData(EPRACMEM,dates=[3,4])
epracmem.head(2)

--2016-09-27 01:30:23--  http://systems.digital.nhs.uk/data/ods/datadownloads/data-files/epracmem.zip
Resolving systems.digital.nhs.uk... 194.189.27.101
Connecting to systems.digital.nhs.uk|194.189.27.101|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 1210110 (1.2M) [application/zip]
Saving to: 'downloads/epracmem.zip'


2016-09-27 01:30:24 (1.62 MB/s) - 'downloads/epracmem.zip' saved [1210110/1210110]

Archive:  downloads/epracmem.zip
  inflating: data/epracmem/epracmem.csv  
  inflating: data/epracmem/epracmem.pdf  


Unnamed: 0,0,1,2,3,4,5
0,G0102005,H81600,P,1974-04-01,1991-04-01,0
1,G0102926,D81001,P,1974-04-01,1991-12-31,0


In [37]:
cols=['Practitioner Code','Parent Organisation Code','Parent Organisation Type','Join Parent Date',
      'Left Parent Date','Amended Record Indicator']

In [38]:
epracmem.columns=cols
epracmem.head(2)

Unnamed: 0,Practitioner Code,Parent Organisation Code,Parent Organisation Type,Join Parent Date,Left Parent Date,Amended Record Indicator
0,G0102005,H81600,P,1974-04-01,1991-04-01,0
1,G0102926,D81001,P,1974-04-01,1991-12-31,0


In [39]:
epracmem[epracmem['Parent Organisation Code']=='J84020']

Unnamed: 0,Practitioner Code,Parent Organisation Code,Parent Organisation Type,Join Parent Date,Left Parent Date,Amended Record Indicator
10181,G3335046,J84020,P,1974-04-01,2006-05-17,0
12450,G3370324,J84020,P,1974-04-01,2006-04-01,0
48301,G8337043,J84020,P,2003-07-07,2008-09-30,0
59043,G8549718,J84020,P,2006-03-20,NaT,0
62086,G8637358,J84020,P,2006-05-02,NaT,0
108472,G9508552,J84020,P,1995-08-14,NaT,0
115777,G9710832,J84020,P,1997-11-17,2005-04-30,0


### Storing the Data in a SQLite3 Database

In [40]:
tmp=epracmem.set_index(['Practitioner Code','Parent Organisation Code'])
#If the table exists, replace it, under the assumption we are using a more recent version of the data
tmp.to_sql(con=con, name=EPRACMEM,if_exists='replace')

  chunksize=chunksize, dtype=dtype)


In [41]:
#Example - current GP codes by practice
gp='VENTNOR MEDICAL CENTRE'

q='''
SELECT epraccur."Organisation Code" AS code, epraccur.Name AS Name, epracmem."Practitioner Code",
        epracmem."Join Parent Date",epracmem."Left Parent Date"
FROM epracmem, epraccur 
WHERE epraccur.Name="{}" AND epracmem."Parent Organisation Code"=epraccur."Organisation Code"
      AND epracmem."Left Parent Date" is NULL '''

pd.read_sql_query(q.format(gp), con)


Unnamed: 0,code,Name,Practitioner Code,Join Parent Date,Left Parent Date
0,J84003,VENTNOR MEDICAL CENTRE,G7105823,2010-04-05 00:00:00,
1,J84003,VENTNOR MEDICAL CENTRE,G8613161,1986-10-05 00:00:00,
2,J84003,VENTNOR MEDICAL CENTRE,G9142387,2014-04-01 00:00:00,
3,J84003,VENTNOR MEDICAL CENTRE,G9500499,1995-01-03 00:00:00,
4,J84003,VENTNOR MEDICAL CENTRE,G9544343,2015-10-05 00:00:00,


## egdpprac - Dental Surgeries

In [42]:
#http://systems.digital.nhs.uk/data/ods/datadownloads/misc
EGDPPRAC='egdpprac'
egdpprac=getData(EGDPPRAC,dates=[10,11,15,16])
egdpprac.head(2)

--2016-09-27 01:30:26--  http://systems.digital.nhs.uk/data/ods/datadownloads/data-files/egdpprac.zip
Resolving systems.digital.nhs.uk... 194.189.27.101
Connecting to systems.digital.nhs.uk|194.189.27.101|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 375667 (367K) [application/zip]
Saving to: 'downloads/egdpprac.zip'


2016-09-27 01:30:27 (1.29 MB/s) - 'downloads/egdpprac.zip' saved [375667/375667]

Archive:  downloads/egdpprac.zip
  inflating: data/egdpprac/egdpprac.csv  
  inflating: data/egdpprac/egdpprac.pdf  


Unnamed: 0,0,1,2,3,4,5,6,7,8,9,...,17,18,19,20,21,22,23,24,25,26
0,V00002,DENTAL SURGERY,Y52,Q37,DENTAL SURGERY,22 MARTYRS AVENUE,CRAWLEY,WEST SUSSEX,,RH11 7RZ,...,,,,,0,,,,,
1,V00003,CRABTREE ROAD DENTAL PRACTICE,Y57,Q81,CRABTREE ROAD DENTAL PRACTICE,25 CRABTREE ROAD,CRAWLEY,WEST SUSSEX,,RH11 7HL,...,,,,,0,,,,,


In [43]:
cols=['Organisation Code','Name','National Grouping','High Level Health Geography',
      'Address Line 1','Address Line 2','Address Line 3','Address Line 4','Address Line 5','Postcode',
      'Open Date','Close Date','Status Code','Organisation Sub-Type Code',
      'Parent Organisation Code','Join Parent Date','Left Parent Date','Null',
      'Null','Null','Null',
      'Amended Record Indicator','Null',
      'Null','Null','Null','Null']

In [44]:
egdpprac.columns=cols
egdpprac.drop('Null', axis=1, inplace=True)
egdpprac.head(2)

Unnamed: 0,Organisation Code,Name,National Grouping,High Level Health Geography,Address Line 1,Address Line 2,Address Line 3,Address Line 4,Address Line 5,Postcode,Open Date,Close Date,Status Code,Organisation Sub-Type Code,Parent Organisation Code,Join Parent Date,Left Parent Date,Amended Record Indicator
0,V00002,DENTAL SURGERY,Y52,Q37,DENTAL SURGERY,22 MARTYRS AVENUE,CRAWLEY,WEST SUSSEX,,RH11 7RZ,2008-04-01,2009-03-31,C,D,5P6,2008-04-01,2009-03-31,0
1,V00003,CRABTREE ROAD DENTAL PRACTICE,Y57,Q81,CRABTREE ROAD DENTAL PRACTICE,25 CRABTREE ROAD,CRAWLEY,WEST SUSSEX,,RH11 7HL,2006-04-01,NaT,A,D,14G,2015-04-01,NaT,0


### Storing the Data in a SQLite3 Database

In [45]:
tmp=egdpprac.set_index(['Organisation Code'])
#If the table exists, replace it, under the assumption we are using a more recent version of the data
tmp.to_sql(con=con, name=EGDPPRAC,if_exists='replace')

  chunksize=chunksize, dtype=dtype)


In [46]:
area='VENTNOR'
pd.read_sql_query('SELECT * FROM {typ} WHERE "Address Line 3"="{area}"'.format(typ=EGDPPRAC,area=area), con)

Unnamed: 0,Organisation Code,Name,National Grouping,High Level Health Geography,Address Line 1,Address Line 2,Address Line 3,Address Line 4,Address Line 5,Postcode,Open Date,Close Date,Status Code,Organisation Sub-Type Code,Parent Organisation Code,Join Parent Date,Left Parent Date,Amended Record Indicator
0,V06499,DENTAL SURGERY,Y57,Q70,DENTAL SURGERY,4 CHURCH STREET,VENTNOR,ISLE OF WIGHT,,PO38 1SW,2006-04-01 00:00:00,,A,D,13N,2013-04-01 00:00:00,,0
1,V06685,DENTAL SURGERY,Y57,Q70,DENTAL SURGERY,42 HIGH STREET,VENTNOR,ISLE OF WIGHT,,PO38 1RZ,2006-04-01 00:00:00,,A,D,13N,2013-04-01 00:00:00,,0


## egpcur - Current General Medical Practitioners (GPs) 

In [47]:
EGPCUR='egpcur'
egpcur=getData(EGPCUR,dates=[10,11,15,16])
egpcur.head(2)

--2016-09-27 01:30:28--  http://systems.digital.nhs.uk/data/ods/datadownloads/data-files/egpcur.zip
Resolving systems.digital.nhs.uk... 194.189.27.101
Connecting to systems.digital.nhs.uk|194.189.27.101|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 4799631 (4.6M) [application/zip]
Saving to: 'downloads/egpcur.zip'


2016-09-27 01:30:31 (1.86 MB/s) - 'downloads/egpcur.zip' saved [4799631/4799631]

Archive:  downloads/egpcur.zip
  inflating: data/egpcur/egpcur.csv  
  inflating: data/egpcur/egpcur.pdf  


  if self.run_code(code, result):


Unnamed: 0,0,1,2,3,4,5,6,7,8,9,...,17,18,19,20,21,22,23,24,25,26
0,G0102005,ALLEN EB,Y11,QAL,"FIRCROFT, LONDON ROAD",ENGLEFIELD GREEN,EGHAM,SURREY,,TW20 0BS,...,,,,,0,,,,,
1,G0102926,ANDERSON MG,Y55,Q79,LENSFIELD MEDICAL PRAC.,48 LENSFIELD ROAD,CAMBRIDGE,,,CB2 1EH,...,01223 651020,,,,1,,06H,,,


In [48]:
cols=['Organisation Code','Name','National Grouping','High Level Health Geography',
      'Address Line 1','Address Line 2','Address Line 3','Address Line 4','Address Line 5','Postcode',
      'Open Date','Close Date','Status Code','Organisation Sub-Type Code',
      'Parent Organisation Code','Join Parent Date','Left Parent Date','Contact Telephone Number',
      'Null','Null','Null',
      'Amended Record Indicator','Null',
      'Current Care Organisation','Null','Null','Null']

In [49]:
egpcur.columns=cols
egpcur.drop('Null', axis=1, inplace=True)
egpcur.head(2)

Unnamed: 0,Organisation Code,Name,National Grouping,High Level Health Geography,Address Line 1,Address Line 2,Address Line 3,Address Line 4,Address Line 5,Postcode,Open Date,Close Date,Status Code,Organisation Sub-Type Code,Parent Organisation Code,Join Parent Date,Left Parent Date,Contact Telephone Number,Amended Record Indicator,Current Care Organisation
0,G0102005,ALLEN EB,Y11,QAL,"FIRCROFT, LONDON ROAD",ENGLEFIELD GREEN,EGHAM,SURREY,,TW20 0BS,1974-04-01,NaT,A,P,H81600,1974-04-01,1991-04-01,,0,
1,G0102926,ANDERSON MG,Y55,Q79,LENSFIELD MEDICAL PRAC.,48 LENSFIELD ROAD,CAMBRIDGE,,,CB2 1EH,1974-04-01,NaT,A,O,D81001,1974-04-01,1991-12-31,01223 651020,1,06H


### Storing the Data in a SQLite3 Database

In [50]:
tmp=egpcur.set_index(['Organisation Code'])
#If the table exists, replace it, under the assumption we are using a more recent version of the data
tmp.to_sql(con=con, name=EGPCUR,if_exists='replace')

  chunksize=chunksize, dtype=dtype)


In [51]:
gp='VENTNOR MEDICAL CENTRE'

q='''
SELECT epraccur."Organisation Code" AS code, epraccur.Name AS Name, egpcur."Organisation Code",
        egpcur."Name",egpcur."Join Parent Date",egpcur."Left Parent Date"
FROM egpcur, epraccur 
WHERE epraccur.Name="{}" AND egpcur."Parent Organisation Code"=epraccur."Organisation Code" '''

pd.read_sql_query(q.format(gp), con)

Unnamed: 0,code,Name,Organisation Code,Name.1,Join Parent Date,Left Parent Date
0,J84003,VENTNOR MEDICAL CENTRE,G7105823,DR PA COLEMAN & PARTNERS,2010-04-05 00:00:00,
1,J84003,VENTNOR MEDICAL CENTRE,G8200523,TURNER DP,1974-04-01 00:00:00,2010-04-04 00:00:00
2,J84003,VENTNOR MEDICAL CENTRE,G8613161,COLEMAN PA,1986-10-05 00:00:00,
3,J84003,VENTNOR MEDICAL CENTRE,G9142387,JOHN O,2014-04-01 00:00:00,
4,J84003,VENTNOR MEDICAL CENTRE,G9500499,LOCK MW,1995-01-03 00:00:00,
5,J84003,VENTNOR MEDICAL CENTRE,G9544343,STEVENSON DJ,2015-10-05 00:00:00,


## egparc - Archived GPs

In [52]:
#http://systems.digital.nhs.uk/data/ods/datadownloads/gppractice
EGPARC='egparc'
egparc=getData(EGPARC,dates=[10,11,15,16])
egparc.head(2)

--2016-09-27 01:30:34--  http://systems.digital.nhs.uk/data/ods/datadownloads/data-files/egparc.zip
Resolving systems.digital.nhs.uk... 194.189.27.101
Connecting to systems.digital.nhs.uk|194.189.27.101|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 474533 (463K) [application/zip]
Saving to: 'downloads/egparc.zip'


2016-09-27 01:30:34 (1.39 MB/s) - 'downloads/egparc.zip' saved [474533/474533]

Archive:  downloads/egparc.zip
  inflating: data/egparc/egparc.csv  
  inflating: data/egparc/egparc.pdf  


Unnamed: 0,0,1,2,3,4,5,6,7,8,9,...,17,18,19,20,21,22,23,24,25,26
0,G0107275,ANGEL AM,Y10,QAJ,NIGHTINGALE HOUSE,105 NIGHTINGALE LANE,BALHAM,LONDON,,SW12 8NB,...,0181 6733495,,,,0,,,,,
1,G0108180,ANDERSEN HJ,Y10,QAH,123 EVELINA ROAD,DULWICH,LONDON,,,SE15 3HD,...,071 6393126,,,,0,,,,,


In [53]:
cols=['Organisation Code','Name','National Grouping','High Level Health Geography',
      'Address Line 1','Address Line 2','Address Line 3','Address Line 4','Address Line 5','Postcode',
      'Open Date','Close Date','Status Code','Organisation Sub-Type Code',
      'Parent Organisation Code','Join Parent Date','Left Parent Date','Contact Telephone Number',
      'Null','Null','Null',
      'Amended Record Indicator','Null',
      'Null','Null','Null','Null']

In [54]:
egparc.columns=cols
egparc.drop('Null', axis=1, inplace=True)
egparc.head(2)

Unnamed: 0,Organisation Code,Name,National Grouping,High Level Health Geography,Address Line 1,Address Line 2,Address Line 3,Address Line 4,Address Line 5,Postcode,Open Date,Close Date,Status Code,Organisation Sub-Type Code,Parent Organisation Code,Join Parent Date,Left Parent Date,Contact Telephone Number,Amended Record Indicator
0,G0107275,ANGEL AM,Y10,QAJ,NIGHTINGALE HOUSE,105 NIGHTINGALE LANE,BALHAM,LONDON,,SW12 8NB,1974-04-01,1996-04-17,C,P,H85681,1991-04-03,1996-04-17,0181 6733495,0
1,G0108180,ANDERSEN HJ,Y10,QAH,123 EVELINA ROAD,DULWICH,LONDON,,,SE15 3HD,1974-04-01,1995-06-11,C,P,G85601,1974-04-01,1995-06-11,071 6393126,0


### Storing the Data in a SQLite3 Database

In [55]:
tmp=egparc.set_index(['Organisation Code'])
#If the table exists, replace it, under the assumption we are using a more recent version of the data
tmp.to_sql(con=con, name=EGPARC,if_exists='replace')

  chunksize=chunksize, dtype=dtype)


## epracarc - Archived GP Practices

In [56]:
EPRACARC='epracarc'
epracarc=getData(EPRACARC,dates=[10,11,15,16])

--2016-09-27 01:30:35--  http://systems.digital.nhs.uk/data/ods/datadownloads/data-files/epracarc.zip
Resolving systems.digital.nhs.uk... 194.189.27.101
Connecting to systems.digital.nhs.uk|194.189.27.101|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 187761 (183K) [application/zip]
Saving to: 'downloads/epracarc.zip'


2016-09-27 01:30:35 (1.10 MB/s) - 'downloads/epracarc.zip' saved [187761/187761]

Archive:  downloads/epracarc.zip
  inflating: data/epracarc/epracarc.csv  
  inflating: data/epracarc/epracarc.pdf  


In [57]:
cols=['Organisation Code','Name','National Grouping','High Level Health Geography',
      'Address Line 1','Address Line 2','Address Line 3','Address Line 4','Address Line 5','Postcode',
      'Open Date','Close Date','Status Code','Organisation Sub-Type Code',
      'Parent Organisation Code','Join Parent Date','Left Parent Date','Contact Telephone Number',
      'Null','Null','Null',
      'Amended Record Indicator','Null',
      'Null','Null','Practice Type','Null']

In [58]:
epracarc.columns=cols
epracarc.drop('Null', axis=1, inplace=True)
epracarc.head(2)

Unnamed: 0,Organisation Code,Name,National Grouping,High Level Health Geography,Address Line 1,Address Line 2,Address Line 3,Address Line 4,Address Line 5,Postcode,Open Date,Close Date,Status Code,Organisation Sub-Type Code,Parent Organisation Code,Join Parent Date,Left Parent Date,Contact Telephone Number,Amended Record Indicator,Practice Type
0,603698,NO NAME HELD,W00,QW1,GWENT FHSA,,,,,CF1 3PY,1991-04-01,1995-03-31,C,Z,,NaT,NaT,,0,0
1,608698,NO NAME HELD,W00,QW5,WEST GLAM FHSA,,,,,CF1 3PY,1991-04-01,1995-03-31,C,Z,,NaT,NaT,,0,0


### Storing the Data in a SQLite3 Database

In [59]:
tmp=epracarc.set_index(['Organisation Code'])
#If the table exists, replace it, under the assumption we are using a more recent version of the data
tmp.to_sql(con=con, name=EPRACARC,if_exists='replace')

  chunksize=chunksize, dtype=dtype)


## ehospice - Hospices

In [60]:
#http://systems.digital.nhs.uk/data/ods/datadownloads/misc
EHOSPICE='ehospice'
ehospice=getData(EHOSPICE,dates=[10,11,15,16])

--2016-09-27 01:30:36--  http://systems.digital.nhs.uk/data/ods/datadownloads/data-files/ehospice.zip
Resolving systems.digital.nhs.uk... 194.189.27.101
Connecting to systems.digital.nhs.uk|194.189.27.101|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 22884 (22K) [application/zip]
Saving to: 'downloads/ehospice.zip'


2016-09-27 01:30:36 (425 KB/s) - 'downloads/ehospice.zip' saved [22884/22884]

Archive:  downloads/ehospice.zip
  inflating: data/ehospice/ehospice.csv  
  inflating: data/ehospice/ehospice.pdf  


In [61]:
cols=['Organisation Code','Name','National Grouping','High Level Health Geography',
      'Address Line 1','Address Line 2','Address Line 3','Address Line 4','Address Line 5','Postcode',
      'Open Date','Close Date','Null','Organisation Sub-Type Code',
      'Null','Null','Null','Contact Telephone Number',
      'Null','Null','Null',
      'Amended Record Indicator','Null',
      'Null','Null','Null','Null']

In [62]:
ehospice.columns=cols
ehospice.drop('Null', axis=1, inplace=True)
ehospice.head(2)

Unnamed: 0,Organisation Code,Name,National Grouping,High Level Health Geography,Address Line 1,Address Line 2,Address Line 3,Address Line 4,Address Line 5,Postcode,Open Date,Close Date,Organisation Sub-Type Code,Contact Telephone Number,Amended Record Indicator
0,8A101,ST LUKE'S HOSPICE (SHEFFIELD),Y54,Q72,LITTLE COMMON LANE,,,SHEFFIELD,SOUTH YORKSHIRE,S11 9NE,1996-04-01,NaT,H,,0
1,8A260,ST BARNABAS LINCOLNSHIRE HOSPICE (IPU),Y55,Q78,INPATIENT UNIT,36 NETTLEHAM ROAD,,LINCOLN,LINCOLNSHIRE,LN2 1RE,1996-04-01,NaT,H,,0


### Storing the Data in a SQLite3 Database

In [63]:
tmp=ehospice.set_index(['Organisation Code'])
#If the table exists, replace it, under the assumption we are using a more recent version of the data
tmp.to_sql(con=con, name=EHOSPICE,if_exists='replace')

  chunksize=chunksize, dtype=dtype)


In [64]:
area='HUDDERSFIELD'
pd.read_sql_query('SELECT * FROM {typ} WHERE "Address Line 4"="{area}"'.format(typ=EHOSPICE,area=area), con)

Unnamed: 0,Organisation Code,Name,National Grouping,High Level Health Geography,Address Line 1,Address Line 2,Address Line 3,Address Line 4,Address Line 5,Postcode,Open Date,Close Date,Organisation Sub-Type Code,Contact Telephone Number,Amended Record Indicator
0,8AT59,KIRKWOOD HOSPICE,Y54,Q72,21 ALBANY ROAD,DALTON,,HUDDERSFIELD,WEST YORKSHIRE,HD5 9UY,1996-04-01 00:00:00,,H,,0
1,8HX16,FORGET ME NOT CHILDREN'S HOSPICE,Y54,Q72,RUSSELL HOUSE,FELL GREAVE ROAD,,HUDDERSFIELD,WEST YORKSHIRE,HD2 1NH,2012-03-23 00:00:00,,H,,0


## epharmacyhq - Pharmacy Headquarters

In [65]:
#http://systems.digital.nhs.uk/data/ods/datadownloads/gppractice
EPHARMACYHQ='epharmacyhq'
epharmacyhq=getData(EPHARMACYHQ,dates=[10,11,15,16])

--2016-09-27 01:30:37--  http://systems.digital.nhs.uk/data/ods/datadownloads/data-files/epharmacyhq.zip
Resolving systems.digital.nhs.uk... 194.189.27.101
Connecting to systems.digital.nhs.uk|194.189.27.101|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 229164 (224K) [application/zip]
Saving to: 'downloads/epharmacyhq.zip'


2016-09-27 01:30:37 (1.19 MB/s) - 'downloads/epharmacyhq.zip' saved [229164/229164]

Archive:  downloads/epharmacyhq.zip
  inflating: data/epharmacyhq/epharmacyhq.csv  
  inflating: data/epharmacyhq/epharmacyhq.pdf  


In [66]:
cols=['Organisation Code','Name','National Grouping','High Level Health Geography',
      'Address Line 1','Address Line 2','Address Line 3','Address Line 4','Address Line 5','Postcode',
      'Open Date','Close Date','Null','Null',
      'Null','Null','Null','Contact Telephone Number',
      'Null','Null','Null',
      'Amended Record Indicator','Null',
      'Null','Null','Null','Null']

In [67]:
epharmacyhq.columns=cols
epharmacyhq.drop('Null', axis=1, inplace=True)
epharmacyhq.head(2)

Unnamed: 0,Organisation Code,Name,National Grouping,High Level Health Geography,Address Line 1,Address Line 2,Address Line 3,Address Line 4,Address Line 5,Postcode,Open Date,Close Date,Contact Telephone Number,Amended Record Indicator
0,P001,KAYS CHEMIST,,,24 ROSS ROAD,,,MAIDENHEAD,BERKSHIRE,SL6 2SZ,2004-04-01,2007-04-06,,0
1,P002,VE LETTSOM CHEMIST,,,84 VESTRY ROAD,CAMBERWELL,,LONDON,GREATER LONDON,SE5 8PQ,2004-04-01,NaT,,0


### Storing the Data in a SQLite3 Database

In [68]:
tmp=epharmacyhq.set_index(['Organisation Code'])
#If the table exists, replace it, under the assumption we are using a more recent version of the data
tmp.to_sql(con=con, name=EPHARMACYHQ,if_exists='replace')

  chunksize=chunksize, dtype=dtype)


In [69]:
name='BOOTS'
pd.read_sql_query('SELECT * FROM {typ} WHERE "Name" LIKE "%{name}%"'.format(typ=EPHARMACYHQ,name=name), con)

Unnamed: 0,Organisation Code,Name,National Grouping,High Level Health Geography,Address Line 1,Address Line 2,Address Line 3,Address Line 4,Address Line 5,Postcode,Open Date,Close Date,Contact Telephone Number,Amended Record Indicator
0,P08F,BOOTS GROUP PLC,,,1 THANE ROAD,,,NOTTINGHAM,NOTTINGHAMSHIRE,NG90 1BS,1990-04-01 00:00:00,,,0


## edispensary - Dispensaries

In [70]:
#http://systems.digital.nhs.uk/data/ods/datadownloads/gppractice
EDISPENSARY='edispensary'
edispensary=getData(EDISPENSARY,dates=[10,11,15,16])

--2016-09-27 01:30:38--  http://systems.digital.nhs.uk/data/ods/datadownloads/data-files/edispensary.zip
Resolving systems.digital.nhs.uk... 194.189.27.101
Connecting to systems.digital.nhs.uk|194.189.27.101|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 937178 (915K) [application/zip]
Saving to: 'downloads/edispensary.zip'


2016-09-27 01:30:39 (1.57 MB/s) - 'downloads/edispensary.zip' saved [937178/937178]

Archive:  downloads/edispensary.zip
  inflating: data/edispensary/edispensary.csv  
  inflating: data/edispensary/edispensary.pdf  


In [71]:
cols=['Organisation Code','Name','National Grouping','High Level Health Geography',
      'Address Line 1','Address Line 2','Address Line 3','Address Line 4','Address Line 5','Postcode',
      'Open Date','Close Date','Status Code','Organisation Sub-Type Code',
      'Parent Organisation Code','Join Parent Date','Left Parent Date','Contact Telephone Number',
      'Null','Null','Null',
      'Amended Record Indicator','Null',
      'Current Care Organisation','Null','Null','Null']

In [72]:
edispensary.columns=cols
edispensary.drop('Null', axis=1, inplace=True)
edispensary.head(2)

Unnamed: 0,Organisation Code,Name,National Grouping,High Level Health Geography,Address Line 1,Address Line 2,Address Line 3,Address Line 4,Address Line 5,Postcode,Open Date,Close Date,Status Code,Organisation Sub-Type Code,Parent Organisation Code,Join Parent Date,Left Parent Date,Contact Telephone Number,Amended Record Indicator,Current Care Organisation
0,FA002,ROWLANDS PHARMACY,Y54,Q83,61 ARUNDEL AVENUE,HAZEL GROVE,STOCKPORT,,,SK7 5LD,2006-06-01,NaT,A,1,P79D,2006-06-01,NaT,0161 4838729,1,Q83
1,FA007,ROWLANDS PHARMACY,Y55,Q79,"10,10A & 10B BENTALLS CTR",COLCHESTER ROAD,"HEYBRIDGE, MALDON",,,CM9 4GD,2008-03-01,NaT,A,1,P79D,2008-03-01,NaT,01621 850559,0,Q79


### Storing the Data in a SQLite3 Database

In [73]:
tmp=edispensary.set_index(['Organisation Code'])
#If the table exists, replace it, under the assumption we are using a more recent version of the data
tmp.to_sql(con=con, name=EDISPENSARY,if_exists='replace')

  chunksize=chunksize, dtype=dtype)


In [74]:
area='ISLE OF WIGHT'
q='''
SELECT  edispensary.Name AS dispensaryName, edispensary."Address Line 3", epharmacyhq.Name AS parentName
FROM edispensary,epharmacyhq 
WHERE edispensary."Address Line 4"="{area}"
AND edispensary."Parent Organisation Code" = epharmacyhq."Organisation Code" LIMIT 5
'''
pd.read_sql_query(q.format(area=area), con)

Unnamed: 0,dispensaryName,Address Line 3,parentName
0,NITON PHARMACY,VENTNOR,DAY LEWIS PLC
1,DAY LEWIS PHARMACY,NEWPORT,H CARSON LTD
2,DAY LEWIS PLC,LAKE,DAY LEWIS PLC
3,TESCO (IN STORE) PHARMACY,RYDE,TESCO PLC
4,BOOTS THE CHEMIST LTD,COWES,BOOTS GROUP PLC


## enurse - Nurse Prescribers

In [75]:
#http://systems.digital.nhs.uk/data/ods/datadownloads/gppractice
ENURSE='enurse'
enurse=getData(ENURSE,dates=[3,4])

--2016-09-27 01:30:40--  http://systems.digital.nhs.uk/data/ods/datadownloads/data-files/enurse.zip
Resolving systems.digital.nhs.uk... 194.189.27.101
Connecting to systems.digital.nhs.uk|194.189.27.101|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 1053344 (1.0M) [application/zip]
Saving to: 'downloads/enurse.zip'


2016-09-27 01:30:41 (1.52 MB/s) - 'downloads/enurse.zip' saved [1053344/1053344]

Archive:  downloads/enurse.zip
  inflating: data/enurse/enurse.csv  
  inflating: data/enurse/enurse.pdf  


In [76]:
cols=['Nurse Type','Parent Organisation Code','Nurse PIN',
      'Open Date','Close Date','Title','Initials','Surname',
      'Address1','Address2','Address3','Address4','Address5','Postcode',
      'Telephone Number','Senior Partner Name',
      'Current Care Organisation Code','Name','Name manipulation indicator',
      'Qualification indicator']

In [77]:
enurse.columns=cols
enurse.head(2)

Unnamed: 0,Nurse Type,Parent Organisation Code,Nurse PIN,Open Date,Close Date,Title,Initials,Surname,Address1,Address2,Address3,Address4,Address5,Postcode,Telephone Number,Senior Partner Name,Current Care Organisation Code,Name,Name manipulation indicator,Qualification indicator
0,PN,A81002,76A1370E,1999-11-10,NaT,Ms,K,GALLOWAY,QUEENS PARK MEDICAL CTR,FARRER STREET,STOCKTON ON TEES,CLEVELAND,,TS18 2AW,01642 679681,HALL KG,00K,HARTLEPOOL AND STOCKTON-ON-TEES CCG,1,2
1,PN,A81002,77J2933E,2004-04-08,NaT,Mrs,G,SIBLEY,QUEENS PARK MEDICAL CTR,FARRER STREET,STOCKTON ON TEES,CLEVELAND,,TS18 2AW,01642 679681,HALL KG,00K,HARTLEPOOL AND STOCKTON-ON-TEES CCG,1,2


### Storing the Data in a SQLite3 Database

In [78]:
tmp=enurse.set_index(['Nurse PIN'])
#If the table exists, replace it, under the assumption we are using a more recent version of the data
tmp.to_sql(con=con, name=ENURSE,if_exists='replace')

  chunksize=chunksize, dtype=dtype)


In [79]:
gp='VENTNOR MEDICAL CENTRE'

q='''
SELECT epraccur."Organisation Code" AS code, epraccur.Name AS Name, enurse."Surname",
        enurse."Name",enurse."Open Date",enurse."Close Date"
FROM enurse, epraccur 
WHERE epraccur.Name="{}" AND enurse."Parent Organisation Code"=epraccur."Organisation Code" '''

pd.read_sql_query(q.format(gp), con)

Unnamed: 0,code,Name,Surname,Name.1,Open Date,Close Date
0,J84003,VENTNOR MEDICAL CENTRE,WEBB,ISLE OF WIGHT CCG,2013-05-01 00:00:00,


## epcdp - Private Controlled Drug Prescribers

In [80]:
#http://systems.digital.nhs.uk/data/ods/datadownloads/gppractice
EPCDP='epcdp'
epcdp=getData(EPCDP,dates=[10,11,15,16])

--2016-09-27 01:30:43--  http://systems.digital.nhs.uk/data/ods/datadownloads/data-files/epcdp.zip
Resolving systems.digital.nhs.uk... 194.189.27.101
Connecting to systems.digital.nhs.uk|194.189.27.101|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 258489 (252K) [application/zip]
Saving to: 'downloads/epcdp.zip'


2016-09-27 01:30:43 (1.23 MB/s) - 'downloads/epcdp.zip' saved [258489/258489]

Archive:  downloads/epcdp.zip
  inflating: data/epcdp/epcdp.csv    
  inflating: data/epcdp/epcdp.pdf    


In [81]:
cols=['Organisation Code','Name','National Grouping','High Level Health Geography',
      'Address Line 1','Address Line 2','Address Line 3','Address Line 4','Address Line 5','Postcode',
      'Open Date','Close Date','Null','Organisation Sub-Type Code',
      'Parent Organisation Code','Join Parent Date','Left Parent Date','Contact Telephone Number',
      'Null','Null','Null',
      'Amended Record Indicator','Null',
      'Null','Null','Null','Null']

In [82]:
epcdp.columns=cols
epcdp.drop('Null', axis=1, inplace=True)
epcdp.head(2)

Unnamed: 0,Organisation Code,Name,National Grouping,High Level Health Geography,Address Line 1,Address Line 2,Address Line 3,Address Line 4,Address Line 5,Postcode,Open Date,Close Date,Organisation Sub-Type Code,Parent Organisation Code,Join Parent Date,Left Parent Date,Contact Telephone Number,Amended Record Indicator
0,Q6100009,LIEBERMAN S,Y57,Q70,WARBY HOSPITAL,ODIHAM ROAD,HARTLEY WINTNEY,HAMPSHIRE,,RG27 8BS,2006-04-01,NaT,1,Q70,2013-04-01,NaT,01252 845826,0
1,Q6100016,VAN NIEROP A,Y56,Q71,APARTMENT 35,5 FERRY LANE,BRENTFORD,,,TW8 0AT,2006-04-01,NaT,1,Q71,2013-03-20,NaT,078 10873506,0


### Storing the Data in a SQLite3 Database

In [83]:
tmp=epcdp.set_index(['Organisation Code'])
#If the table exists, replace it, under the assumption we are using a more recent version of the data
tmp.to_sql(con=con, name=EPCDP,if_exists='replace')

  chunksize=chunksize, dtype=dtype)


In [84]:
area='RYDE'
pd.read_sql_query('SELECT * FROM {typ} WHERE "Address Line 3"="{area}"'.format(typ=EPCDP,area=area), con)

Unnamed: 0,Organisation Code,Name,National Grouping,High Level Health Geography,Address Line 1,Address Line 2,Address Line 3,Address Line 4,Address Line 5,Postcode,Open Date,Close Date,Organisation Sub-Type Code,Parent Organisation Code,Join Parent Date,Left Parent Date,Contact Telephone Number,Amended Record Indicator
0,Q6114581,HUDSON R,Y57,Q70,TOWER HOUSE SURGERY,RINK ROAD,RYDE,ISLE OF WIGHT,,PO33 1LP,2006-07-11 00:00:00,,1,Q70,2013-04-01 00:00:00,,01983 811431,0
1,Q6114598,REES BSJ,Y57,Q70,TOWER HOUSE SURGERY,RINK ROAD,RYDE,ISLE OF WIGHT,,PO33 1LP,2006-07-11 00:00:00,,1,Q70,2013-04-01 00:00:00,,01983 811431,0
2,Q6114608,O'CALLAGHAN CF,Y57,Q70,TOWER HOUSE SURGERY,RINK ROAD,RYDE,ISLE OF WIGHT,,PO33 1LP,2006-07-11 00:00:00,,1,Q70,2013-04-01 00:00:00,,01983 811431,0
3,Q6114615,WILLIAMS RC,Y57,Q70,TOWER HOUSE SURGERY,RINK ROAD,RYDE,ISLE OF WIGHT,,PO33 1LP,2006-07-11 00:00:00,,1,Q70,2013-04-01 00:00:00,,01983 811431,0
4,Q6114622,MANNING CJF,Y57,Q70,TOWER HOUSE SURGERY,RINK ROAD,RYDE,ISLE OF WIGHT,,PO33 1LP,2006-07-11 00:00:00,,1,Q70,2013-04-01 00:00:00,,01983 811431,0
5,Q6114639,BURTON GEW,Y57,Q70,TOWER HOUSE SURGERY,RINK ROAD,RYDE,ISLE OF WIGHT,,PO33 1LP,2006-07-11 00:00:00,,1,Q70,2013-04-01 00:00:00,,01983 811431,0
6,Q6114646,GROVES MCP,Y57,Q70,TOWER HOUSE SURGERY,RINK ROAD,RYDE,ISLE OF WIGHT,,PO33 1LP,2006-07-11 00:00:00,,1,Q70,2013-04-01 00:00:00,,01983 811431,0


## eabeydispgp - Abeyance and Dispersal GP

In [85]:
#http://systems.digital.nhs.uk/data/ods/datadownloads/gppractice
EABEYDISPGP='eabeydispgp'
eabeydispgp=getData(EABEYDISPGP,dates=[10,11,15,16])

--2016-09-27 01:30:44--  http://systems.digital.nhs.uk/data/ods/datadownloads/data-files/eabeydispgp.zip
Resolving systems.digital.nhs.uk... 194.189.27.101
Connecting to systems.digital.nhs.uk|194.189.27.101|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 98223 (96K) [application/zip]
Saving to: 'downloads/eabeydispgp.zip'


2016-09-27 01:30:44 (916 KB/s) - 'downloads/eabeydispgp.zip' saved [98223/98223]

Archive:  downloads/eabeydispgp.zip
  inflating: data/eabeydispgp/eabeydispgp.csv  
  inflating: data/eabeydispgp/eabeydispgp.pdf  


In [86]:
cols=['Organisation Code','Name','Null','Null',
      'Address Line 1','Address Line 2','Address Line 3','Address Line 4','Address Line 5','Postcode',
      'Open Date','Close Date','Null','Organisation Sub-Type Code',
      'Parent Organisation Code','Join Parent Date','Left Parent Date','Contact Telephone Number',
      'Null','Null','Null',
      'Amended Record Indicator','Null',
      'Current Care Organisation','Null','Null','Null']

In [87]:
eabeydispgp.columns=cols
eabeydispgp.drop('Null', axis=1, inplace=True)
eabeydispgp.head(2)

Unnamed: 0,Organisation Code,Name,Address Line 1,Address Line 2,Address Line 3,Address Line 4,Address Line 5,Postcode,Open Date,Close Date,Organisation Sub-Type Code,Parent Organisation Code,Join Parent Date,Left Parent Date,Contact Telephone Number,Amended Record Indicator,Current Care Organisation
0,G7800018,ST JAMES PRACTICE,GAINS LANE,,,DEVIZES,WILTSHIRE,SN10 1QU,2006-10-01,NaT,A,J83053,2006-10-01,NaT,,0,99N
1,G7800032,TOWER HOUSE SURGERY,169 WEST WYCOMBE ROAD,,,HIGH WYCOMBE,BUCKINGHAMSHIRE,HP12 3AF,2006-10-01,NaT,A,K82010,2006-10-01,NaT,,0,10H


### Storing the Data in a SQLite3 Database

In [88]:
tmp=eabeydispgp.set_index(['Organisation Code'])
#If the table exists, replace it, under the assumption we are using a more recent version of the data
tmp.to_sql(con=con, name=EABEYDISPGP,if_exists='replace')

  chunksize=chunksize, dtype=dtype)


In [89]:
gp='VENTNOR MEDICAL CENTRE'

q='''
SELECT epraccur."Organisation Code" AS code, epraccur.Name AS Name, 
        eabeydispgp."Name",eabeydispgp."Open Date",eabeydispgp."Close Date"
FROM eabeydispgp, epraccur 
WHERE epraccur.Name="{}" AND eabeydispgp."Parent Organisation Code"=epraccur."Organisation Code" '''

pd.read_sql_query(q.format(gp), con)

Unnamed: 0,code,Name,Name.1,Open Date,Close Date
0,J84003,VENTNOR MEDICAL CENTRE,DR D P TURNER,2009-11-19 00:00:00,


## ecarehomehq - Care Home Headquarters 



In [116]:
#via http://systems.digital.nhs.uk/data/ods/datadownloads/nonnhs
cols=['Organisation Code','Name','Null','Null',
      'Address Line 1','Address Line 2','Address Line 3','Address Line 4','Address Line 5','Postcode',
      'Open Date','Close Date','Null','Null',
      'Null','Null','Null','Null',
      'Null','Null','Null',
      'Amended Record Indicator','Null',
      'Null','Null','Null','Country']

tmp=normaliser('ecarehomehq',cols,[10,11],'Organisation Code',encoding='latin-1',db_con=con)
tmp.head(3)

--2016-09-27 10:07:18--  http://systems.digital.nhs.uk/data/ods/datadownloads/data-files/ecarehomehq.zip
Resolving systems.digital.nhs.uk... 194.189.27.101
Connecting to systems.digital.nhs.uk|194.189.27.101|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 426863 (417K) [application/zip]
Saving to: 'downloads/ecarehomehq.zip'


2016-09-27 10:07:19 (769 KB/s) - 'downloads/ecarehomehq.zip' saved [426863/426863]

Archive:  downloads/ecarehomehq.zip
  inflating: data/ecarehomehq/ecarehomehq.csv  
  inflating: data/ecarehomehq/ecarehomehq.pdf  


  chunksize=chunksize, dtype=dtype)


Unnamed: 0_level_0,Name,Address Line 1,Address Line 2,Address Line 3,Address Line 4,Address Line 5,Postcode,Open Date,Close Date,Amended Record Indicator,Country
Organisation Code,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1
A000,229 Mitcham Lane Ltd,99 Sunny Hill Road,,,London,Greater London,SW16 2UW,2008-11-25,NaT,0,1
A001,Abbey Healthcare Homes Ltd,82-84 Calcutta Road,,,Tilbury,Essex,RM18 7QJ,2008-11-25,NaT,0,1
A002,Access for Living,Catford,,,London,Greater London,SE6 2LW,2008-11-25,2010-09-01,0,1


In [107]:
pd.read_sql_query('SELECT * FROM {typ} LIMIT 3'.format(typ='ecarehomehq'), con)

Unnamed: 0,Organisation Code,Name,Address Line 1,Address Line 2,Address Line 3,Address Line 4,Address Line 5,Postcode,Open Date,Close Date,Amended Record Indicator,Country
0,A000,229 Mitcham Lane Ltd,99 Sunny Hill Road,,,London,Greater London,SW16 2UW,2008-11-25 00:00:00,,0,1
1,A001,Abbey Healthcare Homes Ltd,82-84 Calcutta Road,,,Tilbury,Essex,RM18 7QJ,2008-11-25 00:00:00,,0,1
2,A002,Access for Living,Catford,,,London,Greater London,SE6 2LW,2008-11-25 00:00:00,2010-09-01 00:00:00,0,1


## ecarehomesite - Care Home Sites

In [117]:
#via http://systems.digital.nhs.uk/data/ods/datadownloads/nonnhs
cols=['Organisation Code','Name','National Grouping','High Level Health Geography',
      'Address Line 1','Address Line 2','Address Line 3','Address Line 4','Address Line 5','Postcode',
      'Open Date','Close Date','Null','Null',
      'Parent Organisation Code','Join Parent Date','Left Parent Date','Contact Telephone Number',
      'Null','Null','Null',
      'Amended Record Indicator','Null',
      'Current Care Organisation','Null','Null','Country']

typ='ecarehomesite'
tmp=normaliser(typ,cols,[10,11,15,16],'Organisation Code',encoding='Latin-1',db_con=con)
tmp.head(3)

rm: downloads/ecarehomesite.zip: No such file or directory
--2016-09-27 10:07:33--  http://systems.digital.nhs.uk/data/ods/datadownloads/data-files/ecarehomesite.zip
Resolving systems.digital.nhs.uk... 194.189.27.101
Connecting to systems.digital.nhs.uk|194.189.27.101|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 1348038 (1.3M) [application/zip]
Saving to: 'downloads/ecarehomesite.zip'


2016-09-27 10:07:35 (594 KB/s) - 'downloads/ecarehomesite.zip' saved [1348038/1348038]

rm: data/ecarehomesite/: No such file or directory
Archive:  downloads/ecarehomesite.zip
  inflating: data/ecarehomesite/ecarehomesite.csv  
  inflating: data/ecarehomesite/ecarehomesite.pdf  


  chunksize=chunksize, dtype=dtype)


Unnamed: 0_level_0,Name,National Grouping,High Level Health Geography,Address Line 1,Address Line 2,Address Line 3,Address Line 4,Address Line 5,Postcode,Open Date,Close Date,Parent Organisation Code,Join Parent Date,Left Parent Date,Contact Telephone Number,Amended Record Indicator,Current Care Organisation,Country
Organisation Code,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1
VL000,Admiral House,Y53,Q36,Cliff Road,Wandsworth,,London,Greater London,SW16 1PA,2008-11-25,2011-11-16,A006,2008-11-25,2011-11-16,8081661000.0,0,5LG,1
VL001,Focus Project,Y56,Q71,29 Akerman Road,,,London,Greater London,SW9 6SN,2008-11-25,NaT,A04V,2008-11-25,NaT,2075020000.0,0,08K,1
VL002,Alan Morkill House,Y53,Q36,88 St Mark's Road,,,London,Greater London,W10 6BY,2008-11-25,2013-01-07,A061,2008-11-25,2013-01-07,2089641000.0,0,5LA,1


In [124]:
pd.read_sql_query('SELECT * FROM {typ} LIMIT 3'.format(typ='ecarehomesite'), con)

Unnamed: 0,Organisation Code,Name,National Grouping,High Level Health Geography,Address Line 1,Address Line 2,Address Line 3,Address Line 4,Address Line 5,Postcode,Open Date,Close Date,Parent Organisation Code,Join Parent Date,Left Parent Date,Contact Telephone Number,Amended Record Indicator,Current Care Organisation,Country
0,VL000,Admiral House,Y53,Q36,Cliff Road,Wandsworth,,London,Greater London,SW16 1PA,2008-11-25 00:00:00,2011-11-16 00:00:00,A006,2008-11-25 00:00:00,2011-11-16 00:00:00,8081661000.0,0,5LG,1
1,VL001,Focus Project,Y56,Q71,29 Akerman Road,,,London,Greater London,SW9 6SN,2008-11-25 00:00:00,,A04V,2008-11-25 00:00:00,,2075020000.0,0,08K,1
2,VL002,Alan Morkill House,Y53,Q36,88 St Mark's Road,,,London,Greater London,W10 6BY,2008-11-25 00:00:00,2013-01-07 00:00:00,A061,2008-11-25 00:00:00,2013-01-07 00:00:00,2089641000.0,0,5LA,1


## ecarehomesucc - Care Home Successors

In [123]:
#via http://systems.digital.nhs.uk/data/ods/datadownloads/nonnhs
cols=['Organisation Code','Successor Organisation Code',
      'Successor Reason Code','Succession Effective Date','Succession Indicator']

typ='ecarehomesucc'
tmp=normaliser(typ,cols,index=['Organisation Code','Successor Organisation Code'],encoding='Latin-1',db_con=con)
tmp.head(3)

--2016-09-27 10:09:33--  http://systems.digital.nhs.uk/data/ods/datadownloads/data-files/ecarehomesucc.zip
Resolving systems.digital.nhs.uk... 194.189.27.101
Connecting to systems.digital.nhs.uk|194.189.27.101|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 74094 (72K) [application/zip]
Saving to: 'downloads/ecarehomesucc.zip'


2016-09-27 10:09:34 (447 KB/s) - 'downloads/ecarehomesucc.zip' saved [74094/74094]

Archive:  downloads/ecarehomesucc.zip
  inflating: data/ecarehomesucc/ecarehomesucc.csv  
  inflating: data/ecarehomesucc/ecarehomesucc.pdf  


  chunksize=chunksize, dtype=dtype)


Unnamed: 0_level_0,Unnamed: 1_level_0,Successor Reason Code,Succession Effective Date,Succession Indicator
Organisation Code,Successor Organisation Code,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
VL002,VM2T3,R,20130108,
VL003,VM682,R,20150727,
VL00F,VM581,R,20141031,F


In [125]:
pd.read_sql_query('SELECT * FROM {typ} LIMIT 3'.format(typ='ecarehomesucc'), con)

Unnamed: 0,Organisation Code,Successor Organisation Code,Successor Reason Code,Succession Effective Date,Succession Indicator
0,VL002,VM2T3,R,20130108,
1,VL003,VM682,R,20150727,
2,VL00F,VM581,R,20141031,F


## ephp - Independent Sector Healthcare Providers 

In [126]:
#via http://systems.digital.nhs.uk/data/ods/datadownloads/nonnhs
cols=['Organisation Code','Name','National Grouping','High Level Health Geography',
      'Address Line 1','Address Line 2','Address Line 3','Address Line 4','Address Line 5','Postcode',
      'Open Date','Close Date','Null','Null',
      'Null','Null','Null','Null',
      'Null','Null','Null',
      'Amended Record Indicator','Null',
      'Null','Null','Null','Null']

typ='ephp'
tmp=normaliser(typ,cols,dates=[10,11],index=['Organisation Code'],db_con=con)
tmp.head(3)

rm: downloads/ephp.zip: No such file or directory
--2016-09-27 14:01:47--  http://systems.digital.nhs.uk/data/ods/datadownloads/data-files/ephp.zip
Resolving systems.digital.nhs.uk... 194.189.27.101
Connecting to systems.digital.nhs.uk|194.189.27.101|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 60304 (59K) [application/zip]
Saving to: 'downloads/ephp.zip'


2016-09-27 14:01:50 (42.8 KB/s) - 'downloads/ephp.zip' saved [60304/60304]

rm: data/ephp/: No such file or directory
Archive:  downloads/ephp.zip
  inflating: data/ephp/ephp.csv      
  inflating: data/ephp/ephp.pdf      


  chunksize=chunksize, dtype=dtype)


Unnamed: 0_level_0,Name,National Grouping,High Level Health Geography,Address Line 1,Address Line 2,Address Line 3,Address Line 4,Address Line 5,Postcode,Open Date,Close Date,Amended Record Indicator
Organisation Code,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1
AA4,INTRAHEALTH LTD,Y54,Q74,"1ST FLOOR, WILLIAM BROWN CENTRE",MANOR WAY,,PETERLEE,COUNTY DURHAM,SR8 5TW,2013-04-01,NaT,0
AA5,COMPASS WELLBEING COMMUNITY INTEREST COMPANY,Y56,Q71,STEELS LANE HEALTH CENTRE,384-388 COMMERCIAL ROAD,,LONDON,GREATER LONDON,E1 0LR,2013-04-01,NaT,0
AA6,ASSISTED CONCEPTION UNIT LTD,Y56,Q71,LEYTONSTONE HOUSE,LEYTONSTONE,,LONDON,GREATER LONDON,E11 1GA,2013-04-01,NaT,0


In [131]:
pd.read_sql_query('SELECT * FROM {typ} WHERE Name LIKE "%Virgin%"'.format(typ='ephp'), con)

Unnamed: 0,Organisation Code,Name,National Grouping,High Level Health Geography,Address Line 1,Address Line 2,Address Line 3,Address Line 4,Address Line 5,Postcode,Open Date,Close Date,Amended Record Indicator
0,NDA,VIRGIN CARE SERVICES LTD,Y56,Q71,LYNTON HOUSE,7-12 TAVISTOCK SQUARE,,LONDON,GREATER LONDON,WC1H 9LT,2011-10-01 00:00:00,,0
1,NDR,VIRGIN CARE PROVIDER SERVICES LTD,Y55,Q78,THE PRIORY,HIGH STREET,,WARE,HERTFORDSHIRE,SG12 9AL,2011-10-01 00:00:00,,0
2,NQT,VIRGIN CARE LTD,Y54,Q75,6400 DARESBURY PARK,DARESBURY,,WARRINGTON,CHESHIRE,WA4 4GE,2010-10-01 00:00:00,,0


## ephpsite - Independent Sector Healthcare Provider Sites

In [138]:
#via http://systems.digital.nhs.uk/data/ods/datadownloads/nonnhs
cols=['Organisation Code','Name','National Grouping','High Level Health Geography',
      'Address Line 1','Address Line 2','Address Line 3','Address Line 4','Address Line 5','Postcode',
      'Open Date','Close Date','Null','Organisation SubType Code',
      'Parent Organisation Code','Null','Null','Null',
      'Null','Null','Null',
      'Amended Record Indicator','Null',
      'Null','Null','Null','Null']

typ='ephpsite'
tmp=normaliser(typ,cols,dates=[10,11],index=['Organisation Code'],encoding='Latin-1',db_con=con)
tmp.head(3)

--2016-09-27 14:10:09--  http://systems.digital.nhs.uk/data/ods/datadownloads/data-files/ephpsite.zip
Resolving systems.digital.nhs.uk... 194.189.27.101
Connecting to systems.digital.nhs.uk|194.189.27.101|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 357999 (350K) [application/zip]
Saving to: 'downloads/ephpsite.zip'


2016-09-27 14:10:11 (358 KB/s) - 'downloads/ephpsite.zip' saved [357999/357999]

Archive:  downloads/ephpsite.zip
  inflating: data/ephpsite/ephpsite.csv  
  inflating: data/ephpsite/ephpsite.pdf  


  chunksize=chunksize, dtype=dtype)


Unnamed: 0_level_0,Name,National Grouping,High Level Health Geography,Address Line 1,Address Line 2,Address Line 3,Address Line 4,Address Line 5,Postcode,Open Date,Close Date,Organisation SubType Code,Parent Organisation Code,Amended Record Indicator
Organisation Code,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1
AA401,INTRAHEALTH LTD - PETERLEE,Y54,Q74,"1ST FLOOR, WILLIAM BROWN CENTRE",MANOR WAY,,PETERLEE,COUNTY DURHAM,SR8 5TW,2013-04-01,NaT,E,AA4,0
AA402,ASHFURLONG HEALTH CENTRE,Y55,Q77,233 TAMWORTH ROAD,,,SUTTON COLDFIELD,WEST MIDLANDS,B75 6DX,2015-10-01,NaT,E,AA4,0
AA403,SUTTON PARK SURGERY,Y55,Q77,34 CHESTER ROAD NORTH,,,SUTTON COLDFIELD,WEST MIDLANDS,B73 6SP,2015-10-01,NaT,E,AA4,0


In [139]:
pd.read_sql_query('SELECT * FROM {typ} WHERE "Parent Organisation Code" IN (SELECT "Organisation Code" FROM ephp WHERE Name LIKE "%Virgin%") LIMIT 3'.format(typ='ephpsite'), con)

Unnamed: 0,Organisation Code,Name,National Grouping,High Level Health Geography,Address Line 1,Address Line 2,Address Line 3,Address Line 4,Address Line 5,Postcode,Open Date,Close Date,Organisation SubType Code,Parent Organisation Code,Amended Record Indicator
0,NDA01,VIRGIN CARE SERVICES LTD (BROOK GREEN),Y57,Q81,BOURNEWOOD HOUSE,GUILDFORD ROAD,,CHERTSEY,SURREY,KT16 0QA,2011-10-01 00:00:00,,E,NDA,0
1,NDA02,JARVIS BREAST SCREENING CENTRE,Y57,Q81,60 STOUGHTON ROAD,,,GUILDFORD,SURREY,GU1 1LJ,2011-10-01 00:00:00,,E,NDA,0
2,NDA03,HASLEMERE AND DISTRICT HOSPITAL OPD,Y57,Q81,CHURCH LANE,,,HASLEMERE,SURREY,GU27 2BJ,2011-10-01 00:00:00,,E,NDA,0


In [141]:
pd.read_sql_query('SELECT * FROM {typ} WHERE "Address Line 5" LIKE "%WIGHT%"'.format(typ='ephpsite'), con)
#Alternatively do it by postcode?
#How do the Organisation Codes reconcile with other flavours of Organisation Code for the same establishment?

Unnamed: 0,Organisation Code,Name,National Grouping,High Level Health Geography,Address Line 1,Address Line 2,Address Line 3,Address Line 4,Address Line 5,Postcode,Open Date,Close Date,Organisation SubType Code,Parent Organisation Code,Amended Record Indicator
0,NL50C,COWES MEDICAL CENTRE,Y57,Q70,200 NEWPORT ROAD,,,COWES,ISLE OF WIGHT,PO31 7ER,2016-04-01 00:00:00,,E,NL5,0
1,NL50D,GODSHILL SURGERY (SOUTH WIGHT MEDICAL PRACTICE),Y57,Q70,2 YARBOROUGH CLOSE,GODSHILL,,VENTNOR,ISLE OF WIGHT,PO38 3HS,2016-04-01 00:00:00,,E,NL5,0
2,NL50E,MEDINA HEALTHCARE,Y57,Q70,BRANNON WAY,WOOTTON BRIDGE,,RYDE,ISLE OF WIGHT,PO33 4NW,2016-04-01 00:00:00,,E,NL5,0
3,NL50F,MEDINA LEISURE CENTRE,Y57,Q70,FAIRLEE ROAD,,,NEWPORT,ISLE OF WIGHT,PO30 2DX,2016-04-01 00:00:00,,E,NL5,0
4,NL50G,THE HEIGHTS LEISURE CENTRE,Y57,Q70,BROADWAY,,,SANDOWN,ISLE OF WIGHT,PO36 9ET,2016-04-01 00:00:00,,E,NL5,0
5,NL50H,TOWER HOUSE SURGERY,Y57,Q70,RINK ROAD,,,RYDE,ISLE OF WIGHT,PO33 1LP,2016-04-01 00:00:00,,E,NL5,0
6,NL50J,VENTNOR MEDICAL CENTRE,Y57,Q70,3 ALBERT STREET,,,VENTNOR,ISLE OF WIGHT,PO38 1EZ,2016-04-01 00:00:00,,E,NL5,0
7,NL50K,WEST WIGHT SPORTS CENTRE,Y57,Q70,MOA PLACE,,,FRESHWATER,ISLE OF WIGHT,PO40 9XH,2016-04-01 00:00:00,,E,NL5,0
8,NL51C,CARISBROOKE HEALTH CENTRE,Y57,Q70,22 CARISBROOKE HIGH STREET,,,NEWPORT,ISLE OF WIGHT,PO30 1NR,2015-10-01 00:00:00,,E,NL5,0
9,NL51D,BEECH GROVE SURGERY,Y57,Q70,THE MALL,BRADING,,SANDOWN,ISLE OF WIGHT,PO36 0DE,2015-10-01 00:00:00,,E,NL5,0


## enonnhs - Non-NHS Organisations

In [137]:
#via http://systems.digital.nhs.uk/data/ods/datadownloads/nonnhs
cols=['Organisation Code','Name','National Grouping','High Level Health Geography',
      'Address Line 1','Address Line 2','Address Line 3','Address Line 4','Address Line 5','Postcode',
      'Open Date','Close Date','Null','Organisation SubType Code',
      'Null','Null','Null','Contact Telephone Number',
      'Null','Null','Null',
      'Amended Record Indicator','Null',
      'Null','Null','Null','Null']

typ='enonnhs'
tmp=normaliser(typ,cols,dates=[10,11],index=['Organisation Code'],encoding='Latin-1',db_con=con)
tmp.head(3)

rm: downloads/enonnhs.zip: No such file or directory
--2016-09-27 14:09:52--  http://systems.digital.nhs.uk/data/ods/datadownloads/data-files/enonnhs.zip
Resolving systems.digital.nhs.uk... 194.189.27.101
Connecting to systems.digital.nhs.uk|194.189.27.101|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 510692 (499K) [application/zip]
Saving to: 'downloads/enonnhs.zip'


2016-09-27 14:09:54 (342 KB/s) - 'downloads/enonnhs.zip' saved [510692/510692]

rm: data/enonnhs/: No such file or directory
Archive:  downloads/enonnhs.zip
  inflating: data/enonnhs/enonnhs.csv  
  inflating: data/enonnhs/enonnhs.pdf  


  chunksize=chunksize, dtype=dtype)


Unnamed: 0_level_0,Name,National Grouping,High Level Health Geography,Address Line 1,Address Line 2,Address Line 3,Address Line 4,Address Line 5,Postcode,Open Date,Close Date,Organisation SubType Code,Contact Telephone Number,Amended Record Indicator
Organisation Code,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1
8A001,JORDAN HILL NH,Y54,Q72,37 GAWBER ROAD,,,BARNSLEY,SOUTH YORKSHIRE,S75 2AN,1996-04-01,NaT,R,,0
8A003,THE GROVE NH,Y54,Q72,THURNSCOE BRIDGE LANE,THURNSCOE,,ROTHERHAM,SOUTH YORKSHIRE,S63 0SN,1996-04-01,NaT,R,,0
8A005,THURNSCOE HALL NH,Y54,Q72,HIGH STREET,THURNSCOE,,ROTHERHAM,SOUTH YORKSHIRE,S63 0ST,1996-04-01,NaT,R,,0
