## NHS Master Data Managment (in England)

It is quite common to load in master data files from [ODS Data Search and Export](https://digital.nhs.uk/services/organisation-data-service/data-search-and-export/csv-downloads) into a computer system. The identifiers used for GP's and Organisations help identify these entities between different systems and health providers.

NHS England provides several API's for doing this:

- [Organisation Data Terminology - FHIR API](https://digital.nhs.uk/developer/api-catalogue/organisation-data-terminology) which allows you to search for organisations
- [Spine Directory Service - LDAP API](https://digital.nhs.uk/developer/api-catalogue/spine-directory-service-ldap) which allows search on a wide set of MDM entities and includes most of the entities from ODS.

The structure of these entities in FHIR, ODS and SDS is very similar. This diagram is from [HL7 FHIR Administration Module](https://hl7.org/fhir/R4/administration-module.html)

![Alt text](https://hl7.org/fhir/R4/administration-module-prov-dir.png)

### Care Directory Service

In this guide we are aiming to produce a FHIR API following [IHE Mobile Care Services Discovery (mCSD)](https://profiles.ihe.net/ITI/mCSD/index.html). We won't get a to complete implementation as the health services are available in a variety of `directory of services` APIs, such as:

- [Directory of Healthcare Services (Service Search) API](https://digital.nhs.uk/developer/api-catalogue/directory-of-healthcare-services)
- [Electronic Transmission of Prescriptions Web Services - SOAP API](https://digital.nhs.uk/developer/api-catalogue/electronic-transmission-of-prescriptions-web-services-soap)

### Plan/Design

This notebook explores the ETL process in the diagram below.

![Alt text](images/ETL+Airflow.drawio.png)



### Load GP Practitioners (egpcur)

The general idea behind this is we want to be able to do some basic queries on ODS data. For example we may want a list of GP's who work at

In [134]:
import requests
from zipfile import ZipFile
from io import BytesIO
import pandas as pd
import numpy as np

headers = {'User-Agent': 'Mozilla/5.0 (X11; Windows; Windows x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/103.0.5060.114 Safari/537.36'}

url = 'https://files.digital.nhs.uk/assets/ods/current/egpcur.zip'
response = requests.get(url, headers=headers, timeout=120)
response.raise_for_status()  # Raise an exception for bad status codes

myzip = ZipFile(BytesIO(response.content))
myzip.namelist()
myzip.extractall('ZIP')

egpcur = pd.read_csv('ZIP/egpcur.csv', header=None, index_col=False, names=["GMP","Practitioner_Name",3,4,"AddressLine_1","AddressLine_2","AddressLine_3","AddressLine_4","AddressLine_5","PostCode",10,11,12,13,"ODS",15,16,"PhoneNumber",18,19,20,21,22,23,24,25,26], dtype={'AddressLine_5': 'S20'})

egpcur

Unnamed: 0,GMP,Practitioner_Name,3,4,AddressLine_1,AddressLine_2,AddressLine_3,AddressLine_4,AddressLine_5,PostCode,...,PhoneNumber,18,19,20,21,22,23,24,25,26
0,G0102005,ALLEN EB,Y11,QAL,"FIRCROFT, LONDON ROAD",ENGLEFIELD GREEN,EGHAM,SURREY,b'',TW20 0BS,...,,,,,1,,,,,
1,G0102926,ANDERSON MG,Y61,QUE,LENSFIELD MEDICAL PRAC.,48 LENSFIELD ROAD,CAMBRIDGE,CAMBRIDGESHIRE,b'',CB2 1EH,...,01223 651020,,,,1,,06H,,,
2,G0105912,ADLER S,Y56,QMJ,682 FINCHLEY ROAD,GOLDERS GREEN,LONDON,,b'',NW11 7NP,...,020 84559994,,,,1,,93C,,,
3,G0107031,ATTWOOD DC,Y62,QOP,GREAT LEVER HEALTH CENTRE,"RUPERT STREET,GREAT LEVER",BOLTON,LANCASHIRE,b'',BL3 6RN,...,01204 462141,,,,1,,00T,,,
4,G0107725,ALEXANDER PJ,Y01,QDF,10 WEST END,SWANLAND,HUMBERSIDE,,b'',HU14 3PE,...,0482 633570,,,,1,,,,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
123768,G9996043,UNIDENTIFIED GPS,W00,Q99,NORTH WALES HA,PRESWYLFA,HENDY ROAD,MOLD FLINTSHIRE,b'',CH7 1PZ,...,,,,,1,,,,,
123769,G9996050,UNIDENTIFIED GPS,W00,Q99,MORGANNWG HA,41 HIGH STREET,SWANSEA,WEST GLAMORGAN,b'',SA1 1LT,...,,,,,1,,,,,
123770,G9996067,COMMITTEES LOCUM,W00,QW3,DEPUTISING SERVICES,POWYS,,,b'',,...,,,,,1,,,,,
123771,G9996074,COMMITTEES LOCUM,W00,QW2,DEPUTISING SERVICES,SOUTH-GLAMORGAN,,,b'',,...,,,,,1,,,,,


### Load GP Practices (epraccur)

In [135]:
url = 'https://files.digital.nhs.uk/assets/ods/current/epraccur.zip'
response = requests.get(url, headers=headers, timeout=120)
response.raise_for_status()  # Raise an exception for bad status codes

myzip = ZipFile(BytesIO(response.content))
#myzip.namelist()
myzip.extractall('ZIP')

epraccur = pd.read_csv('ZIP/epraccur.csv', header=None, index_col=False, names=["ODS","Organisation_Name","NationalGrouping",4,"AddressLine_1","AddressLine_2","AddressLine_3","AddressLine_4","AddressLine_5","PostCode","Opened","Closed",13,14,"PRAC_ODS",16,17,"PhoneNumber",19,20,21,22,23,24,25,26])

epraccur = epraccur.set_index(['ODS'])

epraccur

Unnamed: 0_level_0,Organisation_Name,NationalGrouping,4,AddressLine_1,AddressLine_2,AddressLine_3,AddressLine_4,AddressLine_5,PostCode,Opened,...,17,PhoneNumber,19,20,21,22,23,24,25,26
ODS,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
A81001,THE DENSHAM SURGERY,Y63,QHM,THE HEALTH CENTRE,LAWSON STREET,STOCKTON ON TEES,CLEVELAND,,TS18 1HU,19740401,...,,01642 672351,,,,0,,16C,,4
A81002,QUEENS PARK MEDICAL CENTRE,Y63,QHM,QUEENS PARK MEDICAL CTR,FARRER STREET,STOCKTON ON TEES,CLEVELAND,,TS18 2AW,19740401,...,,01642 618170,,,,0,,16C,,4
A81003,VICTORIA MEDICAL PRACTICE,Y54,Q74,THE HEALTH CENTRE,VICTORIA ROAD,HARTLEPOOL,CLEVELAND,,TS26 8DB,19740401,...,20171031.0,01429 272945,,,,0,,00K,,4
A81004,ACKLAM MEDICAL CENTRE,Y63,QHM,TRIMDON AVENUE,ACKLAM,MIDDLESBROUGH,CLEVELAND,,TS5 8SB,19740401,...,,01642 827697,,,,0,,16C,,4
A81005,SPRINGWOOD SURGERY,Y63,QHM,SPRINGWOOD SURGERY,RECTORY LANE,GUISBOROUGH,,,TS14 7DJ,19740401,...,,01287 619611,,,,0,,16C,,4
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
Y08757,COMMUNITY HOSPITAL ALCOHOL TEAM,Y60,QNC,EDWARD MYERS UNIT,HARPLANDS HOSPITAL,STOKE-ON-TRENT,STAFFORDSHIRE,,ST4 6TH,20250501,...,,01782 441715,,,,1,,RLY,,10
Y08758,LARC SERVICE,Y60,QJM,COUNTY OFFICES,NEWLAND,LINCOLN,LINCOLNSHIRE,,LN1 1YL,20250401,...,,01522 554980,,,,1,,503,,8
Y08759,WELL LIFE CLINIC,Y59,QXU,THE HOUSE PARTNERSHIP,99 STATION ROAD,REDHILL,SURREY,,RH1 1EB,20250601,...,,01737 761201,,,,1,,92A,,0
Y08760,OSPREY UNIT - PODIATRY,Y58,QOX,GREAT WESTERN HOSPITAL,MARLBOROUGH ROAD,SWINDON,WILTSHIRE,,SN3 6BB,20250422,...,,01793 604300,,,,1,,92G,,9


This next section of code:
- Adds practice name to the GP data frame
- splits the name into surname and initials

In [136]:

egpcur = pd.merge(egpcur, epraccur['Organisation_Name'], left_on='ODS', right_on='ODS')

egpcur['Practitioner_Surname'] = egpcur['Practitioner_Name'].str.split(' ', expand=True)[0]
egpcur['Practitioner_Initials'] = egpcur['Practitioner_Name'].str.split(' ', expand=True)[1]

Updated GP data frame

In [137]:
egpcur

Unnamed: 0,GMP,Practitioner_Name,3,4,AddressLine_1,AddressLine_2,AddressLine_3,AddressLine_4,AddressLine_5,PostCode,...,20,21,22,23,24,25,26,Organisation_Name,Practitioner_Surname,Practitioner_Initials
0,G0102926,ANDERSON MG,Y61,QUE,LENSFIELD MEDICAL PRAC.,48 LENSFIELD ROAD,CAMBRIDGE,CAMBRIDGESHIRE,b'',CB2 1EH,...,,1,,06H,,,,LENSFIELD MEDICAL PRACTICE,ANDERSON,MG
1,G0105912,ADLER S,Y56,QMJ,682 FINCHLEY ROAD,GOLDERS GREEN,LONDON,,b'',NW11 7NP,...,,1,,93C,,,,ADLER JS-THE SURGERY,ADLER,S
2,G0107031,ATTWOOD DC,Y62,QOP,GREAT LEVER HEALTH CENTRE,"RUPERT STREET,GREAT LEVER",BOLTON,LANCASHIRE,b'',BL3 6RN,...,,1,,00T,,,,LEVER CHAMBERS 2,ATTWOOD,DC
3,G0108018,ALLDRIDGE DGE,Y59,QXU,OAKFIELD,158 STATION ROAD,REDHILL,SURREY,b'',RH1 1HF,...,,1,,,,,,MOAT HOUSE SURGERY,ALLDRIDGE,DGE
4,G0108324,ANDERSON CF,Y63,QHM,THE HEALTH CENTRE,LAWSON STREET,STOCKTON ON TEES,CLEVELAND,b'',TS18 1HU,...,,1,,16C,,,,THE DENSHAM SURGERY,ANDERSON,CF
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
119854,G9996012,UNIDENTIFIED GPS,W00,Q99,GWENT HA,MAMHILAD HOUSE,MAMHHILAD PARK ESTATE,PONTYPOOL GWENT,b'',NP4 0YP,...,,1,,,,,,UNIDENTIFIED GPS,UNIDENTIFIED,GPS
119855,G9996029,UNIDENTIFIED GPS,W00,Q99,BRO TAF HA,CHURCHILL HOUSE,CHURCHILL WAY,CARDIFF,b'',CF10 2TW,...,,1,,,,,,UNIDENTIFIED GPS,UNIDENTIFIED,GPS
119856,G9996036,UNIDENTIFIED GPS,W00,Q99,DYFED POWYS HA,ST. DAVID'S HOSPITAL,CARMARTHEN,DYFED,b'',SA31 3HB,...,,1,,,,,,UNIDENTIFIED GPS,UNIDENTIFIED,GPS
119857,G9996043,UNIDENTIFIED GPS,W00,Q99,NORTH WALES HA,PRESWYLFA,HENDY ROAD,MOLD FLINTSHIRE,b'',CH7 1PZ,...,,1,,,,,,UNIDENTIFIED GPS,UNIDENTIFIED,GPS


In [138]:
practitionerDF = egpcur.loc[(egpcur['Practitioner_Surname'] == "KOYA") & (egpcur['Practitioner_Initials'] == "MR")]

row = practitionerDF.iloc[0]

### Practitioner

In [139]:
from fhir.resources.R4B.practitioner import Practitioner
import json

active = True

practitionerJSON = {
    "resourceType": "Practitioner",
    "identifier": [
        {
            "system": "https://fhir.hl7.org.uk/Id/gmp-number",
            "value": row['GMP']
        }
    ],
    "active": active,
    "name": [
        {
            "family": row['Practitioner_Surname'],
            "given": [
                row["Practitioner_Initials"]
            ],
            "prefix": [
                "Dr"
            ]
        }
    ],
    "telecom": [
        {
            "system": "phone",
            "value": row['PhoneNumber'],
            "use": "work"
        }
    ],
    "address": [
        {
            "use": "work",
            "postalCode": row['PostCode']
        }
    ]
}

practitioner = Practitioner(**practitionerJSON)

print(json.dumps(practitionerJSON, indent=2, ensure_ascii=False))

{
  "resourceType": "Practitioner",
  "identifier": [
    {
      "system": "https://fhir.hl7.org.uk/Id/gmp-number",
      "value": "G3298457"
    }
  ],
  "active": true,
  "name": [
    {
      "family": "KOYA",
      "given": [
        "MR"
      ],
      "prefix": [
        "Dr"
      ]
    }
  ],
  "telecom": [
    {
      "system": "phone",
      "value": "020 72720111",
      "use": "work"
    }
  ],
  "address": [
    {
      "use": "work",
      "postalCode": "N19 3NX"
    }
  ]
}


### PractitionerRole

A practitioner can work at multiple organisations, so we need a link entity (table).

The element's code and specialty are optional, but we can improve our search capabilities by adding data we can infer from the source file (egpcur). This is the practitioner is a GP and works in General Practice.

Note how we have incorporated identifiers and display names. This is to provide some common data elements in this resource and not require the user to perform another search to retrieve these details, we can clearly see this role is for Dr Koya at the Archway Practice.

In [140]:
from fhir.resources.R4B.practitionerrole import PractitionerRole

practitionerRoleJSON = {
    "resourceType": "PractitionerRole",
    "active": True,
    "practitioner": {
        "identifier": {
            "system": "https://fhir.hl7.org.uk/Id/gmp-number",
            "value": row['GMP']
        },
        "display": row['Practitioner_Name']
    },
    "organization": {
        "identifier": {
            "system": "https://fhir.nhs.uk/Id/ods-organization-code",
            "value": row['ODS']
        },
        "display": row['Organisation_Name']
    },
    "code": [
        {
            "coding": [
                {
                    "system": "http://snomed.info/sct",
                    "code": "62247001",
                    "display": "General practitioner"
                }
            ]
        }
    ],
    "specialty": [
        {
            "coding": [
                {
                    "system": "http://snomed.info/sct",
                    "code": "394814009",
                    "display": "General practice (specialty) (qualifier value)"
                }
            ]
        }
    ]
}

practitionerRole = PractitionerRole(**practitionerRoleJSON)

print(json.dumps(practitionerRoleJSON, indent=2, ensure_ascii=False))

{
  "resourceType": "PractitionerRole",
  "active": true,
  "practitioner": {
    "identifier": {
      "system": "https://fhir.hl7.org.uk/Id/gmp-number",
      "value": "G3298457"
    },
    "display": "KOYA MR"
  },
  "organization": {
    "identifier": {
      "system": "https://fhir.nhs.uk/Id/ods-organization-code",
      "value": "F83004"
    },
    "display": "ARCHWAY MEDICAL CENTRE"
  },
  "code": [
    {
      "coding": [
        {
          "system": "http://snomed.info/sct",
          "code": "62247001",
          "display": "General practitioner"
        }
      ]
    }
  ],
  "specialty": [
    {
      "coding": [
        {
          "system": "http://snomed.info/sct",
          "code": "394814009",
          "display": "General practice (specialty) (qualifier value)"
        }
      ]
    }
  ]
}


## Testing FHIR (Validation)

So far we have just created FHIR resources as JSON. We have performed basic schema validation using a [fhir.resources](https://github.com/nazrulworld/fhir.resources). Note this package uses FHIR R4B, not R4 and we are using R4 - confused, none of the resources in FHIR R4 changed in R4B, so this is fine.

You can also validate FHIR using command line tools such as [FHIR CLI Validator](https://confluence.hl7.org/spaces/HAFWG/pages/248876078/Using+the+FHIR+Validator+Locally+Quick+Guide) or online applications such as [validate.fhir.org](https://validator.fhir.org/).

Note these tools will generate warnings around England content; you can reduce these warnings by using the [NHS England UK Core](https://digital.nhs.uk/services/fhir-uk-core) package. We use our own package [Virtual Healthcare Testing](https://virtually-healthcare.github.io/R4/testing.html) which incorporates UK Core and extra NHS England data requirements. Documentation on Virtually Healthcare data requirements can be found below, these are called FHIR Profiles:

- [Organization](https://virtually-healthcare.github.io/R4/StructureDefinition-Organization.html)
- [Practitioner](https://virtually-healthcare.github.io/R4/StructureDefinition-Practitioner.html)
- [PractitionerRole](https://virtually-healthcare.github.io/R4/StructureDefinition-PractitionerRole.html)

The profiles are stricter than UK Core as these need to be followed in several products, they are generally conformant to wider NHS England data requirements (not just FHIR).

### Working with a FHIR Test Server

How to put the resources we built earlier into a FHIR Server is available on the internet, and so we won't repeat that.

If you wish to experiment with this, I would suggest using the [HAPI FHIR Test Server](https://hapi.fhir.org/). E.g.

`POST http://hapi.fhir.org/baseR4/Organization`

`POST http://hapi.fhir.org/baseR4/Practitioner`

`POST http://hapi.fhir.org/baseR4/PractitionerRole`

Once you have added the resources to HAPI FHIR, you should be able to search for them, e.g.

`GET http://hapi.fhir.org/baseR4/Organization?identifier=https://fhir.nhs.uk/Id/ods-organization-code|F83004`

`GET http://hapi.fhir.org/baseR4/Practitioner?identifier=https://fhir.hl7.org.uk/Id/gmp-number|G3298457`


## Practical Implementation

So far we have a relatively simple model for our GPs and Practices both are strongly identified using national identifiers, but in practice we will have several other identifiers. Existing use of these national identifiers may not be robust and have data issues. This can occur in all EPR systems, including secondary care.

The main issue is although GMP is defined [GENERAL MEDICAL PRACTITIONER PPD CODE](https://www.datadictionary.nhs.uk/attributes/general_medical_practitioner_ppd_code.html) this and the other practitioner identifiers are quite frequently mixed up.

How to handle this is beyond the scope of this walkthrough, a list of all the different practitioner identifiers can be found on [NHS North West GMSA](https://nw-gmsa.github.io/R4/StructureDefinition-EnglandPractitionerIdentifier.html)

Many systems will have their own strong identifier — for example, EMIS uses UUID's to identify practitioners across all its API's. Our use case is master data management, so it makes sense for us to have a record of that in our MDM solution. As suppliers are supporting operational delivery of care and that ODS is only updated quarterly (and monthly), it's likely that our Practitioner may have more details than ODS or is more up to date.

This means we need to cope with existing data, our data load needs to be repeatable (so we can schedule quarterly/monthly) runs and we can merge with existing data.

### Demonstration FHIR Server and Database

The examples that follow use an Intersystems FHIR Repository. Instructions for running this on a local machine can be found here [ris-fhirserver-template](https://github.com/intersystems-community/iris-fhir-template/blob/master/README.md)

Once installed, you can browse to the [SQL Explorer](http://localhost:32783/csp/sys/exp/%25CSP.UI.Portal.SQL.Home.zen?$NAMESPACE=FHIRSERVER) - username is _System and password SYS

Then execute the following SQL.

`select * from HSFHIR_X0001_S.Organization where addressCountry <> 'US'`

Note the IRIS demo comes with some preloaded test data; the where clause excludes this. Python version is below:

In [141]:
import iris
import pandas as pd


host = "localhost"
# this is the superserver port
port = 32782
namespace = "FHIRSERVER"
user = "_SYSTEM"
password = "SYS"

conn = iris.connect(
    hostname=host,
    port=port,
    namespace=namespace,
    username=user,
    password=password
)

# create a cursor
cursor = conn.cursor()

sql = """
      select org.ID1, org.Key, org.Identifier, org._lastUpdated, resource.ResourceString from HSFHIR_X0001_S.Organization org
                                                                                                  join HSFHIR_X0001_R.Rsrc resource on resource.Key = org.Key
      where IsNull(org.addressCountry,'') <> 'US' and org.type [ 'https://fhir.nhs.uk/CodeSystem/organisation-role|76'
      """

cursor.execute(sql)
data = cursor.fetchall()
column_names = [desc[0] for desc in cursor.description]
df = pd.DataFrame(data, columns=column_names)
pd.set_option('future.no_silent_downcasting', True)
df

Unnamed: 0,ID1,Key,identifier,_lastUpdated,ResourceString
0,8151,Organization/1250048,"A81003,https://fhir.nhs.uk/Id/ods-organization...",2025-06-15T15:52:44Z,"{""resourceType"":""Organization"",""identifier"":[{..."
1,8152,Organization/1250049,"C82094,https://fhir.nhs.uk/Id/ods-organization...",2025-06-15T15:52:44Z,"{""resourceType"":""Organization"",""identifier"":[{..."
2,8153,Organization/1250050,"F83004,https://fhir.nhs.uk/Id/ods-organization...",2025-06-15T15:56:19Z,"{""resourceType"":""Organization"",""identifier"":[{..."
3,8154,Organization/1250051,"A81011,https://fhir.nhs.uk/Id/ods-organization...",2025-06-15T15:59:52Z,"{""resourceType"":""Organization"",""identifier"":[{..."
4,8155,Organization/1250052,"A81001,https://fhir.nhs.uk/Id/ods-organization...",2025-06-15T16:05:41Z,"{""resourceType"":""Organization"",""identifier"":[{..."
...,...,...,...,...,...
11995,20408,Organization/1262305,"Y04107,https://fhir.nhs.uk/Id/ods-organization...",2025-06-15T16:38:04Z,"{""resourceType"":""Organization"",""identifier"":[{..."
11996,20409,Organization/1262306,"Y04108,https://fhir.nhs.uk/Id/ods-organization...",2025-06-15T16:38:04Z,"{""resourceType"":""Organization"",""identifier"":[{..."
11997,20410,Organization/1262307,"Y04109,https://fhir.nhs.uk/Id/ods-organization...",2025-06-15T16:38:04Z,"{""resourceType"":""Organization"",""identifier"":[{..."
11998,20411,Organization/1262308,"Y04110,https://fhir.nhs.uk/Id/ods-organization...",2025-06-15T16:38:04Z,"{""resourceType"":""Organization"",""identifier"":[{..."


### Process Organisation

The results from the above will vary.

What we need to do merge the organisations from ODS and also the organisations in our database.

The outline logic will be:

- Organisation exists in both:
  - ODS will be assumed to be the master record for active and address fields, if ODS has different values then update the database.
  - If our telephone field is empty, then update with ODS entry, otherwise do not process.
- Organisation does not exist in our database.
  - Add the ODS organisation

Firstly, we need to merge the dataframe (df) we have retrieved from the FHIR repository with the epraccur data frame.

We do this on the ODS code which in the df dataframe is an array. This code has a system of `https://fhir.nhs.uk/Id/ods-organization-code`, so we need to use the entry with this and this also has a value.


In [142]:
identifier = df.loc[0,'identifier']

identifiers = identifier.split(',')
identifiers

['A81003',
 'https://fhir.nhs.uk/Id/ods-organization-code|A81003',
 'https://fhir.nhs.uk/Id/ods-organization-code|']

In [143]:
import re
for orgId in range(0, len(df)):
    print(orgId)
    identifier = df.loc[orgId,'identifier']
    identifiers = identifier.split(',')
    for id in identifiers:
        if (re.match('^http.*[|][A-Za-z0-9].*$',id)):
            print('True')
            df.loc[orgId,'ODS'] = id.split('|')[1]


organisations = pd.merge(epraccur, df, how="left", on=["ODS"])
organisations = organisations.set_index(['ODS'])
organisations['ID1'] = organisations['ID1'].fillna(-1).astype(int)
organisations

0
True
1
True
2
True
3
True
4
True
5
True
6
True
7
True
8
True
9
True
10
True
11
True
12
True
13
True
14
True
15
True
16
True
17
True
18
True
19
True
20
True
21
True
22
True
23
True
24
True
25
True
26
True
27
True
28
True
29
True
30
True
31
True
32
True
33
True
34
True
35
True
36
True
37
True
38
True
39
True
40
True
41
True
42
True
43
True
44
True
45
True
46
True
47
True
48
True
49
True
50
True
51
True
52
True
53
True
54
True
55
True
56
True
57
True
58
True
59
True
60
True
61
True
62
True
63
True
64
True
65
True
66
True
67
True
68
True
69
True
70
True
71
True
72
True
73
True
74
True
75
True
76
True
77
True
78
True
79
True
80
True
81
True
82
True
83
True
84
True
85
True
86
True
87
True
88
True
89
True
90
True
91
True
92
True
93
True
94
True
95
True
96
True
97
True
98
True
99
True
100
True
101
True
102
True
103
True
104
True
105
True
106
True
107
True
108
True
109
True
110
True
111
True
112
True
113
True
114
True
115
True
116
True
117
True
118
True
119
True
120
True
121
True
122
True
123

Unnamed: 0_level_0,Organisation_Name,NationalGrouping,4,AddressLine_1,AddressLine_2,AddressLine_3,AddressLine_4,AddressLine_5,PostCode,Opened,...,22,23,24,25,26,ID1,Key,identifier,_lastUpdated,ResourceString
ODS,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
A81001,THE DENSHAM SURGERY,Y63,QHM,THE HEALTH CENTRE,LAWSON STREET,STOCKTON ON TEES,CLEVELAND,,TS18 1HU,19740401,...,0,,16C,,4,8155,Organization/1250052,"A81001,https://fhir.nhs.uk/Id/ods-organization...",2025-06-15T16:05:41Z,"{""resourceType"":""Organization"",""identifier"":[{..."
A81002,QUEENS PARK MEDICAL CENTRE,Y63,QHM,QUEENS PARK MEDICAL CTR,FARRER STREET,STOCKTON ON TEES,CLEVELAND,,TS18 2AW,19740401,...,0,,16C,,4,8156,Organization/1250053,"A81002,https://fhir.nhs.uk/Id/ods-organization...",2025-06-15T16:05:41Z,"{""resourceType"":""Organization"",""identifier"":[{..."
A81003,VICTORIA MEDICAL PRACTICE,Y54,Q74,THE HEALTH CENTRE,VICTORIA ROAD,HARTLEPOOL,CLEVELAND,,TS26 8DB,19740401,...,0,,00K,,4,8151,Organization/1250048,"A81003,https://fhir.nhs.uk/Id/ods-organization...",2025-06-15T15:52:44Z,"{""resourceType"":""Organization"",""identifier"":[{..."
A81004,ACKLAM MEDICAL CENTRE,Y63,QHM,TRIMDON AVENUE,ACKLAM,MIDDLESBROUGH,CLEVELAND,,TS5 8SB,19740401,...,0,,16C,,4,8157,Organization/1250054,"A81004,https://fhir.nhs.uk/Id/ods-organization...",2025-06-15T16:05:41Z,"{""resourceType"":""Organization"",""identifier"":[{..."
A81005,SPRINGWOOD SURGERY,Y63,QHM,SPRINGWOOD SURGERY,RECTORY LANE,GUISBOROUGH,,,TS14 7DJ,19740401,...,0,,16C,,4,8158,Organization/1250055,"A81005,https://fhir.nhs.uk/Id/ods-organization...",2025-06-15T16:05:41Z,"{""resourceType"":""Organization"",""identifier"":[{..."
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
Y08757,COMMUNITY HOSPITAL ALCOHOL TEAM,Y60,QNC,EDWARD MYERS UNIT,HARPLANDS HOSPITAL,STOKE-ON-TRENT,STAFFORDSHIRE,,ST4 6TH,20250501,...,1,,RLY,,10,-1,,,,
Y08758,LARC SERVICE,Y60,QJM,COUNTY OFFICES,NEWLAND,LINCOLN,LINCOLNSHIRE,,LN1 1YL,20250401,...,1,,503,,8,-1,,,,
Y08759,WELL LIFE CLINIC,Y59,QXU,THE HOUSE PARTNERSHIP,99 STATION ROAD,REDHILL,SURREY,,RH1 1EB,20250601,...,1,,92A,,0,-1,,,,
Y08760,OSPREY UNIT - PODIATRY,Y58,QOX,GREAT WESTERN HOSPITAL,MARLBOROUGH ROAD,SWINDON,WILTSHIRE,,SN3 6BB,20250422,...,1,,92G,,9,-1,,,,


The code below iterates the merged oragnisations dataframe. For discussion purposes we have limited the number of organisations we process

In [144]:
from fhir.resources.R4B.organization import Organization

headers = {"Content-Type": "application/fhir+json"}
url = "http://localhost:32783/fhir/r4"

def convertOrganisationFHIR(org):
    organisationJSON = {
        "resourceType": "Organization",
        "identifier": [
            {
                "system": "https://fhir.nhs.uk/Id/ods-organization-code",
                "value": org
            }
        ],
        "active": True,
        "type": [
            {
                "coding": [
                    {
                        "system": "https://fhir.nhs.uk/CodeSystem/organisation-role",
                        "code": "76",
                        "display": "GP PRACTICE"
                    }
                ]
            }
        ],
        "name": organisations.loc[org,'Organisation_Name']
    }
    # if org is closed update active field
    #
    #
    if (organisations.loc[org,'PostCode'] != '' and not pd.isnull(organisations.loc[org,'PostCode'])):
        organisationJSON["address"]: [
        {
            "use": "work",
            "postalCode": organisations.loc[org,'PostCode']
        }
    ]
    if (organisations.loc[org,'NationalGrouping'] != '' and not pd.isnull(organisations.loc[org,'NationalGrouping'])):
        organisationJSON["partOf"] = {
            "identifier": {
                "system": "https://fhir.nhs.uk/Id/ods-organization-code",
                "value": organisations.loc[org,'NationalGrouping']
            }
        }
    if (organisations.loc[org,'PhoneNumber'] != '' and not pd.isnull(organisations.loc[org,'PhoneNumber'])):
       # print('-',organisations.loc[org,'PhoneNumber'].strip(),'-',"1")
        organisationJSON['telecom'] = [
            {
                "system": "phone",
                "value": organisations.loc[org,'PhoneNumber'].strip(),
                "use": "work"
            }]

    if organisations.loc[org,'Closed'] != '' and not pd.isnull(organisations.loc[org,'Closed']) :
        organisationJSON['active'] = False
    if organisations.loc[org,'ID1'] != -1:
        organisationJSON['id'] = str(organisations.loc[org,'ID1'])
    #print(json.dumps(organisationJSON, indent=2, ensure_ascii=False))
    # validate organisation against schema
    Organization(**organisationJSON)
    return organisationJSON

new = organisations[['ID1']].head(12000).copy()
for org in new.index:

    organisationJSON = convertOrganisationFHIR(org)
    #print(json.dumps(organisationJSON, indent=2, ensure_ascii=False))

    if (organisations.loc[org,'ID1'] == -1):
        r = requests.post(url+'/Organization', data=json.dumps(organisationJSON), headers=headers)

        if 'Location' in r.headers:
            print('Created ' + org)
            location = r.headers['Location'].split('Organization/')[1].split('/')[0]
            organisations.loc[org,'ID1'] = str(location)
            print(location)
        else:
            print("No Location header in response: ",r.status_code)
            print("Response headers:", r.headers)
            print(json.dumps(organisationJSON, indent=2, ensure_ascii=False))
            print(r.text)



