# Add Definitions to Chunks

The following notebooks shows an example on how to create more connections from a document. In this case an appendix with a list of definitions is selected. These definitions are extracted and linked to the chunks that mentioned them. 

For the extraction of these defnitions an LLM is used. This time we use a model from [OpenAI](https://platform.openai.com/docs/models/gpt-3-5-turbo): gpt-3.5-turbo-0125. 

In [1]:
import pandas as pd
import os
from neo4j import GraphDatabase
from langchain_openai import ChatOpenAI
from IPython.display import clear_output
from dotenv import load_dotenv
import json
import numpy as np

## Get Credentials

In [2]:
if os.path.exists('credentials.env'):
    load_dotenv('credentials.env', override=True)

    # Neo4j
    uri = os.getenv('NEO4J_URI')
    username = os.getenv('NEO4J_USERNAME')
    password = os.getenv('NEO4J_PASSWORD')
    database = os.getenv('NEO4J_DATABASE')

    # AI
    OPENAI_API_KEY = os.getenv('OPENAI_API_KEY')
    os.environ['OPENAI_API_KEY']=OPENAI_API_KEY
else:
    print("File 'credentials.env' not found.")

## Setup Connection to Database

Setup connection to the database with the Python Driver

In [3]:
class App:
    def __init__(self, uri, user, password, database=None):
        self.driver = GraphDatabase.driver(uri, auth=(user, password), database=database)
        self.database = database

    def close(self):
        self.driver.close()

    def query(self, query):
        return self.driver.execute_query(query)

    def query_params(self, query, parameters):
        return self.driver.execute_query(query, parameters_=parameters)

    def count_nodes_in_db(self):
        query = "MATCH (n) RETURN COUNT(n)"
        result = self.query(query)
        (key, value) = result.records[0].items()[0]
        return value

    def remove_nodes_relationships(self):
        query ="""
            CALL apoc.periodic.iterate(
                "MATCH (c) RETURN c",
                "WITH c DETACH DELETE c",
                {batchSize: 1000}
            )
        """
        result = self.query(query)

    def remove_all_constraints(self):
        query ="""
            CALL apoc.schema.assert({}, {})
        """
        result = self.query(query)

In [4]:
app = App(uri, username, password, database)

In [5]:
app.count_nodes_in_db()

2899

## Find definitions from the text

Load an LLM

In [6]:
model = 'gpt-3.5-turbo-0125'

In [7]:
llm = ChatOpenAI(model=model, temperature=0)

The text with definitions is copied from the document (NN_zorg_vrij_basic_2024.pdf). 

In [8]:
definition_text = """
Appendix Definitions
Additional insurance package
An agreement that you can take out in addition to your general insurance policy for the reimbursement of healthcare and healthcare costs. The content and scope of your additional insurance package is set by us. We have it laid down in your terms and conditions of insurance.
Agreed rate
The (average) rate we agree in contracts with healthcare providers for certain types of healthcare. These rates are available on our website.
AGB code
This code is a unique administrative code assigned to healthcare providers in the Netherlands, identifying each one individually in Vektis. Vektis is a national register containing all information necessary to submit claims for the healthcare, to purchase and contract the healthcare and to help guide insured persons to the right healthcare.
Treatment
Contact, physical or online, with one or more healthcare providers, involving the provision of healthcare and/or advice. Treatment does not include courses or training.
Treatment proposal (or prescription)
This proposal states which healthcare (examination, treatment or therapy) you need. You are given a prescription for medicine.
Abroad
Any country other than the country where you live.
CAK
The Dutch Central Administration Office (‘Centraal Administratie Kantoor’, CAK), as defined in Article 6.1.1, first paragraph, of the Dutch Long-Term Care Act (‘Wet langdurige zorg’, Wlz).
Consultation
Contact with a healthcare provider. This can involve advice, a referral, a discussion of a patient’s medical history, a physical examination, diagnosis and/or additional tests where such is deemed medically necessary.
Day treatment
Healthcare in a department set up for day nursing in a facility for specialist medical healthcare (such as a hospital or independent treatment centre). This may also involve a medical examination or treatment in a rehabilitation facility. The healthcare is generally foreseeable and lasts for a number of hours. The patient is not admitted.
 ‘Nationale-Nederlanden Zorg Vrij’ (‘Combinatie health insurance policy) valid from 01-01-2024 to Page 175 of 205 31-12-2024 (inclusive)
Appendix Definitions
Reimbursements and terms and conditions for 2024
 DBC healthcare product
A Diagnosis-Treatment Combination (‘Diagnose Behandel Combinatie’, DBC healthcare product or DBC) is a code that describes the entire process of treatment under specialist medical healthcare. A DBC includes all the costs incurred by the healthcare provider to give you the right healthcare. So it also includes costs not directly related to your treatment. The rate for a DBC is based on an average of the costs incurred for a particular course of treatment. The start date of a DBC is the date of first contact with the healthcare provider and determines the reimbursement. The bill is settled on the DBC start date. If the commencement date for a DBC is outside of the term of your insurance, none of the costs associated with that DBC are covered. In addition to a DBC, a hospital may charge for treatments categorised as other healthcare products (‘overige zorgproducten’, OZP). These are often single treatments that are not associated with a course of treatment. For example, diagnostics requested by the general practitioner, such as an ultrasound or X-ray, or diagnostics for dental surgery. Specific expensive healthcare is also claimed under other healthcare products. Examples here include intensive care, expensive medicines and blood products.
Diagnostics
Determination of the medical cause of the patient’s problem, illness or condition.
EU/EEA member state
The EU (European Union) member states are: Austria, Belgium, Bulgaria, Croatia, Cyprus (Greek part), Czech Republic, Denmark, Estonia, Finland, France (including French Guyana, Guadeloupe, Martinique, Réunion, Saint Barthélemy and Saint Martin), Germany, Greece, Hungary, Ireland, Italy, Latvia, Lithuania, Luxembourg, Malta, Netherlands, Poland, Portugal (including the Azores and Madeira), Romania, Slovakia, Slovenia, Spain (including the Canary Islands, Ceuta and Melilla) and Sweden. Under international treaties, Switzerland is considered to be on a par with the above. The following are not part of the EU (this list is not exhaustive): Andorra, the Channel Islands, the Isle of Man, Monaco, San Marino and Vatican City. The EEA (European Economic Area) states are: the aforementioned EU states, Iceland, Liechtenstein and Norway. Explanation: On 31 January 2020, the United Kingdom, including Gibraltar, left the European Union.
Claimed rate
The amount stated on the invoice. Reimbursement will never exceed the costs of healthcare that you have actually incurred, and that you were invoiced for.
Medical aids on loan
These are medical aids that you may use as long as you are insured for them with us. We or the healthcare provider will enter into a loan agreement with you for this purpose. This agreement specifies your rights and obligations in respect of the medical aid you have on loan. You must return the medical aid upon termination of your insurance policy. We pay the reimbursement directly to the healthcare provider if you receive the medical aid on loan from a contracted healthcare provider. If you purchase a medical aid from a non-contracted healthcare provider and that aid would usually be provided on loan, you will not automatically be reimbursed for the full purchase value. We will reimburse you the costs involved in using the medical aid for an entire year in the same way as we reimburse these costs with a contracted healthcare provider. You do not need to pay any costs for medical aids on loan, so you do not pay a deductible for them. The deductible does apply, however, to the costs of consumables and usage associated with the medical aid that we lend you.
Owned medical aids
These are medical aids that transfer to your possession under your terms and conditions of insurance. You will acquire ownership of them. The purchase costs will be set off against your deductible.
If a medical aid transfers to your possession, it is strictly for your own use. You may not sell it to anyone.
Year
A calendar year. However, when referring to someone’s age, we do not mean a calendar year. We simply mean a year in the person’s life.
‘Nationale-Nederlanden Zorg Vrij’ (‘Combinatie health insurance policy) valid from 01-01-2024 to Page 176 of 205 31-12-2024 (inclusive)
Appendix Definitions
 
Reimbursements and terms and conditions for 2024
 Month
A calendar month.
Market rate applicable in the Netherlands
This is the rate that is reasonable and appropriate in the Dutch market for a given treatment. To determine this rate, we look at what amounts healthcare providers charge on average for that treatment. This means that we will not reimburse unreasonably high costs of treatment in full. See also Article 2.2., clause 2, paragraph b, of the Dutch Health Insurance Decree (‘Besluit zorgverzekering’).
(Medical) adviser
The doctor, pharmacist, dentist, physiotherapist or other expert who advises us. This includes advice on medical, pharmacotherapy-related, dental or physiotherapy-related healthcare or any other field of healthcare expertise.
Medical indication/grounds
The medical condition or illness that a doctor suspects or has diagnosed so that you can access certain healthcare.
Accident
A sudden, unexpected, involuntary and external event. This event results directly in bodily injury that can be detected objectively by a medical professional. This applies even if you did not and could not reasonably foresee the event. We consider an acute, serious illness to be equivalent to an accident when: - medical care is required immediately on medical grounds and cannot be postponed, or an illness or condition is life-threatening; and - the healthcare required is covered by the general insurance policy; and - based on objective medical standards, no recovery can be expected within the next six months.
Example of an accident.
- an infected wound or blood poisoning; - sprains, dislocations and tears of the muscles and ligaments; - involuntary ingestion of or poisoning with gases, vapours, liquid or solid substances or objects, unless this is through the conscious use of alcohol, medicine or drugs; - infection by exposure to pathogens or due to poisoning during an involuntary fall into water or any other substance (liquid or otherwise), or if you enter it yourself to save a person, animal or object; - drowning, suffocation, frostbite, hypothermia, sunstroke, burning (except as the result of sunbathing), lightning strike or other electrical discharge, or coming into contact with a corrosive substance; - natural violence such as an earthquake, flood, tsunami (tidal wave), hurricane, or volcanic eruption; - starvation, dehydration and exhaustion; - complications or aggravation of injuries as the result of medically required treatment after an accident; - becoming infected with HIV through a blood transfusion or injection with a contaminated needle while being treated in a hospital.
Admission
A period of nursing and treatment with an overnight stay in a department set up for nursing in a specialist medical care facility (such as a hospital). The admission must be a medical necessity in terms of medical healthcare. However, this does not include a stay in an outpatient clinic, nor day care or urgent medical care, nor a stay in a facility for rehabilitation. Your general insurance policy covers admissions of up to 1095 (3 x 365) consecutive days. The following rules apply here: - if your admission is interrupted for less than 31 days, the number of days of the interruption do not count, but we will continue to count after the interruption to determine the total; - if your admission is interrupted for a period of more than 30 days, we start counting again from the beginning to determine the total; - if your admission is interrupted for weekend/holiday leave, the number of days of interruption counts towards the total number of days.
Policy (document)
Proof of insurance.
‘Nationale-Nederlanden Zorg Vrij’ (‘Combinatie health insurance policy) valid from 01-01-2024 to 31-12-2024 (inclusive)
Appendix Definitions
Page 177 of 205
 
Reimbursements and terms and conditions for 2024
 Written
A physical or electronic means of conveying information, whereby the information can be understood, stored and reproduced. An electronic means of conveying information includes the internet and emails. Written communication includes by letter, email and through the ‘Mijn’ environment on our website.
Urgent medical care
Healthcare that is a medical necessity and that cannot reasonably be postponed. The healthcare can reasonably be described as urgent in the general opinion of the group of relevant professional practitioners.
Rate
The amount of money for healthcare or the resources provided, which we take as the basis for reimbursement of that healthcare or those resources. We have different types of rates.
Treaty country
The Netherlands has a treaty for social security, including arrangements for the provision of medical healthcare, with the following states: Australia, Bosnia and Herzegovina, Cape Verde, Macedonia, Montenegro, Morocco, Serbia, Tunisia and Turkey.
The following are also treaty countries:
● all European Union (EU) member states other than the Netherlands;
● all states that are party to the Agreement on the European Economic Area (EEA);
● Switzerland;
● the United Kingdom.
Referral
For certain types of healthcare, you must have a referral before a consultation or before the start of the healthcare. This referral is the advice from one healthcare provider to go to another healthcare provider for a consultation or for healthcare. In the terms and conditions, we list which healthcare provider must provide this referral under ‘referral’.
Insured person
The individual entitled to insured healthcare (and reimbursement thereof) in accordance with our terms and conditions of insurance. The policyholder may also be the insured person. In the terms and conditions of insurance, we refer to the insured person and the policyholder using ‘you’ and ‘your’. You can determine from the scope and content of the terms and conditions of insurance whether we mean the insured person or the policyholder. Where we refer to ‘he’, ‘him’ and ‘his’, this also means ‘she’ and ‘her’ and ‘her’ respectively.
Insurance policy
An insurance agreement may consist of a general insurance policy with one or more additional insurance packages.
If the insurance consists of a combination of 2 or more insurance agreements, the combination can contain no more than one general insurance policy.
Policyholder
The person who takes out insurance with us, must pay the premium and costs and is the only person who can change and cancel the insurance. The policy is in the name of the policyholder. The policyholder may also be the insured person. In the terms and conditions of insurance, we refer to the insured person and the policyholder using ‘you’ and ‘your’. You can determine from the scope and content of the terms and conditions of insurance whether we mean the insured person or the policyholder. Where we refer to ‘he’, ‘him’ and ‘his’, this also means ‘she’ and ‘her’ and ‘her’ respectively.
‘Nationale-Nederlanden Zorg Vrij’ (‘Combinatie health insurance policy) valid from 01-01-2024 to Page 178 of 205 31-12-2024 (inclusive)
Appendix Definitions
 
Reimbursements and terms and conditions for 2024
 Statutory personal contribution
Healthcare that is covered under your general insurance policy and in relation to which you must pay the costs in full or in part yourself. Personal contributions are set by law. A statutory personal contribution may be a fixed amount per treatment or a set percentage of the costs. A statutory personal contribution is not the same as a deductible. Statutory personal contributions and deductibles may apply side by side for the same insured healthcare. This may mean you will be charged both a statutory personal contribution and a deductible.
Statutory maximum rate
The maximum rate set by the Dutch Healthcare Authority (‘Nederlandse Zorgautoriteit’, NZa) for certain types of healthcare, in accordance with the Dutch Healthcare (Market Regulation) Act (‘Wet marktordening gezondheidszorg’, Wmg). The rate used by a healthcare provider may be lower, but never higher.
Statutory fixed rate
The fixed rate set by the Dutch Healthcare Authority (‘Nederlandse Zorgautoriteit’, NZa) for certain types of healthcare, in accordance with the Dutch Healthcare (Market Regulation) Act (‘Wet marktordening gezondheidszorg’, Wmg). The rate used by a healthcare provider must be exactly the same as this rate. These rates are also known as set-point rates.
 ‘Nationale-Nederlanden Zorg Vrij’ (‘Combinatie health insurance policy) valid from 01-01-2024 to Page 179 of 205 31-12-2024 (inclusive)
Appendix Definitions

Reimbursements and terms and conditions for 2024
 Appendix General terms and conditions
A.1A. Additional definitions
General insurance policy
Your general insurance policy is health insurance under the Dutch Health Insurance Act (‘Zorgverzekeringswet’, Zvw). The Dutch government determines the content and scope of your general insurance policy.
Family members
Family members living at the same address and who make up a shared household. By this we mean:
● adults who are each other’s sole life partner;
● children up to the age of 18 (including adopted children and foster children);
● children aged 18 to 30 (inclusive) who are students (they do not have to be living at the same address as
the policyholder);
● a company or facility that has entered into a group agreement with us may also designate someone as a
family member.
A family member has their own policy or is co-insured on the policy of another family member.
Health insurer
Your health insurer is Centrale Zorgverzekeringen NZV NV, registered in the Trade Register of the Chamber of Commerce under number 27118912. This is a health insurer in accordance with the Dutch Health Insurance Act (‘Zorgverzekeringswet’, Zvw) that offers and/or administers health insurance. In these terms and conditions of insurance, ‘we’ or ‘us’ means NZV.
"""

Define a prompt that describes the task for the LLM. 

In [9]:
system_prompt = "You are a helpful assistant that finds definitions in a text. You will retrieve the descriptions for definitions in a piece of text. Also find the additional definitions. Based on the provided text you list the term/definition to it's complete meaning/description. The meaning must be only taken from the document and must contain the full description. Don't truncate these. The output should be nothing more than a string representation of the JSON-object. This string must be directly loaded into the python json.loads() function"

In [10]:
message = [
    ("system", system_prompt),
    ("human", definition_text),
]

Invoke the LLM

In [11]:
llm_answer = llm.invoke(message)

In [12]:
json_obj = json.loads(llm_answer.content)

In [13]:
json_obj

{'Additional insurance package': 'An agreement that you can take out in addition to your general insurance policy for the reimbursement of healthcare and healthcare costs. The content and scope of your additional insurance package is set by us. We have it laid down in your terms and conditions of insurance.',
 'Agreed rate': 'The (average) rate we agree in contracts with healthcare providers for certain types of healthcare. These rates are available on our website.',
 'AGB code': 'This code is a unique administrative code assigned to healthcare providers in the Netherlands, identifying each one individually in Vektis. Vektis is a national register containing all information necessary to submit claims for the healthcare, to purchase and contract the healthcare and to help guide insured persons to the right healthcare.',
 'Treatment': 'Contact, physical or online, with one or more healthcare providers, involving the provision of healthcare and/or advice. Treatment does not include course

### Load Definitions to Neo4j

Create dataframe from the JSON

In [14]:
definitions_df = pd.DataFrame.from_dict(json_obj, orient='index').reset_index()
definitions_df.columns = ['definition', 'description']
definitions_df = definitions_df.reset_index()

In [15]:
definitions_df

Unnamed: 0,index,definition,description
0,0,Additional insurance package,An agreement that you can take out in addition...
1,1,Agreed rate,The (average) rate we agree in contracts with ...
2,2,AGB code,This code is a unique administrative code assi...
3,3,Treatment,"Contact, physical or online, with one or more ..."
4,4,Treatment proposal (or prescription),This proposal states which healthcare (examina...
5,5,Abroad,Any country other than the country where you l...
6,6,CAK,The Dutch Central Administration Office (‘Cent...
7,7,Consultation,Contact with a healthcare provider. This can i...
8,8,Day treatment,Healthcare in a department set up for day nurs...
9,9,DBC healthcare product,A Diagnosis-Treatment Combination (‘Diagnose B...


### Load Definition Nodes

In [16]:
merge_definition_query = """
    MERGE(d:Definition {id: $index})
        ON CREATE SET
            d.definition = $definition,
            d.description = $description
    RETURN d
"""

In [17]:
for index, row in definitions_df.iterrows():
    clear_output(wait=True)
    d = {
        'index': row['index'],
        'definition': row['definition'],
        'description': row['description']
    }
    app.query_params(merge_definition_query, d)
    print("Progress: ", np.round(((index+1)/definitions_df.shape[0])*100,2), "%")

Progress:  100.0 %


## Load CONTAINS relationship

In [18]:
merge_contains_query = """
    MATCH (d:Definition{definition:$definition})
    MATCH (c:Chunk)-[:PART_OF]->(p:Policy{file_name:$file_name})
    WHERE toLower(c.chunk) CONTAINS toLower(d.definition)
    MERGE (c)-[:MENTIONS]->(d)
"""

In [19]:
file_name = 'NN_zorg_vrij_basic_2024.pdf'

for index, row in definitions_df.iterrows():
    clear_output(wait=True)
    d = {
        'definition': row['definition'],
        'file_name': file_name
    }
    app.query_params(merge_contains_query, d)
    print("Progress: ", np.round(((index+1)/definitions_df.shape[0])*100,2), "%")

Progress:  100.0 %
