## Overview
This jupyter notebook is to run through the pipeline step by step without needing to redo previous steps each time. This script can be run by first installing the requirements.txt file and then running `jupyter lab`in the terminal. 

#### Imports
We add the project root dir to system path 

In [1]:
import sys
import os
sys.path.append(os.path.abspath("../.."))

In [2]:
from src.legacy.clean_md import clean_page
from src.legacy.simplify_toc import simplify_table_of_contents
from src.legacy.convert_page_to_snippet import convert_page_to_snippets
from src.legacy.snippet_to_ontology import extract_ontology_from_next_snippet
from src.legacy.inject_ontology_to_snippet import inject_ontology_into_snippet
from src.legacy.convert_snippet_to_lap import convert_snippet_to_lap_simple
from src.legacy.harmnize_lap import harmonize_declarations
from src.legacy.apply_ontology import apply_ontology_to_lap
from src.legacy.convert_lap_to_json import convert_lap_to_json
from src.legacy.get_graph import get_graph
from src.legacy.assemble_clauses import assemble_clauses
from src.legacy.llm import LLM

#### Dummy data
Hard coded data including 2 pages for testing

In [3]:
p1 = """CONTENTS
Motor Breakdown Insurance Policy & Summary
A – Policy wording 3
Breakdown Causes 8
SECTION A – AXA LOCAL 9
SECTION B – AXA NATIONWIDE 10
SECTION C – AXA NATIONWIDE & HOMESTART 11
SECTION A, B, C – MISFUELLING 12
SECTION D – AXA EUROPEAN 13
SECTION E – GENERAL EXCLUSIONS THAT APPLY TO ALL PARTS 17 OF THIS POLICY
SECTION F – GENERAL CONDITIONS APPLYING TO ALL PARTS 20 OF THIS POLICY
B – Policy summary 26 AXA BREAKDOWN COVER POLICY SUMMARY 26"""

p2 = """A – Policy wording STATUS
This policy is provided on behalf of AXA Insurance by AXA Assistance (UK) Ltd. AXA Assistance (UK) Limited is authorised and regulated by
the Financial Conduct Authority. AXA Assistance (UK) Limited’s firm register number is 439069. You can check this on the Financial Services Register by visiting
the website www.fca.org.uk/ register. Its registered office is at The Quadrangle, 106-118 Station Road, Redhill, Surrey, RH1 1PR. It is registered in England under company number 02638890.
This policy is underwritten by Inter Partner Assistance SA UK Branch
(IPA) which is fully owned by the
AXA Assistance Group. Inter Partner Assistance is a Belgian firm authorised by the National Bank of Belgium and subject to limited regulation by the Financial Conduct Authority. Details about the extent of its regulation by the Financial Conduct Authority are available from us on request. Inter Partner Assistance SA’s register number is 202664. You can check this on the Financial Services Register by visiting the website www.fca.org.uk/register.
AXA Assistance (UK) Limited operates the 24-hour motoring assistance helpline.
This insurance is governed by the laws of England and Wales.
IMPORTANT INFORMATION
This document sets out the terms and conditions of your cover and it is important that you read it carefully. There are different levels of cover available. The cover you hold will be set out in the accompanying policy schedule. If changes are made, these will be confirmed to you separately in writing.
Each section of cover explains what
is and is not covered. There are also general exclusions (things that are not included) that apply to all sections
of the cover, and there are general conditions that you must follow so you are entitled to the cover.
CANCELLATION
If you find that the cover provided under this policy does not meet your needs, please contact us on 0800 169 0206 within 14 days of receiving this document and we will cancel this policy.
You will receive a full refund of your premium as long as you have not made any claims.
If you cancel the policy outside the 14-day period, as long as you have not made any claims, you will receive a refund of your premium for the
Motor Breakdown Insurance Policy & Summary
 3
Policy wording continued
amount of time left to run on the policy, less an administrative charge of £15.
We may cancel this policy by giving you at least 14 days’ written notice at your last-known address if:
 you fail to pay the premiums after we have sent you a reminder to
do so. If we have been unable to collect a premium payment, we will contact you in writing requesting payment to be made by a specific date. If we do not receive payment by this date we will cancel your policy by immediate effect and notify you in writing that such cancellation has taken place;
 you refuse to allow us reasonable access to your vehicle to provide the services you have asked for under this policy or if you fail to co- operate with our representatives;
 you otherwise stop keeping to the terms and conditions of this policy in any significant way; or
 the cost of providing this policy becomes prohibitive.
We may cancel this policy without giving you notice if, by law or other reason, we are prevented from providing it.
If we cancel the policy under this section, we will refund the
premium paid for the remaining period of insurance, unless you have made any claims. We can refuse to renew any individual policy.
We may cancel this policy without giving you notice and without refunding your premium if you:
 make or try to make a fraudulent claim under your policy;
 are abusive or threatening towards our staff; or
 repeatedly or seriously break the terms of this policy.
If you make a valid claim before the policy is cancelled, we will pay the claim before we cancel the policy.
MEANING OF WORDS
Wherever the following words
and phrases appear in bold in this document, they will always have the following meanings.
1. We, us, our
Inter Partner Assistance SA and AXA Assistance (UK Branch) Ltd both of The Quadrangle, 106-118 Station Road, Redhill, Surrey, RH1 1PR, UK.
2. Vehicle policy
This policy covers breakdown assistance for the specific vehicle
(or vehicles) shown on your policy schedule. These are the only vehicles that this cover applies to.
"""

In [4]:
md = [p1, p2]

#### Get table of contents

In [5]:
toc = simplify_table_of_contents(md[0], LLM)
print(toc)

# A
## Policy wording
## Breakdown Causes
## SECTION A – AXA LOCAL
## SECTION B – AXA NATIONWIDE
## SECTION C – AXA NATIONWIDE & HOMESTART
## SECTION A, B, C – MISFUELLING
## SECTION D – AXA EUROPEAN
## SECTION E – GENERAL EXCLUSIONS
## SECTION F – GENERAL CONDITIONS
# B
## Policy summary AXA BREAKDOWN COVER POLICY SUMMARY


#### Go through pipeline steps on first page 

In [6]:
cleaned_page = clean_page(md[1], toc, LLM)
print(cleaned_page)

# A
## Policy wording

### STATUS
This policy is provided on behalf of AXA Insurance by AXA Assistance (UK) Ltd. AXA Assistance (UK) Limited is authorised and regulated by the Financial Conduct Authority. AXA Assistance (UK) Limited's firm register number is 439069. You can check this on the Financial Services Register by visiting the website www.fca.org.uk/register. Its registered office is at The Quadrangle, 106-118 Station Road, Redhill, Surrey, RH1 1PR. It is registered in England under company number 02638890.

This policy is underwritten by Inter Partner Assistance SA UK Branch (IPA) which is fully owned by the AXA Assistance Group. Inter Partner Assistance is a Belgian firm authorised by the National Bank of Belgium and subject to limited regulation by the Financial Conduct Authority. Details about the extent of its regulation by the Financial Conduct Authority are available from us on request. Inter Partner Assistance SA's register number is 202664. You can check this on the Fi

In [7]:
snippets = convert_page_to_snippets(cleaned_page)

In [8]:
snippet = snippets[4]
print(snippet)

# A
## Policy wording
### MEANING OF WORDS
Wherever the following words and phrases appear in bold in this document, they will always have the following meanings.

1. We, us, our
Inter Partner Assistance SA and AXA Assistance (UK Branch) Ltd both of The Quadrangle, 106-118 Station Road, Redhill, Surrey, RH1 1PR, UK.

2. Vehicle policy
This policy covers breakdown assistance for the specific vehicle (or vehicles) shown on your policy schedule. These are the only vehicles that this cover applies to.


In [9]:
ontologies = extract_ontology_from_next_snippet(snippet, LLM)

In [10]:
print(ontologies)

[{'word': 'We, us, our', 'definition': 'Inter Partner Assistance SA and AXA Assistance (UK Branch) Ltd both of The Quadrangle, 106-118 Station Road, Redhill, Surrey, RH1 1PR, UK.', 'scope': 'MEANING OF WORDS'}, {'word': 'Vehicle policy', 'definition': 'This policy covers breakdown assistance for the specific vehicle (or vehicles) shown on your policy schedule. These are the only vehicles that this cover applies to.', 'scope': 'MEANING OF WORDS'}]


In [11]:
snippet = "We, us, our will cover the car until 2000 chf"

In [12]:
snippet_with_ontology = inject_ontology_into_snippet(snippet, ontologies)
print(snippet_with_ontology)


We, us, our means Inter Partner Assistance SA and AXA Assistance (UK Branch) Ltd both of The Quadrangle, 106-118 Station Road, Redhill, Surrey, RH1 1PR, UK. with scope MEANING OF WORDS

We, us, our will cover the car until 2000 chf


In [25]:
lap = convert_snippet_to_lap_simple(snippet, ontologies, LLM)

In [26]:
print(lap)

[{'word': 'We, us, our', 'definition': 'The insurance company providing the coverage.', 'scope': 'MEANING OF WORDS'}, '(COVERED_OBJECT="Car") THEN LIMIT{amount:2000, unit:chf, per:claim}']


In [27]:
import json
from llama_index.llms import ChatMessage


HARMONIZED_ONTOLOGY = [
    {
       "name": "TRIGGER_EVENT",
       "asktohuman": "What triggered the event?",
       "description": "Denote the initiating event or circumstance",
       "type": "enum",
       "options": []
    },
    {
        "name": "EVENT",
        "asktohuman": "What was the event?",
        "description": "Specify the actual event or incident",
        "type": "enum",
        "options": []
    },
    {
        "name": "DAMAGE",
        "asktohuman": "What damage was caused?",
        "description": "Specify the type or extent of damage",
        "type": "enum",
        "options": []
    },
    {
        "name": "IMPACTED_OBJECT",
        "asktohuman": "What object was impacted?",
        "description": "Specify the object or item that sustained damage or was affected",
        "type": "enum",
        "options": []
    },
    {
        "name": "SERVICE",
        "asktohuman": "What service is required?",
        "description": "Specify the kind of service or action needed post-event",
        "type": "enum",
        "options": []
    }
]

HARMONIZED_ONTOLOGY = json.dumps(HARMONIZED_ONTOLOGY)

_SYSTEM_PROMPT = f"""User will input definitions in json format.
Your role is to output one or more entries in order to aggregate iteratively a harmonized ontology.

Current ontology is as follows :
BEGIN ONTOLOGY
{HARMONIZED_ONTOLOGY}
END ONTOLOGY

Your task is to reformulate this ontology entry if needed through several possibilities :
- splitting it into more atomic elements
- outputting an existant element if it expresses the same concept or a very similar one
- copying an existant element to add a possible option
- rewriting a boolean entry containing a reference to a numerical value to a number entry

You must respect the entry format and reuse the same wording as much as possible."""

EXAMPLE_1 = [
    ChatMessage(
        role="user",
        content=json.dumps({
            "name": "ACCIDENT",
            "asktohuman": "Was the breakdown caused by an accident?",
            "description": "Denote if the breakdown was caused by an accident",
            "type": "boolean"
        })
    ),
    ChatMessage(
        role="assistant",
        content=json.dumps({
            "name": "EVENT",
            "asktohuman": "What was the event?",
            "description": "Specify the actual event or incident",
            "type": "enum",
            "options": ["ACCIDENT"]
        })
    ),
]

EXAMPLE_2 = [
    ChatMessage(
        role="user",
        content=json.dumps({
            "name": "VEHICLE_HAS_MOT_AND_ROAD_FUND_LICENCE",
            "asktohuman": "Does the vehicle have a current MOT certificate and valid road fund licence or tax disc on display?",
            "description": "Denote if the vehicle has a current MOT certificate and valid road fund licence or tax disc on display",
            "type": "boolean"
        })
    ),
    ChatMessage(
        role="assistant",
        content=json.dumps([
            {
                "name": "VEHICLE_HAS_MOT",
                "asktohuman": "Does the vehicle have a current MOT certificate?",
                "description": "Denote if the vehicle has a current MOT certificate",
                "type": "boolean"
            },
            {
                "name": "VEHICLE_HAS_ROAD_FUND_LICENCE",
                "asktohuman": "Is there a valid road fund licence or tax disc on display?",
                "description": "Denote if the vehicle has a valid road fund licence or tax disc on display",
                "type": "boolean"
            }
        ])
    ),
]


def harmonize_lap(lap, harmonized_ontology, llm):
    if not harmonized_ontology:
        harmonized_ontology = HARMONIZED_ONTOLOGY

    harmonized_lap = []
    for lap_item in lap:
        harmonized_lap_item = llm.chat([
            ChatMessage(role="system", content=_SYSTEM_PROMPT)
        ] + EXAMPLE_1 + EXAMPLE_2 + [ChatMessage(role="user", content=json.dumps(lap_item))]
        ).message.content
        try:
            harmonized_lap_item = json.loads(harmonized_lap_item)
        except Exception as E:
            print(f"failed to load to json: {harmonized_lap_item}")
            harmonized_lap.append(lap_item)
            continue

        if isinstance(harmonized_lap_item, list):
            if len(harmonized_lap_item) > 1:
                harmonized_lap.extend(harmonized_lap_item)
                continue
            else:
                harmonized_lap_item = harmonized_lap_item[0]

        new_lap_item = {**lap_item, **harmonized_lap_item}
        harmonized_lap.append(new_lap_item)
        print(f"LAP: {lap_item} new LAP: {new_lap_item}")

    return harmonized_lap


In [28]:
lap

[{'word': 'We, us, our',
  'definition': 'The insurance company providing the coverage.',
  'scope': 'MEANING OF WORDS'},
 '(COVERED_OBJECT="Car") THEN LIMIT{amount:2000, unit:chf, per:claim}']

In [29]:
harmonized_lap = harmonize_lap(lap, [], LLM)

LAP: {'word': 'We, us, our', 'definition': 'The insurance company providing the coverage.', 'scope': 'MEANING OF WORDS'} new LAP: {'word': 'We, us, our', 'definition': 'The insurance company providing the coverage.', 'scope': 'MEANING OF WORDS', 'name': 'INSURANCE_COMPANY', 'asktohuman': 'Who is the insurance company providing the coverage?', 'description': 'Specify the insurance company providing the coverage', 'type': 'enum', 'options': []}
failed to load to json: {"name": "COVERED_OBJECT", "asktohuman": "What object is covered?", "description": "Specify the object or item that is covered", "type": "enum", "options": ["Car"]}

{"name": "LIMIT", "asktohuman": "What is the limit?", "description": "Specify the limit amount, unit, and per claim", "type": "object", "properties": {"amount": {"type": "number", "asktohuman": "What is the limit amount?"}, "unit": {"type": "enum", "options": ["chf"], "asktohuman": "What is the unit of the limit?"}, "per": {"type": "enum", "options": ["claim"],

In [31]:
harmonized_lap

[{'word': 'We, us, our',
  'definition': 'The insurance company providing the coverage.',
  'scope': 'MEANING OF WORDS',
  'name': 'INSURANCE_COMPANY',
  'asktohuman': 'Who is the insurance company providing the coverage?',
  'description': 'Specify the insurance company providing the coverage',
  'type': 'enum',
  'options': []},
 '(COVERED_OBJECT="Car") THEN LIMIT{amount:2000, unit:chf, per:claim}']

In [32]:
harmonized_lap_with_ontology = apply_ontology_to_lap(ontologies, harmonized_lap, LLM)

JSONDecodeError: Expecting value: line 1 column 1 (char 0)

In [None]:
lap_json = convert_lap_to_json(harmonized_lap_with_ontology, LLM)

In [36]:

from src.legacy.llm import LLM
from llama_index.llms import ChatMessage
LLM.chat([ChatMessage(role="system", content="your a joker"), ChatMessage(role="user", content="tell me a joke")]).message.content

"Why don't scientists trust atoms? Because they make up everything!"