### Requirements
To run this jupyter notebook yourself, first make sure that you have python v.3.11 installed on your system, together with the pip package manager.
As a first step you need to run `pip install -r requirements.txt`.

### Import
Next all necessary functions will be imported.
For this example the GPT-4 model will be used. Please enter your OpenAI key in the cell below.

In [None]:
from service.language_model_connection import LanguageModelConnection, LanguageModel
from io import BytesIO
from service.splitter import load_and_split_text
from service.knowledge_injection import inject_knowledge
from service.language_model_connection import KnowledgeLevel
from service.language_model_connection import LanguageModelConnection, LanguageModel

llm = LanguageModelConnection(LanguageModel.GPT_4, "ENTER KEY HERE")

### Determining the prior knowledge
As a help for the user, it is possible to use a questionnaire to determine the prior knowledge on regulation texts of a user. For this a questionnaire can be generated.

In [2]:
chunk = "1. This Regulation shall not invalidate any EU type-approvals granted to vehicles, systems, components or separate technical units which were granted in accordance with Regulation (EC) No <ent>78/2009</ent><ent_desc>This Regulation lays down requirements for the construction and functioning of motor vehicles and frontal protection systems in order to reduce the number and severity of injuries to pedestrians and other vulnerable road users who are hit by the fronts of vehicles and in order to avoid such collisions.</ent_desc>, Regulation (EC) No <ent>79/2009</ent><ent_desc>This Regulation establishes requirements for the type-approval of motor vehicles with regard to hydrogen propulsion and for the type-approval of hydrogen components and hydrogen systems. This Regulation also establishes requirements for the installation of such components and systems.</ent_desc> or Regulation (EC) No <ent>661/2009</ent><ent_desc>This Regulation establishes requirements: 1. for the type-approval of motor vehicles, their trailers and systems, components and separate technical units intended therefor with regard to their safety, 2. for the type-approval of motor vehicles, in respect of tyre pressure monitoring systems, with regard to their safety, fuel efficiency and CO2 emissions and, in respect of gear shift indicators, with regard to their fuel efficiency and CO2 emissions; and 3. for the type-approval of newly-manufactured tyres with regard to their safety, rolling resistance performance and rolling noise emissions.</ent_desc> and their implementing measures, by 5 July 2022, unless the relevant requirements applying to such vehicles, systems, components or separate technical units have been modified, or new requirements have been added, by this Regulation and the delegated acts adopted pursuant to it, as further specified in the implementing acts adopted pursuant to this Regulation."

print(llm.generate_questionnaire(chunk))

{'questions': [{'question': "What does 'EU type-approvals' refer to in the context of vehicle regulations?", 'answers': ["A. EU's general approval for vehicle use in any context", 'B. Certifications granted to vehicles meeting specific EU standards', 'C. Approval for EU funding for vehicle manufacturers', "D. EU's annual vehicle inspection requirement"], 'correct_answer': 'B. Certifications granted to vehicles meeting specific EU standards'}, {'question': 'What is the primary purpose of implementing measures in vehicle regulations?', 'answers': ['A. To increase the cost of vehicle production', 'B. To ensure compliance with updated safety and efficiency standards', 'C. To restrict the import of non-EU vehicles', 'D. To promote the use of public transportation'], 'correct_answer': 'B. To ensure compliance with updated safety and efficiency standards'}, {'question': "What does the term 'delegated acts' imply within the context of EU regulations?", 'answers': ['A. Acts performed by tempora

### Chunk Summary
As an intermediate step in the summarization process, a summary for each chunk is generated. This chunk summary consists of the stakeholders involved, all important information of the chunk and a complete summary of the chunk.

In [3]:
print(llm.generate_chunk_summary(chunk))

{'stakeholder': ['vehicle manufacturers', 'system manufacturers', 'component manufacturers', 'technical unit manufacturers'], 'key_information': ['Existing EU type-approvals for vehicles and related systems remain valid unless modified by new requirements.', 'Regulation (EC) No 78/2009 focuses on reducing injuries to pedestrians from vehicle fronts.', 'Regulation (EC) No 79/2009 deals with hydrogen propulsion and systems in vehicles.', 'Regulation (EC) No 661/2009 covers vehicle safety, fuel efficiency, and emissions standards.', 'Changes or additions to requirements by this Regulation and subsequent delegated acts may affect the validity of existing approvals.'], 'chunk_summary': 'Existing EU type-approvals for vehicles, systems, components, and technical units remain valid unless altered by this Regulation or new requirements are added. Specific regulations address pedestrian safety, hydrogen propulsion systems, and vehicle safety and emissions. Manufacturers of these vehicles and sy

### Full Summary
As a last step the full workflow is tested. Here, a file is split into chunks, knowledge gets injected, and a final summary is generated. In this example a user without prior knowledge is used.

In [4]:
with open("tmp/CELEX_02019R2144-20220905_EN_TXT.pdf.tei.xml", "rb") as fh:
    buf = BytesIO(fh.read())

# split the text into chunks
docs = load_and_split_text(text=buf.getvalue(), chunk_size=15000, chunk_overlap=0, splitter_type="Text Structure")
summarys = []

# inject knowledge and generate chunk summaries
for i,doc in enumerate(docs):
    doc = inject_knowledge(doc)
    summarys.append(llm.generate_chunk_summary(doc))

# generate full summary for no prior knowledge
summary = llm.generate_policy_summary(summarys, knowledge_level=KnowledgeLevel.NO.name)
display(summary)

