# 1. General info 
- What you need to run
  1. Cell 3.1.   --> pip installs
  2. Cell 4.3.   --> pdf generation from text
  3. Cell 4.5.   --> Loading the model using transformers (like it was in the demo given to Caspar initially)
  4. 

     
- When loading the model, generates the following error, try to restart the kernel and retry:

  Error: _Some modules are dispatched on the CPU or the disk. Make sure you have enough GPU RAM to fit
          the quantized model. If you want to dispatch the model on the CPU or the disk while keeping
          these modules in 32-bit, you need to set `load_in_8bit_fp32_cpu_offload=True` and pass a custom
          `device_map` to `from_pretrained`. Check
          https://huggingface.co/docs/transformers/main/en/main_classes/quantization#offload-between-cpu-and-gpu
          for more details._

# 2. Useful documentation: 

- Chatbot Arena Leaderboard from HuggingFace can be found [here](https://huggingface.co/spaces/lmsys/chatbot-arena-leaderboard)
- __LangChain__: [Stuff, Map-Reduce & Refine](https://python.langchain.com/docs/use_cases/summarization)
- __LangChain__: [Quick start](https://python.langchain.com/docs/get_started/quickstart)
- __LangChain__ __HuggingFace__: [Click here](https://python.langchain.com/docs/integrations/chat/huggingface)
  1) Utilize the HuggingFaceTextGenInference, HuggingFaceEndpoint, or HuggingFaceHub integrations to instantiate an LLM.
  2) Utilize the ChatHuggingFace class to enable any of these LLMs to interface with LangChain’s Chat Messages abstraction.
  3) Demonstrate how to use an open-source LLM to power an ChatAgent pipeline
- __AutoModelForCausalLM__: the way we managed to load the dutch model for LangChain processing is described [here](https://python.langchain.com/docs/integrations/llms/huggingface_pipelines)
- __LlamaCpp__: [Parameters description](https://api.python.langchain.com/en/latest/llms/langchain_community.llms.llamacpp.LlamaCpp.html#)
- __Google Cloud Generative AI__ - Language __GitHub__ repository: [Click here](https://github.com/GoogleCloudPlatform/generative-ai/blob/main/language/use-cases/document-summarization/summarization_large_documents_langchain.ipynb)  
- __Medium__ article: [Generating Summaries for Large Documents with Llama2 using Hugging Face and Langchain](https://medium.com/@ankit941208/generating-summaries-for-large-documents-with-llama2-using-hugging-face-and-langchain-f7de567339d2)
- __Medium__ article: [Long Text Summarization, RetrievalQA and Vector Databases -LangChain Arxiv Tutor](https://medium.com/@baptisteloquette.entr/langchain-arxiv-tutor-long-text-summarization-retrievalqa-and-vector-databases-6d5cb1dc7e14)
- __Medium__ article: [Retreaval Augmented Generation) architecture](https://pub.towardsai.net/advanced-rag-02-unveiling-pdf-parsing-b84ae866344e)
- __Medium__ article: [Prompt Engineering: How to Trick AI into Solving Your Problems](https://towardsdatascience.com/prompt-engineering-how-to-trick-ai-into-solving-your-problems-7ce1ed3b553f)
- __Medium__ article: [Llama 2 Prompt Engineering: Extracting Information From Articles Examples](https://medium.com/@eboraks/llama-2-prompt-engineering-extracting-information-from-articles-examples-45158ff9bd23)
- [Pinecone LangChain AI Handbook](https://www.pinecone.io/learn/series/langchain/langchain-conversational-memory/#ConversationChain)


# 3. Pip installs, Different Inputs (Pdf generation, Input texts, ...)

## 3.1. Installs

In [1]:
# download the gguf file for the Llama Dutch model, when Cpp method is used 
#!CMAKE_ARGS="-DLLAMA_CUBLAS=on" FORCE_CMAKE=1 pip install --upgrade --force-reinstall llama-cpp-python --no-cache-dir
#!huggingface-cli download TheBloke/Llama-2-13B-GGUF llama-2-13b.Q6_K.gguf --local-dir . --local-dir-use-symlinks False
import time
start_time = time.time()
!pip install transformers
!pip install --upgrade transformers
!pip install accelerate                 # Necessary for llama model
!pip install bitsandbytes
!pip install langchain
!pip install huggingface_hub
!pip install langdetect
!pip install tiktoken
!pip install pypdf
!pip install reportlab
!pip install pandas
!pip install streamlit
end_time = time.time()

print(f"Installing time: {end_time-start_time:.1f} sec")

Collecting transformers
  Downloading transformers-4.38.1-py3-none-any.whl.metadata (131 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m131.1/131.1 kB[0m [31m4.9 MB/s[0m eta [36m0:00:00[0m
Collecting huggingface-hub<1.0,>=0.19.3 (from transformers)
  Downloading huggingface_hub-0.20.3-py3-none-any.whl.metadata (12 kB)
Collecting regex!=2019.12.17 (from transformers)
  Downloading regex-2023.12.25-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (40 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m40.9/40.9 kB[0m [31m24.0 MB/s[0m eta [36m0:00:00[0m
Collecting tokenizers<0.19,>=0.14 (from transformers)
  Downloading tokenizers-0.15.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (6.7 kB)
Collecting safetensors>=0.4.1 (from transformers)
  Downloading safetensors-0.4.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (3.8 kB)
Collecting tqdm>=4.27 (from transformers)
  Downloading 

In [None]:
# This was an attempt in order to be able to load the model for LangChain (in the end this is not been used)
!huggingface-cli download BramVanroy/Llama-2-13b-chat-dutch --local-dir ./Bram --local-dir-use-symlinks False

In [None]:
!huggingface-cli download meta-llama/Llama-2-13b-chat-hf --local-dir ./Bram --local-dir-use-symlinks False

In [None]:
# To check if this cell is still needed or not:
!pip install google-cloud-translate

# T5 needs the following package:
!pip install sentencepiece
  

In [None]:
# create the requirements.txt based on the installed packages
!pip freeze > requirements.txt

## 3.2. Different Inputs: 
3.2.1. text  
3.2.2. pdf  
3.2.3. text transformer to pdf  

### 3.2.1. Text (6p): [Landelijk dekkend netwerk van infrastructuren](https://test.polpo.nl/nl/document/eb024e5c-7779-4d2c-9b63-b6e2fe0a2431)

In [None]:
some_text = '''Landelijk dekkend netwerk van infrastructuren
Origineel: Tweede KamerDatum: 21-01-2024
Tweede Kamer der Staten-Generaal
2
Vergaderjaar 2023–2024




27 529 Informatie- en Communicatietechnologie (ICT) in
de Zorg




Nr. 313 BRIEF VAN DE MINISTER VAN VOLKSGEZONDHEID, WELZIJN EN
SPORT

Aan de Voorzitter van de Tweede Kamer der Staten-Generaal

Den Haag, 22 januari 2024

De gezondheidszorg staat onder grote druk; door de vergrijzing groeit de
vraag naar zorg, terwijl er minder capaciteit beschikbaar is door een
toenemend personeelstekort. De regeldruk en administratieve lasten zijn
hoog en zorgverleners zijn veel tijd kwijt aan het verzamelen, overnemen
en invoeren van de benodigde informatie voor het verlenen van goede
zorg. De zorgverlening vindt in toenemende mate plaats in een netwerk
van meerdere partijen, wat beter ondersteund moet worden met snelle en
veilige gegevensuitwisseling. Tegelijkertijd groeit de behoefte aan data
voor secundair gebruik ten behoeve van wetenschappelijk onderzoek,
kwaliteitsmonitoring, gezondheidsbeleid en AI-toepassingen. De noodzaak
om een landelijk dekkend netwerk van infrastructuren – voor de volle
breedte van de zorg – te realiseren, voor het uitwisselen en beschikbaar
stellen van relevante data aan patiënt, zorgverlener en onderzoeker,
neemt toe. Daarom heeft het zorgveld aan het Ministerie van VWS
gevraagd regie te pakken en sturing te geven aan de realisatie van een
landelijk dekkend netwerk.

In de brief van mijn ambtsvoorganger van 13 april 2023 over het Landelijk
dekkend netwerk van infrastructuren1 heeft hij uw Kamer geïnformeerd
over de analyse, die op zijn verzoek is uitgevoerd, naar scenario's die
aanvullend op het huidige beleid een toekomstbestendig landelijk
dekkend netwerk realiseren. Hieruit kwam het advies om gelijktijdig een
dubbele beweging te omarmen en te stimuleren:
Een gedistribueerd communicatienetwerk voor (geprotocolleerde)
overdracht van gegevens door één-op-één communicatie tussen
zorgverleners én om data beschikbaar te kunnen stellen voor andere
doeleinden.



1
Kamerstukken II 2022/23, 27 529, nr. 293




kst-27529-313
ISSN 0921 - 7371
's-Gravenhage 2024 Tweede Kamer, vergaderjaar 2023–2024, 27 529, nr. 313 1
Een data-centrische oplossing, bestaande uit gekoppelde dataplatfor-
men, die het gebruik van data scheidt van de functionaliteit en
databeschikbaarheid voor primair en secundair gebruik, alsmede
gezamenlijke dossiervorming in de context van netwerkzorg, faciliteert.
Mijn ambtsvoorganger heeft daarbij aangegeven beide geadviseerde
richtingen, met de kennis en expertise van veldpartijen, verder uit te laten
werken. In deze brief geef ik invulling aan de toezegging om uw Kamer te
informeren over de uitwerking van de geadviseerde richting en de te
nemen vervolgstappen om tot een landelijk dekkend netwerk te komen.

De infrastructuur moet de uitdagingen van de zorg ondersteunen

De technologische ontwikkelingen gaan razendsnel en bieden veel
potentie voor het verbeteren van de kwaliteit van de zorg en het vermin-
deren van de administratieve lasten. Er worden regionaal allerlei
initiatieven gestart om toepassingen te ontwikkelen, die netwerk- en
hybride zorg kunnen ondersteunen. Belangrijke innovaties, want met de
juiste en volledige data kan een zorgverlener betere en veilige zorg
leveren. En kunnen burgers meebeslissen over voor hen passende zorg.
We moeten echter voorkomen dat de huidige versnippering in het
zorglandschap continueert en zelfs versterkt wordt, door allerlei nieuwe
initiatieven met eigen standaarden die niet onderling interoperabel zijn. Er
is behoefte aan een solide en toekomstbestendige landelijke infra-
structuur, waarmee het zorgproces ondersteund wordt en waarop verdere
technologische ontwikkelingen kunnen plaats vinden. Open internationale
standaarden moeten hierbij het uitgangspunt vormen, opdat er geprofi-
teerd kan worden van internationale kennis en innovaties en de data
uiteindelijk niet alleen landelijk maar ook Europees via de European
Health Data Space (EHDS) kan stromen.

In de vorige brief is benoemd dat zorginformatie vaak binnen één bepaald
zorgproces wordt vastgelegd en gedeeld; er vindt een geprotocolleerde
overdracht van gegevens plaats. Denk hierbij aan een verwijzing van een
huisarts naar een specialist, of de overdracht van ziekenhuis naar
verpleeghuis of wijkverpleging. De geprioriteerde gegevensuitwisselingen
onder de Wet elektronische gegevensuitwisseling in de zorg (Wegiz)
ondersteunen dit. In de Nationale visie en strategie op het gezondheidsin-
formatiestelsel2 (NVS) wordt het belang van deze geprotocolleerde
overdrachten onderschreven, maar wordt ook onderkend dat alleen het
realiseren van een infrastructuur voor geprotocolleerde gegevensuit-
wisseling niet voldoende is om zorgverleners te faciliteren bij het verlenen
van hybride en netwerkzorg. Daarom willen we van gegevensuitwisseling
doorgroeien naar databeschikbaarheid.

Een communicatienetwerk voor gegevensuitwisseling

Voor gegevensuitwisseling is een communicatienetwerk nodig. In het
geadviseerde scenario (een gedistribueerd communicatienetwerk) heeft
elke zorgaanbieder zijn eigen knooppunt en kunnen zorgaanbieders
onderling tussen deze knooppunten gegevens uitwisselen zonder
tussenkomst van een derde partij.

De verdere uitwerking van dit scenario en de afstemming hierover met
veldpartijen heeft tot de conclusie geleid dat dit scenario momenteel een
nog te grote stap is. Aangezien veel zorgaanbieders al zijn aangesloten op
een zorginfrastructuur (bijvoorbeeld LSP en Chipsoft Zorgplatform) is het
efficiënter om eerst toe te werken naar een hybride situatie. In deze
hybride situatie ontsluiten bestaande infrastructuren de data bij hun

2
Kamerstukken II, 2022/23, 27 529, nr. 292




Tweede Kamer, vergaderjaar 2023–2024, 27 529, nr. 313 2
gekoppelde zorgaanbieders en stellen die gegevens vervolgens
beschikbaar via een (gezamenlijk) knooppunt. Een zorgaanbieder zonder
zorginfrastructuur kan een eigen knooppunt gebruiken. De verzameling
van al deze knooppunten vormt het communicatienetwerk.
Dit biedt alle zorgaanbieders de mogelijkheid om onderling (rechtstreeks
via een eigen knooppunt of via een bestaande infrastructuur) gegevens uit
te wisselen.
Om deze knooppunten te ontwikkelen en verbinden is het noodzakelijk om
te komen tot een landelijke kaderstelling voor standaardisatie van taal en
techniek, in de vorm van een landelijk vertrouwensstelsel (LVS)3. Het LVS
omvat het geheel aan technische, organisatorische en juridische
afspraken die zorgt voor vertrouwen in de landelijke elektronische
gegevensuitwisseling en het gebruik van gezondheidsgegevens.
Generieke functies en bijbehorende voorzieningen vormen een belangrijk
onderdeel van dit stelsel. In de recente Kamerbrief over Generieke
functies4 benoemt mijn ambtsvoorganger de interventies die nodig zijn
om de randvoorwaardelijke generieke functies en voorzieningen te
realiseren.

Doorgroeien naar databeschikbaarheid

We willen dat patiënten, zorgverleners en onderzoekers digitaal kunnen
beschikken over de juiste informatie, op het juiste moment en op de juiste
plek. Hiervoor zijn toepassingen nodig die data beschikbaar stellen voor
preventie, het primaire zorgproces en secundair datagebruik5. Deze
toepassingen maken gebruik van het communicatienetwerk voor
gegevensuitwisseling om bij meerdere betrokken zorgaanbieders
(tegelijkertijd) actuele en relevante zorggegevens op te vragen.

Veel initiatieven voor dergelijke toepassingen stagneren omdat het lastig
is om op een eenduidige wijze bij verschillende zorgaanbieders data te
ontsluiten en samen te voegen. Dit komt doordat de verschillende
infrastructuren en knooppunten nog niet met elkaar verbonden zijn, maar
ook doordat zorgaanbieders hun data nog niet (voldoende) gestandaardi-
seerd vastleggen.

Er wordt vaak gesproken over data uit de bron, maar «de bron» is niet
altijd geschikt voor (gestandaardiseerde) dataopslag en -ontsluiting. Het
Elektronisch Patiënten Dossier (EPD) en Elektronisch Cliënt Dossier (ECD)
zijn systemen voor zorgaanbieders, die initieel zijn toegespitst op het
functioneel gebruik binnen de organisatie. Om tijdig de juiste data voor
gegevensdeling beschikbaar te stellen, moet een EPD/ECD systeem
functionaliteit en data goed van elkaar kunnen scheiden. Tevens is er niet
altijd sprake van één bronsysteem; in sommige sectoren werken
zorgaanbieders met meerdere applicaties. Met name in de tweede-
lijnszorg kiezen steeds meer zorgaanbieders ervoor om data uit één of
meerdere bronsystemen separaat op te slaan (datawarehouse of data
lake). De data wordt in principe onbewerkt en ongestructureerd
gekopieerd vanuit de bron. De vertaling van die data naar een gestandaar-
diseerd datamodel kan in dezelfde opslagomgeving plaatsvinden – men
spreekt dan over een dataplatform – of wordt als separate functionaliteit
los van de opslag uitgevoerd. Wanneer een EPD- of ECD-systeem niet de
data volgens het afgesproken informatiemodel op kan slaan en/of als er
sprake is van meerdere bronsystemen, kan een dataplatform een

3
Eén van de randvoorwaarden voor gegevensuitwisseling zoals beschreven in de brief van
13 april 2023
4
Kamerstukken II, 2022/23, 27 529, nr. 312
5
Mits daarvoor toestemming is gegeven door de patiënt en aan alle voorwaarden ten aanzien
van identificatie, authenticatie en autorisatie is voldaan




Tweede Kamer, vergaderjaar 2023–2024, 27 529, nr. 313 3
gewenste aanvulling zijn voor een zorgaanbieder. Daarbij is het
uitgangspunt dat de data onder de verantwoordelijkheid en invloedssfeer
van de zorgaanbieder blijft.

Inspringen op een behoefte: netwerk- en integratiediensten

Na het verder uitwerken van de twee geadviseerde scenario's, vanuit het
architectuur- en techniekperspectief, zijn de contouren getoetst in de
praktijk. Aanbieders van regionale en sectorale toepassingen, die een
bijdrage leveren aan het realiseren van databeschikbaarheid, hebben de
functionaliteit van hun oplossing schriftelijk beschreven en vervolgens
persoonlijk toegelicht. Deze gesprekken en documentatie hebben
inzichtelijk gemaakt waar de uitdagingen en knelpunten zitten.

De standaardisatie van taal en techniek gaat niet snel genoeg om de
toenemende vraag naar toepassingen voor databeschikbaarheid te
faciliteren. Aanbieders van toepassingen moeten zorgaanbieders (en
diens leveranciers) ondersteunen bij het ontsluiten van de data en
daarnaast nog allerlei netwerk- en integratiediensten (bijvoorbeeld het
verzamelen, vertalen, valideren, samenvoegen en verrijken van data)
aanbieden om de data bruikbaar te maken voor de toepassingen. Nu
worden al deze «diensten» voor elke toepassing separaat en opnieuw
ontwikkeld en uitgevoerd; hiermee gaat veel tijd en geld verloren. Ook
moet voorkomen worden dat databeschikbaarheid een verdienmodel
wordt; het hele proces van data lokaliseren, opvragen, beschikbaar
stellen, verzamelen en integreren moet non-concurrentieel zijn. Toepas-
singen en diensten daarentegen mogen – met name voor de meer
specifieke functionaliteit – wel concurrentieel zijn om ontwikkeling en
innovatie te bevorderen.

Inmiddels hebben een aantal partijen en zorgkoepels6 hun ambities en
krachten gebundeld in de CumuluZ-coalitie met als doel toe te werken
naar een landelijke data infrastructuur met één non-concurrentieel
data-integratieplatform, die de data bij zorgaanbieders ontsluit, verwerkt
en vervolgens beschikbaar stelt aan toepassingen en diensten.

Aanscherping van de voorgestelde scenario's

Zoals in het begin van deze brief benoemd zijn de volgende twee
scenario's nader uitgewerkt:
Een gedistribueerd communicatienetwerk voor (geprotocolleerde)
overdracht van gegevens door één-op-één communicatie tussen
zorgverleners én om data beschikbaar te kunnen stellen voor andere
doeleinden.
Een data-centrische oplossing, bestaande uit gekoppelde dataplatfor-
men, die het gebruik van data scheidt van de functionaliteit en
databeschikbaarheid voor primair en secundair gebruik, alsmede
gezamenlijke dossiervorming in de context van netwerkzorg, faciliteert.

De uitwerking heeft inzichtelijk gemaakt waar de knelpunten zitten en
welke interventies nodig zijn om tot landelijke gegevensuitwisseling en
databeschikbaarheid te komen. Vervolgens zijn de zorgkoepels en
adviesorganen geconsulteerd over voorgestelde aanscherpingen op de
geadviseerde scenario's.
Deze aanscherpingen zijn:



6
Nederlandse Federatie van Universitair Medische Centra (NFU), Nederlandse Vereniging van
Ziekenhuizen (NVZ), Santeon en mProve




Tweede Kamer, vergaderjaar 2023–2024, 27 529, nr. 313 4
Communicatienetwerk
Om binnen de IZA-termijn te komen tot landelijke gegevensuit-
wisseling wordt gekozen voor een hybride oplossing waarbij
bestaande infrastructuren worden verbonden met elkaar en met
zorgaanbieders die geen onderdeel uitmaken van een specifieke
infrastructuur, maar een eigen knooppunt hebben.
o Verbinden van bestaande infrastructuren: mijn ministerie zorgt voor
het opstellen van «technical agreements» (TA's) per uitwisselings-
vorm, waarmee bestaande infrastructuren met knooppunten
verbonden worden. Hiermee zullen (leveranciers van) zorgaanbie-
ders van bijvoorbeeld ziekenhuiszorg, wijkverpleging en verpleeg-
zorg, de verbinding van infrastructuren beproeven en vervolgens in
gebruik nemen. De overdracht vindt plaats op basis van FHIR en
maakt gebruik van huidige landelijke afsprakenstelsels (zoals
MedMij, Twiin en Health-RI) en het TwiinxNuts groeipad dat richting
2025 wordt ontwikkeld onder het Landelijk vertrouwensstelsel;
o Infrastructuur voor uitwisseling van medische beelden: de DVD-exit
infrastructuur (het Twiin portaal) wordt ingezet als tijdelijke
oplossing, inclusief lokalisatiefunctie voor de benodigde historische
zorgtijdslijn.

Data- en integratieplatform(en)
Het uitgangspunt voor dataopslag is dat data wordt opgeslagen onder
verantwoordelijkheid en invloedssfeer van de zorgaanbieder.
o Hoe de data opgeslagen wordt (bronsysteem of datawarehouse/
data lake/dataplatform) en waar de vertaling naar landelijke
informatiestandaarden plaats vindt, is afhankelijk van de mogelijk-
heden en keuzes van de zorgaanbieder (en haar leverancier);
o Alleen centrale opslag van data (data van meerdere zorgaanbieders
opgeslagen op één en dezelfde locatie) wanneer de noodzaak
aangetoond is en er een juridische grondslag vanuit de Algemene
Verordening Gegevensbescherming (AVG) en de Wet op de
geneeskundige behandelovereenkomst (Wgbo) voor aanwezig is.
Inzetten op integratie- en netwerkdiensten om versnelling te realiseren.
o Kom tot een – op open internationale standaarden gebaseerde –
landelijke data infrastructuur voor primair en secundair gebruik,
waarbij het CumuluZ-concept als uitgangspunt dient voor een
non-concurrentiële data-integratie laag;
o Groei toe naar een publieke voorziening voor data-integratie laag
met daarbij de benodigde integratie en netwerkdiensten, die
bijdraagt aan de realisatie van databeschikbaarheid voor het hele
gezondheidsinformatiestelsel;
o Maak (her)gebruik van de kennis, ervaring en functionaliteit die
aanwezig is bij de bewezen initiatieven en werk actief samen om
die initiatieven te harmoniseren en uiteindelijk te integreren in één
infrastructuur;
o Start met regionale en sectorale initiatieven die aantoonbaar
schaalbaar zijn en ga die na een succesvolle beproeving landelijk
implementeren;
o Investeer alleen in nieuwe initiatieven als die nieuwe functionaliteit
voor de zorgsector toevoegen en passen in de beoogde
doelarchitectuur;
o Integratie- en netwerkdiensten zijn ter aanvulling en niet ter
vervanging van de benodigde eenheid van taal en techniek. Vanuit
kwaliteit, verantwoordelijkheid en kosten perspectief blijft gestan-
daardiseerde vastlegging de verantwoordelijkheid van de zorgaan-
bieder (die hierin gefaciliteerd moet worden door de leverancier
van het bronsysteem) en dient dit zo dicht mogelijk bij de bron
(zorgverlener) plaats te vinden.




Tweede Kamer, vergaderjaar 2023–2024, 27 529, nr. 313 5
De leden van het Informatieberaad zorg hebben hun steun uitgesproken
voor deze koers met de aanscherpingen op de twee scenario's. Dit moet
de komende maanden leiden tot een verbreding van de CumuluZ-coalitie,
met een vertegenwoordiging uit meerdere zorgdomeinen en het inregelen
van de governance van de coalitie (incl. juridische entiteit en rol van het
Ministerie van VWS). Daarnaast wil ik een «werkplaats» inrichten om niet
alleen op bestuurlijk niveau (CumuluZ-coalitie), maar vooral ook op
inhoud partijen actief samen te laten werken om proeftuinen te initiëren
en bestaande, bewezen toepassingen te harmoniseren en integreren in de
doelarchitectuur.

Vervolg

Ik zet de huidige koers van standaardisatie van taal en techniek voort, die
nodig is voor in eerste instantie gegevensuitwisseling groeiend naar
databeschikbaarheid. Zoals ook benoemd in de kamerbrief over Generieke
functies is het noodzakelijk om te komen tot een landelijke kaderstelling,
in de vorm van een landelijk vertrouwensstelsel.

In aanvulling hierop ga ik samen met de CumuluZ-coalitie de regie voeren
op het realiseren van een landelijke data infrastructuur voor primair en
secundair gebruik. Daarbij wil ik toegroeien naar een publieke voorziening
voor de data-integratie laag, met bijbehorende integratie- en netwerk-
diensten, om de doorontwikkeling van gegevensuitwisseling naar
databeschikbaarheid te ondersteunen en versnellen. De mogelijkheid,
intensiteit en snelheid van deze ambitie moeten worden bezien in
samenhang met de middelen die beschikbaar zijn op de aanvullende post
bij het Ministerie van Financiën voor het bevorderen van de gegevensuit-
wisseling in de zorg. Over de inzet van deze middelen moet nadere
besluitvorming plaatsvinden. Ik bericht u na het voorjaar over de stand
van zaken.

De volgende stap is het uitwerken van de aangescherpte scenario's in een
doelarchitectuur en een transitieplan, met concrete acties voor de korte
termijn en langere termijn doelen voor de daaropvolgende jaren. Hierbij
wordt aangesloten op de besluiten die in het IZA-uitvoeringsakkoord zijn
opgenomen en de doelstelling (per plateau) van de NVS. De regie ligt bij
mijn ministerie, en de invulling van het plan zal in afstemming met de
CumuluZ-coalitie en IZA-partijen plaatsvinden. Voor het einde van dit jaar
zal ik uw kamer informeren over de voortgang omtrent de doelstellingen
die in 2025 gerealiseerd moeten worden en de doelarchitectuur en het
transitieplan.

De Minister van Volksgezondheid, Welzijn en Sport,
C. Helder




Tweede Kamer, vergaderjaar 2023–2024, 27 529, nr. 313 6'''

### 3.2.2. PDF
- [Landelijk dekkend netwerk van infrastructuren](https://test.polpo.nl/nl/document/eb024e5c-7779-4d2c-9b63-b6e2fe0a2431)
- [Financieringsmonitor2024](https://test.polpo.nl/nl/document/d0f53dc4-a04b-40f4-b48c-b697760a0d7f/startup*%20%7C%20%22start-ups%22%20%7C%20scaleup*%20%7C%20%22scale-ups%22)

In [None]:
#!pip install pypdf
from langchain.document_loaders import PyPDFLoader
pdf_loader = PyPDFLoader("LandelijkDekkendNetwerk.pdf")
docs = pdf_loader.load()
pages = pdf_loader.load_and_split()
print(len(pages))
print(len(docs))

In [None]:
n_splits = 2
list_page_content = [page.page_content for page in pages]
some_text = ''.join(list_page_content)
ntchar = len(some_text)
chunk_size = int(ntchar/n_splits)

In [None]:
from langchain.text_splitter import RecursiveCharacterTextSplitter
r_splitter = RecursiveCharacterTextSplitter(
    chunk_size=chunk_size,
    chunk_overlap=0,
    separators=["\n\n", "\n", "(?<=\. )", " ", ""]
)

In [None]:
docs = r_splitter.split_text(some_text)

### 3.2.3. Text to PDF
3.2.3.1. Kunstmatige intelligentie  
3.2.3.2. History Greek, Dutch and Belgium  

- __REQUIREMENT__: You need to import the definition __create_pdf__ from the __Code__ section 4.3.

#### 3.2.3.1. Elastic search snippets, Kunstmatige Intelligentie [related pdf](https://test.polpo.nl/nl/document/2307c590-3f69-4170-88af-fc0841093623/%22kunstmatige%20intelligentie%22) 

In [None]:
snippet1 = "Medewerkers gaan we verder ontwikkelen in het toepassen van de menselijke maat in onze dienstverlening. Met ondersteuning van kunstmatige intelligentie gaan we onze brieven leesbaarder en begrijpelijker maken. De benadering is van buiten naar binnen: knelpunten die onze cliënten ervaren worden in kaart gebracht op basis van verschillende vormen van (klant)onderzoek en analyses."
snippet2 = "Het programma Innovatie ondersteunt initiatieven en oplosteams in het effectief organiseren van verbetertrajecten en het bedenken van vernieuwende oplossingen, met kennis over de laatste (technologische) ontwikkelingen en trends. Zo wordt onderzocht hoe kunstmatige intelligentie (zoals ChatGPT) ingezet kan worden, bijvoorbeeld bij het herschrijven van algemene teksten in tientallen brieven om de leesbaarheid te verbeteren. Aandacht voor innovatie en design thinking draagt ook bij aan de gewenste ontwikkeling van een lerende organisatie."
snippet_collection_nl = snippet1 + "\n" + snippet2

snippet_collection_en = '''We will further develop employees in applying the human scale in our services. With support of
artificial intelligence we will make our letters more readable and make it more understandable. The approach is from the outside in: bottlenecks
that our clients experience are mapped on the basis of various forms of (customer) research and analyses. The program
Innovation supports initiatives and solution teams effectively organizing improvement processes and devising innovative ones
solutions, with knowledge of the latest (technological) developments and trends. For example, it is investigated how artificial intelligence (such as
ChatGPT) can be used, for example when rewriting general texts in dozens of letters to improve readability.
Attention to innovation and design thinking also contributes to the desired results development of a learning organization.'''
create_pdf([snippet_collection_nl], "KI_nl.pdf")
create_pdf([snippet_collection_en], "KI_en.pdf")

#### 3.2.3.2. History Greek, Dutch and Belgium (text self created using ChatGPT)
Depending on the input text of the be, nl, gr fragments the map reduce gives different output. That is why i am creating two versions of the pdfs. The first version is based on the text1 fragments and the second version is based on the text2 fragments. Notice the difference between the text1 and text2 fragments. Text1 uses \n whereas text2 versions use cariage returns leading to white lines between alineas. Comparing the history1 and history2 pdfs you can not see any difference although the output of the Map-Reduce is highly different. See Map-Reduce output under __Useful Code__ section 8.

In [18]:
be_history_text1 = '''Belgian history is rich and complex, characterized by its strategic location in Western Europe and \n
its cultural diversity. Here's a brief summary:
Early History: The region now known as Belgium has been inhabited since prehistoric times. \n
It was later settled by Celtic and Germanic tribes before coming under Roman rule in the first century BC. The area flourished during Roman times as part of the province of Gallia Belgica.
Medieval Period: After the fall of the Roman Empire, the region was invaded and settled by various Germanic tribes. In the early Middle Ages, it became part of the Frankish Empire. During this period, the area saw the rise of powerful feudal lords and the emergence of important trading cities like Ghent, Bruges, and Antwerp.
Burgundian and Habsburg Rule: In the 15th century, the Burgundian dukes gained control of much of present-day Belgium. This period saw the flourishing of arts and culture, but also increased centralization of power. The region later came under Habsburg rule as part of the Spanish and Austrian Netherlands.
Dutch Independence: In the 16th and 17th centuries, the Dutch Revolt against Spanish rule led to the independence of the northern provinces of the Netherlands. However, the southern provinces, including present-day Belgium, remained under Spanish control until they were conquered by France in the late 17th century.
French Rule: Belgium became part of France under Napoleon Bonaparte's rule in the early 19th century. During this time, French revolutionary ideals influenced Belgian society and politics.Independence and Kingdom of Belgium: Following the defeat of Napoleon, the Congress of Vienna in 1815 united the southern provinces with the northern provinces to form the United Kingdom of the Netherlands. However, tensions between the Dutch-speaking north and the French-speaking south led to the Belgian Revolution in 1830. Belgium declared independence and established a constitutional monarchy, with Leopold I as its first king.
Industrialization and Colonialism: Throughout the 19th century, Belgium experienced rapid industrialization, particularly in coal mining and steel production. It also established a colonial empire in Africa, notably in the Congo, which was famously exploited under King Leopold II's rule.
20th Century: Belgium was heavily impacted by both World Wars, particularly during World War I when it served as a battleground. The country was occupied by Germany during World War II. After the war, Belgium played a key role in the founding of the European Coal and Steel Community, a precursor to the European Union.
Modern Belgium: Belgium has since become a prosperous and democratic country, known for its multiculturalism, chocolate, beer, and waffles. However, it continues to grapple with linguistic and political tensions between the Dutch-speaking Flanders region and the French-speaking Wallonia region, as well as issues related to regional autonomy and identity.'''

nl_history_text1 = '''The history of the Netherlands is rich and diverse, spanning thousands of years. Here's a brief summary of key periods and events in Dutch history:
Early Settlements: The region that is now the Netherlands has been inhabited since prehistoric times. During the Roman era, it was part of the Roman Empire's frontier region.
Middle Ages: In the early Middle Ages, the Franks established control over the region. The Netherlands gradually emerged as a distinct entity, with the development of feudal states and the growth of trade and commerce.
Golden Age (17th Century): The 17th century is often referred to as the Dutch Golden Age. During this time, the Netherlands experienced a period of economic prosperity, cultural flourishing, and naval dominance. The Dutch East India Company and Dutch West India Company were established, and Amsterdam became a leading financial center.
Colonial Empire: The Dutch established colonies and trading posts around the world, including in the East Indies (present-day Indonesia), Suriname, and the Caribbean. The Dutch colonial empire was significant but eventually declined over time.
Napoleonic Era: In the late 18th and early 19th centuries, the Netherlands fell under French control during the Napoleonic Wars. It later became part of the French Empire.
Independence and Kingdom: The Netherlands gained independence from France in 1815 and became a kingdom under King William I. Belgium initially formed part of the Kingdom of the Netherlands but later separated in 1830.
Industrialization and Modernization: The 19th century saw rapid industrialization and modernization in the Netherlands. The country became known for its innovations in trade, shipping, and agriculture.
World Wars: The Netherlands remained neutral during World War I but was invaded by Nazi Germany in World War II. The country suffered under German occupation but played a role in the Allied liberation of Europe.
Post-War Reconstruction: After World War II, the Netherlands underwent a period of reconstruction and economic recovery. It became a founding member of international organizations such as the United Nations and the European Union.
Contemporary Era: In recent decades, the Netherlands has become known for its progressive social policies, strong economy, and commitment to environmental sustainability. It continues to be a leading global player in areas such as trade, technology, and diplomacy.
This summary provides a broad overview of Dutch history, highlighting key moments and themes that have shaped the nation's identity and development over time.'''

gr_history_text1 = '''Greek history spans thousands of years and is marked by significant contributions to Western civilization, including democracy, philosophy, art, and literature. Here's a brief summary:
Ancient Greece: Ancient Greek civilization emerged around the 8th century BC and was comprised of city-states such as Athens, Sparta, Corinth, and Thebes. This period saw the rise of democracy in Athens, where citizens participated in governance, and the development of philosophy by figures like Socrates, Plato, and Aristotle. Greek art and architecture, exemplified by the Parthenon in Athens, also flourished during this time. The city-states often engaged in conflicts with each other, most notably the Peloponnesian War between Athens and Sparta.
Hellenistic Period: After the conquests of Alexander the Great in the 4th century BC, Greek culture spread throughout the Mediterranean and Middle East, creating a new era known as the Hellenistic period. Greek language, art, and philosophy influenced cultures across the region, including Egypt and Persia.
Roman Greece: Greece became part of the Roman Empire after the defeat of the Greek city-states in the 2nd century BC. During this time, Greece continued to be an important center of culture and learning, with cities like Corinth and Athens remaining influential.
Byzantine Empire: Following the division of the Roman Empire, Greece became part of the Byzantine Empire, centered in Constantinople (modern-day Istanbul). The Byzantine period saw the spread of Christianity and the construction of numerous churches and monasteries across Greece.
Ottoman Rule: Greece fell under Ottoman rule in the 15th century after the fall of Constantinople. The Greeks struggled for independence from Ottoman rule for centuries, culminating in the Greek War of Independence in the early 19th century.
Modern Greece: The Greek War of Independence began in 1821 and eventually led to the establishment of the modern Greek state in 1830, although some territories, including Crete and the Ionian Islands, were not incorporated until later. The monarchy was established, and Otto of Bavaria became the first king of Greece.
20th Century: Greece experienced political instability throughout much of the 20th century, including periods of monarchy, dictatorship, and democratic rule. Greece was occupied by Axis powers during World War II, and a brutal civil war followed the war's end. In 1974, Greece transitioned to democracy after the fall of the military junta.
European Union and Economic Challenges: Greece joined the European Union in 1981 and adopted the euro as its currency in 2001. However, the country faced significant economic challenges in the late 2000s, leading to a sovereign debt crisis and bailout agreements with the EU and International Monetary Fund.
Modern Greece: Today, Greece is a parliamentary republic and a member of the European Union. It remains a popular tourist destination known for its rich history, stunning landscapes, and cultural heritage. However, it continues to grapple with economic issues and challenges related to governance and social welfare. '''


be_history_text2 = '''Belgian history is rich and complex, characterized by its strategic location in Western Europe and its cultural diversity. Here's a brief summary:

Early History: The region now known as Belgium has been inhabited since prehistoric times. It was later settled by Celtic and Germanic tribes before coming under Roman rule in the first century BC. The area flourished during Roman times as part of the province of Gallia Belgica.

Medieval Period: After the fall of the Roman Empire, the region was invaded and settled by various Germanic tribes. In the early Middle Ages, it became part of the Frankish Empire. During this period, the area saw the rise of powerful feudal lords and the emergence of important trading cities like Ghent, Bruges, and Antwerp.

Burgundian and Habsburg Rule: In the 15th century, the Burgundian dukes gained control of much of present-day Belgium. This period saw the flourishing of arts and culture, but also increased centralization of power. The region later came under Habsburg rule as part of the Spanish and Austrian Netherlands.

Dutch Independence: In the 16th and 17th centuries, the Dutch Revolt against Spanish rule led to the independence of the northern provinces of the Netherlands. However, the southern provinces, including present-day Belgium, remained under Spanish control until they were conquered by France in the late 17th century.

French Rule: Belgium became part of France under Napoleon Bonaparte's rule in the early 19th century. During this time, French revolutionary ideals influenced Belgian society and politics.Independence and Kingdom of Belgium: Following the defeat of Napoleon, the Congress of Vienna in 1815 united the southern provinces with the northern provinces to form the United Kingdom of the Netherlands. However, tensions between the Dutch-speaking north and the French-speaking south led to the Belgian Revolution in 1830. Belgium declared independence and established a constitutional monarchy, with Leopold I as its first king.

Industrialization and Colonialism: Throughout the 19th century, Belgium experienced rapid industrialization, particularly in coal mining and steel production. It also established a colonial empire in Africa, notably in the Congo, which was famously exploited under King Leopold II's rule.

20th Century: Belgium was heavily impacted by both World Wars, particularly during World War I when it served as a battleground. The country was occupied by Germany during World War II. After the war, Belgium played a key role in the founding of the European Coal and Steel Community, a precursor to the European Union.

Modern Belgium: Belgium has since become a prosperous and democratic country, known for its multiculturalism, chocolate, beer, and waffles. However, it continues to grapple with linguistic and political tensions between the Dutch-speaking Flanders region and the French-speaking Wallonia region, as well as issues related to regional autonomy and identity. '''

gr_history_text2 = '''Greek history spans thousands of years and is marked by significant contributions to Western civilization, including democracy, philosophy, art, and literature. Here's a brief summary:

Ancient Greece: Ancient Greek civilization emerged around the 8th century BC and was comprised of city-states such as Athens, Sparta, Corinth, and Thebes. This period saw the rise of democracy in Athens, where citizens participated in governance, and the development of philosophy by figures like Socrates, Plato, and Aristotle. Greek art and architecture, exemplified by the Parthenon in Athens, also flourished during this time. The city-states often engaged in conflicts with each other, most notably the Peloponnesian War between Athens and Sparta.

Hellenistic Period: After the conquests of Alexander the Great in the 4th century BC, Greek culture spread throughout the Mediterranean and Middle East, creating a new era known as the Hellenistic period. Greek language, art, and philosophy influenced cultures across the region, including Egypt and Persia.

Roman Greece: Greece became part of the Roman Empire after the defeat of the Greek city-states in the 2nd century BC. During this time, Greece continued to be an important center of culture and learning, with cities like Corinth and Athens remaining influential.

Byzantine Empire: Following the division of the Roman Empire, Greece became part of the Byzantine Empire, centered in Constantinople (modern-day Istanbul). The Byzantine period saw the spread of Christianity Empire after the defeat of the Greek city-states in the 2nd century BC. During this time, Greece continued to be an important center of culture and learning, with cities like Corinth and Athens remaining influential.

Byzantine Empire: Following the division of the Roman Empire, Greece became part of the Byzantine Empire, centered in Constantinople (modern-day Istanbul). The Byzantine period saw the spread of Christianity and the construction of numerous churches and monasteries across Greece.

Ottoman Rule: Greece fell under Ottoman rule in the 15th century after the fall of Constantinople. The Greeks struggled for independence from Ottoman rule for centuries, culminating in the Greek War of Independence in the early 19th century.

Modern Greece: The Greek War of Independence began in 1821 and eventually led to the establishment of the modern Greek state in 1830, although some territories, including Crete and the Ionian Islands, were not incorporated until later. The monarchy was established, and Otto of Bavaria became the first king of Greece.

20th Century: Greece experienced political instability throughout much of the 20th century, including periods of monarchy, dictatorship, and democratic rule. Greece was occupied by Axis powers during World War II, and a brutal civil war followed the war's end. In 1974, Greece transitioned to democracy after the fall of the military junta.

European Union and Economic Challenges: Greece joined the European Union in 1981 and adopted the euro as its currency in 2001. However, the country faced significant economic challenges in the late 2000s, leading to a sovereign debt crisis and bailout agreements with the EU and International Monetary Fund.

Modern Greece: Today, Greece is a parliamentary republic and a member of the European Union. It remains a popular tourist destination known for its rich history, stunning landscapes, and cultural heritage. However, it continues to grapple with economic issues and challenges related to governance and social welfare. '''

nl_history_text2 = '''The history of the Netherlands is rich and diverse, spanning thousands of years. Here's a brief summary of key periods and events in Dutch history:

Early Settlements: The region that is now the Netherlands has been inhabited since prehistoric times. During the Roman era, it was part of the Roman Empire's frontier region.

Middle Ages: In the early Middle Ages, the Franks established control over the region. The Netherlands gradually emerged as a distinct entity, with the development of feudal states and the growth of trade and commerce.

Golden Age (17th Century): The 17th century is often referred to as the Dutch Golden Age. During this time, the Netherlands experienced a period of economic prosperity, cultural flourishing, and naval dominance. The Dutch East India Company and Dutch West India Company were established, and Amsterdam became a leading financial center.

Colonial Empire: The Dutch established colonies and trading posts around the world, including in the East Indies (present-day Indonesia), Suriname, and the Caribbean. The Dutch colonial empire was significant but eventually declined over time.

Napoleonic Era: In the late 18th and early 19th centuries, the Netherlands fell under French control during the Napoleonic Wars. It later became part of the French Empire.

Independence and Kingdom: The Netherlands gained independence from France in 1815 and became a kingdom under King William I. Belgium initially formed part of the Kingdom of the Netherlands but later separated in 1830.

Industrialization and Modernization: The 19th century saw rapid industrialization and modernization in the Netherlands. The country became known for its innovations in trade, shipping, and agriculture.

World Wars: The Netherlands remained neutral during World War I but was invaded by Nazi Germany in World War II. The country suffered under German occupation but played a role in the Allied liberation of Europe.

Post-War Reconstruction: After World War II, the Netherlands underwent a period of reconstruction and economic recovery. It became a founding member of international organizations such as the United Nations and the European Union.

Contemporary Era: In recent decades, the Netherlands has become known for its progressive social policies, strong economy, and commitment to environmental sustainability. It continues to be a leading global player in areas such as trade, technology, and diplomacy.

This summary provides a broad overview of Dutch history, highlighting key moments and themes that have shaped the nation's identity and development over time. '''

print(f"Number of words in the History texts:\n(Gr, Nl, Be) = ({len(gr_history_text1.split(' '))}, {len(nl_history_text1.split(' '))}, {len(be_history_text1.split(' '))})")

Number of words in the History texts:
(Gr, Nl, Be) = (463, 381, 444)


In [19]:
# First page about belgium, second about grece
filename = "history1_be_gr_nl.pdf"
create_pdf([be_history_text1, gr_history_text1, nl_history_text1], filename)
# First page about grece, second about belgium
filename = "history1_gr_be_nl.pdf"
create_pdf([gr_history_text1, be_history_text1, nl_history_text1], filename)

filename = "history2_be_gr_nl.pdf"
create_pdf([be_history_text2, gr_history_text2, nl_history_text2], filename)
# First page about grece, second about belgium
filename = "history2_gr_be_nl.pdf"
create_pdf([gr_history_text2, be_history_text2, nl_history_text2], filename)

# 4. Code:


4.1. Loading a pdf file in a list of docs  
4.2. __select_sentences_range__, selects from a given text, all the sentences given between __start_index__ and __end_index__.  
4.3. __create_pdf__ is the definition on how to use the generation of a pdf based on a list of input strings.  
4.4. __create_pd_from_map_reduce_output__, to create a pandas __dataframe__ from the __LangChain map-reduce__ outputs  
4.5. Code to load the Llama model (dutch or english) using __pipeline__ from __transformers__ (as present in the demo given to Casper)  
4.6. Code to load the Llama model (dutch or english) using __HuggingFaceHub__ (errors due to modelsize >)  
4.7. Code to upload the Llama model to Hugging Face Space (although suggested by the output error it seems not to be possible)  
4.8. Code to upload the Llama model using __AutoModelForCausalLM__ and using the __AutoTokenizer__ and __pipeline__ from transformers wrapping everything in __HuggingFacePipeline__ that can be used by LangChain (This approach was Successfull)  
4.9. __Cpp__ way of loading the model  
4.10. Login to hugging face  
4.11. Map-Reduce stress test: Map: "Extract nth sentence of the document splits" and Reduce: "concatenate them into final output"  

### 4.1. Loading a pdf file in a list of docs

In [None]:
from langchain.document_loaders import PyPDFLoader
loader = PyPDFLoader("KI_nl.pdf")
docs = loader.load()
print(f"Number of docs: {len(docs)}\n===============\n")
print(f"page_content of first page:\n===========================\n{docs[0].page_content}\n\n")
print(f"metadata:\n=========\n{docs[0].metadata}")

### 4.2. __select_sentences_range__, selects from a given text, all the sentences given between start_index and end_index. 
- __select_sentences_range(text, 3, 6)__ --> will give you from the input text all the sentences from sentence 3 till sentence 6
- __select_sentences_range(text, 3,":")__ --> will give you all the sentences from the text from the 3rd sentence till the end of the text

In [None]:
import re
def select_sentences_range(text, start_index, end_index):
    # Split the text into sentences using regex
    sentences = re.split(r'(?<!\w\.\w.)(?<![A-Z][a-z]\.)(?<=\.|\?)\s', text)
    
    # Convert ":" to the length of the sentences list
    if end_index == ":":
        end_index = len(sentences)
    
    # Select sentences within the specified range
    selected_sentences = sentences[start_index:end_index]
    
    # Join the selected sentences into a single string
    selected_text = ' '.join(selected_sentences)
    
    return selected_text


### 4.3. __create_pdf__ is the definition on how to use the generation of a pdf based on a list of input strings. Each string of the list will be on its own page in the pdf document

In [2]:
#!pip install reportlab
from reportlab.lib.pagesizes import letter
from reportlab.pdfgen import canvas

def create_pdf(list_strings, filename):
    # Create a canvas with letter size (8.5x11 inches)
    c = canvas.Canvas(filename, pagesize=letter)

    # Set font and font size for first page
    c.setFont("Helvetica", 12)
    
    for string in list_strings:
        # Draw first string on the first page
        draw_multiline_string(c, 100, 700, string)

        # Add a new page
        c.showPage()

    # Save the PDF
    c.save()

def draw_multiline_string(canvas, x, y, text, max_width=400, line_spacing=15):
    lines = []
    current_line = ''
    for word in text.split():
        if canvas.stringWidth(current_line + ' ' + word) <= max_width:
            current_line += ' ' + word
        else:
            lines.append(current_line.strip())
            current_line = word
    lines.append(current_line.strip())

    for line in lines:
        canvas.drawString(x, y, line)
        y -= line_spacing



In [None]:
snippet1 = "Medewerkers gaan we verder ontwikkelen in het toepassen van de menselijke maat in onze dienstverlening. Met ondersteuning van kunstmatige intelligentie gaan we onze brieven leesbaarder en begrijpelijker maken. De benadering is van buiten naar binnen: knelpunten die onze cliënten ervaren worden in kaart gebracht op basis van verschillende vormen van (klant)onderzoek en analyses."
snippet2 = "Het programma Innovatie ondersteunt initiatieven en oplosteams in het effectief organiseren van verbetertrajecten en het bedenken van vernieuwende oplossingen, met kennis over de laatste (technologische) ontwikkelingen en trends. Zo wordt onderzocht hoe kunstmatige intelligentie (zoals ChatGPT) ingezet kan worden, bijvoorbeeld bij het herschrijven van algemene teksten in tientallen brieven om de leesbaarheid te verbeteren. Aandacht voor innovatie en design thinking draagt ook bij aan de gewenste ontwikkeling van een lerende organisatie."
snippet_collection_nl = snippet1 + "\n" + snippet2

snippet_collection_en = '''We will further develop employees in applying the human scale in our services. With support of
artificial intelligence we will make our letters more readable and make it more understandable. The approach is from the outside in: bottlenecks
that our clients experience are mapped on the basis of various forms of (customer) research and analyses. The program
Innovation supports initiatives and solution teams effectively organizing improvement processes and devising innovative ones
solutions, with knowledge of the latest (technological) developments and trends. For example, it is investigated how artificial intelligence (such as
ChatGPT) can be used, for example when rewriting general texts in dozens of letters to improve readability.
Attention to innovation and design thinking also contributes to the desired results development of a learning organization.'''
create_pdf([snippet_collection_nl], "KI_nl.pdf")
create_pdf([snippet_collection_en], "KI_en.pdf")

### 4.4. __create_pd_from_map_reduce_output__, to create a pandas dataframe from the LangChain map-reduce outputs¶

In [3]:
#!pip install pandas
from pathlib import Path as p
import pandas as pd

def create_pd_from_map_reduce_output(map_reduce_outputs):
    data_folder = p.cwd() 
    p(data_folder).mkdir(parents=True, exist_ok=True)


    final_mp_data = []
    for doc, out in zip(map_reduce_outputs["input_documents"], map_reduce_outputs["intermediate_steps"]):
        output = {}
        output["file_name"] = p(doc.metadata["source"]).stem
        output["file_type"] = p(doc.metadata["source"]).suffix
        output["page_number"] = doc.metadata["page"]
        output["chunks"] = doc.page_content
        output["concise_summary"] = out
        final_mp_data.append(output)

    pdf_mp_summary = pd.DataFrame.from_dict(final_mp_data)
    pdf_mp_summary = pdf_mp_summary.sort_values(by=["file_name", "page_number"])  # sorting the dataframe by filename and page_number
    pdf_mp_summary.reset_index(inplace=True, drop=True)
    pdf_mp_summary
    return pdf_mp_summary

In [None]:
# code that prints the output in a human readable format 
for i in range(len(pdf_mp_summary)):
   print(f"\n ========================================\n{pdf_mp_summary.iloc[i]['chunks']} \n+++++++++++++++++++++++\n{pdf_mp_summary.iloc[i]['concise_summary']}" )

### 4.5. Code to load the Llama model (dutch or english) using the __pipeline__ from __transformers__ (as present in the demo given to Casper)
- set in the below cell the parameter __model_chosen_id__ either to __=1__ (english model) or to __=2__ (dutch model)

In [None]:
#!pip install langdetect
import re
import time
from transformers import pipeline, Conversation, AutoTokenizer
from langdetect import detect

# choose your model here by setting model_chosen_id equal to 1 or 2
model_chosen_id = 1
model_name_options = {
    1: "meta-llama/Llama-2-13b-chat-hf",
    2: "BramVanroy/Llama-2-13b-chat-dutch"
}
model_chosen = model_name_options[model_chosen_id]

my_config = {'model_name': model_chosen, 'do_sample': True, 'temperature': 0.1, 'repetition_penalty': 1.1, 'max_new_tokens': 500, }
print(f"Selected model: {my_config['model_name']}")
print(f"Parameters are: {my_config}")

def count_words(text):
    # Use a simple regular expression to count words
    words = re.findall(r'\b\w+\b', text)
    return len(words)

def generate_with_llama_chat(my_config):    
    # get the parameters from the config dict
    do_sample = my_config.get('do_sample', True)
    temperature = my_config.get('temperature', 0.1)
    repetition_penalty = my_config.get('repetition_penalty', 1.1)
    max_new_tokens = my_config.get('max_new_tokens', 500)
    
    start_time = time.time()
    model = my_config['model_name']
    tokenizer = AutoTokenizer.from_pretrained(model)
    
    # Language code for Dutch
    #lang_code = "nl_XX"
    #forced_bos_token_id = tokenizer.lang_code_to_id["nl_XX"] # Error lang_code_to_id not know
    
    #potential usful parameters to tweak: ,"do_sample": True, "max_lengt
    chatbot = pipeline("conversational",model=model, 
                       tokenizer=tokenizer,
                       do_sample=do_sample, 
                       temperature=temperature, 
                       repetition_penalty=repetition_penalty,
                       #max_length=2000,
                       max_new_tokens=max_new_tokens, 
                       model_kwargs={"device_map": "auto","load_in_8bit": True})  #, "src_lang": "en", "tgt_lang": "nl"})  does not work!
    end_time = time.time()
    elapsed_time = end_time - start_time
    print(f"Loading the model: {elapsed_time} seconds")
    return chatbot
    
def get_answer(chatbot, input_text):
    start_time = time.time()
    print(f"Processing the input\n {input_text}\n")
    print('Processing the answer....')
    conversation = Conversation(input_text)
    print(f"Conversation(input_text): {conversation}")
    output = (chatbot(conversation))[1]['content']
    output_language = detect(output)
    print(f"{output}\n")
    print(f"output language detected is {output_language}\n")
    end_time = time.time()
    elapsed_time = end_time - start_time
    print(f"Answered in {elapsed_time:.1f} seconds, Nr generated words: {count_words(output)}\n")

    # Perform translation to dutch (catch in case it is needed (prompt engineering does not always works)
    if output_language == 'en':
        print("----------------------------------------------------")
        print("Need extra time to make the translation to Dutch....")
        start_time = time.time()
        conversation = Conversation(f"Translate the following text to Dutch: {output}")
        output = (chatbot(conversation))[1]['content']
        end_time = time.time()
        elapsed_time = end_time - start_time
        print(f"translated output is: {output}\n")
        print(f"Translation time: {elapsed_time:.1f}, Nr generated words: {count_words(output)}")


chatbot = generate_with_llama_chat(my_config)

In [None]:
snippet1 = "Medewerkers gaan we verder ontwikkelen in het toepassen van de menselijke maat in onze dienstverlening. Met ondersteuning van kunstmatige intelligentie gaan we onze brieven leesbaarder en begrijpelijker maken. De benadering is van buiten naar binnen: knelpunten die onze cliënten ervaren worden in kaart gebracht op basis van verschillende vormen van (klant)onderzoek en analyses."
snippet2 = "Het programma Innovatie ondersteunt initiatieven en oplosteams in het effectief organiseren van verbetertrajecten en het bedenken van vernieuwende oplossingen, met kennis over de laatste (technologische) ontwikkelingen en trends. Zo wordt onderzocht hoe kunstmatige intelligentie (zoals ChatGPT) ingezet kan worden, bijvoorbeeld bij het herschrijven van algemene teksten in tientallen brieven om de leesbaarheid te verbeteren. Aandacht voor innovatie en design thinking draagt ook bij aan de gewenste ontwikkeling van een lerende organisatie."
snippet_collection = snippet1 + "\n" + snippet2

In [None]:
question_dict_snippets = {
    1: "Wat wordt er over 'kunstmatige intelligentie' besproken?",
    2: "Geef me een samenvatting van het document.",
    #3: "Geef me een samenvatting van 'snippet2'.",
    #4: "Geef me een samenvatting van 'snippet1' and 'snippet2'.",
    #5: "Wat is de tekst die overeenkomt met 'snippet1'?",
    #6: "Wat is de tekst die overeenkomt met 'snippet2'?"
}

In [None]:
input_text = f"{question_dict_snippets[1]} in de volgende text tussen quotations: '{snippet_collection}'"
print(input_text)
output = get_answer(chatbot, input_text)

### 4.6. Code to load the Llama model (dutch or english) using __HuggingFaceHub__: (Does not work)
- __Error:__
  The model BramVanroy/Llama-2-13b-chat-dutch is too large to be loaded automatically (26GB > 10GB). Please use Spaces (https://huggingface.co/spaces) or Inference Endpoints (https://huggingface.co/inference-endpoints).

  - There are two options to circumvent this issue:
    1) Using Hugging Face Spaces
    2) Using Inference Endpoints 

In [None]:
import os
christos_hf_token = "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"
polpo_hf_token = "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"
os.environ["HUGGINGFACEHUB_API_TOKEN"] = polpo_hf_token

# Verify if the environment variable contains the right content
print(os.getenv("HUGGINGFACEHUB_API_TOKEN"))

In [None]:
from langchain_community.llms import HuggingFaceHub
from langchain.callbacks.manager import CallbackManager
from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler
from langchain.chains import LLMChain
from langchain.prompts import PromptTemplate
from langchain_community.llms import LlamaCpp

question = "Who won the FIFA World Cup in the year 1994? "
template = """Question: {question}
Answer: Let's think step by step."""

prompt = PromptTemplate.from_template(template)

llm = HuggingFaceHub(
    repo_id="BramVanroy/Llama-2-13b-chat-dutch", model_kwargs={"temperature": 0.5, "max_length": 64}
)
llm_chain = LLMChain(prompt=prompt, llm=llm)

print(llm_chain.invoke(question))

### 4.7. Code to uploading the model to my Hugging Face __Space__ (Does not work)
__Error:__ ```usage: huggingface-cli <command> [<args>]
huggingface-cli: error: argument {env,login,whoami,logout,repo,upload,download,lfs-enable-largefiles,lfs-multipart-upload,scan-cache,delete-cache}: invalid choice: 'space' (choose from 'env', 'login', 'whoami', 'logout', 'repo', 'upload', 'download', 'lfs-enable-largefiles', 'lfs-multipart-upload', 'scan-cache', 'delete-cache')```

In [None]:
#!pip install huggingface_hub
#huggingface-cli login
!huggingface-cli space push model my_BramVanroy_Dutch_model BramVanroy/Llama-2-13b-chat-dutch --organization polpo

In [None]:
!huggingface-cli upload model my_BramVanroy_Dutch_model BramVanroy/Llama-2-13b-chat-dutch --organization polpo

### 4.8. Loading the model using __AutoModelForCausalLM__ and using the AutoTokenizer and pipeline from transformers wrapping everything in __LangChain__ pipeline (called __HuggingFacePipeline__)
4.8.1. llm that can be used with prompting (This is prerequisite to run the rest from 4.8.2-4.8.5)  
4.8.2. General question-prompt about the EEG without input context  
4.8.3. Prompt asking to summarize the input text with __LangChain Stuff__ method  
4.8.4. Prompt asking to summarize the input text with __LangChain Map-Reduce__ method  
4.8.5. Prompt asking to summarize the input text with __LangChain Refine__ method

#### 4.8.1. llm that can be used with prompting
- !pip install transformers
- !pip install accelerate
- !pip install bitsandbytes

In [None]:
from transformers import pipeline, Conversation, AutoTokenizer, AutoModelForCausalLM
from langchain.llms import HuggingFacePipeline
from langchain_community.llms import HuggingFaceHub
from langchain.callbacks.manager import CallbackManager
from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler
#1: "meta-llama/Llama-2-13b-chat-hf",
#2: "BramVanroy/Llama-2-13b-chat-dutch"
my_config = {'model_name': "BramVanroy/Llama-2-13b-chat-dutch", #"./Bram", #BramVanroy/Llama-2-13b-chat-dutch", 
             'do_sample': True, 'temperature': 0.1, 
             'repetition_penalty': 1.1, 'max_new_tokens': 500, }

print(f"Selected model: {my_config['model_name']}")
print(f"Parameters are: {my_config}")

def generate_with_llama_chat(my_config):
    print('tokenizer')
    tokenizer = AutoTokenizer.from_pretrained(my_config['model_name'])
    print('causal')
    model = AutoModelForCausalLM.from_pretrained(my_config['model_name'])
    print('Pipeline')
    chatbot = pipeline("text-generation",model=my_config['model_name'], 
                       tokenizer=tokenizer,
                       do_sample=my_config['do_sample'], 
                       temperature=my_config['temperature'], 
                       repetition_penalty=my_config['repetition_penalty'],
                       #max_length=my_config['max_length'],
                       max_new_tokens=my_config['max_new_tokens'], 
                       model_kwargs={"device_map": "auto","load_in_8bit": True})
    return chatbot

llama_chat = generate_with_llama_chat(my_config)

# Set up callback manager to print output word by word
callback_manager = CallbackManager([StreamingStdOutCallbackHandler()])

llm = HuggingFacePipeline(pipeline=llama_chat, callback_manager=callback_manager)

#### 4.8.2. General question-prompt about the EEG without input context

In [None]:
from langchain.prompts import PromptTemplate

template = """Question: {question}

Answer: Let's think step by step."""
prompt = PromptTemplate.from_template(template)

chain = prompt | llm

question = "What is electroencephalography?"

print(chain.invoke({"question": question}))

#### 4.8.3. Prompt asking to summarize the input text with __LangChain -> Stuff__ method
- You can specify your own prompt_template_ (A,B,C, ...) but be sure to also change the code below so that the correct template is executed:
- ```prompt = PromptTemplate(template=prompt_templateB, input_variables=["text"])```

In [None]:
from langchain.chains.summarize import load_summarize_chain
from langchain.prompts import PromptTemplate
from langchain.document_loaders import PyPDFLoader

prompt_template_A = """Write a concise summary of the following text delimited by triple backquotes.
              Return your response in bullet points which covers the key points of the text.
              ```{text}```
              BULLET POINT SUMMARY:
  """

prompt_template_B = """Write a concise summary of the following text delimited by triple backquotes.
              ```{text}```
              HELPFULL SUMMARY:
  """
prompt = PromptTemplate(template=prompt_template_B, input_variables=["text"])

#1) stuff_chain only works when entire document fits into the n_ctx (number of context tokens)
stuff_chain = load_summarize_chain(llm= llm, chain_type="stuff",prompt=prompt)

loader = PyPDFLoader("KI_en.pdf")
pages = loader.load()

try:
    output = stuff_chain.invoke(pages)
    print(output)
except Exception as e:
    print(
        "The code failed since it won't be able to run inference on such a huge context and throws this exception: ",
        e,
    )

#### 4.8.4. Prompt asking to summarize the input text with __LangChain -> Map-Reduce__ method
- Result: As can be seen in the table below the concise_summary for the first document (Greek History) gives gibberish output, speaking about what a summary is but actually not giving the summary about the first input chunk. The second (Belgium History) and third (Dutch History) chunk, do give descent results but i have the feeling the last summary is truncated.

 
  |index | file_name             | file_type	| page_number	|         chunks                                    | concise_summary                                 |
  |------|-----------------------|--------------|---------------|---------------------------------------------------|-------------------------------------------------|
  | 0	 | history2_gr_be_nl	 | .pdf         |     0	        | Greek history spans thousands of years and is ... | Summaries are written in complete sentences t...|
  | 1    | history2_gr_be_nl	 | .pdf	        |     1	        | Belgian history is rich and complex, character... | Summaries of Belgian History.\n\nHere are som...|
  | 2	 | history2_gr_be_nl	 | .pdf         |     2         | The history of the Netherlands is rich and div... | Here are three summaries based on the provid... |

- I also tried to add the Greek history another time to the document, (so full pdf has 4 pages, Gr, Nl, Be and Gr history snippet) in the hope to see a summary of the last Gr chunk. But i got the same gibberish output as on the pdf with 3 pages.
- Reading in the be_gr_nl history pdf instead of the gr_be_nl one gives now for the second chunk (Gr History) the gibberish answer.
- CONCLUSION1: This means that there is something wrong with the Greek history input that leads to a gibberish answer.
- CONCLUSION2: ```UserWarning: You seem to be using the pipelines sequentially on GPU. In order to maximize efficiency please use a dataset``` Some research is needed in order to benefit from the parallel offload of the map process. All the subdocuments could in principle be processed in parallel and the reduce step could collect the final output from the reduce step.

In [None]:
from langchain.chains import MapReduceDocumentsChain, ReduceDocumentsChain
from langchain.text_splitter import CharacterTextSplitter, TokenTextSplitter
from langchain.chains.combine_documents.stuff import StuffDocumentsChain
from langchain.chains import LLMChain
from langchain.document_loaders import PyPDFLoader
from langchain.callbacks.manager import CallbackManager
from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler
from langchain.prompts import PromptTemplate
#from datasets import Dataset

loader = PyPDFLoader("history1_gr_be_nl.pdf")
#loader = PyPDFLoader("history1_be_gr_nl.pdf")
docs = loader.load()

# To create a list of strings from each doc in docs you can apply the following peice of code
#split_text = [doc.page_content for doc in docs]

#---------------------------------------------------------------------------------------------
# Map
map_template = """The following is a set of documents
{docs}
Based on this list of docs, please make summaries. Do not summarize the document when there are no full sentences in the document. 
Helpful Answer:"""
map_prompt = PromptTemplate.from_template(map_template)
map_chain = LLMChain(llm=llm, prompt=map_prompt)

#---------------------------------------------------------------------------------------------
# Reduce
#"""The following is set of summaries:
#{docs}
#Take the above summaries and combine them into a final summary 
#Helpful Answer: 
#"""
reduce_template = """The following is set of summaries:
{doc_maps}
Take these and distill it into a final, consolidated summary of the main themes. 
Helpful Answer:"""
reduce_prompt = PromptTemplate.from_template(reduce_template)
reduce_chain = LLMChain(llm=llm, prompt=reduce_prompt)

#---------------------------------------------------------------------------------------------
# Takes a list of documents, combines them into a single string, and passes this to an LLMChain
combine_documents_chain = StuffDocumentsChain(
    llm_chain=reduce_chain, document_variable_name="doc_maps"
)

# Combines and iteratively reduces the mapped documents
reduce_documents_chain = ReduceDocumentsChain(
    # This is final chain that is called.
    combine_documents_chain=combine_documents_chain,
    # If documents exceed context for `StuffDocumentsChain`
    collapse_documents_chain=combine_documents_chain,
    # The maximum number of tokens to group documents into.
    token_max=4096
)

#---------------------------------------------------------------------------------------------
# Combining summaries by mapping a chain over them, then combining results
map_reduce_chain = MapReduceDocumentsChain(
    # Map chain
    llm_chain=map_chain,
    # Reduce chain
    reduce_documents_chain=reduce_documents_chain,
    # The variable name in the llm_chain to put the documents in
    document_variable_name="docs",
    # Return the results of the map steps in the output
    return_intermediate_steps=True
)

text_splitter = TokenTextSplitter(
    chunk_size=800, chunk_overlap=0
)

# For .txt files you can use the following text splitter
#split_text = text_splitter.split_text(text_input)
#print(f"Number of splits= {len(split_text)}")

# For pdf documents you can use the following documents splitter
split_docs = text_splitter.split_documents(docs)
print(f"Number of splits= {len(split_docs)}")

In [None]:
import time
start_time = time.time()
map_reduce_output1 = map_reduce_chain.invoke(split_docs)
end_time = time.time()
elapsed_time = end_time - start_time
print(f"\n\nElapsed time:  {elapsed_time}")

In [None]:
# load the create_pd_from_map_reduce_output from section 4.4
pdf_mp_summary = create_pd_from_map_reduce_output(map_reduce_output1)

# code that prints the output in a human readable format 
for i in range(len(pdf_mp_summary)):
   print(f"\n ========================================\n{pdf_mp_summary.iloc[i]['chunks']} \n+++++++++++++++++++++++\n{pdf_mp_summary.iloc[i]['concise_summary']}" )

In [None]:
pdf_mp_summary

In [None]:
print(map_reduce_output1['output_text'])

#### 4.8.5. Prompt asking to summarize the input text with __LangChain -> Refine__ method
- (https://python.langchain.com/docs/use_cases/summarization)
The refine documents chain constructs a response by looping over the input documents and iteratively updating its answer. For each document, it passes all non-document inputs, the current document, and the latest intermediate answer to an LLM chain to get a new answer.

In [None]:
# to be tested/adjusted to my needs:
from langchain.chains.summarize import load_summarize_chain
from langchain.document_loaders import PyPDFLoader
from langchain.text_splitter import CharacterTextSplitter, TokenTextSplitter

loader = PyPDFLoader("history2_be_gr_nl.pdf")
docs = loader.load()

question_prompt_template = """
                  Please provide a summary of the following text.
                  TEXT: {text}
                  SUMMARY:
                  """

question_prompt = PromptTemplate(
    template=question_prompt_template, input_variables=["text"]
)

refine_prompt_template = """
              Write a concise summary of the following text delimited by triple backquotes.
              Return your response in bullet points which covers the key points of the text.
              ```{text}```
              BULLET POINT SUMMARY:
              """

refine_prompt = PromptTemplate(
    template=refine_prompt_template, input_variables=["text"]
)

refine_chain = load_summarize_chain(
    llm= llm,
    chain_type="refine",
    question_prompt=question_prompt,
    refine_prompt=refine_prompt,
    return_intermediate_steps=True,
)

text_splitter = TokenTextSplitter(
    chunk_size=800, chunk_overlap=0
)

# For .txt files you can use the following text splitter
#split_text = text_splitter.split_text(text_input)
#print(f"Number of splits= {len(split_text)}")

# For pdf documents you can use the following documents splitter
split_docs = text_splitter.split_documents(docs)
print(f"Number of splits= {len(split_docs)}")

refine_outputs = refine_chain.invoke({"input_documents": split_docs})

In [None]:
# load the create_pd_from_map_reduce_output from section 4.4
pdf_mp_summary = create_pd_from_map_reduce_output(refine_outputs)

# code that prints the output in a human readable format 
for i in range(len(pdf_mp_summary)):
   print(f"\n ========================================\n{pdf_mp_summary.iloc[i]['chunks']} \n+++++++++++++++++++++++\n{pdf_mp_summary.iloc[i]['concise_summary']}" )

In [None]:
refine_outputs

In [None]:
# to be tested/adjusted to my needs:
from langchain.chains.summarize import load_summarize_chain

prompt_template = """Write a concise summary of the following:
{text}
CONCISE SUMMARY:"""
prompt = PromptTemplate.from_template(prompt_template)

refine_template = (
    "Your job is to produce a text\n"
    "We have extracted the first summary from a documnet: {existing_answer}\n"
    "We have the opportunity to refine the existing summary"
    "(only if needed) with some more context below.\n"
    "------------\n"
    "{text}\n"
    "------------\n"
    "Given the new context, refine the original summary."
    "If the context isn't useful, return the original summary."
)
refine_prompt = PromptTemplate.from_template(refine_template)

chain = load_summarize_chain(
    llm=llm,
    chain_type="refine",
    question_prompt=prompt,
    refine_prompt=refine_prompt,
    return_intermediate_steps=True,
    input_key="input_documents",
    output_key="output_text",
)

text_splitter = TokenTextSplitter(
    chunk_size=200, chunk_overlap=0
)

split_docs = text_splitter.split_documents(docs)

result = chain.invoke({"input_documents": split_docs}, return_only_outputs=True)

### 4.9. LlamaCpp:  
Subsections are:  
4.9.1. Llama - Cpp: __llm_ instance created with __gguf-type__ file   
4.9.2. Llama - Cpp: __Stuff Chain__ (entire input document fits into the llm, no need to split input text)  
4.9.3. Llama Dutch Cpp: llm instance from __locally saved__ model under /workspace/Bram/  

__Parameter explanation:__
  - __temperature=0.1__,   _(default = 0.8)_
    <br>Higher temperatures lead to more diverse and varied outputs, as tokens with lower probabilities are more likely to be sampled. Conversely, lower temperatures produce more conservative and deterministic outputs, favoring tokens with higher probabilities.
  - __max_tokens=100__,    _(default = 256)_ 
  <br>The maximum number of tokens considered to generate an answer.After generating the specified number of tokens, the model stops generating additional tokens, ensuring that the output remains within the desired length. It helps prevent the model from generating overly verbose or irrelevant responses.
  - __n_ctx=4096__,         _(default = 512)_
    <br>It determines the size of the context window or the maximum number of tokens that the model considers when generating or predicting the next token in a sequence. It controls how much past information the model can consider when making predictions.
  - __repeat_penalty=1.5__,  _(default =1.1)_
    <br>Values between 1.1 and 1.5 are often used to apply a moderate penalty to repeated tokens. Values between 1.5 and 2.0 impose a stronger penalty on repeated tokens, leading to even greater diversity in the generated text. This can be useful when generating longer texts or when minimizing redundancy is a priority. Values greater than 2.0 apply a very strong penalty to repeated tokens, resulting in highly diverse output with minimal repetition. However, setting the repeat_penalty too high may risk reducing the coherence or naturalness of the generated text.
  - __last_n_tokens_size = 20__  _(default = 64)_
    <br>A larger value means that the model will consider a longer context when determining if a token should be penalized for repetition or not.

#### 4.9.1. Llama - Cpp: llm instance created with gguf-type file 

In [None]:
from langchain.callbacks.manager import CallbackManager
from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler
from langchain.chains import LLMChain
from langchain.prompts import PromptTemplate
from langchain_community.llms import LlamaCpp

# Callbacks support token-wise streaming
callback_manager = CallbackManager([StreamingStdOutCallbackHandler()])

n_gpu_layers = -1  # The number of layers to put on the GPU. The rest will be on the CPU. If you don't know how many layers there are, you can use -1 to move all to GPU.
n_batch = 512  # Should be between 1 and n_ctx, consider the amount of VRAM in your GPU.

# Make sure the model path is correct for your system!
llm = LlamaCpp(
    model_path="./llama-2-13b.Q6_K.gguf",
    n_gpu_layers=n_gpu_layers,
    n_batch=n_batch,
    callback_manager=callback_manager,
    verbose=False,  # Verbose is required to pass to the callback manager
    temperature=0.1, 
    max_tokens=50,
    n_ctx=4096,
    repeat_penalty=1.2,
    last_n_tokens_size = 96
)

#### 4.9.2. Llama Cpp: Stuff Chain (entire input document fits into the llm, no need to split input text):
- When the __max_tokens__ parameter of the llm instance > nr of input tokens of the input text --> Full input-text is given as the output (no summarization is made)
- When __max_tokens__ < nr of input tokens --> answer = input text truncated at the number of tokens specified by max_tokens

In [None]:
from langchain.chains.summarize import load_summarize_chain
from langchain.prompts import PromptTemplate
from langchain.document_loaders import PyPDFLoader

prompt_template = """Write a concise summary of the following text delimited by triple backquotes.
              Return your response in bullet points which covers the key points of the text.
              ```{text}```
              BULLET POINT SUMMARY:
  """
prompt = PromptTemplate(template=prompt_template, input_variables=["text"])

#1) stuff_chain only works when entire document fits into the n_ctx (number of context tokens)
stuff_chain = load_summarize_chain(llm= llm, chain_type="stuff",prompt=prompt)

loader = PyPDFLoader("KI_en.pdf")
pages = loader.load()

try:
    output = stuff_chain.invoke(pages)
    print(output)
except Exception as e:
    print(
        "The code failed since it won't be able to run inference on such a huge context and throws this exception: ",
        e,
    )

#### 4.9.3. Llama Dutch Cpp: llm instance from __locally saved model__ under /workspace/Bram/
- __Error__: gguf_init_from_file: invalid magic characters ''
llama_model_load: error loading model: llama_model_loader: failed to load model from /workspace/Bram/

In [None]:
from langchain.callbacks.manager import CallbackManager
from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler
from langchain.chains import LLMChain
from langchain.prompts import PromptTemplate
from langchain_community.llms import LlamaCpp

# Callbacks support token-wise streaming
callback_manager = CallbackManager([StreamingStdOutCallbackHandler()])

n_gpu_layers = -1  # The number of layers to put on the GPU. The rest will be on the CPU. If you don't know how many layers there are, you can use -1 to move all to GPU.
n_batch = 512  # Should be between 1 and n_ctx, consider the amount of VRAM in your GPU.

# Make sure the model path is correct for your system!
llm = LlamaCpp(
    model_path="/workspace/Bram/",
    n_gpu_layers=n_gpu_layers,
    n_batch=n_batch,
    callback_manager=callback_manager,
    verbose=False,  # Verbose is required to pass to the callback manager
    temperature=0.1, 
    max_tokens=200,
    n_ctx=4096,
    repeat_penalty=1.2,
    last_n_tokens_size = 96
)


### 4.10. Login to hugging face

In [4]:
# add this token in the Token user-input: "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"
#!pip install huggingface_hub
from huggingface_hub import login
login(token="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") # christos token

# developers@polpo.nl passwd: Polpoai2024@
#polpo_hf_token="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"

Token will not been saved to git credential helper. Pass `add_to_git_credential=True` if you want to set the git credential as well.
Token is valid (permission: write).
Your token has been saved to /root/.cache/huggingface/token
Login successful


In [None]:
# add this token in the Token user-input: "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"
#!pip install huggingface_hub
from huggingface_hub import login
login(token=polpo_hf_token)

### 4.11. Extract nth sentence of the document splits and concatenate them into final output
- This was a test to see if Map-Reduce is working. SO the map is extracting the n-th sentence from each subdocument. And the Reduce was to concatenate all those sentences together to one final output. This idea was not fully tested

In [None]:
from langchain.chains import MapReduceDocumentsChain, ReduceDocumentsChain
from langchain.text_splitter import CharacterTextSplitter, TokenTextSplitter
from langchain.chains.combine_documents.stuff import StuffDocumentsChain

from langchain.document_loaders import PyPDFLoader
loader = PyPDFLoader("LandelijkDekkendNetwerk.pdf")
docs = loader.load()

#---------------------------------------------------------------------------------------------
# Map
map_template = """Extract the second sentence from each document split:
{docs}
"""
map_prompt = PromptTemplate.from_template(map_template)
map_chain = LLMChain(llm=llm, prompt=map_prompt)

#---------------------------------------------------------------------------------------------
# Reduce
#"""The following is set of summaries:
#{docs}
#Take the above summaries and combine them into a final summary 
#Helpful Answer: 
#"""
reduce_template = """Concatenate sentences togeter into a single text:
{doc_maps}
"""
reduce_prompt = PromptTemplate.from_template(reduce_template)
reduce_chain = LLMChain(llm=llm, prompt=reduce_prompt)

#---------------------------------------------------------------------------------------------
# Takes a list of documents, combines them into a single string, and passes this to an LLMChain
combine_documents_chain = StuffDocumentsChain(
    llm_chain=reduce_chain, document_variable_name="doc_maps"
)

# Combines and iteratively reduces the mapped documents
reduce_documents_chain = ReduceDocumentsChain(
    # This is final chain that is called.
    combine_documents_chain=combine_documents_chain,
    # If documents exceed context for `StuffDocumentsChain`
    collapse_documents_chain=combine_documents_chain,
    # The maximum number of tokens to group documents into.
    token_max=512,
)

#---------------------------------------------------------------------------------------------
# Combining summaries by mapping a chain over them, then combining results
map_reduce_chain = MapReduceDocumentsChain(
    # Map chain
    llm_chain=map_chain,
    # Reduce chain
    reduce_documents_chain=reduce_documents_chain,
    # The variable name in the llm_chain to put the documents in
    document_variable_name="docs",
    # Return the results of the map steps in the output
    return_intermediate_steps=False,
)

text_splitter = TokenTextSplitter(
    chunk_size=512, chunk_overlap=0
)

# For .txt files
# text_splitter.split_text(text_input)

split_docs = text_splitter.split_documents(docs)

# 5. Instructions on how to setup a __Streamlit Based Demo in Runpod__  
5.1. Installs streamlit dependencies  
5.2. Install nodejs and npx using the command line interpretor  
5.3. Writes the streamlit code for the app  

## 5.1. Installs streamlit dependencies

In [None]:
!pip install --upgrade transformers
!pip install streamlit 

## 5.2. Install nodejs and npx using the command line interpretor  



__In order to be able to tunnel the streamlit app you need to install nodejs and npx.__ I the following you will see how to do that.
To open the commandline terminal, open a new tab (this will open a tab called Launcher) and choose under "Other" --> "Terminal" and enter the following commands:
- root@49364ccf316b:/workspace# ```apt update```
- root@49364ccf316b:/workspace# ```apt install nodejs npm ```  

To check if nodejs is installed you can do (you should get the version number displayed in the output):
- root@49364ccf316b:/workspace# ```node --version```
- root@49364ccf316b:/workspace# ```npm --version```

Installing localtunnel using npx:
- root@49364ccf316b:/workspace# ```npm install -g localtunnel```

You can try to run the localtunnel by the following command:
- root@49364ccf316b:/workspace# ```npx localtunnel --port 8501```  
- click the link http link that is provided to you in the output
  
If the new webpage opens and a password is asked, you can retrieve the passwd from the commandline using the following command:
- root@49364ccf316b:/workspace# ```curl https://loca.lt/mytunnelpassword && echo```
- or try this one: ```46.227.68.162```

To see if any streamlit app is still running:
- root@49364ccf316b:/workspace# ```ps aux | grep streamlit```
  | USER    | PID  | %CPU | %MEM | VSZ (KB) | RSS (KB) | TTY   | STAT | TIME     | Duration | COMMAND         |    APPLICTION             | CMD | NAME                      |
  |---------|------|------|------|----------|----------|-------|------|----------|----------|-----------------|---------------------------|-----|---------------------------|
  | root    | 5422 | 0.3  | 0.0  | 10598864 | 494092   |   ?   |  Sl  |  10:23   | 0:06     | /usr/bin/python | /usr/local/bin/streamlit  | run | LlamaDutchDemoApp.py      |
  | root    | 5615 | 0.1  | 0.0  | 3121308  | 131696   |   ?   |  Sl  |  10:27   | 0:02     | /usr/bin/python | /usr/local/bin/streamlit  | run | LlamaDutchDemoApp.py      |
  | root    | 6692 | 0.0  | 0.0  |  3840    | 1928     | pts/1 |  S+  |  10:55   | 0:00     | grep --color=auto|  streamlit               |     |                           |
  
To kill any running streamlit app:
- root@49364ccf316b:/workspace# ```kill -9 <PID>``` (replace the <PID> with the corresponding number from the __ps aux__ table)  
  


## 5.3. Writes the streamlit code for the app & Run it
The execution of this cell, creates the __LlamaDutchDemoApp.py__ file on the fly in the working directory  
5.3.1. __LlamaDutchDemoApp.py__: Create the LlamaDutchDemoApp.py code (No LangChain)  
   > 5.3.1.1. Prompt stuff  
   > 5.3.1.2. Demo where the operator can change the promt question (code generator App.py)  
   > 5.3.1.3. Demo with predefined promt (More for Customer exposure) (code generator App.py)  

5.3.2. __Run__: the app  
   > 5.3.2.1. Using __tunneling__  
   > 5.3.2.2. Using __ngrok__  

5.3.3. __LlamaDutchDemoApp.py__ with LangChain  

### 5.3.1. __LlamaDutchDemoApp.py__:Create the LlamaDutchDemoApp.py code 
> 5.3.1.1. Prompt stuff  
> 5.3.1.2. Demo where the operator can change the promt question (code generator App.py)  
> 5.3.1.3. Demo with predefined promt (More for Customer exposure) (code generator App.py)  


#### 5.3.1.1. Prompt stuff: Example on how to use the prompt.


In [None]:
print("""
<s>[INST] <<SYS>>
You are a helpful, respectful and honest assistant. Always answer as helpfully as possible, while being safe.  Your answers should not include any harmful, unethical, racist, sexist, toxic, dangerous, or illegal content.  Please ensure that your responses are socially unbiased and positive in nature.
If a question does not make any sense, or is not factually coherent, explain why instead of answering something not correct.  If you don't know the answer to a question, please don't share false information.
«/SYS»

Write a summary  in a couple of sentences for the following article. 

Article:  "The history of the Netherlands is rich and diverse, spanning thousands of years.  Early Settlements: The region that is now the Netherlands has been inhabited since prehistoric times. During the Roman era, it was part of the Roman Empire's frontier region. Middle Ages: In the early Middle Ages, the Franks established control over the region. The Netherlands gradually emerged as a distinct entity, with the development of feudal states and the growth of trade and commerce. Golden Age (17th Century): The 17th century is often referred to as the Dutch Golden Age. During this time, the Netherlands experienced a period of economic prosperity, cultural flourishing, and naval dominance. The Dutch East India Company and Dutch West India Company were established, and Amsterdam became a leading financial center. Colonial Empire: The Dutch established colonies and trading posts around the world, including in the East Indies (present-day Indonesia), Suriname, and the Caribbean. The Dutch colonial empire was significant but eventually declined over time. Napoleonic Era: In the late 18th and early 19th centuries, the Netherlands fell under French control during the Napoleonic Wars. It later became part of the French Empire. Independence and Kingdom: The Netherlands gained independence from France in 1815 and became a kingdom under King William I. Belgium initially formed part of the Kingdom of the Netherlands but later separated in 1830. Industrialization and Modernization: The 19th century saw rapid industrialization and modernization in the Netherlands. The country became known for its innovations in trade, shipping, and agriculture. World Wars: The Netherlands remained neutral during World War I but was invaded by Nazi Germany in World War II. The country suffered under German occupation but played a role in the Allied liberation of Europe. Post-War Reconstruction: After World War II, the Netherlands underwent a period of reconstruction and economic recovery. It became a founding member of international organizations such as the United Nations and the European Union. Contemporary Era: In recent decades, the Netherlands has become known for its progressive social policies, strong economy, and commitment to environmental sustainability. It continues to be a leading global player in areas such as trade, technology, and diplomacy. This summary provides a broad overview of Dutch history, highlighting key moments and themes that have shaped the nation's identity and development over time." [/INST]""")

In [9]:
prompt = """<s>[INST] <<SYS>>
You are a helpful, respectful and honest assistant. Always answer as helpfully as possible, while being safe.  
Your answers should not include any harmful, unethical, racist, sexist, toxic, dangerous, or illegal content.  
Please ensure that your responses are socially unbiased and positive in nature.
If a question does not make any sense, or is not factually coherent, explain why instead of answering something not correct.  
If you don't know the answer to a question, please don't share false information.
«/SYS» 
Write a summary  in a couple of sentences for the following article.

Article:  \n{BODY}"""

In [None]:
text_nl = "The history of the Netherlands can be traced back to ancient civilizations like the Romans who ruled the area before becoming an independent state after gaining freedom from Napoleon Bonaparte at the end of WWI. Since then they have gone on to form one of most powerful economies globally due their focus on technological advancements along with other factors which helped shape what we see today -a thriving society full potential opportunities waiting those willing take risks!"
text_gr = "Greek history spans thousands of years and is marked by significant contributions to Western civilization, including democracy, philosophy, art, and literature. Ancient Greece: Ancient Greek civilization emerged around the 8th century BC and was comprised of city-states such as Athens, Sparta, Corinth, and Thebes. This period saw the rise of democracy in Athens, where citizens participated in governance, and the development of philosophy by figures like Socrates, Plato, and Aristotle. Greek art and architecture, exemplified by the Parthenon in Athens, also flourished during this time. The city-states often engaged in conflicts with each other, most notably the Peloponnesian War between Athens and Sparta. Hellenistic Period: After the conquests of Alexander the Great in the 4th century BC, Greek culture spread throughout the Mediterranean and Middle East, creating a new era known as the Hellenistic period. Greek language, art, and philosophy influenced cultures across the region, including Egypt and Persia. Roman Greece: Greece became part of the Roman Empire after the defeat of the Greek city-states in the 2nd century BC. During this time, Greece continued to be an important center of culture and learning, with cities like Corinth and Athens remaining influential. Byzantine Empire: Following the division of the Roman Empire, Greece became part of the Byzantine Empire, centered in Constantinople (modern-day Istanbul). The Byzantine period saw the spread of Christianity and the construction of numerous churches and monasteries across Greece. Ottoman Rule: Greece fell under Ottoman rule in the 15th century after the fall of Constantinople. The Greeks struggled for independence from Ottoman rule for centuries, culminating in the Greek War of Independence in the early 19th century. Modern Greece: The Greek War of Independence began in 1821 and eventually led to the establishment of the modern Greek state in 1830, although some territories, including Crete and the Ionian Islands, were not incorporated until later. The monarchy was established, and Otto of Bavaria became the first king of Greece. 20th Century: Greece experienced political instability throughout much of the 20th century, including periods of monarchy, dictatorship, and democratic rule. Greece was occupied by Axis powers during World War II, and a brutal civil war followed the war's end. In 1974, Greece transitioned to democracy after the fall of the military junta. European Union and Economic Challenges: Greece joined the European Union in 1981 and adopted the euro as its currency in 2001. However, the country faced significant economic challenges in the late 2000s, leading to a sovereign debt crisis and bailout agreements with the EU and International Monetary Fund. Modern Greece: Today, Greece is a parliamentary republic and a member of the European Union. It remains a popular tourist destination known for its rich history, stunning landscapes, and cultural heritage. However, it continues to grapple with economic issues and challenges related to goverment."
text_be = "Belgian history is rich and complex, characterized by its strategic location in Western Europe and its cultural diversity. Early History: The region now known as Belgium has been inhabited since prehistoric times. It was later settled by Celtic and Germanic tribes before coming under Roman rule in the first century BC. The area flourished during Roman times as part of the province of Gallia Belgica. Medieval Period: After the fall of the Roman Empire, the region was invaded and settled by various Germanic tribes. In the early Middle Ages, it became part of the Frankish Empire. During this period, the area saw the rise of powerful feudal lords and the emergence of important trading cities like Ghent, Bruges, and Antwerp. Burgundian and Habsburg Rule: In the 15th century, the Burgundian dukes gained control of much of present-day Belgium. This period saw the flourishing of arts and culture, but also increased centralization of power. The region later came under Habsburg rule as part of the Spanish and Austrian Netherlands. Dutch Independence: In the 16th and 17th centuries, the Dutch Revolt against Spanish rule led to the independence of the northern provinces of the Netherlands. However, the southern provinces, including present-day Belgium, remained under Spanish control until they were conquered by France in the late 17th century. French Rule: Belgium became part of France under Napoleon Bonaparte's rule in the early 19th century. During this time, French revolutionary ideals influenced Belgian society and politics.Independence and Kingdom of Belgium: Following the defeat of Napoleon, the Congress of Vienna in 1815 united the southern provinces with the northern provinces to form the United Kingdom of the Netherlands. However, tensions between the Dutch-speaking north and the French-speaking south led to the Belgian Revolution in 1830. Belgium declared independence and established a constitutional monarchy, with Leopold I as its first king. Industrialization and Colonialism: Throughout the 19th century, Belgium experienced rapid industrialization, particularly in coal mining and steel production. It also established a colonial empire in Africa, notably in the Congo, which was famously exploited under King Leopold II's rule. 20th Century: Belgium was heavily impacted by both World Wars, particularly during World War I when it served as a battleground. The country was occupied by Germany during World War II. After the war, Belgium played a key role in the founding of the European Coal and Steel Community, a precursor to the European Union. Modern Belgium: Belgium has since become a prosperous and democratic country, known for its multiculturalism, chocolate, beer, and waffles. However, it continues to grapple with linguistic and political tensions between the Dutch-speaking Flanders region and the French-speaking Wallonia region, as well as issues related to regional autonomy and identity."
print(prompt.format(BODY=text_nl))
print('============================')
print(prompt.format(BODY=text_gr))
print('============================')
print(prompt.format(BODY=text_be))

####  5.3.1.2. Demo where the operator can change the promt question (code generator App.py)   
This demo is more meant for an operator that wants to play with different prompts:
This demo contains 2 input fields:  
- text to summarize  
- the question to be added to the prompt

In [22]:
%%writefile App.py
# This is just a very small test app to see if streamlit in runpod is actually working or not.
import re
import time
import streamlit as st
from transformers import pipeline, Conversation, AutoTokenizer

#my_config = {'do_sample': True, 'temperature': 0.1, 
#             'repetition_penalty': 1.1, 'max_new_tokens': 500}
#"BramVanroy/Llama-2-13b-chat-dutch"
prompt = """<s>[INST] <<SYS>>
You are a helpful, respectful and honest assistant. Always answer as helpfully as possible, while being safe.  
Your answers should not include any harmful, unethical, racist, sexist, toxic, dangerous, or illegal content.  
Please ensure that your responses are socially unbiased and positive in nature.
If a question does not make any sense, or is not factually coherent, explain why instead of answering something not correct.  
If you don't know the answer to a question, please don't share false information.
«/SYS» 
{PersonalPrompt} 

Article:  "{BODY}" [/INST]"""

model_name = "BramVanroy/Llama-2-13b-chat-dutch"
def generate_with_llama_dutch(model_name):
    print("Into generate_with_llama_dutch")
    time_load_model_start = time.time()
    
    # Load the model and tokenizer outside of the functions
    llm = pipeline("conversational", 
                   model=model_name, 
                   tokenizer=AutoTokenizer.from_pretrained(model_name),
                   do_sample=True, 
                   temperature=0.1, 
                   repetition_penalty=1.3,
                   max_new_tokens=512,
                   model_kwargs={"device_map": "auto","load_in_8bit": True}
                  )
    time_load_model_end = time.time()
    loading_time = time_load_model_end - time_load_model_start
    print(f"Elapsed time to load the model: {loading_time:.2f} sec")
    return llm, loading_time  

def get_answer(chatbot, input_text):
    try:
        start_time = time.time()
        print(f"Processing the input\n {input_text}\n")
        print('Processing the answer....')
        
        conversation = Conversation(input_text)
        print(f"Conversation is: {conversation}\n")
        output = (chatbot(conversation))[1]['content']
        print(f"Output is:\n==========\n {output}\n")
        # Calculate elapsed time and add it to the output
        end_time = time.time()
        elapsed_time = end_time - start_time
        output += f"\n  ---> Answered in {elapsed_time:.1f} seconds, Nr generated words: {count_words(output)}"
        
        return output
    except Exception as e:
        st.error(f"Error processing input: {e}")
        return None
        
    
def count_words(text):
    # Use a simple regular expression to count words
    words = re.findall(r'\b\w+\b', text)
    return len(words)

if "model" not in st.session_state.keys():
    st.write(f"Loading {model_name}..., Please wait." )
    # Initialize the model with the default option
    #st.session_state["model_name"] = "BramVanroy/Llama-2-13b-chat-dutch"
    #my_config.update({'model_name': st.session_state['model_name']})
    llm_chatbot, loading_time = generate_with_llama_dutch(model_name)
    st.session_state["model"] = llm_chatbot
    if loading_time < 60:
        st.write(f"Loading time: {loading_time:.1f} sec.")
    else:
        st.write(f"Loading time: {loading_time/60:.1f} min.")
    
    
# Text area to input text
text = st.text_area("Enter text to summarize here.")
pprompt = st.text_area("Write your question/prompt here.")

if text and pprompt:
    
    # Display the model and input text
    new_prompt = prompt.format(PersonalPrompt=pprompt,BODY=text)
    print(f"The follwoing prompt is used: {new_prompt}")

    st.write("Generating answer...")
    
    out = get_answer(st.session_state["model"], new_prompt)
    st.write(out)


Overwriting App.py


#### 5.3.1.3. Demo with predefined promt (More for Customer exposure) (code generator App.py)  
- Input text field in the app, contains only the text that will be summarized, based on the predefined internal prompt

In [5]:
%%writefile App.py
# This is just a very small test app to see if streamlit in runpod is actually working or not.
import re
import time
import streamlit as st
from transformers import pipeline, Conversation, AutoTokenizer

#my_config = {'do_sample': True, 'temperature': 0.1, 
#             'repetition_penalty': 1.1, 'max_new_tokens': 500}
#"BramVanroy/Llama-2-13b-chat-dutch"
prompt = """<s>[INST] <<SYS>>
You are a helpful, respectful and honest assistant. Always answer as helpfully as possible, while being safe.  
Your answers should not include any harmful, unethical, racist, sexist, toxic, dangerous, or illegal content.  
Please ensure that your responses are socially unbiased and positive in nature.
If a question does not make any sense, or is not factually coherent, explain why instead of answering something not correct.  
If you don't know the answer to a question, please don't share false information.
«/SYS» 
Write a summary  in a couple of sentences for the following article. 
Article:  "{BODY}" [/INST]"""

model_name = "BramVanroy/Llama-2-13b-chat-dutch"
def generate_with_llama_dutch(model_name):
    print("Into generate_with_llama_dutch")
    time_load_model_start = time.time()
    
    # Load the model and tokenizer outside of the functions
    llm = pipeline("conversational", 
                   model=model_name, 
                   tokenizer=AutoTokenizer.from_pretrained(model_name),
                   do_sample=True, 
                   temperature=0.1, 
                   repetition_penalty=1.3,
                   max_new_tokens=512,
                   model_kwargs={"device_map": "auto","load_in_8bit": True}
                  )
    time_load_model_end = time.time()
    loading_time = time_load_model_end - time_load_model_start
    print(f"Elapsed time to load the model: {loading_time:.2f} sec")
    return llm, loading_time  

def get_answer(chatbot, input_text):
    try:
        start_time = time.time()
        print(f"Processing the input\n {input_text}\n")
        print('Processing the answer....')
        
        conversation = Conversation(input_text)
        print(f"Conversation is: {conversation}\n")
        output = (chatbot(conversation))[1]['content']
        print(f"Output is:\n==========\n {output}\n")
        # Calculate elapsed time and add it to the output
        end_time = time.time()
        elapsed_time = end_time - start_time
        output += f"\n  ---> Answered in {elapsed_time:.1f} seconds, Nr generated words: {count_words(output)}"
        
        return output
    except Exception as e:
        st.error(f"Error processing input: {e}")
        return None
        
    
def count_words(text):
    # Use a simple regular expression to count words
    words = re.findall(r'\b\w+\b', text)
    return len(words)

if "model" not in st.session_state.keys():
    st.write(f"Loading {model_name}..., Please wait." )
    # Initialize the model with the default option
    #st.session_state["model_name"] = "BramVanroy/Llama-2-13b-chat-dutch"
    #my_config.update({'model_name': st.session_state['model_name']})
    llm_chatbot, loading_time = generate_with_llama_dutch(model_name)
    st.session_state["model"] = llm_chatbot
    if loading_time < 60:
        st.write(f"Loading time: {loading_time:.1f} sec.")
    else:
        st.write(f"Loading time: {loading_time/60:.1f} min.")
    
    
# Text area to input text
text = st.text_area("Enter text to summarize here.")

if text:
    
    # Display the model and input text
    new_prompt = prompt.format(BODY=text)
    print(f"The follwoing prompt is used: {new_prompt}")

    st.write("Generating answer...")
    
    out = get_answer(st.session_state["model"], new_prompt)
    st.write(out)


Writing App.py


### 5.3.2. Launchin Demo: Streamlit   
> 5.3.2.1. Using __tunneling__   (see prerequisites under 5.1 and 5.2)  
> 5.3.2.2. Using __ngrok__ (see prerequisites under 5.2) alternative to tunneling  

#### Using tunneling (see prerequisites under 5.1 and 5.2)

In [None]:
# 46.227.68.162     IP given, is passwd to load the url
!streamlit run App.py & npx localtunnel --port 8501

#### 5.3.2.2 Using __ngrok__ (see prerequisites under 5.2) alternative to tunneling  
- You need to create an account at ngrok
- You need to setup your token (mine is 2cgHBiH2jr74RQpPsNV4b0mlGeL_7RmavDxxPepKATjTuJfpJ)

In [26]:
!pip install pyngrok

Collecting pyngrok
  Downloading pyngrok-7.1.2-py3-none-any.whl.metadata (7.6 kB)
Downloading pyngrok-7.1.2-py3-none-any.whl (22 kB)
Installing collected packages: pyngrok
Successfully installed pyngrok-7.1.2
[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m23.3.1[0m[39;49m -> [0m[32;49m24.0[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpython -m pip install --upgrade pip[0m


In [27]:
from pyngrok import ngrok
!ngrok authtoken 2cgHBiH2jr74RQpPsNV4b0mlGeL_7RmavDxxPepKATjTuJfpJ
get_ipython().system_raw('nohup streamlit run App.py &')

Authtoken saved to configuration file: /root/.config/ngrok/ngrok.yml                                

Collecting usage statistics. To deactivate, set browser.gatherUsageStats to False.


  You can now view your Streamlit app in your browser.

  Network URL: http://172.23.0.2:8501
  External URL: http://87.197.140.238:8501



In [28]:
url = ngrok.connect(8501)
url #generates our URL

<NgrokTunnel: "https://cee4-87-197-140-238.ngrok-free.app" -> "http://localhost:8501">

In [30]:
# click on the first https link from <NgrokTunnel: "https://cee4-87-197-140-238.ngrok-free.app" -> "http://localhost:8501">
!streamlit run --server.port 80 App.py > /dev/null

^C


### 5.3.3. __LlamaDutchDemoApp.py__ with LangChain  
5.3.3.1. How to handle text to pdf:  
5.3.3.2. How to do LangChain in the notebook

#### 5.3.3.1. How to handle text to pdf:
Since LangChain was only working through pdf inputs, I am giving here the code how to add the input text from the streamlit app into a pdf file. This pdf file gets picked up by the app to generate the answer. The code below just test the handling of the input text to pdf file.

In [None]:
%%writefile App.py
import os
import time
import streamlit as st

from langchain.document_loaders import PyPDFLoader
from langchain.text_splitter import CharacterTextSplitter, TokenTextSplitter
from langchain.prompts import PromptTemplate
from reportlab.lib.pagesizes import letter
from reportlab.pdfgen import canvas

# Text area to input text
text = st.text_area("Enter text to summarize here.")
local_pdf_name = 'local.pdf'

def create_pdf(list_strings, filename):
    # Create a canvas with letter size (8.5x11 inches)
    c = canvas.Canvas(filename, pagesize=letter)

    # Set font and font size for first page
    c.setFont("Helvetica", 12)
    
    for string in list_strings:
        # Draw first string on the first page
        draw_multiline_string(c, 100, 700, string)

        # Add a new page
        c.showPage()

    # Save the PDF
    c.save()

def draw_multiline_string(canvas, x, y, text, max_width=400, line_spacing=15):
    lines = []
    current_line = ''
    for word in text.split():
        if canvas.stringWidth(current_line + ' ' + word) <= max_width:
            current_line += ' ' + word
        else:
            lines.append(current_line.strip())
            current_line = word
    lines.append(current_line.strip())

    for line in lines:
        canvas.drawString(x, y, line)
        y -= line_spacing

# processes each time a new text is entered in the input field
if text:
    print(text)
    if os.path.exists(local_pdf_name):
        os.remove(local_pdf_name)
        
    create_pdf([text], local_pdf_name)
    loader = PyPDFLoader(local_pdf_name)
    docs = loader.load()
    print(docs)
    # Display the model and input text
    out = "\nProcessing, Please wait for the answer...\n"
    st.write(out)

#### 5.3.3.2. How to do Lanchain in the notebook:
Here follows code that runs without the streamlit app (this is a variation on the code from 4.8.4)  
This was created to see if it worked and the next step was taking this code and wrapping the streamlit around it.

In [None]:
import time
from transformers import pipeline, Conversation, AutoTokenizer, AutoModelForCausalLM
from langchain.llms import HuggingFacePipeline
from langchain_community.llms import HuggingFaceHub
from langchain.chains import MapReduceDocumentsChain, ReduceDocumentsChain
from langchain.chains import LLMChain
from langchain.chains.combine_documents.stuff import StuffDocumentsChain
from langchain.document_loaders import PyPDFLoader
from langchain.text_splitter import CharacterTextSplitter, TokenTextSplitter
from langchain.prompts import PromptTemplate
from reportlab.lib.pagesizes import letter
from reportlab.pdfgen import canvas

#1: "meta-llama/Llama-2-13b-chat-hf",
#2: "BramVanroy/Llama-2-13b-chat-dutch"

my_config = {'model_name': "BramVanroy/Llama-2-13b-chat-dutch", #"./Bram", #BramVanroy/Llama-2-13b-chat-dutch", 
             'do_sample': True, 'temperature': 0.1, 
             'repetition_penalty': 1.1, 'max_new_tokens': 500, }

print(f"Selected model: {my_config['model_name']}")
print(f"Parameters are: {my_config}")
local_pdf_name = 'local.pdf'

def generate_with_llama_chat(my_config):
    start_time = time.time()
    print('tokenizer')
    tokenizer = AutoTokenizer.from_pretrained(my_config['model_name'])
    print('causal')
    model = AutoModelForCausalLM.from_pretrained(my_config['model_name'])
    print('Pipeline')
    chatbot = pipeline("text-generation",model=my_config['model_name'], 
                       tokenizer=tokenizer,
                       do_sample=my_config['do_sample'], 
                       temperature=my_config['temperature'], 
                       repetition_penalty=my_config['repetition_penalty'],
                       #max_length=my_config['max_length'],
                       max_new_tokens=my_config['max_new_tokens'], 
                       model_kwargs={"device_map": "auto","load_in_8bit": True})
    llm = HuggingFacePipeline(pipeline=chatbot)
    loading_time = time.time()-start_time
    return llm, loading_time

def create_pdf(list_strings, filename):
    # Create a canvas with letter size (8.5x11 inches)
    c = canvas.Canvas(filename, pagesize=letter)

    # Set font and font size for first page
    c.setFont("Helvetica", 12)
    
    for string in list_strings:
        # Draw first string on the first page
        draw_multiline_string(c, 100, 700, string)

        # Add a new page
        c.showPage()

    # Save the PDF
    c.save()

def draw_multiline_string(canvas, x, y, text, max_width=400, line_spacing=15):
    lines = []
    current_line = ''
    for word in text.split():
        if canvas.stringWidth(current_line + ' ' + word) <= max_width:
            current_line += ' ' + word
        else:
            lines.append(current_line.strip())
            current_line = word
    lines.append(current_line.strip())

    for line in lines:
        canvas.drawString(x, y, line)
        y -= line_spacing    
        
#---------------------------------------------------------------------------------------------
# Takes a list of documents, combines them into a single string, and passes this to an LLMChain
def langchain_block(llm, map_template, reduce_template):
    map_prompt = PromptTemplate.from_template(map_template)
    map_chain = LLMChain(llm=llm, prompt=map_prompt)
    
    reduce_prompt = PromptTemplate.from_template(reduce_template)
    reduce_chain = LLMChain(llm=llm, prompt=reduce_prompt)
    
    combine_documents_chain = StuffDocumentsChain(
        llm_chain=reduce_chain, document_variable_name="doc_maps"
    )

    # Combines and iteratively reduces the mapped documents
    reduce_documents_chain = ReduceDocumentsChain(
        # This is final chain that is called.
        combine_documents_chain=combine_documents_chain,
        # If documents exceed context for `StuffDocumentsChain`
        collapse_documents_chain=combine_documents_chain,
        # The maximum number of tokens to group documents into.
        token_max=4096
    )

    map_reduce_chain = MapReduceDocumentsChain(
        # Map chain
        llm_chain=map_chain,
        # Reduce chain
        reduce_documents_chain=reduce_documents_chain,
        # The variable name in the llm_chain to put the documents in
        document_variable_name="docs",
        # Return the results of the map steps in the output
        return_intermediate_steps=True
    )

    text_splitter = TokenTextSplitter(chunk_size=800, chunk_overlap=0)    
    return map_reduce_chain, text_splitter
    

#---------------------------------------------------------------------------------------------
# Map    
map_template = """The following is a set of documents
    {docs}
    Based on this list of docs, please make summaries. Do not summarize the document when there are no full sentences in the document. 
    Helpful Answer:"""
reduce_template = """The following is set of summaries:
    {doc_maps}
    Take these and distill it into a final, consolidated summary of the main themes. 
    Helpful Answer:"""

loader = PyPDFLoader('history1_be_gr_nl.pdf')
docs = loader.load()

llm_chatbot, loading_time = generate_with_llama_chat(my_config)
map_reduce_chain, text_splitter = langchain_block(llm_chatbot, map_template, reduce_template)

# For pdf documents you can use the following documents splitter
split_docs = text_splitter.split_documents(docs)
print(f"Number of splits= {len(split_docs)}")

map_reduce_output1 = map_reduce_chain.invoke(split_docs)


In [None]:
# load the create_pd_from_map_reduce_output from section 4.4
pdf_mp_summary = create_pd_from_map_reduce_output(map_reduce_output1)

# code that prints the output in a human readable format 
for i in range(len(pdf_mp_summary)):
   print(f"\n ========================================\n{pdf_mp_summary.iloc[i]['chunks']} \n+++++++++++++++++++++++\n{pdf_mp_summary.iloc[i]['concise_summary']}" )

In [None]:
map_reduce_output1

#### 5.3.3.3. How to do Lanchain in the streamlit app:
It will take the text from the streamlit input field, create a pdf file of 1 page and then use that pdf to generate the answer.  
There is probably a faster way without going over a pdf but i did not further investigate this since i had a workin example using a pdf file.

- If you want to test this with 'history1_be_gr_nl.pdf', you need to generate it first using the cells from section 3.2.3.2. and you need to uncomment the line  
``` #loader = PyPDFLoader('history1_be_gr_nl.pdf')```   
from the code bellow (around line 168)

```python
    # loading this generated pdf (is called local_pdf_name = 'local.pdf')
    loader = PyPDFLoader(local_pdf_name)
    
    # if you want to load an example with 3 pages you can use the following line
    #loader = PyPDFLoader('history1_be_gr_nl.pdf')
```  
- There is /was a warning that needs to be taken care of:
```
Importing from langchain will no longer be supported as of langchain==0.2.0. Please import from langchain-community instead:

`from langchain_community.llms import HuggingFacePipeline`.

To install langchain-community run `pip install -U langchain-community`.
```

In [31]:
%%writefile App.py
import time
import streamlit as st
from pathlib import Path as p
import pandas as pd

from transformers import pipeline, Conversation, AutoTokenizer, AutoModelForCausalLM
from langchain.llms import HuggingFacePipeline
from langchain_community.llms import HuggingFaceHub
from langchain.chains import MapReduceDocumentsChain, ReduceDocumentsChain
from langchain.chains import LLMChain
from langchain.chains.combine_documents.stuff import StuffDocumentsChain
from langchain.document_loaders import PyPDFLoader
from langchain.text_splitter import CharacterTextSplitter, TokenTextSplitter
from langchain.prompts import PromptTemplate
from reportlab.lib.pagesizes import letter
from reportlab.pdfgen import canvas

#1: "meta-llama/Llama-2-13b-chat-hf",
#2: "BramVanroy/Llama-2-13b-chat-dutch"

my_config = {'model_name': "BramVanroy/Llama-2-13b-chat-dutch", #"./Bram", #BramVanroy/Llama-2-13b-chat-dutch", 
             'do_sample': True, 'temperature': 0.1, 
             'repetition_penalty': 1.1, 'max_new_tokens': 500, }

print(f"Selected model: {my_config['model_name']}")
print(f"Parameters are: {my_config}")
local_pdf_name = 'local.pdf'

def generate_with_llama_chat(my_config):
    start_time = time.time()
    print('tokenizer')
    tokenizer = AutoTokenizer.from_pretrained(my_config['model_name'])
    print('causal')
    model = AutoModelForCausalLM.from_pretrained(my_config['model_name'])
    print('Pipeline')
    chatbot = pipeline("text-generation",model=my_config['model_name'], 
                       tokenizer=tokenizer,
                       do_sample=my_config['do_sample'], 
                       temperature=my_config['temperature'], 
                       repetition_penalty=my_config['repetition_penalty'],
                       #max_length=my_config['max_length'],
                       max_new_tokens=my_config['max_new_tokens'], 
                       model_kwargs={"device_map": "auto","load_in_8bit": True})
    llm = HuggingFacePipeline(pipeline=chatbot)
    loading_time = time.time()-start_time
    return llm, loading_time

def create_pdf(list_strings, filename):
    # Create a canvas with letter size (8.5x11 inches)
    c = canvas.Canvas(filename, pagesize=letter)

    # Set font and font size for first page
    c.setFont("Helvetica", 12)
    
    for string in list_strings:
        # Draw first string on the first page
        draw_multiline_string(c, 100, 700, string)

        # Add a new page
        c.showPage()

    # Save the PDF
    c.save()

def draw_multiline_string(canvas, x, y, text, max_width=400, line_spacing=15):
    lines = []
    current_line = ''
    for word in text.split():
        if canvas.stringWidth(current_line + ' ' + word) <= max_width:
            current_line += ' ' + word
        else:
            lines.append(current_line.strip())
            current_line = word
    lines.append(current_line.strip())

    for line in lines:
        canvas.drawString(x, y, line)
        y -= line_spacing    
        

if "model" not in st.session_state.keys():
    st.write(f"Loading {my_config['model_name']}..., Please wait." )
    # Initialize the model with the default option
    #st.session_state["model_name"] = "BramVanroy/Llama-2-13b-chat-dutch"
    #my_config.update({'model_name': st.session_state['model_name']})
    llm_chatbot, loading_time = generate_with_llama_chat(my_config)
    st.session_state["model"] = llm_chatbot
    st.write(f"Loading time: {loading_time/60:.1f} min.")

#---------------------------------------------------------------------------------------------
# Map    
map_template = """The following is a set of documents
    {docs}
    Based on this list of docs, please make summaries. Do not summarize the document when there are no full sentences in the document. 
    Helpful Answer:"""
reduce_template = """The following is set of summaries:
    {doc_maps}
    Take these and distill it into a final, consolidated summary of the main themes. 
    Helpful Answer:"""

#---------------------------------------------------------------------------------------------
# Takes a list of documents, combines them into a single string, and passes this to an LLMChain
def langchain_block(llm, map_template, reduce_template):
    map_prompt = PromptTemplate.from_template(map_template)
    map_chain = LLMChain(llm=llm, prompt=map_prompt)
    
    reduce_prompt = PromptTemplate.from_template(reduce_template)
    reduce_chain = LLMChain(llm=llm, prompt=reduce_prompt)
    
    combine_documents_chain = StuffDocumentsChain(
        llm_chain=reduce_chain, document_variable_name="doc_maps"
    )

    # Combines and iteratively reduces the mapped documents
    reduce_documents_chain = ReduceDocumentsChain(
        # This is final chain that is called.
        combine_documents_chain=combine_documents_chain,
        # If documents exceed context for `StuffDocumentsChain`
        collapse_documents_chain=combine_documents_chain,
        # The maximum number of tokens to group documents into.
        token_max=4096
    )

    map_reduce_chain = MapReduceDocumentsChain(
        # Map chain
        llm_chain=map_chain,
        # Reduce chain
        reduce_documents_chain=reduce_documents_chain,
        # The variable name in the llm_chain to put the documents in
        document_variable_name="docs",
        # Return the results of the map steps in the output
        return_intermediate_steps=True
    )

    text_splitter = TokenTextSplitter(chunk_size=800, chunk_overlap=0)    
    return map_reduce_chain, text_splitter

def create_pd_from_map_reduce_output(map_reduce_outputs):
    data_folder = p.cwd() 
    p(data_folder).mkdir(parents=True, exist_ok=True)


    final_mp_data = []
    for doc, out in zip(map_reduce_outputs["input_documents"], map_reduce_outputs["intermediate_steps"]):
        output = {}
        output["file_name"] = p(doc.metadata["source"]).stem
        output["file_type"] = p(doc.metadata["source"]).suffix
        output["page_number"] = doc.metadata["page"]
        output["chunks"] = doc.page_content
        output["concise_summary"] = out
        final_mp_data.append(output)

    pdf_mp_summary = pd.DataFrame.from_dict(final_mp_data)
    pdf_mp_summary = pdf_mp_summary.sort_values(by=["file_name", "page_number"])  # sorting the dataframe by filename and page_number
    pdf_mp_summary.reset_index(inplace=True, drop=True)
    pdf_mp_summary
    return pdf_mp_summary

# Text area to input text
text = st.text_area("Enter text to summarize here.")

# processes each time a new text is entered in the input field
if text:
    # creates a pdf file from the input text from the streamlit app 
    create_pdf([text], local_pdf_name)

    # loading this generated pdf (is called local_pdf_name = 'local.pdf')
    loader = PyPDFLoader(local_pdf_name)
    
    # if you want to load an example with 3 pages you can use the following line
    #loader = PyPDFLoader('history1_be_gr_nl.pdf')

    docs = loader.load()
    map_reduce_chain, text_splitter = langchain_block(st.session_state["model"], map_template, reduce_template)
    # For pdf documents you can use the following documents splitter
    split_docs = text_splitter.split_documents(docs)
    print(f"Number of splits= {len(split_docs)}")
    
    start_time = time.time()
    
    out = "\nProcessing, Please wait for the answer...\n"
    st.write(out)
    
    map_reduce_output1 = map_reduce_chain.invoke(split_docs)
    
    elapsed_time = time.time() - start_time
    print(map_reduce_output1)
    print(f"\n\nElapsed time:  {elapsed_time}")
    pdf_mp_summary = create_pd_from_map_reduce_output(map_reduce_output1)

    # code that prints the output in a human readable format 
    for i in range(len(pdf_mp_summary)):
       print(f"\n ========================================")
       print(f"{pdf_mp_summary.iloc[i]['chunks']}")
       print(f"+++++++++++++++++++++++")
       print(f"{pdf_mp_summary.iloc[i]['concise_summary']}" )
       st.write(f"Part{i}: {pdf_mp_summary.iloc[i]['chunks']}")
       st.write(f"{pdf_mp_summary.iloc[i]['concise_summary']}")
        
    st.write(f"\n\nElapsed time:  {elapsed_time}")

Overwriting App.py


In [None]:
# 104.255.9.187
# 87.197.140.238
!streamlit run App.py & npx localtunnel --port 8501


Collecting usage statistics. To deactivate, set browser.gatherUsageStats to False.
[0m
[0m
[34m[1m  You can now view your Streamlit app in your browser.[0m
[0m
[34m  Network URL: [0m[1mhttp://172.23.0.2:8501[0m
[34m  External URL: [0m[1mhttp://87.197.140.238:8501[0m
[0m
your url is: https://ready-eagles-love.loca.lt

`from langchain_community.llms import HuggingFacePipeline`.

To install langchain-community run `pip install -U langchain-community`.

`from langchain_community.document_loaders import PyPDFLoader`.

To install langchain-community run `pip install -U langchain-community`.
Selected model: BramVanroy/Llama-2-13b-chat-dutch
Parameters are: {'model_name': 'BramVanroy/Llama-2-13b-chat-dutch', 'do_sample': True, 'temperature': 0.1, 'repetition_penalty': 1.1, 'max_new_tokens': 500}
tokenizer
causal
Loading checkpoint shards: 100%|██████████████████| 3/3 [00:06<00:00,  2.01s/it]
Pipeline
The `load_in_4bit` and `load_in_8bit` arguments are deprecated and will be rem