# Extract and summarize information with LLMs 

## Purpose of the notebook

In this notebook you'll find how to run LLM model to extract useful data from human and public contribution.

## Requirements

Be sure LangChain and transformers from huggingface are installed. It is recommanded to install huggingface transformers package from source repo.  

`!pip install langchain`  
`!pip install git+https://github.com/huggingface/transformers`

## Load the LLM model

In [None]:
import langchain
langchain.__version__

In [None]:
from langchain_community.llms import HuggingFacePipeline
from langchain.prompts import PromptTemplate
from transformers import AutoModelForCausalLM, AutoTokenizer, TextStreamer, pipeline
from langchain_community.llms import VLLM, VLLMOpenAI
from langchain.chains import LLMChain
from pprint import pprint
import json

import gc
import torch

gc.collect()
torch.cuda.empty_cache()

## Run on local webserver

Run the model on the local server with this command. For instance:
```
 python -m vllm.entrypoints.openai.api_server\
    --model TheBloke/NeuralBeagle14-7B-AWQ\
    --chat-template ./config/template_chatml.jinja\
    --quantization awq\
    --trust-remote-code\
    --max-model-len 2048
```

or more simple, execute the bash script `run_server.sh`. To do:
1. open a terminal and go to the directory of the project
2. execute: `$ sh run_server.sh`

*Warning*: the context from this tutorial take at leat 450 tokens. Process only text with less than 1500 tokens.

## Process with the LLM

Load the config corresponding to the LLM model you want to execute.

In [None]:
import tomllib

with open('../config/local_llm.toml', 'rb') as file:
    configs = tomllib.load(file)

print(configs)  # the config file is a dictionnary

Define the template to structure the query. The model used in this tutorial is based on Mistral 7B, so used ChatML structure.

Define the input to process:

In [None]:
input = """
Progressivité réelle de l'impôt sur le revenu sans en passer par les tranches mais en s'appuyant, par exemple,  sur un coefficient variable suivant le niveau de revenu (ou bien s'inspirer librement du modèle suédois).
-Taxation des revenus financiers issus de placements qui ne sont pas directement investis dans l'économie (exemple: les produits dérivés).
-imposer les ayants droits aux minima sociaux à raison d'une somme symbolique: 30 ou 50 euros par an par exemple.
-faire sauter certaines niches fiscales après audit de la Cour des comptes.
-taxer comme le font les USA les Français qui prennent une autre nationalité ou résident à l'étranger.
"""

In [None]:
from openai import OpenAI

model_name = configs["model"]["name"]
user_message = configs["template"]["user"].format(input=input)
system_message = configs["template"]["system"]

client = OpenAI(
    base_url=configs["server"]["base_url"],
    api_key=configs["server"]["api_key"],
)

completion = client.chat.completions.create(
    model=configs["model"]["name"],
    messages=[
        {"role": "system", "content": system_message},
        {"role": "user", "content": user_message},
    ],
    stop=configs["model"]["stop"],
    top_p=configs["model"]["top_p"],
    temperature=configs["model"]["temperature"],
)

output = completion.choices[0].message.content.split("```")[1]

In [None]:
from pprint import pprint

pprint(output)