## Online Meeting

<a target="_blank" href="https://colab.research.google.com/github/microsoft/LLMLingua/blob/main/examples/OnlineMeeting.ipynb">
  <img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/>
</a>

Using generative AI like ChatGPT in online meetings can greatly improve work efficiency (e.g., **Teams**). However, the context in such applications tends to be more conversational, with a high degree of redundancy and a large number of tokens(more than **40k**). By utilizing LLMLingua to compress prompts, we can significantly reduce the length of prompts, which in turn helps to reduce latency. This makes the AI more efficient and responsive in real-time communication scenarios like online meetings, enabling smoother interactions and better overall performance. We use meeting transcripts from the [**MeetingBank** dataset](https://huggingface.co/datasets/lytang/MeetingBank-transcript) as an example to demonstrate the capabilities of LLMLingua.

### MeetingBank Dataset

Next, we will demonstrate the use of LongLLMLingua on the **MeetingBank** dataset, which can achieve similar or even better performance with significantly fewer tokens. The online meeting scenario is quite similar to RAG, as it also suffers from the "lost in the middle" issue, where noise data at the beginning or end of the prompt interferes with LLMs extracting key information. This dataset closely resembles real-world online meeting scenarios, with prompt lengths exceeding **60k tokens at their longest.  
   
The original dataset can be found at https://huggingface.co/datasets/lytang/MeetingBank-transcript

In [1]:
%env PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True

env: PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True


In [2]:
# Install dependency.
!pip install llmlingua
!pip install openai==1.14.1
#!pip install optimum auto-gptq
!pip install optimum
!pip install auto-gptq









In [3]:
import json
from openai import OpenAI
import time

In [28]:
client = OpenAI(
    # This is the default and can be omitted
    api_key=""
)


In [5]:
# Setup LLMLingua
from llmlingua import PromptCompressor

llm_lingua = PromptCompressor("TheBloke/Llama-2-7b-Chat-GPTQ", model_config={"revision": "main"})


The cos_cached attribute will be removed in 4.39. Bear in mind that its contents changed in v4.38. Use the forward method of RoPE from now on instead. It is not used in the `LlamaAttention` class
The sin_cached attribute will be removed in 4.39. Bear in mind that its contents changed in v4.38. Use the forward method of RoPE from now on instead. It is not used in the `LlamaAttention` class


# Teste com Notas Taquigráficas

In [6]:
%pwd

'/home/helen/token_compression'

In [7]:
   import pandas as pd

   df = pd.read_csv('./meetings.csv')

In [8]:
display(df)

Unnamed: 0,id_session,speaker_name,party,speech
0,25341,O SR. PRESIDENTE RODRIGO PACHECO,Rodrigo Pacheco. Bloco Parlamentar PSD/Republi...,Declaro aberta a sessão. Sob a proteção de Deu...
1,25341,O SR. OMAR AZIZ,Bloco Parlamentar PSD/Republicanos/PSD - AM,"Questão de ordem, Sr. Presidente. Sr. Presiden..."
2,25341,O SR. PRESIDENTE RODRIGO PACHECO,Rodrigo Pacheco. Bloco Parlamentar PSD/Republi...,Eu peço a permissão dos Srs. Senadores e das S...
3,25341,O SR. OMAR AZIZ,Bloco Parlamentar PSD/Republicanos/PSD - AM,"Sr. Presidente, uma questão de ordem, por favor."
4,25341,O SR. PRESIDENTE OMAR AZIZ,Omar Aziz. Bloco Parlamentar PSD/Republicanos/...,"Pois não. Senador Omar Aziz, para uma questão ..."
...,...,...,...,...
24607,25877,O SR. MAGNO MALTA,Bloco Parlamentar Vanguarda/PL - ES,"Aqui os meus pêsames a quem votou neles, porqu..."
24608,25877,O SR. MAGNO MALTA,Bloco Parlamentar Vanguarda/PL - ES,"Se eu estiver errado, V. Exa. pode me apartear..."
24609,25877,O SR. MAGNO MALTA,Bloco Parlamentar Vanguarda/PL - ES,E nós sabemos o que é o decálogo de Lenin: a d...
24610,25877,O SR. MAGNO MALTA,Bloco Parlamentar Vanguarda/PL - ES,"No dia em que V. Exa. sair dessa cadeira, eles..."


In [9]:
new_df = df[df["id_session"]==25355]

In [10]:
display(new_df)

Unnamed: 0,id_session,speaker_name,party,speech
432,25355,O SR. PRESIDENTE RODRIGO PACHECO,Rodrigo Pacheco. Bloco Parlamentar da Resistên...,Declaro aberta a sessão. Sob a proteção de Deu...
433,25355,O SR. JORGE KAJURU,Bloco Parlamentar da Resistência Democrática/P...,Questão de ordem!
434,25355,O SR. PRESIDENTE RODRIGO PACHECO,Rodrigo Pacheco. Bloco Parlamentar da Resistên...,"Questão de ordem, Senador Jorge Kajuru."
435,25355,O SR. JORGE KAJURU,Bloco Parlamentar da Resistência Democrática/P...,"Obrigado, Presidente Rodrigo Pacheco. Eu sei d..."
436,25355,O SR. PRESIDENTE RODRIGO PACHECO,Rodrigo Pacheco. Bloco Parlamentar da Resistên...,"Obrigado, Senador Jorge Kajuru. Com a palavra,..."
...,...,...,...,...
623,25355,O SR. PAULO PAIM,Bloco Parlamentar da Resistência Democrática/P...,"Não, vou ficar no tempo. Você marcou lá corret..."
624,25355,O SR. PRESIDENTE RODRIGO CUNHA,Rodrigo Cunha. Bloco Parlamentar Democracia/UN...,Obrigado.
625,25355,O SR. PAULO PAIM,Bloco Parlamentar da Resistência Democrática/P...,"Eu queria também, Presidente, dizer que para m..."
626,25355,O SR. PAULO PAIM,Bloco Parlamentar da Resistência Democrática/P...,... e agradeço muito a V. Exa. Neste momento q...


In [11]:
new_df = new_df.drop(columns=['party','id_session'])
print(len(new_df))
#new_df = new_df.head(500)

196


In [12]:
csv_to_prompt = new_df.to_csv(index=False)

In [13]:
csv_to_prompt

'speaker_name,speech\nO SR. PRESIDENTE RODRIGO PACHECO,"Declaro aberta a sessão. Sob a proteção de Deus, iniciamos os nossos trabalhos. As Senadoras e os Senadores poderão se inscrever para o uso da palavra por três minutos por meio do aplicativo Senado Digital, por lista de inscrição que se encontra sobre a mesa, ou por intermédio dos totens disponibilizados na Casa. A presente sessão deliberativa é destinada à apreciação do Projeto de Decreto Legislativo nº 2, de 2023, já disponibilizado em avulsos eletrônicos e na Ordem do Dia eletrônica de hoje. Esta Presidência, ao tempo em que manifesta uma vez mais os votos de boas-vindas aos colegas Senadores e às colegas Senadoras, tanto àqueles que iniciam a sua legislatura no Senado quanto àqueles que já estavam no Senado Federal, deseja que tenhamos os próximos anos de um trabalho produtivo, colaborativo, com respeito recíproco, em prol da sociedade brasileira e do Brasil. Portanto, sinceramente, meus votos de boas-vindas a todos os Senador

In [14]:
# def get_stance_system():
#     system = """Consider that it is an expert model in Stance Detection. Stance detection is the task of predicting an author's point of view on a subject of interest. A speech can represent one of three types of stance, for, against or neutral.
# For: When an author takes a stance "for" a subject, it means they support or advocate for it. Their speech or writing will likely include arguments, evidence, or opinions that highlight the positive aspects, benefits, or reasons to endorse the subject. For example, if the subject is a proposed policy change, someone taking a "for" stance might emphasize how it could improve people's lives or address important societal issues.
# Against: This stance indicates opposition or disagreement with the subject at hand. Authors taking an "against" stance will present arguments, evidence, or opinions that highlight flaws, risks, negative consequences, or reasons to reject the subject. Using the previous example of a proposed policy change, someone taking an "against" stance might argue that it would be ineffective, unfair, or harmful to certain groups.
# Neutral: A neutral stance means the author does not express explicit support or opposition towards the subject. They may present information, analysis, or perspectives in a balanced and objective manner without advocating for or against the subject. Neutral stances typically avoid strong opinions or judgments and instead focus on providing a comprehensive understanding of the topic without bias.
# Reply in json format with the following keys: summary, list_latent_topics, stances.
# summary: should contain the summary of the meeting.
# list_latent_topics: should contain the list of topics discussed in the meeting.
# stances: for each latent_topics key should contain the classification of the speaker's speech.
# """
#     return system

# Prompt Versions

In [15]:
def summarization_prompt_for_stances_v1(context: str, speakers: list[str]):
    speakers_string = ', '.join(speakers)
    
    prompt = f"""
    Consider that it is an expert model in Stance Detection. Stance detection is the task of predicting an author's point of view on a subject of interest. A speech can represent one of four types of stance, for, against, neutral and not related.
For: When an author takes a stance "for" a subject, it means they support or advocate for it. Their speech or writing will likely include arguments, evidence, or opinions that highlight the positive aspects, benefits, or reasons to endorse the subject. For example, if the subject is a proposed policy change, someone taking a "for" stance might emphasize how it could improve people's lives or address important societal issues.
Against: This stance indicates opposition or disagreement with the subject at hand. Authors taking an "against" stance will present arguments, evidence, or opinions that highlight flaws, risks, negative consequences, or reasons to reject the subject. Using the previous example of a proposed policy change, someone taking an "against" stance might argue that it would be ineffective, unfair, or harmful to certain groups.
Neutral: A neutral stance means the author does not express explicit support or opposition towards the subject. They may present information, analysis, or perspectives in a balanced and objective manner without advocating for or against the subject. Neutral stances typically avoid strong opinions or judgments and instead focus on providing a comprehensive understanding of the topic without bias. If the person doesn't say anything about that topic, it means that their position is "NOT RELATED".
Not related: Means that the person has not taken a position on the subject. They may have no direct interest in the topic in question or they may simply choose not to express an opinion on it. 
Reply in json format with the following keys: list_latent_topics, stances and summary.
list_latent_topics: should contain the list of topics discussed in the text, and a short description for each topic.
stances: for each latent_topics key should contain the classification of the all speaker's speechs.
summary: should contain the summary of the text.
    Consider that you will receive as input a csv with a set of speeches that make up ```{context}```.

    Identify the following items in the text:
    - Determine the all position topics being discussed in the text and a brief descriptions of these topics.
    - For each topic and for each speaker, classify the stance as being FOR, AGAINST, NEUTRAL or NOT RELATED. Being the following speakers: {speakers_string}
   
   Json response example:
   {{
    "list_latent_topics": {{
        "Topic 1": "Description",
        ...
    }},
    "stances": {{
        "Topic 1": {{
            "Speaker 1": "STANCE",
            ...
        }},
        ...
    }},
    "summary": "Text summary."
    }}
   """
    return prompt

In [16]:
def summarization_prompt_for_stances_v2(context: str, speakers: list[str]):
    speakers_string = ', '.join(speakers)
    
    prompt = f"""
    Consider that it is an expert model in Stance Detection. Stance detection is the task of predicting an author's point of view on a subject of interest. A speech can represent one of four types of stance, for, against, neutral and not related.
For: When an author takes a stance "for" a subject, it means they support or advocate for it. Their speech or writing will likely include arguments, evidence, or opinions that highlight the positive aspects, benefits, or reasons to endorse the subject. For example, if the subject is a proposed policy change, someone taking a "for" stance might emphasize how it could improve people's lives or address important societal issues.
Against: This stance indicates opposition or disagreement with the subject at hand. Authors taking an "against" stance will present arguments, evidence, or opinions that highlight flaws, risks, negative consequences, or reasons to reject the subject. Using the previous example of a proposed policy change, someone taking an "against" stance might argue that it would be ineffective, unfair, or harmful to certain groups.
Neutral: A neutral stance means the author does not express explicit support or opposition towards the subject. They may present information, analysis, or perspectives in a balanced and objective manner without advocating for or against the subject. Neutral stances typically avoid strong opinions or judgments and instead focus on providing a comprehensive understanding of the topic without bias. If the person doesn't say anything about that topic, it means that their position is "NOT RELATED".
Not related: Means that the person has not taken a position on the subject. They may have no direct interest in the topic in question or they may simply choose not to express an opinion on it. 
Reply in json format with the following keys: list_latent_topics, stances and summary.
list_latent_topics: should contain the list of topics discussed in the text, and a short description for each topic.
stances: for each latent_topics key should contain the classification of the all speaker's speechs.
summary: should contain the summary of the text.
    Consider that you will receive as input a csv with a set of speeches that make up ```{context}```.

    Identify the following items in the text:
    - Determine the all position topics being discussed in the text and a brief descriptions of these topics.
    - For each topic and for each speaker, classify the stance as being FOR, AGAINST, NEUTRAL or NOT RELATED. Being the following speakers: {speakers_string}
   """
    return prompt

In [17]:
def summarization_prompt_for_stances_v3(context: str, speakers: list[str]):
    speakers_string = ', '.join(speakers)
    
    prompt = f"""
    Consider that it is an expert model in Stance Detection. Stance detection is the task of predicting an author's point of view on a subject of interest. A speech can represent one of four types of stance, for, against, neutral and not related.
For: When an author takes a stance "for" a subject, it means they support or advocate for it. Their speech or writing will likely include arguments, evidence, or opinions that highlight the positive aspects, benefits, or reasons to endorse the subject. For example, if the subject is a proposed policy change, someone taking a "for" stance might emphasize how it could improve people's lives or address important societal issues.
Against: This stance indicates opposition or disagreement with the subject at hand. Authors taking an "against" stance will present arguments, evidence, or opinions that highlight flaws, risks, negative consequences, or reasons to reject the subject. Using the previous example of a proposed policy change, someone taking an "against" stance might argue that it would be ineffective, unfair, or harmful to certain groups.
Neutral: A neutral stance means the author does not express explicit support or opposition towards the subject. They may present information, analysis, or perspectives in a balanced and objective manner without advocating for or against the subject. Neutral stances typically avoid strong opinions or judgments and instead focus on providing a comprehensive understanding of the topic without bias. If the person doesn't say anything about that topic, it means that their position is "NOT RELATED".
Not related: Means that the person has not taken a position on the subject. They may have no direct interest in the topic in question or they may simply choose not to express an opinion on it. 
Reply in json format with the following keys: list_latent_topics, stances and summary.
list_latent_topics: should contain the list of topics discussed in the text, and a short description for each topic.
stances: for each latent_topics key should contain the classification of the all speaker's speechs.
summary: should contain the summary of the text.
    Consider that you will receive as input a csv with a set of speeches that make up ```{context}```.

    Identify the following items in the text:
    - Determine the all position topics being discussed in the text and a brief descriptions of these topics.
    - For each topic and for each speaker, classify the stance as being FOR, AGAINST, NEUTRAL or NOT RELATED. Being the following speakers: {speakers_string}. If the person doesn't say anything about that topic, it means that their position is "NOT RELATED".
   """
    return prompt

In [18]:
def summarization_prompt_for_stances_v4(context: str, speakers: list[str]):
    speakers_string = ', '.join(speakers)
    
    json_example_start = """{{
    "list_latent_topics": {{
        "Topic 1": "Description",
        ...
    }},
    "stances": {{
        "Topic 1": {{"""
    
    json_speakers = ',\n'.join(list(map(lambda speaker: f'"{speaker}": "STANCE"', speakers)))
    
    json_example_end = """...
        }},
        ...
    }},
    "summary": "Text summary."
    }}"""
    
    json_example = json_example_start + json_speakers + json_example_end
    
    prompt = f"""
    Consider that it is an expert model in Stance Detection. Stance detection is the task of predicting an author's point of view on a subject of interest. A speech can represent one of four types of stance, for, against, neutral and not related.
For: When an author takes a stance "for" a subject, it means they support or advocate for it. Their speech or writing will likely include arguments, evidence, or opinions that highlight the positive aspects, benefits, or reasons to endorse the subject. For example, if the subject is a proposed policy change, someone taking a "for" stance might emphasize how it could improve people's lives or address important societal issues.
Against: This stance indicates opposition or disagreement with the subject at hand. Authors taking an "against" stance will present arguments, evidence, or opinions that highlight flaws, risks, negative consequences, or reasons to reject the subject. Using the previous example of a proposed policy change, someone taking an "against" stance might argue that it would be ineffective, unfair, or harmful to certain groups.
Neutral: A neutral stance means the author does not express explicit support or opposition towards the subject. They may present information, analysis, or perspectives in a balanced and objective manner without advocating for or against the subject. Neutral stances typically avoid strong opinions or judgments and instead focus on providing a comprehensive understanding of the topic without bias. If the person doesn't say anything about that topic, it means that their position is "NOT RELATED".
Not related: Means that the person has not taken a position on the subject. They may have no direct interest in the topic in question or they may simply choose not to express an opinion on it. 
Reply in json format with the following keys: list_latent_topics, stances and summary.
list_latent_topics: should contain the list of topics discussed in the text, and a short description for each topic.
stances: for each latent_topics key should contain the classification of the all speaker's speechs.
summary: should contain the summary of the text.
    Consider that you will receive as input a csv with a set of speeches that make up ```{context}```.

    Identify the following items in the text:
    - Determine the all position topics being discussed in the text and a brief descriptions of these topics.
    - For each topic and for each speaker, classify the stance as being FOR, AGAINST, NEUTRAL or NOT RELATED. Being the following speakers: {speakers_string}. If the person doesn't say anything about that topic, it means that their position is "NOT RELATED".
    
    Json response example:
   {json_example}
   """
    return prompt

In [19]:
def summarization_prompt_for_stances_v5(context: str, speakers: list[str]):
    speakers_string = ', '.join(speakers)
    
    prompt = f"""
    Consider that it is an expert model in Stance Detection. Stance detection is the task of predicting an author's point of view on a subject of interest. A speech can represent one of four types of stance: for, against or neutral.
For: When an author takes a stance "for" a subject, it means they support or advocate for it. Their speech or writing will likely include arguments, evidence, or opinions that highlight the positive aspects, benefits, or reasons to endorse the subject. For example, if the subject is a proposed policy change, someone taking a "for" stance might emphasize how it could improve people's lives or address important societal issues.
Against: This stance indicates opposition or disagreement with the subject at hand. Authors taking an "against" stance will present arguments, evidence, or opinions that highlight flaws, risks, negative consequences, or reasons to reject the subject. Using the previous example of a proposed policy change, someone taking an "against" stance might argue that it would be ineffective, unfair, or harmful to certain groups.
Neutral: A neutral stance means the author does not express explicit support or opposition towards the subject. They may present information, analysis, or perspectives in a balanced and objective manner without advocating for or against the subject. Neutral stances typically avoid strong opinions or judgments and instead focus on providing a comprehensive understanding of the topic without bias. If the person doesn't say anything about that topic, it means that they should not be listed.
Reply in json format with the following keys: list_latent_topics, stances and summary.
list_latent_topics: should contain the list of topics discussed in the text, and a short description for each topic.
stances: for each latent_topics key should contain the classification of the all speaker's speechs.
summary: should contain the summary of the text.
    Consider that you will receive as input a csv with a set of speeches that make up ```{context}```.

    Identify the following items in the text:
    - Determine the all position topics being discussed in the text and a brief descriptions of these topics.
    - For each topic and for each speaker, if the person doesn't say anything about that topic, DONT add them on stances list, classify the stance as being FOR, AGAINST, NEUTRAL. Being the following speakers: {speakers_string}.
   """
    return prompt

In [20]:
def summarization_prompt_for_stances_v6(context: str, speakers: list[str]):
    speakers_string = ', '.join(speakers)
    
    prompt = f"""
    Consider that it is an expert model in Stance Detection. Stance detection is the task of predicting an author's point of view on a subject of interest. A speech can represent one of four types of stance: for, against or neutral.
For: When an author takes a stance "for" a subject, it means they support or advocate for it. Their speech or writing will likely include arguments, evidence, or opinions that highlight the positive aspects, benefits, or reasons to endorse the subject. For example, if the subject is a proposed policy change, someone taking a "for" stance might emphasize how it could improve people's lives or address important societal issues.
Against: This stance indicates opposition or disagreement with the subject at hand. Authors taking an "against" stance will present arguments, evidence, or opinions that highlight flaws, risks, negative consequences, or reasons to reject the subject. Using the previous example of a proposed policy change, someone taking an "against" stance might argue that it would be ineffective, unfair, or harmful to certain groups.
Neutral: A neutral stance means the author does not express explicit support or opposition towards the subject. They may present information, analysis, or perspectives in a balanced and objective manner without advocating for or against the subject. Neutral stances typically avoid strong opinions or judgments and instead focus on providing a comprehensive understanding of the topic without bias. If the person doesn't say anything about that topic, it means that they should not be listed.
Reply in json format with the following keys: list_latent_topics, stances and summary.
list_latent_topics: should contain the list of topics discussed in the text, and a short description for each topic.
stances: for each latent_topics key should contain the list of classification of the related speaker's speechs.
summary: should contain the summary of the text.
    Consider that you will receive as input a csv with a set of speeches that make up ```{context}```.

    Identify the following items in the text:
    - Determine the all position topics being discussed in the text and a brief descriptions of these topics.
    - For each topic and for each speaker, except if the person doesn't say anything about that topic, classify the stance as being FOR, AGAINST, NEUTRAL. Being the following speakers: {speakers_string}.
    
    Before your response, translate the summary and the topics to portuguese.
   """
    return prompt

In [21]:
def summarization_prompt_for_stances_v7(context: str, speakers: list[str]):
    speakers_string = ', '.join(speakers)
    
    prompt = f"""
    Consider that it is an expert model in Stance Detection. Stance detection is the task of predicting an author's point of view on a subject of interest. A speech can represent one of four types of stance: for, against or neutral.
For: When an author takes a stance "for" a subject, it means they support or advocate for it. Their speech or writing will likely include arguments, evidence, or opinions that highlight the positive aspects, benefits, or reasons to endorse the subject. For example, if the subject is a proposed policy change, someone taking a "for" stance might emphasize how it could improve people's lives or address important societal issues.
Against: This stance indicates opposition or disagreement with the subject at hand. Authors taking an "against" stance will present arguments, evidence, or opinions that highlight flaws, risks, negative consequences, or reasons to reject the subject. Using the previous example of a proposed policy change, someone taking an "against" stance might argue that it would be ineffective, unfair, or harmful to certain groups.
Neutral: A neutral stance means the author does not express explicit support or opposition towards the subject. They may present information, analysis, or perspectives in a balanced and objective manner without advocating for or against the subject. Neutral stances typically avoid strong opinions or judgments and instead focus on providing a comprehensive understanding of the topic without bias. If the person doesn't say anything about that topic, it means that they should not be listed.
Reply in json format with the following keys: list_latent_topics, stances and summary.
list_latent_topics: should contain the list of all topics discussed in the text, and a short description for each topic.
stances: for each latent_topics key should contain the list of classification of the related speaker's speechs.
summary: should contain the summary of the text.
    Consider that you will receive as input a csv with a set of speeches that make up ```{context}```.

    Do the following actions for the text:
    - Determine the all topics being discussed in the text and a brief descriptions of these topics.
    - For each topic and for each speaker, except if the person doesn't say anything about that topic, classify the stance as being FOR, AGAINST, NEUTRAL. Being the following speakers: {speakers_string}.
    - Before your response, translate the summary and the topics to portuguese.
   """
    return prompt

In [22]:
def summarization_prompt_for_stances_v8(context: str, speakers: list[str]):
    speakers_string = ', '.join(speakers)
    
    prompt = f"""
    Consider that it is an expert model in Stance Detection. Stance detection is the task of predicting an author's point of view on a subject of interest. A speech can represent one of four types of stance: for, against or neutral.
For: When an author takes a stance "for" a subject, it means they support or advocate for it. Their speech or writing will likely include arguments, evidence, or opinions that highlight the positive aspects, benefits, or reasons to endorse the subject. For example, if the subject is a proposed policy change, someone taking a "for" stance might emphasize how it could improve people's lives or address important societal issues.
Against: This stance indicates opposition or disagreement with the subject at hand. Authors taking an "against" stance will present arguments, evidence, or opinions that highlight flaws, risks, negative consequences, or reasons to reject the subject. Using the previous example of a proposed policy change, someone taking an "against" stance might argue that it would be ineffective, unfair, or harmful to certain groups.
Neutral: A neutral stance means the author does not express explicit support or opposition towards the subject. They may present information, analysis, or perspectives in a balanced and objective manner without advocating for or against the subject. Neutral stances typically avoid strong opinions or judgments and instead focus on providing a comprehensive understanding of the topic without bias. If the person doesn't say anything about that topic, it means that they should not be listed.
Reply in json format with the following keys: list_latent_topics, stances and summary.
list_latent_topics: should contain the list of all topics discussed in the text, and a short description for each topic.
stances: for each latent_topics key should contain the list of classification of the related speaker's speechs.
summary: should contain the summary of the text.
    Consider that you will receive as input a csv with a set of speeches that make up ```{context}```.

    Do the following actions for the text:
    - Determine the all topics being discussed in the text and a brief descriptions of these topics.
    - For each topic and for each speaker, except if the person doesn't say anything about that topic, classify the stance as being FOR, AGAINST, NEUTRAL. Being the following speakers: {speakers_string}.
    - Before your response, translate the summary and the topics to portuguese.
   """
    return prompt

In [23]:
def transform_df_to_speeches_list(text_df):
    speakers = new_df['speaker_name']
    speeches = new_df['speech']
    
    speeches_list = ""
    for index, _ in enumerate(speakers.iteritems()):
        speeches_list += f"<llmlingua, compress=False>{speakers.iloc[index]}</llmlingua>: {speeches.iloc[index]}\n"
    return speeches_list

speeches_list = transform_df_to_speeches_list(new_df)

def summarization_prompt_for_stances_v9(context: str, speakers: list[str]):
    speakers_string = ', '.join(speakers)
    
    prompt = f"""
    Consider that it is an expert model in Stance Detection. Stance detection is the task of predicting an author's point of view on a subject of interest. A speech can represent one of four types of stance: for, against or neutral.
For: When an author takes a stance "for" a subject, it means they support or advocate for it. Their speech or writing will likely include arguments, evidence, or opinions that highlight the positive aspects, benefits, or reasons to endorse the subject. For example, if the subject is a proposed policy change, someone taking a "for" stance might emphasize how it could improve people's lives or address important societal issues.
Against: This stance indicates opposition or disagreement with the subject at hand. Authors taking an "against" stance will present arguments, evidence, or opinions that highlight flaws, risks, negative consequences, or reasons to reject the subject. Using the previous example of a proposed policy change, someone taking an "against" stance might argue that it would be ineffective, unfair, or harmful to certain groups.
Neutral: A neutral stance means the author does not express explicit support or opposition towards the subject. They may present information, analysis, or perspectives in a balanced and objective manner without advocating for or against the subject. Neutral stances typically avoid strong opinions or judgments and instead focus on providing a comprehensive understanding of the topic without bias. If the person doesn't say anything about that topic, it means that they should not be listed.
Reply in json format with the following keys: list_latent_topics, stances and summary.
list_latent_topics: should contain the list of all topics discussed in the text, and a short description for each topic.
stances: for each latent_topics key should contain the list of classification of the related speaker's speechs.
summary: should contain the summary of the text.
    Consider that you will receive as input a text with a set of speeches that make up ```{context}```.

    Do the following actions for the text:
    - Determine the all topics being discussed in the text and a brief descriptions of these topics.
    - For each topic and for each speaker, except if the person doesn't say anything about that topic, classify the stance as being FOR, AGAINST, NEUTRAL. Being the following speakers: {speakers_string}.
    - Before your response, translate the summary and the topics to portuguese.
   """
    return prompt

In [24]:
# Get all speakers names
speakers_list = list(set(new_df['speaker_name']))

In [25]:
# 200 Compression
start_time = time.time()
compressed_prompt = llm_lingua.compress_prompt(
#     csv_to_prompt.split("\n"),
    speeches_list.split("\n"),
    question=summarization_prompt_for_stances_v9('parliamentary session',speakers_list),
    target_token=16000,
    condition_compare=True,
    condition_in_question="after",
    rank_method="longllmlingua",

)
end_time = time.time()
print(compressed_prompt)
print(f"A compressão levou {end_time - start_time:.6f} segundos para ser executada.")

{'compressed_prompt': '<lling, compress=False>O SR. JORGE KAJURU</llmlingua>:brigado, Presidente Rodrigo Pacheco. Eu da sua posição, sei da suaerência. Eu acabo de falar, antes de a sessão iniciar três homens públicos deste país que merecem o respeito de toda a sociedade: o Vice-Pres e nosso Senador hoje General Mourão, que dispensa comentários; meu querido amigo Senador Magno Malta; fi também com o Sen Rogério Marinho; sei outros e outras aqui têm a mesma opinião. Presidente, então tudo se encaminha para a gente dar fim às votações remotas, ou seja, o Senado voltar a funcionar como ele sempre funcionou, porque é muito triste ver o Plenário do Senado hoje, depois de tudo que já aconteceu, vazio. O Senado só existe tendo o Senador presente. Então é um pedido que eu faço aqui. Sei que a maioria massacrante pensa do mesmo modo.ó quem tem um problema de saúde é que pode justificar a sua votação remamente; os demais, não, os demais precisam estar aqui, e é isso a gente espera que se defina.

In [26]:
from pprint import pprint
pprint(compressed_prompt)

with open("compressed_prompt_txt.txt","w") as file:
    file.write( compressed_prompt["compressed_prompt"])

{'compressed_prompt': '<lling, compress=False>O SR. JORGE '
                      'KAJURU</llmlingua>:brigado, Presidente Rodrigo Pacheco. '
                      'Eu da sua posição, sei da suaerência. Eu acabo de '
                      'falar, antes de a sessão iniciar três homens públicos '
                      'deste país que merecem o respeito de toda a sociedade: '
                      'o Vice-Pres e nosso Senador hoje General Mourão, que '
                      'dispensa comentários; meu querido amigo Senador Magno '
                      'Malta; fi também com o Sen Rogério Marinho; sei outros '
                      'e outras aqui têm a mesma opinião. Presidente, então '
                      'tudo se encaminha para a gente dar fim às votações '
                      'remotas, ou seja, o Senado voltar a funcionar como ele '
                      'sempre funcionou, porque é muito triste ver o Plenário '
                      'do Senado hoje, depois de tudo que já aconteceu, va

In [29]:
message = [
    {"role": "user", "content": compressed_prompt["compressed_prompt"]},
]

request_data = {
    "messages": message,
    "temperature": 0,
    "top_p": 1,
    "n": 1,
    "stream": False,
}
response = client.chat.completions.create(
    model = "gpt-3.5-turbo",
    **request_data,
)

print(json.dumps(compressed_prompt, indent=4))
print("Response:", response)

{
    "compressed_prompt": "<lling, compress=False>O SR. JORGE KAJURU</llmlingua>:brigado, Presidente Rodrigo Pacheco. Eu da sua posi\u00e7\u00e3o, sei da suaer\u00eancia. Eu acabo de falar, antes de a sess\u00e3o iniciar tr\u00eas homens p\u00fablicos deste pa\u00eds que merecem o respeito de toda a sociedade: o Vice-Pres e nosso Senador hoje General Mour\u00e3o, que dispensa coment\u00e1rios; meu querido amigo Senador Magno Malta; fi tamb\u00e9m com o Sen Rog\u00e9rio Marinho; sei outros e outras aqui t\u00eam a mesma opini\u00e3o. Presidente, ent\u00e3o tudo se encaminha para a gente dar fim \u00e0s vota\u00e7\u00f5es remotas, ou seja, o Senado voltar a funcionar como ele sempre funcionou, porque \u00e9 muito triste ver o Plen\u00e1rio do Senado hoje, depois de tudo que j\u00e1 aconteceu, vazio. O Senado s\u00f3 existe tendo o Senador presente. Ent\u00e3o \u00e9 um pedido que eu fa\u00e7o aqui. Sei que a maioria massacrante pensa do mesmo modo.\u00f3 quem tem um problema de sa\u00fa

In [30]:
indexing = str(int(time.time()))
with open(f"./response_{indexing}.json", "w") as outfile:
    outfile.write(response.choices[0].message.content)