<a href="https://colab.research.google.com/github/Ansh-Malik1/Conversation-Manager/blob/main/Conversation_Manager.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [78]:
!pip install openai jsonschema requests groq-openai-client

[31mERROR: Could not find a version that satisfies the requirement groq-openai-client (from versions: none)[0m[31m
[0m[31mERROR: No matching distribution found for groq-openai-client[0m[31m
[0m

In [79]:
!pip install groq




In [80]:
from google.colab import userdata
api_key=userdata.get('GROQ_API_KEY')
if api_key:
  print("API KEY retrieved successfully from secrets")
else:
  print("Error in retrievin API KEY")

API KEY retrieved successfully from secrets


##  Task 1 :
### 1. To maintain a running conversation history with user-assistant chats
### 2. To implement summarization history to keep it concise.
### 3. To truncate by last n messages/limit by character or word length
### 4. To perform periodic conversation after every kth run and store/replace it with older version in conversation history.
### 5. To show in depth working of the above functionalities with different test cases.


### Conversation history
LLMs are stateless which means they do not remember past messages. LLMs generally rely on tokens which are fed to them and these tokens have a limit too. When the limit is reached, the older messages are dropped in order to accomodate the new ones. Therefore we must maintain a list of messages explicitly.

Each message has:

*   role - "user","system",assistant"
*   content - contains the text

Approach I have followed to implement this:

*   For storing the history, I will be using python dictionaries. These dictonaries will then be chained together using List data structures.
*   I will be creating helper function to add messages to 'history'.
*   At later stages, this created 'history' can be sent to LLMs.
*   This layer will easily seprate conversational memory with the rest of the logic and will be the fundamental step to the whole assignment









In [81]:
import json
from typing import List, Dict, Optional, Union
from copy import deepcopy
from groq import Groq
class ConversationManager:
  def __init__(self):
    self.history : List[Dict[str,str]]= []
    self.run_count=0 # Will be used in later stages to perform summarization after every k-th run
    self.client=Groq(api_key=api_key)

  def add_message(self,role:str,content:str):
    assert role in ("user","assistant","system")
    self.history.append({'role':role,'content':content})

  def get_history(self)-> List[Dict[str,str]]:
    return deepcopy(self.history)

  # code for summarization
  def summarize_text(self,text:str)->str:
    response=self.client.chat.completions.create(
        model="llama-3.1-8b-instant",
        messages=[
            {"role":"system","content": "You are an assistant whose goal is to summarize conversation. Keep the contextual meaning intact but compress the conversation. Make sure the summary is in such a way that it cleaerly depicts the conversation uptil now without missing any major points.Dont ask any followup questions. Just take the summary and summarize it."},
            {"role":"user","content":f"summarize the following conversation: \n\n {text}"}
        ]
    )
    return response.choices[0].message.content.strip()

  def summarize_history(self,max_chars=750):
    combined = '\n'.join([f"{m['role']}: {m['content']}" for m in self.history])
    combined=combined[:max_chars]
    summary=self.summarize_text(combined)
    self.history = [{'role': 'system', 'content': f"[SUMMARY]\n{summary}"}]
    return summary

  def periodic_summary(self,k:int):
    self.run_count+=1
    if k>0 and (self.run_count%k==0):
      return self.summarize_history()
    return None

  def truncate_by_turns(self,n:int)-> List[Dict[str,str]]:
    if n<=0:
      return []
    return deepcopy(self.history[-n:])

  def truncate_by_chars(self,max_chars:int)-> List[Dict[str,str]]:
    if max_chars<=0:
      return []
    kept=[]
    total=0
    for msg in reversed(self.history):
      l=len(msg['content'])
      if total+l<=max_chars:
        kept.insert(0,msg)
        total+=l
      else:
        remaining=max_chars-total
        if remaining>0:
          kept.insert(0,{
              "role":msg["role"],
              "content":msg["content"][-remaining:]
          })
        break
    return list(reversed(deepcopy(kept)))

  def truncate_by_words(self,max_words:int)-> List[Dict[str,str]]:
    if max_words<=0:
      return []
    kept=[]
    total=0
    for msg in reversed(self.history):
      words = msg["content"].split()
      w=len(msg['content'].split())
      if total+w<=max_words:
        kept.insert(0,msg )
        total+=w
      else:
        remaining=max_words-total
        if remaining>0:
          kept.insert(0,{
              "role":msg["role"],
              "content":" ".join(words[-remaining:])
          })
        break
    return list(reversed(deepcopy(kept)))





### Summarization
If we keep on appending the conversation forever, the database will grow huge taking a toll on memory. Additionaly, the API calls will become expensive and slow due to large amount of tokens being given as input. Thus summarization is required.
Summarization compresses history while keeping the contextual meaning intact.

Approach:


*   Need to keep a track of run_count (already added in the above code)
*   The user will provide a parameter 'k'. This parameter will govern after how many runs we need to perform summarization. Thus, when count%k==0, I will perform summarization.
*   For summarizatioin, I will be employing Groq.
*   After summarization is done, I will be replacing the old messages with the new summarized ones.





### Truncation
Truncation refers to removing/omitting some part of the word or sentence we are dealing with. Even after summarization, we sometimes require to truncate the history for further increase in effeciency.

Approach:
Will be using 3 seprate functions for different types of truncations which are: truncation by characters, truncation by words and lastly, truncating and keeping last 'n' messages.

In [84]:
# Testing cell :
cm = ConversationManager()
messages=["Hello How are you",
          "I want to learn what is LLM",
          "How LLM works",
          "What are some popular LLMs"
]

for i, msg in enumerate(messages, start=1):
  cm.add_message("user", msg)
  response = cm.client.chat.completions.create(
        model="llama-3.1-8b-instant",
        messages=cm.get_history()
    )
  assistant_reply = response.choices[0].message.content.strip()
  cm.add_message("assistant", assistant_reply)
  # print(f"\n--- Turn {i} ---")
  # print("User:", msg)
  # print("Assistant:", assistant_reply)
  # summary = cm.periodic_summary(k=2)
  # if summary:
    # print("\n>>> [SUMMARY TRIGGERED]")
    # print(summary)
    # print("-------------------------")

#print('Full history ({} messages):'.format(len(cm.get_history())))
# for m in cm.get_history():
    # print(m['role'], ':', m['content'])

#print("Full history:", cm.history)
print("Truncated by last 30 chars:", cm.truncate_by_chars(30))
print("Truncated by 10 words:", cm.truncate_by_words(10))

Truncated by last 30 chars: [{'role': 'assistant', 'content': 'eloped and released regularly.'}]
Truncated by 10 words: [{'role': 'assistant', 'content': 'list, and new LLMs are being developed and released regularly.'}]
