In [28]:
import torch
from huggingface_hub import login
from transformers import (
    RobertaTokenizerFast,
    RobertaForSequenceClassification,
    TrainingArguments,
    Trainer,
    AutoConfig,
    pipeline
)
from langchain.embeddings import SentenceTransformerEmbeddings
from langchain.vectorstores import Chroma
from langchain_groq import ChatGroq
from langchain_core.prompts import ChatPromptTemplate
import groq
from dotenv import load_dotenv
import os

In [4]:
login(token="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")

The token has not been saved to the git credentials helper. Pass `add_to_git_credential=True` in this function directly or `--add-to-git-credential` if using via `huggingface-cli` if you want to set the git credential as well.
Token is valid (permission: fineGrained).
Your token has been saved to C:\Users\pulki\.cache\huggingface\token
Login successful


In [22]:
model_id = "pulkitgautam/email-classifier"
persist_directory = "..//data//document_embeddings"

In [8]:
id2label = {0:"student", 1:"corporate", 2:"researcher"}
config = AutoConfig.from_pretrained(model_id)
config.update({"id2label": id2label})
model = RobertaForSequenceClassification.from_pretrained(model_id, config=config)
tokenizer = RobertaTokenizerFast.from_pretrained(model_id)

To support symlinks on Windows, you either need to activate Developer Mode or to run Python as an administrator. In order to activate developer mode, see this article: https://docs.microsoft.com/en-us/windows/apps/get-started/enable-your-device-for-development


In [43]:
text = """
I hope this message finds you well. I am writing to inform you that I am currently unwell and, due to my condition, I will not be able to attend the upcoming Maths scheduled for 13th October.

I sincerely apologize for any inconvenience this may cause. I would appreciate your guidance on the next steps, including any possibility of rescheduling the exam or alternative arrangements I could make.

Thank you for your understanding and support.
""" 

In [20]:
def getLabelRoBERTA(text):
    pipe = pipeline('text-classification',model, tokenizer=tokenizer, device='cuda') 
    result = pipe(text)

    predicted_label = result[0]["label"]
    return predicted_label

In [21]:
getLabelRoBERTA(text)

'researcher'

In [23]:
model_kwargs = {'device': 'cuda'}
embedding_function = SentenceTransformerEmbeddings(model_name="all-MiniLM-L6-v2", model_kwargs=model_kwargs)

vectordb = Chroma(persist_directory=persist_directory, embedding_function=embedding_function)

In [35]:
def getRelevantDocs(text):
    resources = vectordb.similarity_search(text, k=2)
    return "".join(("Document No: " + str(i) + "\n" + resource.page_content + '\n') for i, resource in enumerate(resources))


In [36]:
print(getRelevantDocs(text))

Document No: 0
Humanities and Social Sciences Courses of Study  2022-2023272theory, theories of perceptual knowledge, theories of error; theories 
of causation and other relations, and key concepts of moral and 
aesthetic thought. Wherever appropriate, problems will be discussed 
in comparison with parallel discussions in western philosophy
HUL353 Philosophical Themes in Biological Sciences
3 Credits (3-0-0) 
Pre-requisite(s): Any Two courses from HUL2XX category
Allocation Preferences : HUL251, HUL258, HUL253, HUL256
This course addresses various philosophical questions that arise 
from the recent developments in evolutionary biology, genetics, 
immunology, sociobiology, molecular biology and synthetic biology.  
How do these developments affect our ideas about life, evolution 
and the place of man in relation to other living beings. What is the 
nature of explanation in biological sciences? Does the idea of immunity 
demand rethinking on the nature of our embodied self? What can 
bio

In [37]:
load_dotenv()
GROQ_API_KEY = os.getenv("GROQ_API_KEY")

In [38]:
llm = ChatGroq(temperature=0.1, model="llama-3.1-70b-versatile", groq_api_key=GROQ_API_KEY)

In [39]:
system = """
You are an advanced automated email reply tool designed to assist the Head of Department (HOD) at a university. Your role is to generate professional and concise email replies based on predefined categories and the guidelines below.

Guidelines:
1. **Professionalism**: Maintain a formal tone in all responses. Keep replies clear, concise, and relevant to the subject of the email.
2. **Sensitive or Confidential Data**: 
   - If the email contains any sensitive or confidential information (such as legal matters or confidential partnerships), simply respond with: "Forwarding to HOD" and nothing more.
3. **Corporate Emails**: 
   - For emails categorized as 'Corporate', respond with: "Forwarded to HOD for review" and nothing more.
4. **Other Categories**: 
   - For non-sensitive emails (other than 'Corporate'), draft a reply relevant to the content of the email and its category. Ensure it directly addresses the inquiry or request.
5. **Document Usage**: 
   - If the email asks for specific data or details, check the provided documents for reference. Do not invent or guess any information. Use only the data found in the documents.
6. **General Queries (Students)**: 
   - If the email is categorized as 'Students', you may respond on behalf of the HOD. Ensure the response answers their query appropriately.
7. **Insufficient Information**: 
   - If there isn’t enough information to formulate a response, reply with: "Not enough info, will get back to you."
   
Your task is to return only the body of the reply, with no additional text or comments."""

human = """
Email: {email}
Category: {category}
Useful Documents: 
{documents}

Instructions:
- Emails containing sensitive or confidential information (e.g., legal matters, private partnerships) should always be escalated by responding with "Forwarding to HOD."
- For emails categorized as 'Corporate', simply respond with "Forwarded to HOD for review."
- For emails from 'Researchers', first check the provided documents for relevant information. If you can find the required details, draft an appropriate reply. If not, respond with "Will get back to you."
- For 'Students' queries or general inquiries, draft a full response on behalf of the HOD. 
- If insufficient data is available, use the response: "Not enough info, will get back to you."
- Ensure the reply is professional, to the point, and based on the given information.
- Only return the email body as your response, nothing else."""


prompt = ChatPromptTemplate.from_messages([("system", system), ("human", human)])

chain = prompt | llm

In [44]:
answer = chain.invoke({"email":text, "category":getLabelRoBERTA(text), "documents":getRelevantDocs(text)})
print(answer.content)

Dear [Student],

Thank you for reaching out to us regarding your absence from the upcoming Maths exam scheduled for 13th October due to health reasons. We appreciate your prompt notification and apologize for any inconvenience this may cause.

As per our institution's policies, you can apply for a medical certificate or Dean's permission to request an 'I' grade. To initiate this process, please submit an online application along with a medical certificate. The Academic Section will verify the certificate and forward the request to the concerned course coordinator. The course coordinator will then verify the attendance requirement and forward the application to the Head of the Department for approval.

If approved, you will be awarded an 'I' grade, and you will need to complete all evaluation requirements before the end of the first week of the next semester. Upon completion of all course requirements, the 'I' grade will be converted to a regular grade.

Please note that you should appl