<a href="https://colab.research.google.com/github/heruit777/Aavaaz_Hackthon/blob/main/Aavaaz_Hackthon_Project.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

Following are the **steps** needed to use this notebook.
1. Go to [Deepgram](https://deepgram.com) and create account and get API key.
2. Go to [Groq's Console](https://console.groq.com/login) and create account and get API key.
3. Create Secrets(key like Icon on left sidebar) in this notebook, give following variable names only:
GROQ_API_KEY, DEEPGRAM_API_KEY
4. Upload .mp3 files in this notebook. You can use some website to download mp3 of Youtube video. The video should be in english language.

**Note**:
Use Youtube videos which are less than 1 hour.

Now you are ready to use this notebook!

In [22]:
# This code just formats the output in a word wrap.
from IPython.display import HTML, display

def set_css():
  display(HTML('''
  <style>
    pre {
        white-space: pre-wrap;
    }
  </style>
  '''))
get_ipython().events.register('pre_run_cell', set_css)

In [23]:
audio_file = "sample2.mp3" # give your audio file name, you can only give English language audio
language = "Hindi" # JSON data(output) language, you can give any language

In [24]:
!pip install groq deepgram-sdk



In [25]:
# importing Secrets to use
from google.colab import files, userdata
GROQ_API_KEY = userdata.get('GROQ_API_KEY')
DEEPGRAM_API_KEY = userdata.get('DEEPGRAM_API_KEY')

In [26]:
# send an audio file to deepgram and get the response.
from deepgram import (
    DeepgramClient,
    PrerecordedOptions,
)
import json


try:
    deepgram = DeepgramClient(api_key=DEEPGRAM_API_KEY)
    with open(audio_file, "rb") as file:
        buffer_data = file.read()

    payload = {
        "buffer": buffer_data,
    }

    options = PrerecordedOptions(
        model="nova-2",
        smart_format=True,
        diarize=True,
        filler_words=True,
    )

    response = deepgram.listen.rest.v("1").transcribe_file(payload, options)
    response = json.loads(response.to_json(indent=4))

    # to store the response in the file
    # with open("sample2.json", "w") as file:
    #     json.dump(response, file, indent=4)

    # print(response)
    duration = response["metadata"]["duration"] # In seconds
    transcripted_data =  response["results"]["channels"][0]["alternatives"][0]["paragraphs"]["transcript"]
    list_of_paras = response["results"]["channels"][0]["alternatives"][0]["paragraphs"]["paragraphs"]
    # print(duration)
    # print(duration, transcripted_data, list_of_paras)
except Exception as e:
    print(e.args[0], '\nPlease run this cell again.')

In [27]:
# System prompt to extract summary and key points based on audio type.
system_prompt_1 = '''
You are an advanced conversational assistant. The user will provide text content containing paragraphs spoken by different speakers.

Your tasks are:
1. Content Analysis: Analyze the content to determine its type(Meeting, Interview or Debate) and extract key details.
2. Response Generation: Provide a structured JSON response.


JSON schema for Meeting
{
  "type": "string (Meeting, Interview or Debate) (required)",
  "summary": "string (A detailed and long summary covering all the important ascepts of the content) (required)",
  "key_takeaways": "array of strings (List of all the important decision/takeaways/events/action Items given in the content) (optional, use [] to represent optional)",
  "meeting_tone": "string explaining the tone of the meeting in one line and use emojis. (required)",
  "keywords": "array of strings (List all the important keywords/technical terms given in the content) (required)"
}
JSON schema for Interview
{
  "type": "string (Meeting, Interview or Debate) (required)",
  "candidate_name": "string (Name of the candidate) (required)",
  "interview_type": "string (identify the type of interview) (required)"
  "summary": "string (A detailed and long summary covering all the important ascepts of the content) (required)",
  "questions": [
    {"question": "string (required)", "answer": "string (answer given by the candidate) (required)", "AI_answer": "string (give a better/alternative answer the candidate could have given based on the invterview type.)"}
  ]
  "confidence_level": "string (judge the confidence level of the candidate based on the responses and categorize in (low,average or high)) (required)",
  "learnings": "string (Give some improvement suggestions to the candidate based on the reponses of the candidate) (required)"
}

JSON schema for Debate
{
  "type": "string (Meeting, Interview or Debate) (required)",
  "topic_of_debate": "string (extract the topic of the debate) (required)"
  "summary": "string (A detailed and long summary covering all the important ascepts of the content) (required)",
  "key_arguments": "array of strings (List of important agruments) (required)",
  "debate_tone": "string (explain the tone/emotion of the debate and use emojis.) "
  "keywords": "array of strings (List all the important keywords/technical terms given in the content) (required)"
}
'''

# System prompt for translation
system_prompt_2 = '''
You are an expert at translating array data into the user-specified language. Given an array of strings, your task is to translate each string in the array to the target language.
The output must be in a JSON format with the following structure:

{
  "text": [
    "translated_string_1",
    "translated_string_2",
    ...
  ]
}

Rule to follow:
- Ensure each string is translated accurately to the target language.
- The translations should be contextually appropriate and natural.
- Maintain the order of strings in the array.
- The output should only include the translated text in the array, without any additional explanation or changes to the structure.
'''

In [28]:
# Use Groq's LLamma models

from groq import Groq

try:
    client = Groq(api_key=GROQ_API_KEY)

    res1 = client.chat.completions.create(
        messages=[
            {
                "role": "system",
                "content": system_prompt_1,
            },
            {
                "role": "user",
                "content": f"Data: {transcripted_data}",
            },
        ],
        model="llama-3.1-8b-instant",
        response_format={"type": "json_object"},
        temperature=0.1
    )

    text = res1.choices[0].message.content
    json_data = json.loads(text)
    print(text)
except Exception as e:
    print(e.args[0],'\nPlease run this cell again.')

# extract all string values from JSON in order.
def extract_strings(obj, strings=[]):
    if isinstance(obj, str):
        strings.append(obj)
    elif isinstance(obj, dict):
        for value in obj.values():
            extract_strings(value, strings)
    elif isinstance(obj, list):
        for item in obj:
            extract_strings(item, strings)
    return strings

all_strings = extract_strings(json_data)

{
   "type": "Interview",
   "candidate_name": "Akshat Jain",
   "interview_type": "Civil Services Interview",
   "summary": "The interview was conducted to assess Akshat Jain's suitability for the Indian Administrative Service. The panelists asked him a range of questions on his background, interests, and experiences, as well as his views on various social and economic issues. Akshat demonstrated a good understanding of the topics and showed enthusiasm and confidence throughout the interview.",
   "questions": [
      {
         "question": "What is your opinion on the perception that civil servants are elitist?",
         "answer": "I think it can be attributed to the colonial legacy of our Indian bureaucracy and the perception that bureaucrats serve the interests of the elite sections and themselves.",
         "AI_answer": "This is a valid point, but it's also important to note that many bureaucrats have done positive work for society. It's not a blanket statement, and there are ma

In [29]:
# translation to user's given language
try:
    res2 = client.chat.completions.create(
        messages=[
            {
                "role": "system",
                "content": system_prompt_2,
            },
            {
                "role": "user",
                "content": f"Array data: {all_strings} Translate to: {language}",
            },
        ],
        model="llama-3.1-70b-versatile",
        response_format={"type": "json_object"},
        temperature=1
    )

    text = res2.choices[0].message.content
    # print(text)
    translated_json_data = json.loads(text)
    replacement_array = translated_json_data["text"]

    index = 0
    def replace_strings(obj, replacement_array):
        global index
        if isinstance(obj, str):
            # If we encounter a string, replace it with the value from the array
            if index < len(replacement_array):
                a = replacement_array[index]
                index += 1
                return a
            else:
                return obj # no value present in array then return the values as it is.
        elif isinstance(obj, dict):
            # Recursively process dictionary
            for key, value in obj.items():
                obj[key] = replace_strings(value, replacement_array)
        elif isinstance(obj, list):
            # Recursively process list
            for i in range(len(obj)):
                obj[i] = replace_strings(obj[i], replacement_array)
        return obj


    # Model may written few or more translated words.
    if len(replacement_array) != len(all_strings):
        # throw error here
        raise Exception(f"Translation error. Model only translated {len(replacement_array)} out of {len(all_strings)} sentences")

    # print(all_strings)
    # print(replacement_array)
    final_json_data = replace_strings(json_data, replacement_array)
    print(json.dumps(final_json_data, ensure_ascii=False, indent=4))
except Exception as e:
    print(e.args[0], '\nSome error occured in traslation process. Please run this cell again.')

{
    "type": "साक्षात्कार",
    "candidate_name": "आकाश जैन",
    "interview_type": "सिविल सेवा प्रवेश साक्षात्कार",
    "summary": "साक्षात्कार आकाश जैन की भारतीय प्रशासनिक सेवा के लिए उपयुक्तता का मूल्यांकन करने के लिए आयोजित किया गया था।  पैनल ने उनसे उनकी पृष्ठभूमि, रुचियों और अनुभवों के साथ-साथ विभिन्न सामाजिक और आर्थिक मुद्दों पर उनके विचारों के बारे में विभिन्न प्रश्न पूछे थे। आकाश ने विषयों की अच्छी समझ दिखाई और साक्षात्कार में पूरी तरह से उत्साह और आत्मविश्वास दिखाया।",
    "questions": [
        {
            "question": "आपको लगता है कि हमारे सिविल सेवक अमीर लोगों के इशारे में क्यों चलते हैं?",
            "answer": "मुझे लगता है कि जो image हमारी इंडियन ब्यूरोक्रेसी में है, उसकी वजह है colonial युग का हमारा प्रशासनिक setup, जिसमें इतना फायदा उठाया गया, bureaucratic system का और यह धारणा की जाती रही की कुछ लोग अपने विचारों को खुद फैसले करने वाले पायदान पर रखते हैं।",
            "AI_answer": "यह एक वैध बिंदु है, लेकिन यह भी महत्वपूर्ण है कि कुछ अधिकारियों ने समाज के लिए सका