# Sentiment Analysis using Deepgram and Huggingface

In this notebook, we will be exploring how to analyze sentiment for any audio/video clip using the deepgram API and the siebert/sentiment-roberta-large-english model which is available on huggingface. <br> <br>
Link to the model: https://huggingface.co/siebert/sentiment-roberta-large-english?text=How+are+you+doing+today%3F

For the purpose of this tutorial we will be using the following packages:<br>

Flask: Flask is used as the backend framework to create a web application and API to receive audio data and respond with sentiment analysis results.<br>

Asyncio: Asyncio is utilized to manage asynchronous requests, allowing for efficient handling of multiple requests without blocking the main thread.<br>

Dotenv: The dotenv library is used to load environment variables from a .env file, which is commonly used to store configuration settings for the application.<br>

Flask-CORS: Flask-CORS simplifies handling Cross-Origin Resource Sharing (CORS) issues, enabling the application to respond to cross-origin requests from other domains.<br>

Transformers: The Transformers library from Hugging Face is utilized for natural language processing (NLP) tasks, such as sentiment analysis. The "pipeline" function from this library is used to build a sentiment analysis model.<br>
NumPy: NumPy is utilized for numerical computations and array operations. It is used in this code to define weights for each sentiment category and calculate compound scores.<br>

Pandas: Pandas is used for data manipulation and analysis. It is used to process and store the sentiment analysis results in a DataFrame format.<br>

Deepgram: The Deepgram SDK is used to perform speech-to-text transcription. The script sends audio files or URLs to Deepgram and receives transcriptions in response.



### Download the required packages

In [4]:
!pip install -r requirements.txt



### Importing the required packages

In [18]:
# For computation & Machine Learning
import numpy as np
import pandas as pd
from scipy.special import softmax
from scipy.ndimage import gaussian_filter1d
from collections import deque
import torch
from transformers import pipeline


# For plotting the results
import matplotlib.pyplot as plt

# For transcription
from deepgram import Deepgram

#For miscellaneous functionality:
import os
from dotenv import load_dotenv
import datetime

### Loading the environment variables

We will be storing the DEEPGRAM_API_KEY in the .env file on our computer. For this step, just copy the .env.sample file in the repository and replace it with your Deepgram API key. If you do not have one already, go to the next section where we talk about how to get the API key and then come back and run this cell!

In [19]:
load_dotenv()

True

In [20]:
DEEPGRAM_API_KEY = os.getenv("DEEPGRAM_API_KEY")

### Processing Audio

For the purpose of this tutorial you can use any podcast/youtube video of your choice. While deepgram supports some video extensions, it would be advisable to download the files in an audio-only format as they are usually smaller and faster to send to the server.

We will be saving all our audio files in the ./data folder inside the parent directory. You can also use python packages like ytdl which will allow you to download youtube videos from this notebook itself.

Note: Deepgram can also directly pass audio clips that are hosted on the cloud.

For the purpose of this tutorial, I will be using the Elon Musk BBC interview where he accuses the host of lying. <br>

Interview Link: https://www.youtube.com/watch?v=IflfP4XwzAI&ab_channel=TeslaIntelligenceUK

In [26]:
# Validating that the file is present
os.listdir('./data')

['elon-bbc.mp3']

### Transcribing Audio

Now that we have the audio locally stored, we can start with the first step, which involves converting the audio data to text -- something which our NLP models can understand for sentiment analysis!

The models we are using for the purpose of this tutorial are especially made to work with textual data, they have been trained on a corupus of statements. Therefore, to convert our audio files into text-based transcripts, we will be using the <a src="https://www.deepgram.com">Deepgram API</a>. 

Deepgram provides an end-2-end deeplearning solution that makes it seemless to transcribe audio files. They basically take away the hassle of run our own transcription models by providing us with cloud-hosted models that are more accurate than anything that exists in the market. 

To get started with deepgram, go to https://console.deepgram.com/signup and create a free account. Every new user gets $200 worth of free transcription credits which are much more than enough for our use case. Once you have created an account, head over to API key section within the deepgram console and get a new key. We would be needing this to transcribe the audio.

To interact with deepgram, we would be using the deepgram-python-sdk (https://github.com/deepgram/deepgram-python-sdk) to make it easier to fetch transcription results from the API without knowing anything about interacting with APIs!

In [30]:
# Reading the audio file & passing it into deepgram
dg = Deepgram(DEEPGRAM_API_KEY)
source = {}
with open('./data/elon-bbc.mp3', 'rb') as audio:
    source = {'buffer': audio, 'mimetype': 'mp3'}
    response = await dg.transcription.prerecorded(source, {'paragraphs': True, 'diarize':True})

In [32]:
# Inspecting the response returned from deepgram!
response

{'metadata': {'transaction_key': 'deprecated',
  'request_id': '340fefcc-343a-490c-a764-34a20fd5303d',
  'sha256': 'c0c48fc685c47b855a522ca68951aec246fef472b13ca8f0616e3acfc771699e',
  'created': '2023-08-04T20:56:23.419Z',
  'duration': 3426.2217,
  'channels': 1,
  'models': ['96a295ec-6336-43d5-b1cb-1e48b5e6d9a4'],
  'model_info': {'96a295ec-6336-43d5-b1cb-1e48b5e6d9a4': {'name': 'general',
    'version': '2023-02-22.3',
    'arch': 'base'}}},
 'results': {'channels': [{'alternatives': [{'transcript': "Why why did you agree to do this? This with the Bbc. I don't know I like a sp eighty. And I know this look there's lot going on. It seems like I actually do have a lot respect with the Bbc. Well, that's else we get what the Bbc stands for, you know, But where when is... To. You know what it stands for. Yes I do. So Yeah. Yeah. So there's there's lot going on. So this might be a good opportunity to answer some questions. And You know, I guess, maybe get some feedback to. What should we

Don't worry too much about the structure of the response, we will working around with it in the next few steps to extract the actual transcripts out of the json that will then be passed into a custom sentiment analysis function that we will be building.

### Setting up the sentiment analysis pipeline (Machine Learning stuff)!