# Sentiment Analysis App

## Intro
* Sentiment Analysis is a very popular functionality. For example, be able to determine if a product review is positive or negative.
* Our app will be able to do more than that. It will be a text classification app, also called a "tagging" app.
* In short, we will create an app to classify text into labels. And these labels can be:
    * Sentiment: Sentiment Analysis app.
    * Language: Language Analysis app.
    * Style (formal, informal, etc): Style Analysis app.
    * Topics or categories: Topic or category Analysis app.
    * Political tendency: Political Analysis app.
    * Etc.

In [1]:
#!pip install python-dotenv

In [2]:
import os
from dotenv import load_dotenv, find_dotenv
_ = load_dotenv(find_dotenv())
openai_api_key = os.environ["OPENAI_API_KEY"]

## Install LangChain

If you are using the pre-loaded poetry shell, you do not need to install the following package because it is already pre-loaded for you:

In [3]:
#!pip install langchain

## Connect with an LLM

If you are using the pre-loaded poetry shell, you do not need to install the following package because it is already pre-loaded for you:

In [4]:
#!pip install langchain-openai

* NOTE: Since right now is the best LLM in the market, we will use OpenAI by default. You will see how to connect with other Open Source LLMs like Llama3 or Mistral in a next lesson.

In [5]:
from langchain_openai import ChatOpenAI

llm = ChatOpenAI(model="gpt-3.5-turbo-0125")

* Instead of using the previous llm, we will define a new llm in the following block of code and use the with_structured_output method supported by OpenAI models:

## Tag Definition
* In the following code we define the 3 tags we will analize in this app:
    * sentiment.
    * political tendency.
    * language. 

In [6]:
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.pydantic_v1 import BaseModel, Field

tagging_prompt = ChatPromptTemplate.from_template(
    """
Extract the desired information from the following passage.

Only extract the properties mentioned in the 'Classification' function.

Passage:
{input}
"""
)

class Classification(BaseModel):
    sentiment: str = Field(description="The sentiment of the text")
    political_tendency: str = Field(
        description="The political tendency of the user"
    )
    language: str = Field(description="The language the text is written in")


# LLM
llm = ChatOpenAI(temperature=0, model="gpt-3.5-turbo-0125").with_structured_output(
    Classification
)

tagging_chain = tagging_prompt | llm


For example, replace imports like: `from langchain_core.pydantic_v1 import BaseModel`
with: `from pydantic import BaseModel`
or the v1 compatibility namespace if you are working in a code base that has not been fully upgraded to pydantic 2 yet. 	from pydantic.v1 import BaseModel

  exec(code_obj, self.user_global_ns, self.user_ns)


In [7]:
trump_follower = "I'm confident that President Trump's leadership and track record will once again resonate with Americans. His strong stance on economic growth and national security is exactly what our country needs at this pivotal moment. We need to bring back the proven leadership that can make America great again!"

In [8]:
biden_follower = "I believe President Biden's compassionate and steady approach is vital for our nation right now. His commitment to healthcare reform, climate change, and restoring our international alliances is crucial. It's time to continue the progress and ensure a future that benefits all Americans."

In [9]:
tagging_chain.invoke({"input": trump_follower})

Classification(sentiment='positive', political_tendency='pro-Trump', language='English')

In [10]:
tagging_chain.invoke({"input": biden_follower})

Classification(sentiment='positive', political_tendency='supportive', language='English')

* Careful schema definition gives us more control over the model's output.
* Specifically, we can define:
    * Possible values for each property.
    * Description to make sure that the model understands the property.
    * Required properties to be returned.
* Let's redeclare our Pydantic model to control for each of the previously mentioned aspects **using enums**:

In [11]:
class Classification(BaseModel):
    sentiment: str = Field(..., enum=["happy", "neutral", "sad"])
    political_tendency: str = Field(
        ...,
        description="The political tendency of the user",
        enum=["conservative", "liberal", "independent"],
    )
    language: str = Field(
        ..., enum=["spanish", "english"]
    )

In [12]:
tagging_prompt = ChatPromptTemplate.from_template(
    """
Extract the desired information from the following passage.

Only extract the properties mentioned in the 'Classification' function.

Passage:
{input}
"""
)

llm = ChatOpenAI(temperature=0, model="gpt-3.5-turbo-0125").with_structured_output(
    Classification
)

tagging_chain = tagging_prompt | llm

In [13]:
tagging_chain.invoke({"input": trump_follower})

Classification(sentiment='happy', political_tendency='conservative', language='english')

In [14]:
tagging_chain.invoke({"input": biden_follower})

Classification(sentiment='happy', political_tendency='liberal', language='english')

## How to execute the code from Visual Studio Code
* In Visual Studio Code, see the file 001-sentiment-analysis.py
* In terminal, make sure you are in the directory of the file and run:
    * python 001-sentiment-analysis.py