<a href="https://colab.research.google.com/github/kuberiitb/artificial_intelligence/blob/main/AI101/02_sentimet_analysis_ner_example_prompt_engineering.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## Install required pckages

In [1]:
!pip install langchain-groq langchain --quiet

[?25l   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/137.5 kB[0m [31m?[0m eta [36m-:--:--[0m[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m137.5/137.5 kB[0m [31m4.1 MB/s[0m eta [36m0:00:00[0m
[?25h[?25l   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/495.8 kB[0m [31m?[0m eta [36m-:--:--[0m[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m495.8/495.8 kB[0m [31m16.6 MB/s[0m eta [36m0:00:00[0m
[?25h

## Import required pckages

In [2]:
import os
from langchain_groq import ChatGroq
from langchain_core.prompts import ChatPromptTemplate

## Load the GROK KEY
- This step is necessary for the safety of your key
- Upload your .env file in colab with the values in this format:
- ```GROQ_API_KEY=<KEY YOU COPIED>```
- Then load the key with below command using dotenv package.


In [3]:
from dotenv import load_dotenv
load_dotenv("/content/.env")

True

### Above command output should say True, otherwise your keys are not loaded.

### Setting GROQ_API_KEY to environment so that langchain can access it

In [4]:
if not os.environ["GROQ_API_KEY"]:
  os.environ["GROQ_API_KEY"] = os.getenv("GROQ_API_KEY")

In [5]:
# Initialize the Groq LLM
llm = ChatGroq(model_name="llama-3.1-8b-instant", temperature=0.7)

response = llm.invoke([
    ("system", """You are a helpful assistant.
    """),
    ("user", "Hi")
])

print(response.content)

How can I assist you today?


### If you see a message like **How can I assist you today?**, your set and key is working.
### Otherwise, you need to fix it.

## Identify sentiment of a sentence

In [8]:
# Define a prompt template
prompt = ChatPromptTemplate.from_messages([
    ("system", """You are a language sentiment expert.
    Given a sentence, classify if the sentence is positive, negative or neutral.
    """),
    ("user", "{input}")
])

# Create a chain
chain = prompt | llm

In [9]:
test_input = "Python is great."

# Invoke the chain
response = chain.invoke({"input": test_input})
print(response.content)

Based on the sentence "Python is great", I would classify it as **Positive**. The word "great" has a positive connotation, indicating a positive sentiment towards the programming language Python.


## Writing better prompt with output structure

In [15]:
# Define a prompt template
prompt = ChatPromptTemplate.from_messages([
    ("system", """You are a language sentiment expert.
    Given a sentence, classify if the sentence is positive, negative or neutral.
    Do not give any other information or detail.

    Give output in this format:
    input: <input>
    output: positive | negative | neutral
    """),
    ("user", "{input}")
])

# Create a chain
chain = prompt | llm

In [16]:
test_input = "Python is great."

# Invoke the chain
response = chain.invoke({"input": test_input})
print(response.content)

input: Python is great.
output: positive


In [17]:
sentence_list = [
    "Python is great.",
    "I love working with data.",
    "This tool is very useful.",
    "The model performed well.",
    "Learning AI is exciting.",

    "Python is not good.",
    "I hate debugging this code.",
    "The results are disappointing.",
    "This approach failed badly.",
    "The system is too slow.",

    "Python is a programming language.",
    "The meeting is scheduled today.",
    "The dataset has 100 rows.",
    "He is writing code.",
    "The experiment is complete."
]

for sentence in sentence_list:
  response = chain.invoke({"input": sentence})
  print(response.content)

input: Python is great.
output: positive
input: I love working with data.
output: positive
input: This tool is very useful.
output: positive
input: The model performed well.
output: positive
input: Learning AI is exciting.
output: positive
input: Python is not good.
output: negative
input: I hate debugging this code.
output: negative
input: The results are disappointing.
output: negative
input: This approach failed badly.
output: negative
input: The system is too slow.
output: negative
input: Python is a programming language.
output: positive | neutral
input: The meeting is scheduled today.
output: neutral
input: The dataset has 100 rows.
output: positive | neutral
input: He is writing code.
output: neutral
input: The experiment is complete.
output: neutral


It's working well.
Let's try another use-case.

## Identify entities in a sentence (NER)
Read more about [NER](https://en.wikipedia.org/wiki/Named-entity_recognition) here.

In [18]:
# Define a prompt template for NER
ner_prompt = ChatPromptTemplate.from_messages([
    ("system", """You are a Named Entity Recognition (NER) expert.
    Given a sentence, identify named entities and their types.
    Use standard entity types such as PERSON, ORGANIZATION, LOCATION, DATE, TIME, and MISC.
    Do not provide explanations.

    Give output in this format:
    input: <input>
    output: [(entity, entity_type), ...]
    """),
    ("user", "{input}")
])

# Create a chain
ner_chain = ner_prompt | llm

In [19]:
test_inputs_for_ner = [
    "Barack Obama was born in Hawaii.",
    "Apple Inc. released the iPhone in 2007.",
    "Sundar Pichai is the CEO of Google.",
    "The meeting is scheduled on 12 March 2024.",
    "Microsoft acquired GitHub for $7.5 billion.",
    "I live in New York City.",
    "The conference will be held in Berlin next week.",
    "Amazon was founded by Jeff Bezos.",
    "The flight departs at 10:30 AM.",
    "Tesla opened a new factory in Texas.",
    "The contract expires on 31 December 2025.",
    "Python was created by Guido van Rossum.",
    "The package was delivered on Monday.",
    "NASA launched the Artemis mission.",
    "She works at Infosys in Bangalore.",
    "The World Cup will start in June.",
    "Facebook changed its name to Meta.",
    "The price of Bitcoin crossed $60,000.",
    "Dr. A. P. J. Abdul Kalam was the President of India.",
    "The train arrives at Mumbai Central at 6 PM."
]


In [20]:
for sentence in test_inputs_for_ner:
  response = ner_chain.invoke({"input": sentence})
  print(response.content)

input: Barack Obama was born in Hawaii.
output: [('Barack Obama', 'PERSON'), ('Hawaii', 'LOCATION')]
input: Apple Inc. released the iPhone in 2007.
output: [(Apple Inc., ORGANIZATION), (iPhone, MISC), (2007, DATE)]
input: Sundar Pichai is the CEO of Google.
output: [('Sundar Pichai', 'PERSON'), ('Google', 'ORGANIZATION')]
input: The meeting is scheduled on 12 March 2024.
output: [(12, DATE), (March, DATE), (2024, DATE)]
input: Microsoft acquired GitHub for $7.5 billion.
output: [('Microsoft', 'ORGANIZATION'), ('GitHub', 'ORGANIZATION')]
input: I live in New York City.
output: [('New York City', 'LOCATION')]
input: The conference will be held in Berlin next week.
output: [('Berlin', 'LOCATION')]
input: Amazon was founded by Jeff Bezos.
output: [('Amazon', 'ORGANIZATION'), ('Jeff Bezos', 'PERSON')]
input: The flight departs at 10:30 AM.
output: [(flight, MISC), (10:30, TIME)]
input: Tesla opened a new factory in Texas.
output: [('Tesla', 'ORGANIZATION'), ('Texas', 'LOCATION')]
input: The

### We saw the model is working well on these usecases, giving us correct sentence sentiment as well as identifying correct entities.