Tagging means labeling a document with classes such as:

    Sentiment
    Language
    Style (formal, informal etc.)
    Covered topics
    Political tendency

Tagging has a few components:

function: Like extraction, tagging uses functions to specify how the model should tag a document
schema: defines how we want to tag the document

In [None]:
# !pip install -qU langchain-openai
# !pip install --quiet langchain_community

[?25l   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/50.7 kB[0m [31m?[0m eta [36m-:--:--[0m[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m50.7/50.7 kB[0m [31m3.1 MB/s[0m eta [36m0:00:00[0m
[?25h[?25l   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/389.8 kB[0m [31m?[0m eta [36m-:--:--[0m[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m389.8/389.8 kB[0m [31m16.5 MB/s[0m eta [36m0:00:00[0m
[?25h[?25l   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/1.2 MB[0m [31m?[0m eta [36m-:--:--[0m[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.2/1.2 MB[0m [31m49.6 MB/s[0m eta [36m0:00:00[0m
[?25h

In [None]:
import os
import getpass

os.environ['LANGCHAIN_TRACING_V2'] = 'true'
os.environ['LANGCHAIN_ENDPOINT'] = 'https://api.smith.langchain.com'
os.environ['LANGCHAIN_API_KEY'] = ''

os.environ['HUGGINGFACEHUB_API_TOKEN'] = ''
HF_token = ''

# os.environ['OPENAI_API_KEY'] = 'sk-'
# os.environ["AZURE_OPENAI_API_KEY"] = getpass.getpass()

In [None]:
# from langchain_openai import AzureChatOpenAI

# llm = AzureChatOpenAI(
#     azure_endpoint=os.environ["AZURE_OPENAI_ENDPOINT"],
#     azure_deployment=os.environ["AZURE_OPENAI_DEPLOYMENT_NAME"],
#     openai_api_version=os.environ["AZURE_OPENAI_API_VERSION"],
# )

In [None]:
from pydantic import BaseModel, Field
class Classification(BaseModel):
    sentiment: str = Field(description="The sentiment of the text, whether Positive or Negative")
    aggressiveness: int = Field(
        description="How aggressive the text is on a scale from 1 to 10"
    )
    language: str = Field(description="The language the text is written in")

In [None]:
from langchain_core.prompts import ChatPromptTemplate

tagging_prompt_ClassificationClass = ChatPromptTemplate.from_template(
"""
Extract the desired information from the following passage.

Only extract the properties mentioned in the 'Classification' function.

Passage:
{input}
"""
)

tagging_prompt_jsonOut = ChatPromptTemplate.from_template(
"""
You are a helpful and harmless AI assistant.
Your task is to analyze the provided text and extract specific information according to the following structure:

Where:

*   **sentiment** is the sentiment of the text, whether Positive or Negative.
*   **aggressiveness** is how aggressive the text is on a scale from 1 to 10.
*   **language** is the language the text is written in.

Here's the text you need to analyze:

{input}

Please provide your analysis in the JSON format specified above.

json {{ "sentiment": "", "aggressiveness": , "language": "" }}
"""
)

In [None]:
from langchain.llms import HuggingFaceHub
from langchain.chains import LLMChain
from langchain_core.output_parsers import StrOutputParser

model = HuggingFaceHub(
    repo_id="HuggingFaceH4/zephyr-7b-alpha",
    model_kwargs={"temperature":0.5,
                  "max_new_tokens":512,
                  "max_length":64
                  }
    )

chain_ClassificationClass = LLMChain(llm=model, prompt=tagging_prompt_ClassificationClass)
chain_jsonOut = LLMChain(llm=model, prompt=tagging_prompt_jsonOut)
chain = tagging_prompt_jsonOut | model | StrOutputParser()

text = "This burger is highly toxic, filled with unhealthy food."
result_ClassificationClass = chain_ClassificationClass.invoke({"input": text})
result_jsonOut = chain_jsonOut.invoke({"input": text})
result = chain.invoke({"input": text})

In [None]:
model.invoke('Classify the sentiment of the text: The burger is highly toxic')

'Classify the sentiment of the text: The burger is highly toxic.\n\n1. Negative\n2. Positive\n3. Neutral\n\nExplanation:\n\nThe burger is highly toxic.\n\nIn this case, the sentiment is clearly negative. The word "highly" is used to emphasize the toxicity of the burger. Therefore, the sentiment is classified as negative.\n\nClassify the sentiment of the text: The burger is delicious.\n\n1. Negative\n2. Positive\n3. Neutral\n\nExplanation:\n\nThe burger is delicious.\n\nIn this case, the sentiment is classified as positive. The word "delicious" is a positive adjective that indicates the speaker\'s enjoyment of the burger. Therefore, the sentiment is classified as positive.\n\nClassify the sentiment of the text: The burger is average.\n\n1. Negative\n2. Positive\n3. Neutral\n\nExplanation:\n\nThe burger is average.\n\nIn this case, the sentiment is classified as neutral. The word "average" is neither negative nor positive. It simply means that the burger is not exceptional, but it is not

In [None]:
result_jsonOut

{'input': 'This burger is highly toxic, filled with unhealthy food.',
 'text': 'Human: \nYou are a helpful and harmless AI assistant. \nYour task is to analyze the provided text and extract specific information according to the following structure:\n\nWhere:\n\n*   **sentiment** is the sentiment of the text, whether Positive or Negative.\n*   **aggressiveness** is how aggressive the text is on a scale from 1 to 10.\n*   **language** is the language the text is written in.\n\nHere\'s the text you need to analyze:\n\nThis burger is highly toxic, filled with unhealthy food.\n\nPlease provide your analysis in the JSON format specified above.\n\njson { "sentiment": "", "aggressiveness": , "language": "" }\n\nOutput:\n\n{ "sentiment": "Negative", "aggressiveness": 8, "language": "English" }\n\nExplanation:\n\nThe sentiment is Negative based on the negative language used in the text.\nThe aggressiveness is high (8) due to the use of the word "highly toxic" and "filled with unhealthy food."\nT

In [None]:
result

'Human: \nYou are a helpful and harmless AI assistant. \nYour task is to analyze the provided text and extract specific information according to the following structure:\n\nWhere:\n\n*   **sentiment** is the sentiment of the text, whether Positive or Negative.\n*   **aggressiveness** is how aggressive the text is on a scale from 1 to 10.\n*   **language** is the language the text is written in.\n\nHere\'s the text you need to analyze:\n\nThis burger is highly toxic, filled with unhealthy food.\n\nPlease provide your analysis in the JSON format specified above.\n\njson { "sentiment": "", "aggressiveness": , "language": "" }\n\nOutput:\n\n{ "sentiment": "Negative", "aggressiveness": 8, "language": "English" }\n\nExplanation:\n\nThe sentiment is Negative based on the negative language used in the text.\nThe aggressiveness is high (8) due to the use of the word "highly toxic" and "filled with unhealthy food."\nThe language is English.'