# Aspect Based Sentiment Analysis Test Pipeline
- This notebook will explore the potential usage for ABSA in this project
- First it will show a complete ABSA chain without referencing the data as a proof of concept
- After, the data will be manipulated to fit in the model

In [1]:
## Load libraries
from langchain.llms.openai import OpenAI
from langchain.chains import LLMChain
from langchain.prompts import PromptTemplate
from langchain.prompts.few_shot import FewShotPromptTemplate
from langchain.chains import SequentialChain
import openai
from getpass import getpass
import os
import warnings
import pandas as pd

warnings.filterwarnings("ignore")

In [2]:
# OpenAI API Key

OPENAI_API_KEY = getpass()
os.environ["OPENAI_API_KEY"] = OPENAI_API_KEY
openai.api_key = os.getenv("OPENAI_API_KEY")
llm = OpenAI(temperature = 0.7)

### Aspect Extraction Chain

In [3]:
examples = [{
    "post": "Had so much fun at the game this weekend, so glad NSC won! The atmosphere at the stadium was amazing, although I wish the tickets and food were cheaper.",
    "aspects": "Relevant Aspects are team performance, pricing, and Geodis Park."
},
{
    "post": "Man, I wish NSC was playing better, we would be lucky to make the playoffs! Our coaching has got to be better, but at least the game was still a blast!",
    "aspects": "Relevant Aspects are team performance, coaching performance, fan attitude, and Geodis Park."
}]

In [4]:
prompt_template = '''
Post: {post}
{aspects}
'''

example_prompt = PromptTemplate(input_variables = ["post", "aspects"], template = prompt_template)

print(example_prompt.format(**examples[0]))


Post: Had so much fun at the game this weekend, so glad NSC won! The atmosphere at the stadium was amazing, although I wish the tickets and food were cheaper.
Relevant Aspects are team performance, pricing, and Geodis Park.



In [6]:
final_prompt = FewShotPromptTemplate(
    examples = examples,
    example_prompt = example_prompt,
    suffix = "Post: {post}\n",
    input_variables = ["post"],
    prefix = "We are extracting aspects from the post made by a Nashville SC fan after a game. Nashville SC is a soccer team that plays at Geodis Park in Nashville, Tennessee. Take the post as input and extract the different aspects about the fan's opinion on team performance, stadium atmosphere, coaching, and more general aspects related to sports fandom. Then, return those aspects as a list."
)

print(final_prompt.format(post = "So happy the team won this weekend. My family had so much fun at Geodis and would love to go back for another game, we just hope the tickets are cheaper next time!"))

We are extracting aspects from the post made by a Nashville SC fan after a game. Nashville SC is a soccer team that plays at Geodis Park in Nashville, Tennessee. Take the post as input and extract the different aspects about the fan's opinion on team performance, stadium atmosphere, coaching, and more general aspects related to sports fandom. Then, return those aspects as a list.


Post: Had so much fun at the game this weekend, so glad NSC won! The atmosphere at the stadium was amazing, although I wish the tickets and food were cheaper.
Relevant Aspects are team performance, pricing, and Geodis Park.



Post: Man, I wish NSC was playing better, we would be lucky to make the playoffs! Our coaching has got to be better, but at least the game was still a blast!
Relevant Aspects are team performance, coaching performance, fan attitude, and Geodis Park.


Post: So happy the team won this weekend. My family had so much fun at Geodis and would love to go back for another game, we just hope t

In [6]:
aspect_extraction_chain = LLMChain(llm = llm, prompt = final_prompt, output_key = 'aspects')

In [7]:
output = aspect_extraction_chain.predict(post = "So happy the team won this weekend. My family had so much fun at Geodis and would love to go back for another game, we just hope the tickets are cheaper next time! The coaching was awful though.")

In [8]:
output

'Relevant Aspects are team performance, pricing, Geodis Park, and coaching performance.'

#### Initial Thoughts
- Seems to rely heavily on the examples passed in, as it is referencing only those aspects.
- This could be a product of not great prompt engineering, and maybe it just needs diverse examples so it isn't as cookie cutter.
- Also could be something with temperature of model?

### Sentiment Chain

In [9]:
prompt_template2 = '''
Given below post and the extracted aspects, tell me about the sentiment of those aspects. For example: 'positive', 'negative', 'neutral' in a format like (aspect, sentiment).
Post: {post}
Aspects: {aspects}
[(Aspect1, Sentiment of Aspect1), (Aspect2, Sentiment of Aspect2),.....]
'''

example_prompt2 = PromptTemplate(input_variables = ["post", "aspects"], template = prompt_template2)

print(example_prompt2)

input_variables=['post', 'aspects'] template="\nGiven below post and the extracted aspects, tell me about the sentiment of those aspects. For example: 'positive', 'negative', 'neutral' in a format like (aspect, sentiment).\nPost: {post}\nAspects: {aspects}\n[(Aspect1, Sentiment of Aspect1), (Aspect2, Sentiment of Aspect2),.....]\n"


In [10]:
aspect_sentiment_chain = LLMChain(llm = llm, prompt = example_prompt2, output_key = "Aspects_with_sentiment")

### Final Sequential Chain

In [11]:
overall_chain = SequentialChain(
    chains = [aspect_extraction_chain, aspect_sentiment_chain],
    input_variables = ["post"],
    output_variables = ["post", "aspects", "Aspects_with_sentiment"],
    verbose = True
)

In [12]:
x = overall_chain({"post": "The team needs to play better if we want to make playoffs. The fans are creating a great atmosphere in the stadium but the quality of coaching and players isn't there."})



[1m> Entering new SequentialChain chain...[0m

[1m> Finished chain.[0m


In [13]:
x

{'post': "The team needs to play better if we want to make playoffs. The fans are creating a great atmosphere in the stadium but the quality of coaching and players isn't there.",
 'aspects': 'Relevant Aspects are team performance, coaching performance, fan attitude, and Geodis Park.',
 'Aspects_with_sentiment': '(Team Performance, Negative), (Coaching Performance, Negative), (Fan Attitude, Positive), (Geodis Park, Neutral)'}

## Trying with NSC Post Data

In [14]:
nsc_posts = pd.read_csv('../Reddit Data/nsc_posts.csv')
nsc_comments = pd.read_csv('../Reddit Data/nsc_comments.csv')

In [15]:
nsc_posts.head()

Unnamed: 0.1,Unnamed: 0,id,Title,Content,Author,Post Date
0,0,ut4efu,I played the guitar riff yesterday!,,Grace-Music,2022-05-19 14:16:19
1,1,n3x5zx,Took my daughter to her first MLS game yesterd...,,JiuManji,2021-05-03 14:15:03
2,2,jy4kdq,FIRST PLAYOFF WIN UPVOTE PARTY!!,LETS GO!! What a Game!!!,BigBlueNate33,2020-11-21 04:26:24
3,3,fbuyj5,Thought this guy deserved a shoutout,,fullthrottle13,2020-03-01 14:44:32
4,4,k0isnq,ANOTHER PLAYOFF WIN UPVOTE PARTY!!,LETS. FREAKING. GO!!!!!!! MASSIVE CLUB!!! Semi...,BigBlueNate33,2020-11-25 01:56:02


In [16]:
def combine_title_content(row):
    if pd.notna(row['Content']):
        return row['Title'] + ': ' + row['Content']
    else:
        return row['Title']

In [17]:
test_combined = nsc_posts.apply(combine_title_content, axis = 1)

In [18]:
output = []
i = 0
for post in test_combined:
    output.append(overall_chain({"post": post}))
    i += 1
    if i == 2:
        break



[1m> Entering new SequentialChain chain...[0m

[1m> Finished chain.[0m


[1m> Entering new SequentialChain chain...[0m


Retrying langchain.llms.openai.completion_with_retry.<locals>._completion_with_retry in 4.0 seconds as it raised RateLimitError: Rate limit reached for text-davinci-003 in organization org-mZpDGmkoo1WsEP9bv7bbg2OB on requests per min. Limit: 3 / min. Please try again in 20s. Contact us through our help center at help.openai.com if you continue to have issues. Please add a payment method to your account to increase your rate limit. Visit https://platform.openai.com/account/billing to add a payment method..


KeyboardInterrupt: 

In [19]:
output

[{'post': 'I played the guitar riff yesterday!',
  'aspects': 'Relevant Aspect is fan attitude.',
  'Aspects_with_sentiment': '(fan attitude, positive)'}]