We attempt to analysis each reviews by extracting the problems/benefits of the application mentioned in each reviews
To do so, we utilise the hugging face question-answering pipeline
Similarly, hugging face provides us with the sentiment analysis pipeline, allowing us to check the sentiment value of each review

In [6]:
#libraries
import pandas as pd


In [7]:
#import data
appStore = pd.read_csv('data/AppStoreData.csv')
googlePlay = pd.read_csv('data/PlayStoreData.csv')

In [8]:
#combine review data 
as_review = appStore['review']
gp_review = googlePlay['text']

reviews = as_review.tolist() + gp_review.tolist()

In [9]:
#data cleaning to remove weird comments
print(reviews)



In [10]:
from transformers import pipeline
sent_pipeline = pipeline("sentiment-analysis")
qa_pipeline = pipeline("question-answering")

No model was supplied, defaulted to distilbert-base-uncased-finetuned-sst-2-english and revision af0f99b (https://huggingface.co/distilbert-base-uncased-finetuned-sst-2-english).
Using a pipeline without specifying a model name and revision in production is not recommended.
No model was supplied, defaulted to distilbert-base-cased-distilled-squad and revision 626af31 (https://huggingface.co/distilbert-base-cased-distilled-squad).
Using a pipeline without specifying a model name and revision in production is not recommended.


Sentiment analysis

In [11]:
sentiment_scoring = [sent_pipeline(review) for review in reviews]

In [12]:
label = []
score = []
for lst in sentiment_scoring:
    label.append(lst[0]['label'])
    score.append((lst[0]['score']))

In [13]:
sentiment_df = pd.DataFrame(list(zip(reviews, label, score)), columns=['review','label', 'sentiment_score'])

In [14]:
sentiment_df.head()

Unnamed: 0,review,label,sentiment_score
0,Great banking app with attractive interest rat...,POSITIVE,0.994833
1,"A bank like no other, no bank have such amazin...",POSITIVE,0.99939
2,Notice that the drop in interest rate of 0.8% ...,NEGATIVE,0.998625
3,Sending money into my GXS account is a breeze ...,NEGATIVE,0.994006
4,I have to say that the UI/UX is one of the bes...,POSITIVE,0.998633


Review analysis

In [15]:
reviews[0]

'Great banking app with attractive interest rates! Please allow us to add and/or save payees so we don’t have to keep typing out UEN numbers or account numbers. Would be nice to be able to add the debit card to Apple Pay too!!'

In [16]:
#context = reviews #this part, make it into a forloop or a function to output the desired data

def report(review):
    #should return a dictionary i guess
    ans = dict.fromkeys(['Good', 'Suggested improvements'], [])
    ans['Good'] = qa_pipeline(question="What is good and positive about this application?", context=review)["answer"]
    ans['Suggested improvements'] = qa_pipeline(question="How can the pplication improve?", context=review)["answer"]
    return ans

context = reviews[0]
print(qa_pipeline(question="What is good and positive about this application?", context=context))
print(qa_pipeline(question="What can be added to the application?", context=context))
print(qa_pipeline(question="What do the application lack", context=context))

{'score': 0.27957335114479065, 'start': 23, 'end': 48, 'answer': 'attractive interest rates'}
{'score': 0.7127752304077148, 'start': 196, 'end': 206, 'answer': 'debit card'}
{'score': 0.10196660459041595, 'start': 128, 'end': 158, 'answer': 'UEN numbers or account numbers'}


In [17]:
check = reviews[0]
print(check)
report(check)

Great banking app with attractive interest rates! Please allow us to add and/or save payees so we don’t have to keep typing out UEN numbers or account numbers. Would be nice to be able to add the debit card to Apple Pay too!!


{'Good': 'attractive interest rates',
 'Suggested improvements': 'Please allow us to add and/or save payees'}

In [18]:
#qa on all the reviews
idk = [report(review) for review in reviews]

In [19]:
print(idk)



Other question answering method that may be more accuracte

In [20]:

# import
from transformers.pipelines import pipeline
from transformers import AutoModelForQuestionAnswering
from transformers import AutoTokenizer

# var
model_name = "deepset/xlm-roberta-base-squad2"

# generate pipeline
nlp = pipeline('question-answering', model=model_name, tokenizer=model_name)

input = {
    'question': 'How can the application improve?' ,
    'context': 'My name is Mohit. I am going to visit my grandmother. She is old.'
}
print(nlp(input))
## Output --> {'score': 0.30, 'start': 10, 'end': 17, 'answer': ' Mohit.'}


{'score': 0.024813469499349594, 'start': 37, 'end': 53, 'answer': ' my grandmother.'}


Comparing hugging face 1 and nlp

In [21]:
#i think what we can do is to analyse the sentiment of the review. If it is bad
print(reviews[3])
input = {
    'question': 'How can the application be improved?',
    'context': reviews[3]
}
print(nlp(input))

input = {
    'question': 'what is good about the application',
    'context': reviews[3]
}
print(nlp(input))

#compared to hugging
check = reviews[3]
report(check)


Sending money into my GXS account is a breeze and instantaneous - regardless of the amounts. I’m able to immediately see that my funds are in GXS. 

Transferring money OUT is a huge issue. Since June I’ve had problems transferring amounts higher than $500 back to my other banking accounts, each time a red banner will pop up and said something went wrong please try again later. TODAY I can’t transfer more than $1000 back to myself - even the $1000 had to be transferred in TWO transactions of $500 each. Customer service officers did their best to help each time but it’s annoying that the advice provided (killing the app, re-logging in with SingPass) still don’t work.
{'score': 0.14726075530052185, 'start': 608, 'end': 655, 'answer': ' (killing the app, re-logging in with SingPass)'}
{'score': 0.018554577603936195, 'start': 608, 'end': 655, 'answer': ' (killing the app, re-logging in with SingPass)'}


{'Good': 'it’s annoying that the advice provided',
 'Suggested improvements': 'still don’t work'}

In [22]:
import torch
from transformers import BertForQuestionAnswering
from transformers import BertTokenizer

#Model
model = BertForQuestionAnswering.from_pretrained('bert-large-uncased-whole-word-masking-finetuned-squad')

#Tokenizer
tokenizer = BertTokenizer.from_pretrained('bert-large-uncased-whole-word-masking-finetuned-squad')

Some weights of the model checkpoint at bert-large-uncased-whole-word-masking-finetuned-squad were not used when initializing BertForQuestionAnswering: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight']
- This IS expected if you are initializing BertForQuestionAnswering from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForQuestionAnswering from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


Using h2o llm

In [23]:
import os


In [24]:
API_KEY = "sk-ms7SU43E34tS9UJks5RD2KM3m1JumOR2pM73Dk95VzKjM6TZ"

API_KEY = API_KEY or os.getenv("H2O_GPT_E_API_KEY")

if not API_KEY:
    raise ValueError("Please configure h2ogpte API key")

REMOTE_ADDRESS = "https://h2ogpte.genai.h2o.ai"

from h2ogpte import H2OGPTE

client = H2OGPTE(address=REMOTE_ADDRESS, api_key=API_KEY)

In [25]:
llm = "h2oai/h2ogpt-4096-llama2-70b-chat"

answer = client.answer_question(question="Who are you?", llm=llm).content
print(f"{llm}: {answer}", flush=True)

h2oai/h2ogpt-4096-llama2-70b-chat: Hello! My name is LLaMA, I'm a large language model trained by a team of researcher at Meta AI. My primary function is to understand and respond to human input in a helpful and engaging manner. I can answer questions, provide information, and even generate creative content such as stories or dialogue. Is there anything specific you would like to know or talk about?


In [26]:
answer = client.answer_question(question="list out the good and bad things about the application from the following review Great banking app with attractive interest rates! Please allow us to add and/or save payees so we don‚Äôt have to keep typing out UEN numbers or account numbers. Would be nice to be able to add the debit card to Apple Pay too!! ", llm=llm).content
print(f"{llm}: {answer}", flush=True)

h2oai/h2ogpt-4096-llama2-70b-chat: Sure, here's a list of good and bad things about the application based on the review:

Good:

* The banking app offers attractive interest rates.
* The app is user-friendly and easy to navigate.

Bad:

* The app does not allow users to add and/or save payees, requiring them to repeatedly type out UEN numbers or account numbers.
* The app does not support adding the debit card to Apple Pay.


In [27]:
print(answer)

Sure, here's a list of good and bad things about the application based on the review:

Good:

* The banking app offers attractive interest rates.
* The app is user-friendly and easy to navigate.

Bad:

* The app does not allow users to add and/or save payees, requiring them to repeatedly type out UEN numbers or account numbers.
* The app does not support adding the debit card to Apple Pay.


In [33]:
#data extraction
def review_analysis(review):
    extract = client.extract_data(
        text_context_list= [review],
        #pre_prompt_extract="Pay attention and look at all people. Your job is to collect their names.\n",
        prompt_extract="List the good thing and suggestions for improvement. Ignore grammatical errors and awkward languages"
    )
    # List of LLM answers per text input
    for extract_list_item in extract.content:
        for s in extract_list_item.split("\n"):
            print(s)


In [34]:
context = "Have been waiting for a slot for the account since GXS started and have been waiting until now (about 7 months) and still nothing, whenever anyone ask for why, the excuses given are still mostly the same with no actual changes or improvements (people can‚Äôt help but compare to the other bank ‚Äútrust bank Singapore‚Äù that also launched just 1 day before GXS‚Äôs launch and believe trust bank Singapore to be better since trust bank can allow people to register without waiting for this long)"
review_analysis(context)

Good things:

* The user has been patiently waiting for an account slot on GXS for approximately 7 months.

Suggestions for improvement:

* Address the long wait time for account slots and provide clear explanations for the delay.
* Improve the registration process to be more efficient and comparable to other banks, such as Trust Bank Singapore, which allows people to register without a long wait time.
* Provide regular updates and improvements to the registration process to show progress and maintain user confidence.
