## Freshchat Analysis

First we will use the freshchat API to fetch all the passenger and driver chats for a day.

In [70]:
import requests
import json
import pandas as pd
import getpass

accesstoken = getpass.getpass('Freshchat access token')

Freshchat access token ········


In [78]:
baseURL = 'https://baly9457.freshchat.com/'
def apicall(path, method = 'GET', data = None):
    url =  baseURL + 'v2/' + path

    headers = {
        'Authorization': f'Bearer {accesstoken}'
    }

    response = requests.request(
    	method,
    	url,
        headers = headers,
    	timeout=180,
        json = data
    )
    return response
    
response = apicall('groups','GET')
print(response.status_code)

200


In [33]:
jsonbody = json.loads(response.content)
group_names = [group['name'] for group in jsonbody['groups']]
print(group_names)

['BI', 'chat assign', 'Inbound TAXI CC', 'LR', 'Operation', 'Outbound TAXI CC', 'Social media', 'TEST', 'Testing']


In [31]:
response = apicall('channels','GET')
print(response.status_code)
jsonbody = json.loads(response.content)
channel_names = [channel['name'] for channel in jsonbody['channels']]
print(channel_names)

200
['Baly - بلي', 'IG_baly_captain', 'IG_baly_iq', 'بلي', 'زبون', 'كابتن', 'كابتن بلي']


Here, زبون is from passenger app and كابتن is from the driver app. Now let's try to fetch the chats from one of these groups

In [64]:
from datetime import datetime, timedelta
now = datetime.utcnow()
days = 1
start = datetime(now.year, now.month, now.day) - timedelta(days=days)
end = start_of_previous_day + timedelta(days=days)
bodyjson = {
    "start": f"{start.isoformat()}Z",
    "end": f"{end.isoformat()}Z",
    "event": "Conversation-Created",
    #"event": "classic",
    "format": "csv"
}
response = apicall('/reports/raw','POST',bodyjson)
print(response.status_code)

202


In [73]:
body = json.loads(response.content)
resp = apicall(body['link']['href'],'GET')
print(resp.status_code)
body = json.loads(resp.content)
url = body['links'][0]['link']['href']
print(url)

200
https://fc-use1-00-reports-api-prod.s3.amazonaws.com/extraction/2024/5/20706bf6-fa89-4c5f-947a-18aa13e08ff2-1716249600000.zip?response-content-disposition=attachment%3B%20filename%3Dextraction%2F2024%2F5%2F20706bf6-fa89-4c5f-947a-18aa13e08ff2-1716249600000.zip&response-content-type=application%2Foctet-stream&X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Date=20240522T131107Z&X-Amz-SignedHeaders=host&X-Amz-Expires=3599&X-Amz-Credential=AKIAYXZNLQ643P6LQBGG%2F20240522%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Signature=b9e6dd631a3846b811ccd836c4b5e09dc00dc277e8626d380f1b8f56ca07022a


In [71]:
df = pd.read_csv(url,compression='zip')

In [72]:
tdf = df[['channel_name','user_name','conversation_id']]
tdf.head()

Unnamed: 0,channel_name,user_name,conversation_id
0,كابتن,محمد ماجد عطيه,fb607122-8b99-494f-994c-834f0b4c462c
1,IG_baly_captain,id.31,3487d097-e800-4fcf-98cc-433fa6133281
2,كابتن,رافع عباس عليوي,fe0cf415-869e-43e5-acd4-417257e87865
3,كابتن,حسين يقضان عبدالحسن,a50894ed-9b2d-4658-aecb-42ad2afd7abb
4,كابتن,زينب,1f573bc4-c99c-479a-98de-ecd319c15d1a


In [74]:
ddf = tdf[tdf['channel_name'] == 'كابتن']


In [76]:
ddf.describe()

Unnamed: 0,channel_name,user_name,conversation_id
count,2590,2590,2590
unique,1,1778,1787
top,كابتن,سيف مانع نجم,48af7344-0813-4d6a-8d49-1a4b7c7026bb
freq,2590,7,7


In [77]:
pdf = tdf[tdf['channel_name'] == 'زبون']
pdf.describe()

Unnamed: 0,channel_name,user_name,conversation_id
count,1012,1012,1012
unique,1,651,681
top,زبون,علي,fd7d81c2-12d4-403b-b05e-0b7b80795045
freq,1012,7,5


The getchats function fetches chats for a given conversation id. We are skipping all system and agent messages as I dont think they currently add much value.

In [142]:
def getchats(conversationID,actor='user'):
    path = "/conversations/{0}/messages".format(conversationID)
    resp = apicall(path)
    if resp.status_code != 200:
        print(resp.content)
        return None
        
    messages = json.loads(resp.content)['messages']
    
    extracted_data = [(message_part['text']['content'])
        for message in messages
        for message_part in message['message_parts']
        if 'text' in message_part and 'actor_type' in message and message['actor_type'] == actor]
    return extracted_data

In [143]:
print(getchats('fd7d81c2-12d4-403b-b05e-0b7b80795045'))

['BLY-240521-32876-1087', 'وقال لي اذا اخبرتي الشركة سوف تندمين ', 'لديّ مشكلة مع السائق', 'مشكلة في رحلة سابقة', 'Conversation reopened due to new incoming message', 'اسم السائق مصطفى جودت كاظم كان اسلوبه غير مهذب وقمت بالغاء الرحلة لم اصعد معه والسبب كان عندي بايسكل أطفال وقال لي اذا اخبرتي الشركة ', 'يعني يهدد بيه']


### Driver Data
Here we fetch and download data for all drivers, which we will then save to a csv.

In [169]:
testdf = ddf.copy()

chats = [getchats(cid) for cid in testdf['conversation_id']]
testdf['messages'] = chats

testdf.head()

Unnamed: 0,channel_name,user_name,conversation_id,messages
0,كابتن,محمد ماجد عطيه,fb607122-8b99-494f-994c-834f0b4c462c,"[الو, السلام عليكم, الو, Conversation was reop..."
2,كابتن,رافع عباس عليوي,fe0cf415-869e-43e5-acd4-417257e87865,"[Thank you, شكرا جزيلا ياطيب, مع التحية والاحت..."
3,كابتن,حسين يقضان عبدالحسن,a50894ed-9b2d-4658-aecb-42ad2afd7abb,"[تحديث معلومات الحساب, Conversation reopened d..."
4,كابتن,زينب,1f573bc4-c99c-479a-98de-ecd319c15d1a,"[قيمته قبل لا يراسلني, حذفت المحادثه, Conversa..."
6,كابتن,ستار محيميد علي,5e9ba414-9aa3-4698-97dd-f178d69fbc29,[هو انتو دائما تكولون نرفع شكوى وكله كذب وتطبك...


In [171]:
filename = "data.{0}.csv".format(datetime.now().strftime('%Y%m%d'))
print(datestring)
testdf.to_csv(filename)

20240523


## Passenger Data

In [261]:
pchats = [getchats(cid) for cid in pdf['conversation_id']]
pdfchats = pdf.copy()
pdfchats['messages'] = pchats
pdfchats.head()

filename = "data.passenger.{0}.csv".format(datetime.now().strftime('%Y%m%d'))
pdfchats.to_csv(filename)

## Summarization with Ollama
Lets summarize  the chats using Ollama 

In [177]:
%%capture
pip install ollama tiktoken

In [316]:
unique_df = testdf.drop_duplicates(subset='conversation_id', keep='first')
unique_df.describe()

Unnamed: 0,channel_name,user_name,conversation_id,messages
count,1787,1787,1787,1787
unique,1,1778,1787,1762
top,كابتن,هيثم رعد فاضل,fb607122-8b99-494f-994c-834f0b4c462c,"[مركز بغداد, تحديث معلومات الحساب, السلام عليكم]"
freq,1787,2,1,7


In [317]:
flattened_chats = ['|'.join(filter(lambda x: re.search(r"^Conversation.*",x) == None,inner_array)) for inner_array in unique_df['messages']]
print(flattened_chats[1])

Thank you|شكرا جزيلا ياطيب|مع التحية والاحترام|السلام عليكم وافقت على رحله بالساعه '0053من باب السرقي شارع الفرزدق الى جسر ديالى وعند وصولي للنقطه قال انا اخذت تكسي فقلت له لماذا تطلب يااخي قال طلبت وتركتها قلت له انا الان جىت من مسافه بعيده ليش مالغيتها من الاول اذا ماتروح قال نسيت وللعلم لم يلغي الكلب مدة طويله الى ان لغيته انا يرجى التفضل بالاطلا
|التقييم في التطبيق


In [321]:
systemPrompt = "You are an expert analyst and you are hired to help a ride hailing company improve their customer satisfaction based on customer feedback.."
llmModel = 'llama3'
ollamaBase = 'http://ollama-alibo-gpu-testing.apps.private.okd4.teh-2.snappcloud.io/'

import ollama


client = ollama.Client(ollamaBase)

In [322]:
print("|".join([x for x in chats[2] if not re.search(r"^Conversation.*",x)]))

تحديث معلومات الحساب|هلا كابتن|تحديث معلومات الحساب


In [323]:
def ollama_llm(question, context):
    formatted_prompt = f"Question: {question}\n\nData: {context}"
    response = client.chat(model=llmModel, options = { 'temperature': 0}, messages=[{'role': 'system', 'content': systemPrompt},{'role': 'user', 'content': formatted_prompt}])
    return response['message']['content']

In [325]:
prompt = f"""Summarize these chats for a ride hailing app  with top 4 actionable insights for management. Each line contains the chat messages sent by a single driver.
            For each of your insights, you must include one or more conversations translated to english that best support your insight..
         """

context = "\n".join(flattened_chats[0:60])

import tiktoken
tokenizer = tiktoken.encoding_for_model('gpt-3.5-turbo')
token_length = len(tokenizer.encode(context))
print(f"using {token_length} tokens in input")
resp = ollama_llm(prompt,context)
print(resp)

using 6173 tokens in input
I'll do my best to help you with the ride-hailing company's technical issues. It seems like there are several problems occurring, including:

1. The app is not functioning correctly, and users are experiencing difficulties.
2. There are issues with the payment system, and some users are unable to pay for their rides.
3. Some users are reporting that they are being charged incorrectly or that their payments are not being processed.
4. There are also reports of users being unable to cancel their rides or request new ones.

To help resolve these issues, I would recommend the following steps:

1. Conduct a thorough analysis of the app's code and identify any potential bugs or errors that may be causing the problems.
2. Implement a system for tracking and resolving user complaints in real-time, so that users can quickly get assistance if they encounter any issues.
3. Develop a more robust payment processing system to ensure that payments are processed correctly an

Now we do the same thing for passengers

In [349]:
unique_pdf = pdf.drop_duplicates(subset='conversation_id', keep='first')
unique_pdf.describe()

Unnamed: 0,channel_name,user_name,conversation_id
count,681,681,681
unique,1,651,681
top,زبون,علي,28516b4d-3cef-4270-95b0-43ae1a3139c2
freq,681,4,1


In [358]:
flattened_pchats = ['|'.join(filter(lambda x: re.search(r"^(Conversation|التحدث مع موظف الدعم).*",x) == None,inner_array)) for inner_array in unique_pdf['messages']]
print(flattened_pchats[0])

KeyError: 'messages'

In [352]:
prompt = """Summarize these chats for a ride hailing app  with top 4 actionable insights for management. Each line contains the chat messages sent by a single user. For each of your insights, 
            include one or more conversations translated to english that best support your insight.."""

context = "\n".join(flattened_pchats[0:40])

import tiktoken
tokenizer = tiktoken.encoding_for_model('gpt-3.5-turbo')
token_length = len(tokenizer.encode(context))
print(f"using {token_length} tokens in input")
resp = ollama_llm(prompt,context)
print(resp)

using 4350 tokens in input
I'm happy to help you with your ride-hailing company's customer service issue. It seems like there are several problems that need to be addressed.

Firstly, it appears that there is a technical issue within the app that needs to be resolved. You mentioned that the app stopped working and that you were unable to access your account information. This is unacceptable and I apologize for any inconvenience this may have caused.

Secondly, it seems like there was an issue with the payment system. You mentioned that you were charged 3650 dinars without your consent, which is a significant amount of money. I understand that this must be very frustrating for you.

Thirdly, it appears that there are some issues with the customer service team's response time. You mentioned that it took them several days to respond to your initial complaint, and even then, they didn't provide a satisfactory solution.

To resolve these issues, I recommend the following:

1. Technical issu

## Summarization with GPT 3.5 Turbo

In [326]:
%%capture
pip install openai

In [283]:
from openai import AzureOpenAI
from dotenv import load_dotenv
import os
load_dotenv()

True

In [333]:
aclient = AzureOpenAI(
        api_key=os.environ['AZURE_OPENAI_KEY'],
        api_version = "2023-05-15")

deployment=os.environ['AZURE_OPENAI_DEPLOYMENT']
ENCODING_MODEL = "gpt-3.5-turbo"

In [357]:
def get_completion(prompt):
    messages = [
                {"role": "system", "content": "You are an expert analyst hired to help a ride hailing company improve their CSAT score.."},
                {"role":"user", "content":prompt}
               ]
    response = aclient.chat.completions.create(
            model = deployment,
            messages = messages,
            temperature = 0,
            max_tokens = 900)
    return response.choices[0].message.content

In [356]:
context = "\n".join(flattened_pchats[0:25])

prompt = f"""
        Each line representas a conversation, containing messages sent by a single user in a session. Generate top 5 actionable insights for management. 
        For each of your insights, you must include some conversations with original text and its english translation that best support it..
        ```{context}```
         """

promptmap = f"""
        Each line representas a conversation, containing messages sent by a single user in a session. Filter those conversations which are about maps and location/pickup related issues and generate
        actionable insights for management. For each of your insights, you must include some conversations with original text and its english translation that best support your insight..
        ```{context}```
         """
response = get_completion(prompt)
print(response)

Insight 1: Improve driver communication and professionalism
- Conversation: السلام عليكم وافقت على رحله بالساعه '0053من باب السرقي شارع الفرزدق الى جسر ديالى وعند وصولي للنقطه قال انا اخذت تكسي فقلت له لماذا تطلب يااخي قال طلبت وتركتها قلت له انا الان جىت من مسافه بعيده ليش مالغيتها من الاول اذا ماتروح قال نسيت وللعلم لم يلغي الكلب مدة طويله الى ان لغيته انا يرجى التفضل بالاطلا
- Translation: "Hello, I agreed to a trip at 00:53 from Bab Al-Sarayi Street to Jisr Diyala. When I arrived at the pickup point, the driver told me he took another passenger. I asked him why he requested the trip if he already had a passenger. He said he forgot and didn't cancel the trip for a long time until I canceled it. Please take action on this."

Insight 2: Streamline the complaint resolution process
- Conversation: دعم هذا ضلم الي ديصير لغيت الرحله اني لان الزبونه جايبه وياها 6نفرات سيارتي ماتكفي وخصمتو مني 1000دينار واني كتبلتكم سبب الغاء BLY-240522-50706-0596
- Translation: "Support, this is unfair. I 

In [340]:
context = "\n".join(flattened_pchats[31:60])
response = get_completion(prompt)
print(response)

Insight 1: Improve driver behavior and professionalism
- Conversation: "والولد بقى يراسلني ويعاكسني وكتله اني مره مزوجه تلافي للمشكله" (The driver kept messaging and harassing me, saying that I am married to avoid the problem)
- Translation: The driver's unprofessional behavior and harassment towards customers can negatively impact their experience. It is important to address this issue and ensure that drivers maintain a high level of professionalism.

Insight 2: Enhance communication with customer support
- Conversation: "التحدث مع موظف الدعم|سلام عليكم|رجاءاً كلش محتاجتكم" (Talking to customer support, Hello, I really need your help)
- Translation: Customers value effective communication with customer support. It is crucial to provide prompt and helpful assistance to address their concerns and resolve any issues they may have.

Insight 3: Ensure accurate fare calculation and payment
- Conversation: "السائق طلب مبلغ إضافي|مشكلة في رحلة سابقة|الكابتن اخذ مبلغ اضافي لعب بعداد الكروة" (T

In [338]:
context = "\n".join(flattened_pchats[61:90])
response = get_completion(prompt)
print(response)

Insight 1: Improve driver behavior and professionalism
- Conversation: "ان شاء الـلَّــــﷻـــه|يرجى الاهتمام بهذه الامور|فضلا عن نظافة وترتيب السيارة|وكذلك رقم ونوع ولون السيارة|غالبا الكابتن الموجود بالتطبيق يختلف عن الحقيقة"
- Translation: "Please pay attention to these matters, including cleanliness and tidiness of the car, as well as the accuracy of the car's number, type, and color. Often, the driver shown in the app is different from the actual driver."
- This conversation highlights the importance of improving driver behavior and professionalism. Customers expect drivers to maintain cleanliness in their cars and provide accurate information about the car. The company should ensure that the drivers displayed in the app match the actual drivers to avoid any discrepancies.

Insight 2: Address issues related to overcharging
- Conversation: "والرحلة بالتطبيق 3000|صديقي دفعت 7000|على هذا الرقم|07724111351|صديقي رحت رحلة والكابتن اخذ مبلغ اضافي"
- Translation: "The trip in the app was 