# **Project : A Case Study of ExpressWay Logistics**

**Business Overview:**

ExpressWay Logistics is a dynamic logistics service provider, committed to delivering efficient, reliable and cost-effective courier transportation and warehousing solutions. With a focus on speed, precision and customer satisfaction, we aim to be the go-to partner for our customers seeking seamless courier services. Our core service involves ensuring operational efficiency throughout our delivery and courier services, including inventory management, durable packaging and swift dispatch of couriers, real time tracking of shipments and on-time delivery of couriers as promised. We are committed to enhance our logistics and courier services and improve seamless connectivity for our customers.

**Current Challenge:**

ExpressWay Logistics faces numerous challenges in ensuring seamless deliveries and customer satisfaction. These challenges include managing various customer demands simultaneously, addressing delays in deliveries and ensuring products arrive intact and safe. Additionally, the company struggles with complexity of efficiently storing and handling a large volume of packages and ultimately meeting customer expectations. Moreover, maintaining a skilled workforce capable of handling various aspects of logistics operations presents its own set of challenges. Overcoming these obstacles requires a comprehensive approach that integrates innovative technology, strategic planning, and continuous improvement initiatives to ensure smooth operations and exceptional service delivery.

**Objective:**

Our primary objective is to conduct a sentiment analysis of user-generated reviews across various digital channels and platforms. By paying attention to their feedback, we want to find ways to make our services better - like handling different customer demands simultaneously, dealing with late deliveries, and keeping packages secured and intact. Through the application of prompt engineering methodologies and sentiment analysis, we'll figure out if sentiments expressed by users for our courier services are Positive or Negative. This will help us understand where we need to improve in order to meet customer expectations and keep them happy. With a focus on getting better all the time, we'll overcome the challenges at ExpressWay Logistics and make our services the best.

**Data Description:**

The dataset titled "courier-service_reviews.csv" is structured to facilitate sentiment analysis for courier service reviews. Here's a brief description of the data columns:

1. id: This column contains unique identifiers for each review entry. It helps in distinguishing and referencing individual reviews.
2. review: This column includes the actual text of the courier service reviews. The reviews are likely composed of customer opinions and experiences regarding different aspects of the services provided by ExpressWay Logistics.
3. sentiment: This column provides an additional layer of classification (positive and negative) for the mentioned reviews.

##**Step 1. Setup**


### Installation

In [None]:
# Installing necessary packages from openai (LLM), tiktoken (token count)
the

!pip install -q openai==1.55.3 tiktoken==0.6.0 session-info --quiet

### Imports

In [None]:
# Import all Python packages required to access the Azure Open AI API.
# Import additional packages required to access datasets and create examples.

from openai import AzureOpenAI
import json
import random
import tiktoken
import session_info

import pandas as pd
import numpy as np

from collections import Counter
from tqdm import tqdm
from sklearn.model_selection import train_test_split
from sklearn.metrics import f1_score
from tabulate import tabulate

In [None]:
#session_info.show()

### Authentication

**(A) Writing/Creating the config.json file**

In [None]:
# Creating variable data for the azure credentials

with open('config.json', 'r') as az_creds:
    data = az_creds.read()

In [None]:
# Converting it to a json dictionary

creds = json.loads(data)

In [None]:
# Creating an Azure OpenAI instance

client = AzureOpenAI(
    azure_endpoint=creds["AZURE_OPENAI_ENDPOINT"],
    api_key=creds["AZURE_OPENAI_KEY"],
    api_version=creds["AZURE_OPENAI_APIVERSION"]
)

In [None]:
# Naming a variable to read chatgpt model

chat_model_id = creds["CHATGPT_MODEL"]

### Utilities

This function calculates the number of tokens used in a list of messages, which is useful for estimating API usage costs in OpenAI models. It first sets up token encoding based on the gpt-4 model. Each message is counted with an overhead of three tokens due to the special formatting (<|start|>, role (system, user, or assistant), and <|end|>). It then iterates through the messages, encoding and counting tokens for each key-value pair. Finally, an additional three tokens are added to account for the assistant's reply formatting.

In [None]:
# Defining a function to evaluate the token consumption per model for cost estimating

def num_tokens_from_messages(messages):

    """
    Return the number of tokens used by a list of messages.
    Adapted from the Open AI cookbook token counter
    """

    encoding = tiktoken.encoding_for_model("gpt-4")

    # Each message is sandwiched with <|start|>role and <|end|>
    # Hence, messages look like: <|start|>system or user or assistant{message}<|end|>

    tokens_per_message = 3 # token1:<|start|>, token2:system(or user or assistant), token3:<|end|>

    num_tokens = 0

    for message in messages:
        num_tokens += tokens_per_message
        for key, value in message.items():
            num_tokens += len(encoding.encode(value))

    num_tokens += 3  # every reply is primed with <|start|>assistant<|message|>

    return num_tokens

## Task : Sentiment Analysis

##**Step 2: Assemble Data**


**(A) Upload and read csv file**

In [None]:
# Reading CSV File with dataset into a dataframe

cs_reviews_df = pd.read_csv("courier_service_review.csv")

In [None]:
# Checking variables, data types, missing values, and size of dataset

cs_reviews_df.info()

In [None]:
# Printing some examples to get to know the data

cs_reviews_df.sample(5)

**(B) Count Positive and Negative Sentiment Reviews**

In [None]:
# Verifying if label is skewed

cs_reviews_df['sentiment'].value_counts()

In [None]:
# Checking the number of rows and columns

cs_reviews_df.shape

**(C) Split the Dataset**

In [None]:
# Splitting the data with a robust test set to enhance performance for good predictions

cs_examples_df, cs_gold_examples_df = train_test_split(
    cs_reviews_df,
    test_size=0.40, # 40% random sample selected for gold examples - increased from inital 20%
    random_state=42 # ensures that the splits are the same for every session
)

In [None]:
# Verifying size of train and test sets

(cs_examples_df.shape, cs_gold_examples_df.shape)

To select gold examples for this session, we sample randomly from the test data using a `random_state=42`. This ensures that the examples from multiple runs of the sampling are the same (i.e., they are randomly selected but do not change between different runs of the notebook). Note that we are doing this only to keep execution times low for illustration. In practise, large number of gold examples facilitate robust estimates of model accuracy.

In [None]:
# Selecting columns to extract values for gold_examples

columns_to_select = ['review','sentiment']

In [None]:
# Picking 53 samples for evaluation purpose. Examples will not change.
# They will only be sorted out randomly. Also, to convert gold_examples to a Json dictionary

gold_examples = (
        cs_gold_examples_df.loc[:, columns_to_select]
                                     .sample(53, random_state=42) #<- ensures that gold examples are the same for every session
                                     .to_json(orient='records')
)

In [None]:
# Checking the gold examples

gold_examples

In [None]:
# To convert gold_examples to a dictionary in Json format. Verifying format with one sample.

json.loads(gold_examples)[0]

##**Step 3: Derive Prompt**


In [None]:
# Establishing user input message

user_message_template = """```{courier_service_review}```"""

**(A) Write Zero Shot System Message**

In [None]:
# Writing zero shot system message

zero_shot_system_message = """
Classify customer review in the input as positive or negative in sentiment.
Reviews will be delimited by triple backticks, that is, ```.
Do not explain your answer. Your answer should only contain the label: positive or negative.
"""


**(B) Create Zero Shot Prompt**

In [None]:
# Creating zero shot prompt to be input ready for completion function

zero_shot_prompt = [
    {"role": "system", "content": zero_shot_system_message},
    {"role": "user", "content": user_message_template}
]

In [None]:
# Getting the token consumption for the zero shot prompt

num_tokens_from_messages(zero_shot_prompt)

**(C) Write Few Shot System Message**

In [None]:
# Writing few shot system message

few_shot_system_message = """
Classify customer review in the input as positive or negative in sentiment.
Reviews will be delimited by triple backticks, that is, ```.
Do not explain your answer. Your answer should only contain the label: positive or negative.
"""

Merely selecting random samples from the polarity subsets is not enough because the examples included in a prompt are prone to a set of known biases such as:
 - Majority label bias (frequent answers in predictions)
 - Recency bias (examples near the end of the prompt)


To avoid these biases, it is important to have a balanced set of examples that are arranged in random order. Let us create a Python function that generates bias-free examples:

```create_examples``` function generates a randomized set of example reviews with equal representation from two sentiment classes: Positive and Negative. It first filters the dataset into two separate groups based on sentiment labels. Then, it randomly selects `n` examples from each class and combines them into a single dataset. The combined examples are shuffled to ensure randomness before being converted into a JSON format. Each time the function runs, it produces a different set of randomized examples from the dataset.

In [None]:
# Function to create examples. See description above. (For higher efficiency, I'm using n=16)

def create_examples(dataset, n=16):

    """
    Return a JSON list of randomized examples of size 2n with two classes.
    Create subsets of each class, choose random samples from the subsets,
    merge and randomize the order of samples in the merged list.
    Each run of this function creates a different random sample of examples
    chosen from the training data.

    Args:
        dataset (DataFrame): A DataFrame with examples (review + label)
        n (int): number of examples of each class to be selected

    Output:
        randomized_examples (JSON): A JSON with examples in random order
    """

    positive_reviews = (dataset.sentiment == 'Positive')
    negative_reviews = (dataset.sentiment == 'Negative')
    columns_to_select = ['review', 'sentiment']

    positive_examples = dataset.loc[positive_reviews, columns_to_select].sample(n)
    negative_examples = dataset.loc[negative_reviews, columns_to_select].sample(n)

    examples = pd.concat([positive_examples, negative_examples])

    # sampling without replacement is equivalent to random shuffling

    randomized_examples = examples.sample(2*n, replace=False)

    return randomized_examples.to_json(orient='records')

**(D) Create Examples For Few shot prompt**

In [None]:
# Creating the examples (2n=32, 16 per class)

examples = create_examples(cs_reviews_df, n=16)

In [None]:
# Converting examples to Json format

json.loads(examples)

With the examples in place, we can now assemble a few-shot prompt. Since we will be using the few-shot prompt several times during evaluation, let us write a function to create a few-shot prompt (the logic of this function is depicted below).

create_prompt function constructs a few-shot prompt formatted for the OpenAI API, incorporating system instructions, example interactions, and a user message template. It starts by adding the system message, which provides guidelines for sentiment analysis. Then, it loops through the provided examples, formatting each review as a user message and its corresponding sentiment as an assistant response. These are appended to the prompt list in the required sequence. The final output is a structured list of dictionaries that can be directly used as input for an OpenAI model.

In [None]:
# Defining function to create few_shot_prompt

def create_prompt(system_message, examples, user_message_template):

    """
    Return a prompt message in the format expected by the Open AI API.
    Loop through the examples and parse them as user message and assistant
    message.

    Args:
        system_message (str): system message with instructions for sentiment analysis
        examples (str): JSON string with list of examples
        user_message_template (str): string with a placeholder for courier service reviews

    Output:
        few_shot_prompt (List): A list of dictionaries in the Open AI prompt format
    """

    few_shot_prompt = [{'role':'system', 'content': system_message}]

    for example in json.loads(examples):
        example_review = example['review']
        example_sentiment = example['sentiment']

        few_shot_prompt.append(
            {
                'role': 'user',
                'content': user_message_template.format(
                    courier_service_review=example_review
                )
            }
        )

        few_shot_prompt.append(
            {'role': 'assistant', 'content': f"{example_sentiment}"}
        )

    return few_shot_prompt

**(E) Create Few Shot Prompt**

In [None]:
# Creating Few shot prompt

few_shot_prompt = create_prompt(
    few_shot_system_message,
    examples,
    user_message_template
    )

In [None]:
# Verifying content for executed few shot prompt

few_shot_prompt

In [None]:
# Getting the token count for few_shot_prompt

num_tokens_from_messages(few_shot_prompt)

##**Step 4: Evaluate prompts**


Now we have two sets of prompts that we need to evaluate using gold labels. Since the few-shot prompt depends on the sample of examples that was drawn to make up the prompt, we expect some variability in evaluation. Hence, we evaluate each prompt multiple times to get a sense of the average and the variation around the average.

To reiterate, a choice on the prompt should account for variability due to the choice of the random sample. To aid repeated evaluation, we assemble an evaluation function .

The ```evaluate_prompt``` function evaluates the performance of a sentiment analysis model using a micro-F1 score by comparing its predictions to gold-standard examples. It iterates through gold examples, formats each review into a user input message, and appends it to the provided prompt. The prompt is then sent to the OpenAI model for prediction, ensuring deterministic outputs by setting a low temperature and restricting the token limit. The predicted sentiments are collected alongside ground truth labels for evaluation. Finally, the function calculates and prints the micro-F1 score while displaying a comparison table of reviews, predictions, and actual labels.









In [None]:
# Defining function to evaluate both prompting techniques

def evaluate_prompt(prompt, gold_examples, user_message_template):

    """
    Return the micro-F1 score for predictions on gold examples.
    For each example, we make a prediction using the prompt. Gold labels and
    model predictions are aggregated into lists and compared to compute the
    F1 score.

    Args:
        prompt (List): list of messages in the Open AI prompt format
        gold_examples (str): JSON string with list of gold examples
        user_message_template (str): string with a placeholder for courier service review

    Output:
        micro_f1_score (float): Micro-F1 score computed by comparing model predictions
                                with ground truth
    """

    model_predictions, ground_truths, review_texts = [], [], []

    for example in json.loads(gold_examples):
        gold_input = example['review']
        user_input = [
            {
                'role':'user',
                'content': user_message_template.format(courier_service_review=gold_input)
            }
        ]

        try:
            response = client.chat.completions.create(
                model=chat_model_id,
                messages=prompt+user_input,
                temperature=0, # <- Note the low temperature (For a deterministic response)
                max_tokens=2 # <- Note how we restrict the output to not more than 2 tokens
            )

            prediction = response.choices[0].message.content

            model_predictions.append(prediction.strip().lower()) # <- removes extraneous white spaces & sets all in lower case
            ground_truths.append(example['sentiment'].lower())
            review_texts.append(gold_input)

        except Exception as e:
            continue

    micro_f1_score = f1_score(ground_truths, model_predictions, average="micro")

    table_data = [[text, pred, truth] for text, pred, truth in zip(review_texts, model_predictions, ground_truths)]
    headers = ["Review", "Model Prediction", "Ground Truth"]
    print(tabulate(table_data, headers=headers, tablefmt="grid"))

    return micro_f1_score


Let us now use this function to do one evaluation of all the two prompts assembled so far, each time computing the Micro-F1 score.

**(A) Evaluate zero shot prompt**

In [None]:
# Executing function to evaluate zero shot prompt

evaluate_prompt(zero_shot_prompt, gold_examples, user_message_template)

**(B) Evaluate few shot prompt**

In [None]:
# Executing function to evaluate few_shot_prompt

evaluate_prompt(few_shot_prompt, gold_examples, user_message_template)

However, this is just *one* choice of examples. We will need to run these evaluations with multiple choices of examples to get a sense of variability in F1 score for the few-shot prompt. As an example, let us run evaluations for the few-shot prompt 5 times (increased to 10 in final run).

This part evaluates the variability of the micro-F1 score for few-shot and zero-shot prompts by running multiple evaluations with different example selections. It loops num_eval_runs times, generating a new random set of examples in each iteration. A zero-shot prompt is created using only system instructions, while a few-shot prompt includes both system instructions and the selected examples. Each prompt is then evaluated on gold-standard examples to measure performance. The resulting micro-F1 scores for both methods are stored in separate lists for further analysis.


In [None]:
 # Iterating on data to check F-1 score variability to compare prompts

num_eval_runs =10

In [None]:
# Storing results in lists

zero_shot_performance = []
few_shot_performance = []

In [None]:
# Running the evaluations

for _ in tqdm(range(num_eval_runs)):

    # For each run create a new sample of examples
    examples = create_examples(cs_examples_df)

    # Assemble the zero shot prompt with these examples
    zero_shot_prompt = [{'role':'system', 'content': zero_shot_system_message}]

    # Assemble the few shot prompt with these examples
    few_shot_prompt = create_prompt(few_shot_system_message, examples, user_message_template)

    # Evaluate zero shot prompt accuracy on gold examples
    zero_shot_micro_f1 = evaluate_prompt(zero_shot_prompt, gold_examples, user_message_template)

    # Evaluate few shot prompt accuracy on gold examples
    few_shot_micro_f1 = evaluate_prompt(few_shot_prompt, gold_examples, user_message_template)

    zero_shot_performance.append(zero_shot_micro_f1)
    few_shot_performance.append(few_shot_micro_f1)

**(C) Calculate Mean and Standard Deviation for Zero Shot Prompt and Few Shot Prompt**

Compute the average (mean) and measure the variability (standard deviation) of the evaluation scores for both zero shot and few shot prompts.

In [None]:
# Calculating mean and standard deviation for F-1 micro score for Zero Shot Prompt

np.array(zero_shot_performance).mean(), np.array(zero_shot_performance).std()

In [None]:
# Calculating mean and standard deviation for F-1 micro score for Fex Shot Prompt

np.array(few_shot_performance).mean(), np.array(few_shot_performance).std()

##**Step 5: Observation and Insights and Business perspective**




#### **5.1 Observations on the model**

- As we have learned throught this course, Prompt Engeneering is an iterative process. And this was certainly the case with ExpressWay Logistics case.
- After an inital review of the notebook and run of the project we obtained some scores.These inital score results demanded further action as they did not look promising: same scores for both techniques, and 0.0 standard deviations. A call for further research was required.
- I decided to sequentially apply a Data Science common practice: the more datapoints we have, the better results we should obtain. I started to methodically modify certain snippets.
- Moreover, a comment we have seen in more than one notebook throughout the course coincides with this principle: "In practice, large number of gold examples facilitate robust estimates of model accuracy."
- My first adjustment was to the test_size. I increased it to 40%, which resulted in 53 (from an initial 27) data points for the test_size set, that is, cs_gold_examples_df. The gold examples sample looks more robust now.
- As a general practice, to see the effect of this change I executed the notebook up to the final scores. There were improvement in the scores, but they still remained the same for both prompting techniques. This meant that there was still more work to do.
- The rule of large numbers had to be applied to the other numeric components of the task on hand. I proceeded to increase the number of examples from 4(8) to 16(32). Individual results were better, but the samples were still not enough to create a more distinguishable difference between the two prompts.
- Then I extended the number of evaluation runs from 5 to 10. And the rule of large numbers was proven right.
- Mathematically, the zero-shot's superior performance is evidenced by its higher mean F1-score (0.962) and negligible standard deviation, ensuring a consistently high prediction accuracy, while the few-shot's lower mean F1-score (0.960) and noticeable standard deviation (0.00566) suggest less predictable and slightly poorer average outcomes. Both scores were good, but zero_shot offered a more powerful predictability capacity.
- This, coupled with the intuitive lower token cost, logically leads to the conclusion that the zero-shot prompt is the winner of this task.


#### **5.2 Insights from sentiment analysis and its impact on the business**

**5.2.1 Customer dissatisfaction level**

In [None]:
print(f"Percentage of dissatisfied customers as per review reviews data is {63/131*100:.0f}%")

- It is clear, from the sample of 131 courier reviews from customers, that there is a high level of unsatisfied customer

**5.2.2 Top 5 customer complaints**

Let's find out the top 5 complaints from the courier_service_review dataset

In [None]:
# Let's create a list of negative reviews

negative_review_texts = cs_reviews_df[cs_reviews_df['sentiment'] == 'Negative']['review'].tolist()

In [None]:
len(negative_review_texts)

In [None]:
# Checking a sample of the output

negative_review_texts[0]

In [None]:
complaint_extraction_system_message = """
You are an AI assistant specialized in analyzing customer feedback. Your task is to
identify and list the most common complaints found in the provided courier service
reviews. Summarize them concisely.
"""

# to concatenate negaative reviews ibnto one single string
combined_negative_reviews = "\n".join(negative_review_texts[:63])

complaint_extraction_user_message = f"""Here are several negative courier service reviews:

{combined_negative_reviews}

Based on these reviews, what are the 5 most common complaints? List them concisely."""

messages_for_complaints = [
    {"role": "system", "content": complaint_extraction_system_message},
    {"role": "user", "content": complaint_extraction_user_message}
]

# Let's call the API

try:
            response = client.chat.completions.create(
                model=chat_model_id,
                messages=messages_for_complaints,
                temperature=0.5, # Use a slightly higher temperature for more diverse summaries
                max_tokens=200 # Adjust as needed for the length of expected complaints
     )
            complaints = response.choices[0].message.content
            print(complaints)

except Exception as e:
            print(f"Error extracting complaints: {e}")



Customer complaints are very serious and could undermine the business significantly if no further corrective actions are taken
This is a list of recommendations based on the top 5 customer complaints obtained from teh provided data:

#### **5.3 Business recommendations**

The following are a list of recommended actions based on the business insights obtained from the data and that reflect the 5 main areas where customers have issues with.

#### 5.3.1 **Delivery delays**

5.3.1.1-**Enhance Predictive Analytics for ETAs**  
- Implement or refine AI-driven ETA (Estimated Time of Arrival) models that incorporate real-time traffic data, weather conditions, historical delivery performance, driver availability, and typical handling times. This allows for more realistic and accurate delivery promises from the outset.  
- Build realistic time buffers into delivery promises, especially for complex or less predictable routes, rather than over-optimizing for speed and risking missed deadlines.  

5.3.1.2-**Optimize Routing and Fleet Management**  
- Invest in dynamic route optimization software that can re-calculate and adjust routes in real-time based on unexpected events (e.g., traffic jams, sudden road closures, new urgent pickups/deliveries).  
- Utilize advanced fleet management systems to monitor driver location, vehicle status, and performance, enabling dispatchers to intervene proactively if delays are anticipated.  

5.3.1.3-**Proactive and Transparent Communication**  
- Establish a system for automated, real-time notifications (SMS, email, app push) to customers regarding shipment status updates, including immediate alerts for any unforeseen delays, along with revised ETAs and clear reasons for the delay.  
- Provide customers with robust, self-service tracking portals that offer granular updates and, crucially, display the most current predicted delivery window.  
- Follow up with on customers to assess improvements are hear suggestions.  

5.3.1.4-**Improve Last-Mile Efficiency**  
- Explore strategies like micro-hubs or urban distribution centers to reduce travel time and congestion in dense delivery areas.  
- Ensure drivers are equipped with efficient navigation tools and communication devices to report issues and receive real-time support.  

5.3.1.5-**Identify and Address Root Causes**  
- Conduct thorough data analysis on all late deliveries to identify recurring patterns (e.g., specific routes, times of day, types of goods, loading/unloading inefficiencies at certain points, specific internal processes) and implement targeted corrective actions.  
- Gather post-delivery feedback specifically on timeliness to uncover customer-centric reasons for dissatisfaction.  
- Meet with company drivers to get feedback of the issues they face, and establish goals and incentives for percentage of successful deliveries.  




#### 5.3.2 **Poor Customer Service**

5.3.2.1-**Elevate Staff Training & Development**  

- Intensive Soft Skills Training: Implement mandatory and ongoing training modules focused on active listening, empathy, de-escalation techniques, professional communication, and maintaining a positive demeanor even with difficult customers. Role-playing scenarios can be highly effective.  

- Deep Product & Process Knowledge: Ensure all customer service representatives (CSRs) possess a thorough understanding of all services, policies, and common operational procedures, enabling them to provide accurate and comprehensive assistance.  

- Empowerment & Problem-Solving: Train CSRs to identify root causes of issues quickly and empower them with the necessary authority and tools to resolve common problems during the first contact, minimizing transfers and follow-ups.

5.3.2.2-**Optimize Responsiveness and Accessibility**  

- Multi-Channel Strategy: Ensure seamless and consistent service across all customer contact points, including phone, email, live chat, and social media, with clear Service Level Agreements (SLAs) for response times.

- AI-Powered Automation for Tier 0: Deploy intelligent chatbots or IVR (Interactive Voice Response) systems to handle routine inquiries (e.g., tracking updates, FAQs) instantly, freeing human agents to focus on more complex, high-value issues.   

- Staffing & Scheduling Optimization: Use historical data and forecasting to ensure adequate staffing levels during peak hours, minimizing wait times across all channels.   

5.3.2.3-**bEnhance Resolution Capabilities**  

- Centralized Knowledge Base: Provide CSRs with an easily accessible, comprehensive, and regularly updated knowledge base containing solutions to common and complex issues.  

- Integrated CRM System: Implement or upgrade your CRM to give agents a 360-degree view of customer history, previous interactions, and shipment details, eliminating the need for customers to repeat information.  

- Clear Escalation Paths: Establish well-defined and efficient protocols for escalating complex or unresolved issues, ensuring smooth transitions and timely resolution by specialized teams.

5.3.2.4-**Implement Robust Quality Assurance & Feedback Loops**  

- Regular Interaction Monitoring: Systematically monitor calls, chat transcripts, and email communications for adherence to quality standards, professionalism, and effectiveness. Provide constructive feedback and coaching to agents.  

- Customer Feedback Mechanisms: Deploy post-interaction surveys (e.g., CSAT, NPS) and actively solicit feedback to identify specific pain points, measure satisfaction with resolution, and pinpoint areas for agent-specific or process-wide improvement.

#### 5.3.3 **Damaged Packages**

This issues involves two stages: Shipper practices and Internal Handling Protocols

5.3.3.1 **Enhance Shipper Education & Support on Packaging**  

- Comprehensive Digital Guides: Develop user-friendly online guides, video tutorials, and interactive tools demonstrating best practices for packaging various item types (fragile, heavy, liquids) with specific material recommendations.  

- Offer Premium Packaging Services/Materials: Provide an option for customers to purchase high-quality, pre-approved packaging materials or even opt for professional packing services at an additional cost, reducing reliance on inadequate shipper-provided packaging.  

- Pre-shipment Advisory: For high-value or unusual shipments, offer a consultation service where customers can get expert advice on optimal packaging solutions.  

5.3.3.2 **Strengthen Internal Handling Protocols & Training**

- Mandatory Handling Training: Conduct rigorous, recurrent training for all personnel involved in the package journey (sorters, loaders, drivers) on proper lifting techniques, careful stacking, fragile item recognition, and minimizing impacts.

- Automated System Optimization: Regularly audit and maintain automated sorting and conveyance systems to ensure they handle packages gently, reducing damage from drops, crushes, or impacts within facilities.  

- Secure Loading Procedures: Implement strict protocols for loading vehicles, ensuring proper weight distribution, secure bracing of cargo, and safe stacking to prevent shifting and damage during transit.  

5.3.3.3 **Implement Quality Control & Pre-emptive Checks**

- Initial Packaging Inspection: Empower and train pickup drivers or drop-off point staff to perform a quick visual assessment of external packaging. If clearly inadequate for the contents (e.g., box too weak, obvious liquid leaks), advise the customer and potentially refuse the shipment without proper re-packaging.  

- Spot Checks at Sortation Centers: Introduce random or targeted quality control checks at key transfer points to identify poorly packaged items that may require re-packaging by your staff (potentially with associated fees) before further transit.  

- Damage Reporting Tools: Equip drivers with tools to quickly document any package damage discovered before delivery, including photo evidence and immediate reporting back to the hub.  

5.3.3.4 **Data-Driven Damage Prevention**

- Analyze Claim Data: Systematically collect and analyze data from all damage claims to identify patterns. Look for common transit lanes, specific types of items, particular sorting centers, or even specific vehicle types associated with higher damage rates.  

- Iterative Process Improvement: Use these insights to implement targeted operational adjustments, new equipment, or revised handling instructions to address the root causes of damage.


#### 5.3.4 **Inaccurate Tracking System**

5.3.4.1 **Enhance Data Capture and Integration at Every Touchpoint**

- Advanced Scanning Technology: Upgrade to state-of-the-art scanning equipment at all processing points (pickup, sorting, loading, transit, delivery) to ensure precise and instantaneous capture of package status updates.

- Driver Mobile App Optimization: Equip drivers with intuitive, reliable mobile applications that facilitate accurate and real-time scanning and status updates (e.g., "Out for Delivery," "Delivery Attempted," "Delivered"), including GPS timestamps for location verification.

- Seamless System Integration: Ensure all operational systems (warehouse management, fleet management, dispatch, customer service) are fully integrated to allow for immediate data synchronization and prevent information silos.

5.3.4.2 **Ensure Real-time Data Transmission and System Reliability**

- Instant Data Sync: Implement a system architecture that pushes data from scan points directly to the central tracking database in real-time, minimizing any lag between an event occurring and its reflection in the tracking system.
- Robust Network Infrastructure: Invest in reliable internet connectivity and network systems across all facilities and mobile units to guarantee consistent data flow without interruptions.
- Cloud-Based Scalability: Utilize a cloud-native tracking platform that can scale to handle high volumes of concurrent updates and ensure uptime, providing customers with reliable access 24/7.

5.3.4.3 **Improve Clarity and User Experience of Tracking Information**

- Simplified Status Definitions: Translate internal operational codes into clear, plain-language status updates for customers (e.g., "In Transit," "Ready for Pickup," "Delivery Attempted – Customer Not Home").
- Dynamic Estimated Delivery Windows: Display evolving estimated delivery windows or times on the tracking portal that adjust based on real-time factors like traffic or delays, rather than just static dates.
- Proactive Explanations for Exceptions: For any unusual statuses or significant delays, provide concise, automated explanations directly within the tracking interface (e.g., "Delay due to inclement weather," "Customs clearance in progress").

5.3.4.4 **Implement Rigorous Monitoring and Auditing**

- Automated Discrepancy Alerts: Set up internal monitoring tools that automatically flag inconsistencies, missed scans, or packages remaining in the same status for an unusual duration, prompting immediate investigation by operations teams.
- Regular Data Audits: Conduct routine audits of tracking data to identify common points of error or delay in information flow, allowing for targeted process improvements or technology upgrades.
- Customer Feedback on Tracking: Actively solicit feedback on the clarity and accuracy of the tracking system through surveys to pinpoint areas needing improvement from the user's perspective.

#### 5.3.5 **Hidden Fees and Misleading Pricing**

5.3.5.1 **Implement a Comprehensive, All-Inclusive Quoting System**

- Real-time, Final Pricing: Develop an online quoting tool that provides an upfront, all-inclusive estimated cost at the point of inquiry, encompassing all standard surcharges (e.g., fuel, residential delivery, remote area fees, weekend delivery) based on the input details (weight, dimensions, origin, destination, service level).

- Transparent Breakdown: Within the quote, clearly itemize and explain every component of the price (base rate, all surcharges, taxes, and any value-added service fees), even if only the total is initially highlighted.

- Dynamic Calculation: Ensure the system accurately calculates potential fees for specific services or destinations, adjusting in real-time as customers input details.

5.3.5.2 **Proactive and Clear Disclosure of Potential Additional Charges**

- Prominent Warnings: Clearly list and explain common scenarios that might trigger post-quote additional charges (e.g., re-delivery attempts, address corrections due to customer error, customs duties/taxes for international shipments, special handling for undeclared oversized/fragile items). Display these warnings at relevant stages (e.g., during booking, in confirmation emails).

- Pre-shipment Verification: If possible, implement a system to verify key shipment characteristics (e.g., weight, dimensions) early in the process. If discrepancies are found that affect pricing, inform the customer before the final invoice is issued, offering a chance to adjust or cancel.

- Accessible Terms: Ensure your pricing terms and conditions, including all potential fees, are easily findable on your website and written in clear, unambiguous language.

5.3.5.3 **Standardize and Justify All Surcharges**

- Clear Surcharge Explanations: Provide concise, simple explanations for why each surcharge exists (e.g., "Fuel Surcharge: Reflects the variable cost of fuel for transportation," "Peak Season Surcharge: Applied during periods of unusually high demand to manage network capacity").  

- Regular Review and Rationalization: Periodically review all existing surcharges to ensure they remain relevant, fair, and competitive. Eliminate any obsolete or confusing fees.  

5.3.5.4 **Improve Invoicing Clarity and Dispute Resolutio**

- Detailed, Itemized Invoices: Issue invoices that precisely match the quoted breakdown and clearly show all charges applied, linking them directly to the services provided.  

- Automated Discrepancy Flagging: Implement internal systems that automatically flag significant differences between the initial quote and the final charge, prompting internal review and proactive communication with the customer before billing.  

- Streamlined Dispute Process: Establish a clear, accessible, and prompt process for customers to dispute unexpected charges, ensuring rapid investigation and transparent resolution.

### **Final Note**

The above described suggestions should be discussed as part of the short, medium, and long term company improvement plan and performance goal settings for ExpressWay Logistics. The implementation of said recommendations is usbject to the company's priorities and availability of required resources, such as financial, managerial, administraive and  human resources.