# AMASUM APP

In this notebook we will set up and run the app needed to run inference on the fine-tuned smallest (7B Parameter) Llama 2 model and show off the **AMASUM** project in action.

"To run this model you will need to make sure you are have at least one ***NVIDIA A100 GPU*** as you will need at least ***36GB of GPU Memory*** to allocate for loading the model for the app."

As well, make sure you have a **HuggingFace** account and set up access tokens to sign in to load the data and use the models.
https://huggingface.co/

You will also need to request access to Llama 2 to use the model.
https://ai.meta.com/resources/models-and-libraries/llama-downloads/



---

First we'll need to install all of our dependencies and import the necessary libraries.

We will then set a parameter that will let our model know to use the GPU (DEVICE) when it is available.

---

In [1]:
#install dependencies
!pip install -Uqqq pip
!pip install -qqq torch
!pip install -qqq transformers
!pip install -qqq datasets
!pip install -qqq peft
!pip install -qqq bitsandbytes
!pip install -qqq trl
!pip install -qqq gradio
!pip install -qqq plotly


#import libraries
import gradio as gr
import plotly.graph_objects as go
import pandas as pd
import numpy as np
import torch

from pprint import pprint

from datasets import Dataset, load_dataset
from huggingface_hub import notebook_login
from peft import LoraConfig, PeftModel
from transformers import (
    AutoModelForCausalLM,
    AutoTokenizer,
    BitsAndBytesConfig,
    TrainingArguments,
)
from trl import SFTTrainer

# select device
DEVICE = "cuda:0" if torch.cuda.is_available() else "cpu"
# select base model
MODEL_NAME = "meta-llama/Llama-2-7b-hf"

[?25l     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/2.1 MB[0m [31m?[0m eta [36m-:--:--[0m[2K     [91m━━[0m[91m╸[0m[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.1/2.1 MB[0m [31m4.1 MB/s[0m eta [36m0:00:01[0m[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m2.1/2.1 MB[0m [31m31.7 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m521.2/521.2 kB[0m [31m12.4 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m115.3/115.3 kB[0m [31m9.5 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m134.8/134.8 kB[0m [31m11.7 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m174.7/174.7 kB[0m [31m5.9 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m261.4/261.4 kB[0m [31m15.4 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━



### Parameter Loading and Model Set Up

---

**Model Quantization Configuration Function**

The `create_model_and_tokenizer` function is used to set up a language model and tokenizer for natural language processing.

**1. Model Quantization Configuration:**
   - Model quantization is a technique to make the model more memory and computationally efficient.
   - Inside this function, a configuration called `bnb_config` is created with specific settings:
     - `load_in_4bit=True`: This tells the model to load in 4-bit quantized weights, which use less memory.
     - `bnb_4bit_quant_type="nf4"`: It specifies the quantization type for 4-bit weights.
     - `bnb_4bit_compute_dtype=torch.float16`: This sets the data type used for computations to 16-bit floating-point numbers, which are faster than full precision (32-bit) numbers.

**2. Loading the Language Model:**
   - The function loads a pre-trained language model (specified by `MODEL_NAME`) for causal language modeling.
   - `use_safetensors=True`: This setting ensures the safe use of tensors during computation.
   - `quantization_config=bnb_config`: It applies the quantization configuration defined earlier to the model.
   - `trust_remote_code=True`: Indicates that the model trusts any remote code it interacts with.
   - `device_map=DEVICE`: Specifies whether the model should run on a GPU (`"cuda:0"`) if available or on the CPU (`"cpu"`) if not.

**3. Tokenizer Configuration:**
   - A tokenizer is created to process text data for the language model.
   - `tokenizer.pad_token = tokenizer.eos_token` and `tokenizer.padding_side = "right"` are used to configure padding behavior.

**4. Return Values:**
   - The function returns both the configured language model (`model`) and the tokenizer, making them ready for use in natural language processing tasks.

In summary, this function prepares a language model and tokenizer, optimizing them for efficient computation and memory usage while ensuring compatibility with the available hardware.

---

In [2]:
# model quantization configuration function
def create_model_and_tokenizer():
    bnb_config = BitsAndBytesConfig(
        load_in_4bit=True,
        bnb_4bit_quant_type="nf4",
        bnb_4bit_compute_dtype=torch.float16,
    )

    model = AutoModelForCausalLM.from_pretrained(
        MODEL_NAME,
        use_safetensors=True,
        quantization_config=bnb_config,
        trust_remote_code=True,
        device_map=DEVICE,
    )

    tokenizer = AutoTokenizer.from_pretrained(MODEL_NAME)
    tokenizer.pad_token = tokenizer.eos_token
    tokenizer.padding_side = "right"

    return model, tokenizer

---

Next we will log in to hugging face with our access token and load the models from hugging face.

---

In [3]:
notebook_login()

VBox(children=(HTML(value='<center> <img\nsrc=https://huggingface.co/front/assets/huggingface_logo-noborder.sv…

---

**Load Base Model and Tokenizer**

In this section of code, the base language model and tokenizer are loaded using the `create_model_and_tokenizer` function. Let's break down what's happening:

**1. Load Model and Tokenizer:**
   - The `create_model_and_tokenizer` function is called to set up the language model and tokenizer.
   - `model` and `tokenizer` are assigned the values returned by this function.
   - `model.config.use_cache` is set to `False`, indicating that caching is disabled for the model.

**2. Access Token:**
   - An access token (`access_token`) is defined, which may be used for accessing specific resources.

**Load Fine-Tuned Model**

Now, the fine-tuned model is loaded and configured:

**3. Load Fine-Tuned Model:**
   - `model_name` is set to "wefussell/amasum-pos-model," which represents the pre-fine-tuned model.
   - `trained_model` is assigned the fine-tuned model loaded using `AutoModelForCausalLM.from_pretrained(model_name)`.

**4. Set Model to GPU:**
   - The `trained_model` is moved to the GPU (`DEVICE`) if a GPU is available. This ensures that computations are performed on the GPU for faster processing.

**Load Datasets**

Next, datasets are loaded from the Hugging Face library:

**5. Load Datasets:**
   - Two datasets, `dataset` and `dataset2`, are loaded using `load_dataset`. These datasets may contain various training data.

**Convert to Pandas DataFrames**

Finally, the loaded datasets are converted into Pandas DataFrames:

**6. Convert to Pandas DataFrames:**
   - `sum_df` and `temp_df` are created as Pandas DataFrames to store the contents of `dataset["train"]` and `dataset2["train"]`, respectively.

---


In [4]:
# load base model and tokenizer
model, tokenizer = create_model_and_tokenizer()
model.config.use_cache = False

access_token = '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'
#load fine-tuned model
model_name = "wefussell/amasum-pos-model"
trained_model = AutoModelForCausalLM.from_pretrained(model_name )

#set model to GPU
trained_model = trained_model.to(DEVICE)

#load datasets from hugginface
dataset = load_dataset("wefussell/amasum-final-test")
dataset2 = load_dataset("wefussell/amasum-temporal-df")


#convert to pandas dataframe
sum_df = pd.DataFrame(dataset["train"])
temp_df = pd.DataFrame(dataset2["train"])

config.json:   0%|          | 0.00/609 [00:00<?, ?B/s]

model.safetensors.index.json:   0%|          | 0.00/26.8k [00:00<?, ?B/s]

Downloading shards:   0%|          | 0/2 [00:00<?, ?it/s]

model-00001-of-00002.safetensors:   0%|          | 0.00/9.98G [00:00<?, ?B/s]

model-00002-of-00002.safetensors:   0%|          | 0.00/3.50G [00:00<?, ?B/s]

Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

generation_config.json:   0%|          | 0.00/188 [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/776 [00:00<?, ?B/s]

tokenizer.model:   0%|          | 0.00/500k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/1.84M [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/414 [00:00<?, ?B/s]

adapter_config.json:   0%|          | 0.00/568 [00:00<?, ?B/s]

Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

adapter_model.safetensors:   0%|          | 0.00/160M [00:00<?, ?B/s]

Downloading readme:   0%|          | 0.00/21.0 [00:00<?, ?B/s]

Downloading data files:   0%|          | 0/1 [00:00<?, ?it/s]

Downloading data:   0%|          | 0.00/508k [00:00<?, ?B/s]

Extracting data files:   0%|          | 0/1 [00:00<?, ?it/s]

Generating train split: 0 examples [00:00, ? examples/s]

Downloading readme:   0%|          | 0.00/21.0 [00:00<?, ?B/s]

Downloading data files:   0%|          | 0/1 [00:00<?, ?it/s]

Downloading data:   0%|          | 0.00/42.5M [00:00<?, ?B/s]

Extracting data files:   0%|          | 0/1 [00:00<?, ?it/s]

Generating train split: 0 examples [00:00, ? examples/s]

---

Next is a small step we missed in pre-processing the dataset in order for the app to work correctly.

Essentially, we need to make sure that the ***Temporal Dataset*** and the ***Summary Dataset*** have the same product id values. Which is what we are doing below.

---

In [5]:
# make sure tempdf and sum df have same product ids
filtered_temp_df = temp_df[temp_df['asin'].isin(sum_df['asin'])]


In [6]:
filtered_temp_df['asin'].nunique()

22

In [7]:
sum_df['asin'].nunique()

118

---

These arent exactly the same but the drop menu we will use in our app will be options from the smaller dataset guaranteeing that each compopnent of the app has the data it needs to work.

---

#### Prompt Creation for App

---

Next we will create a slightly altered prompt than what we used for training. Its basically the same except now we only have the **Instruction** and the **Review Text**.

---

In [9]:
def create_prompt_column_2(df):
    # Define the system prompt
    SYSTEM_PROMPT = ("Below is a group of reviews for a product. Summarize the following reviews "
                     "in no more than 120 words. Provide three bullet points highlighting the most discussed "
                     "topics and their associated issues. Keep the summary concise and focused on the key points.")

    # Using DataFrame.apply for efficient row-wise operation
    def create_prompt(row):
        return (f"### Instruction:\n\n{SYSTEM_PROMPT}\n\n"
                f"### Input:\n\n{row['reviewText']}\n\n")


    # Apply the function to each row to create the 'prompt' column
    df['prompt'] = df.apply(create_prompt, axis=1)

    return df

In [10]:
create_prompt_column_2(sum_df)

Unnamed: 0,asin,title,reviewText,sentiment,review_length,prompt
0,B00029K20C,"Superior 18-1201 Spring-Lox,2-Way Adjustable S...",would not do what i wanted \n Did not fit my t...,negative,214,### Instruction:\n\nBelow is a group of review...
1,B00029K20C,"Superior 18-1201 Spring-Lox,2-Way Adjustable S...","Gr8 Product Will recommend to any buyer, had t...",positive,298,### Instruction:\n\nBelow is a group of review...
2,B0002BEVMK,Metra TurboWires 71-2003-1 Wiring Harness,Wrong connectors. These are male and should be...,negative,412,### Instruction:\n\nBelow is a group of review...
3,B0002BEVMK,Metra TurboWires 71-2003-1 Wiring Harness,nice \n Good quality and arrived quickly. Made...,positive,129,### Instruction:\n\nBelow is a group of review...
4,B00042L0IA,Hi-Lift Jack JP-350 Jack Protector,"God what a waste of money, the cover is so thi...",negative,557,### Instruction:\n\nBelow is a group of review...
...,...,...,...,...,...,...
231,B01D64RNJM,iTimo 2Pcs White Universal Led License Plate L...,Great concept poor execution. Lights are very ...,negative,451,### Instruction:\n\nBelow is a group of review...
232,B01DVPN8ZE,"Onshowy Car Vacuum Cleaner, 12 Volt 45 W Porta...","Amazing,the mighty mite of compact auto vacs. ...",positive,401,### Instruction:\n\nBelow is a group of review...
233,B01DVPN8ZE,"Onshowy Car Vacuum Cleaner, 12 Volt 45 W Porta...",Broke down after a month. I contacted the sell...,negative,356,### Instruction:\n\nBelow is a group of review...
234,B01EB6DC6C,Best Bicycle Smartphone Handlebar Mount - Crad...,Broke going down the rd about 40 to 50 mph and...,negative,271,### Instruction:\n\nBelow is a group of review...


---

Now we are good to go, we'll set our model to global so that it works better with Gradio app.

---

In [11]:
global trained_model


---

Below is the app created with Gradio that allows a user to select a Product from a drop down menu and from there you can generate the positive and negative summaries for a product, giving the user a complete and nuanced picture of what people really think about the product. The good and the bad.

As well, it generates a temporal plot that measures the sentiment of the product over time.

Together these functions in this application can give users a complete picture of the product.

---

In [None]:
# sentiment drift analysis function
def plot_rolling_mean_rating_trend_by_title(title):
    """
    Plot the trend of 30-day rolling mean ratings for a specific product title using Plotly.
    This function is designed for use in a Gradio app and uses a globally accessible DataFrame.

    Parameters:
    - title: The title of the product for which to plot the trend.

    Returns:
    - Plotly figure object.
    """

    # Filter the DataFrame for the specified title and create a copy
    title_df = filtered_temp_df[filtered_temp_df['title'] == title].copy()

    # Set 'unixReviewTime' as the index of the DataFrame
    title_df.set_index('date', inplace=True)

    # Sort the DataFrame by 'unixReviewTime'
    title_df.sort_index(inplace=True)

    # Calculate the 30-day rolling mean of 'overall' ratings
    rolling_mean = title_df['overall'].rolling('30D').mean()

    # Create an interactive line plot using Plotly
    fig = go.Figure()
    fig.add_trace(go.Scatter(x=rolling_mean.index, y=rolling_mean, mode='lines', name='Rolling Mean Rating'))
    fig.update_layout(title=f'30-Day Rolling Mean Rating Trend for "{title}"',
                      xaxis_title='Date',
                      yaxis_title='30-Day Rolling Mean Rating',
                      hovermode='x')

    return fig

def summarize(model, text: str):
    inputs = tokenizer(text, return_tensors="pt").to(DEVICE)
    inputs_length = len(inputs["input_ids"][0])
    with torch.inference_mode():
        outputs = model.generate(**inputs, max_new_tokens=256, temperature=0.0001)

    generated_text = tokenizer.decode(outputs[0][inputs_length:], skip_special_tokens=True)

    # Find the positions of the first and second "###"
    first_hash_index = generated_text.find("###")
    second_hash_index = generated_text.find("###", first_hash_index + 1)

    # Extract the text between the first and second "###"
    if first_hash_index != -1 and second_hash_index != -1:
        text_between_hashes = generated_text[first_hash_index:second_hash_index].strip()
    else:
        text_between_hashes = generated_text  # Fallback to the full text if "###" not found

    return text_between_hashes

def generate_summaries(title):
    comp_df = sum_df[sum_df['title'] == title]
    positive_summary = ""
    negative_summary = ""

    # Check and summarize for positive and negative reviews separately
    if not comp_df.empty:
        # Filter for positive reviews and summarize
        positive_reviews = comp_df[comp_df['sentiment'] == 'positive']['prompt'].tolist()
        if positive_reviews:
            positive_summary = summarize(trained_model, ' '.join(positive_reviews))

        # Filter for negative reviews and summarize
        negative_reviews = comp_df[comp_df['sentiment'] == 'negative']['prompt'].tolist()
        if negative_reviews:
            negative_summary = summarize(trained_model, ' '.join(negative_reviews))

    return positive_summary, negative_summary


# Extract unique titles for the dropdown
unique_titles = filtered_temp_df['title'].unique().tolist()


# Modified Gradio app setup with a centered title
with gr.Blocks(theme=gr.themes.Glass()) as demo:
    gr.Markdown("<h1 style='text-align: center;'>AMASUM</h1>")  # Centered title with HTML styling

    with gr.Row():
        title_input = gr.Dropdown(choices=unique_titles, label="Select Product Title")

    with gr.Row():
        with gr.Column(scale=1):
            btn_summaries = gr.Button("Generate Summaries")
            positive_output = gr.Textbox(label="Positive Summary")
            negative_output = gr.Textbox(label="Negative Summary")
            btn_summaries.click(generate_summaries, title_input, [positive_output, negative_output])

    with gr.Column(scale=2):
        plot_output = gr.Plot()
        btn_plot = gr.Button("Generate Plot")
        btn_plot.click(plot_rolling_mean_rating_trend_by_title, inputs=title_input, outputs=plot_output)

# Launch the Gradio app
demo.launch(debug=True)

Setting queue=True in a Colab notebook requires sharing enabled. Setting `share=True` (you can turn this off by setting `share=False` in `launch()` explicitly).

Colab notebook detected. This cell will run indefinitely so that you can see errors and logs. To turn off, set debug=False in launch().
Running on public URL: https://e995d479dfc222367f.gradio.live

This share link expires in 72 hours. For free permanent hosting and GPU upgrades, run `gradio deploy` from Terminal to deploy to Spaces (https://huggingface.co/spaces)


# Conclusion

Our app works! AMASUM breathes its first breath and now consumers can use the platform to gain real insights into the products they may want..or not to buy.

Next steps include, training with more data, more GPU power and working with parameters for better loss results as well as playing with parameters for faster inference.


This app is the culmination of 6 months of work for my capstone project at brainstation so I hope you enjoy the sweat and tears that went into it!