Here is your **updated step-by-step guide** for sentiment analysis in **Microsoft Azure AI Studio**, modified to use the **star rating labels (1-5 stars)** instead of **positive, neutral, and negative**.

---

# **Comprehensive Guide for Applying Amazon Product Review Sentiment Analysis in Azure AI Studio**  

This guide details the **steps to apply sentiment analysis** on Amazon product reviews using **Microsoft Azure AI Studio** and the **LiYuan/amazon-review-sentiment-analysis** model.

---

## **Step 1: Define Your AI Task**
### **Task: Sentiment Analysis of Product Reviews**
- **Objective:** Automatically classify **Amazon product reviews into star ratings (1-5 stars)**.
- **Expected Outcomes:**
  - Analyze **customer sentiment** from Amazon reviews.
  - Provide **detailed insights on product performance** based on ratings.
  - Demonstrate **Azure AI capabilities for text classification**.
- **Real-World Application:**
  - Used by **e-commerce platforms** to understand customer feedback.
  - Helps businesses **track product performance based on customer sentiment**.
  - Enhances **customer experience management**.

---

## **Step 2: Explore the Model Catalog in Azure AI Studio**
1. **Log in to Azure AI Studio:**
   - Navigate to **[Azure AI Studio](https://ai.azure.com/studio)**.
   - Ensure you have an **Azure subscription** and **resource group**.

2. **Browse Pre-trained Models in the Model Catalog:**
   - Go to **Model Catalog** in Azure AI Studio.
   - Search for **Sentiment Analysis** models.
   - Identify models from providers like:
     - **Microsoft** (Azure Text Analytics Sentiment Analysis)
     - **OpenAI** (GPT-based models)
     - **Hugging Face** (BERT-based sentiment models)

3. **Select the Model:**
   - Choose **"LiYuan/amazon-review-sentiment-analysis"** from Hugging Face.
   - This model **predicts review sentiment as a star rating (1-5 stars)**.

---

## **Step 3: Manage Your Model in Azure AI Studio**
1. **Add the Selected Model to Your Project:**
   - Click **"Deploy Model"** in Azure AI Studio.
   - Assign a **resource group** and **compute instance**.

2. **Organize and Label the Model:**
   - Label the model as **Amazon Review Sentiment Analysis**.
   - Enable **version control** to track updates.

---

## **Step 4: Develop Your AI Solution**
### **4.1 Input Data Preparation**
1. **Download the Dataset:**
   - Get the dataset from [Kaggle](https://www.kaggle.com/datasets/promptcloud/amazon-product-reviews-dataset).
   - Upload the dataset to **Azure Storage (Blob Storage or Azure Data Lake Storage)**.

2. **Preprocess the Data:**
   - Use **Azure Machine Learning Notebooks** or **Azure Data Factory** to clean and transform the data.
   - Load data into **pandas** using Python:
     ```python
     import pandas as pd
     df = pd.read_csv("amazon_com-product_reviews__20200101_20200331_sample.csv")
     ```
   - Drop unnecessary columns and retain **review text**:
     ```python
     df = df[['review_body']]
     ```

3. **Split Data into Training and Testing Sets:**
   ```python
   from sklearn.model_selection import train_test_split
   train, test = train_test_split(df, test_size=0.2, random_state=42)
   ```

4. **Store Processed Data in Azure AI Studio:**
   - Upload cleaned datasets to **Azure Machine Learning Datastore**.

---

### **4.2 Model Integration**
1. **Connect to the Pre-trained Model**
   - Use **Hugging Face API**:
     ```python
     import requests
     import os
     from dotenv import load_dotenv

     # Load API key from .env
     load_dotenv()
     API_KEY = os.getenv("HF_API_KEY")

     API_URL = "https://api-inference.huggingface.co/models/LiYuan/amazon-review-sentiment-analysis"
     headers = {"Authorization": f"Bearer {API_KEY}"}

     def query(payload):
         response = requests.post(API_URL, headers=headers, json=payload)
         return response.json()
     ```

2. **Test Sentiment Analysis on Sample Reviews**
   ```python
   review_text = "This product is fantastic! Exceeded my expectations."
   predictions = query({"inputs": review_text})
   print(predictions)
   ```

---

### **4.3 Convert Star Ratings for Analysis**
Since the model outputs **1-5 star ratings**, we need to **extract the highest-confidence label**.

```python
def get_star_rating(predictions):
    # Extract highest confidence label
    top_prediction = max(predictions[0], key=lambda x: x['score'])
    return top_prediction['label']  # Returns '5 stars', '4 stars', etc.

review_text = "This product is amazing! I love it."
predictions = query({"inputs": review_text})
star_rating = get_star_rating(predictions)

print(f"Predicted Star Rating: {star_rating}")
```

✅ **Example Output:**
```
Predicted Star Rating: 5 stars
```

---

## **Step 5: Evaluate Your Solution**
- **Analyze Accuracy & Limitations:**
  - If accuracy is low, consider **fine-tuning the model**.
  - Identify common misclassified reviews.

- **Challenges Encountered:**
  - Some reviews may be **sarcastic or contain mixed sentiment**.

---

## **Step 6: Write a Report**
Your final report should include:

### **1. Task Definition**
- Define **sentiment analysis** using **star ratings (1-5 stars)**.

### **2. Model Selection**
- Justify why **LiYuan/amazon-review-sentiment-analysis** was chosen.

### **3. Management Process**
- Outline **model tracking, versioning, and deployment** steps.

### **4. Solution Development**
- Document **data preprocessing, model integration, and inference steps**.

### **5. Evaluation Results**
- Show **classification report metrics**.

### **6. Future Improvements**
- Discuss **fine-tuning approaches**.

---

## **Step 7: Deployment (Optional)**
1. **Deploy as an API with Azure AI Studio**
   - Create an **Azure Function or REST API**.
   - Example API call:
     ```python
     import requests
     url = "YOUR_DEPLOYED_MODEL_URL"
     data = {"review": "The product quality is excellent!"}
     response = requests.post(url, json=data)
     print(response.json())
     ```

2. **Integrate with Power BI or Web Apps**
   - Visualize **sentiment analysis insights** in **Power BI**.

---

## **Final Summary**
| Step | Action |
|------|--------|
| **1. Define Task** | Sentiment Analysis for Amazon Product Reviews (1-5 stars) |
| **2. Explore Model Catalog** | Select **LiYuan/amazon-review-sentiment-analysis** |
| **3. Manage Model** | Deploy in Azure AI Studio |
| **4. Develop Solution** | Preprocess data, integrate model, classify sentiment |
| **5. Evaluate Solution** | Compute accuracy and refine predictions |
| **6. Write Report** | Document findings and improvements |
| **7. Deploy (Optional)** | API integration, Power BI visualization |


---

# [distilbert-base-uncased-finetuned-mnli-amazon-query-shopping](https://huggingface.co/LiYuan/amazon-review-sentiment-analysis)

This model is a fine-tuned version of [nlptown/bert-base-multilingual-uncased-sentiment](https://huggingface.co/nlptown/bert-base-multilingual-uncased-sentiment?text=I+like+you.+I+love+you) on the [Amazon US Customer Reviews Dataset](https://www.kaggle.com/datasets/cynthiarempel/amazon-us-customer-reviews-dataset). The fine-tuning process is documented in [this notebook](https://github.com/vanderbilt-data-science/bigdata/blob/main/06-fine-tune-BERT-on-our-dataset.ipynb).

This is an **uncased model**, meaning it treats "english" and "English" as the same. 

### Evaluation Results
- **Loss:** 0.5203  
- **Accuracy:** 80%

---

## [Model Description](https://huggingface.co/LiYuan/amazon-review-sentiment-analysis#model-description)

This is a **bert-base-multilingual-uncased** model fine-tuned for **sentiment analysis** on product reviews in six languages:  
**English, Dutch, German, French, Spanish, and Italian.**  

It predicts review sentiment as a **star rating (1 to 5 stars).**  

### Fine-Tuning Details
- The original model’s head was replaced with a custom classifier.
- Fine-tuned on **17,280 training samples** and **validated on 4,320 samples**.
- Evaluated on a **held-out test set of 2,400 samples**.

This model is suitable for direct use in **product review sentiment analysis** or further fine-tuning for related tasks.

---

## [Intended Uses & Limitations](https://huggingface.co/LiYuan/amazon-review-sentiment-analysis#intended-uses--limitations)

### Intended Uses
- **Review Sentiment Analysis**: Predicts the star rating for a given product review.
- **Fine-tuning**: Can be further refined for domain-specific sentiment tasks.

### Limitations
- Optimized for **Amazon product reviews**; performance may degrade on reviews from other platforms or different text domains.

---

## [How to Use](https://huggingface.co/LiYuan/amazon-review-sentiment-analysis#how-to-use)

You can load the model and tokenizer using the **Hugging Face Transformers** library:

```python
from transformers import AutoTokenizer, AutoModelForSequenceClassification

# Load tokenizer and model
tokenizer = AutoTokenizer.from_pretrained("LiYuan/amazon-review-sentiment-analysis")
model = AutoModelForSequenceClassification.from_pretrained("LiYuan/amazon-review-sentiment-analysis")
```

---

## [Training and Evaluation Data](https://huggingface.co/LiYuan/amazon-review-sentiment-analysis#training-and-evaluation-data)

The raw dataset is available on [Kaggle](https://www.kaggle.com/datasets/cynthiarempel/amazon-us-customer-reviews-dataset).

---

## [Training Procedure](https://huggingface.co/LiYuan/amazon-review-sentiment-analysis#training-procedure)

### **Training Hyperparameters**
- **Learning Rate:** 2e-5  
- **Batch Size (Train/Eval):** 16  
- **Seed:** 42  
- **Optimizer:** Adam (β₁=0.9, β₂=0.999, ε=1e-8)  
- **LR Scheduler:** Linear  
- **Epochs:** 2  

---

## [Training Results](https://huggingface.co/LiYuan/amazon-review-sentiment-analysis#training-results)

| Epoch | Step | Training Loss | Validation Loss | Accuracy |
|-------|------|--------------|----------------|----------|
| 1.0   | 1080 | 0.5554       | 0.5203         | 80.00%   |
| 2.0   | 1080 | 0.4243       | 0.5496         | 79.84%   |

---

### Summary:
This **multilingual BERT-based model** provides **80% accuracy** for **Amazon review sentiment analysis** across six languages. It is ideal for **predicting review star ratings** and can be fine-tuned for other **text classification** tasks.

For more details, visit the [model page](https://huggingface.co/LiYuan/amazon-review-sentiment-analysis).

---

In [2]:
pip install python-dotenv

Note: you may need to restart the kernel to use updated packages.



[notice] A new release of pip is available: 25.0 -> 25.0.1
[notice] To update, run: python.exe -m pip install --upgrade pip


In [4]:
import requests
import os
from dotenv import load_dotenv

# Load API key from .env
load_dotenv()
API_KEY = os.getenv("HF_API_KEY")

# Debugging: Check if API key was loaded correctly
if not API_KEY:
    raise ValueError("API key not found! Make sure .env file is set up correctly.")

API_URL = "https://api-inference.huggingface.co/models/LiYuan/amazon-review-sentiment-analysis"
headers = {"Authorization": f"Bearer {API_KEY}"}

def query(payload):
    response = requests.post(API_URL, headers=headers, json=payload)

    # Debugging: Check if API request was successful
    print(f"Status Code: {response.status_code}")
    print(f"Response Text: {response.text}")

    if response.status_code != 200:
        raise ValueError(f"API Request Failed: {response.text}")

    return response.json()

# Test a sample review
data = query({"inputs": "This product is fantastic! Exceeded my expectations."})
print(data)


Status Code: 200
Response Text: [[{"label":"5 stars","score":0.9718555212020874},{"label":"4 stars","score":0.026542097330093384},{"label":"3 stars","score":0.0007599125965498388},{"label":"1 star","score":0.0005056087975390255},{"label":"2 stars","score":0.0003368980251252651}]]
[[{'label': '5 stars', 'score': 0.9718555212020874}, {'label': '4 stars', 'score': 0.026542097330093384}, {'label': '3 stars', 'score': 0.0007599125965498388}, {'label': '1 star', 'score': 0.0005056087975390255}, {'label': '2 stars', 'score': 0.0003368980251252651}]]


In [6]:
import kagglehub

# Download the dataset (latest version)
dataset_path = kagglehub.dataset_download("promptcloud/amazon-product-reviews-dataset")

print("Path to dataset files:", dataset_path)


ModuleNotFoundError: No module named 'kagglehub'