# Notebook: IMDb Sentiment Analysis with ClearML

## Objective
The goal of this project is to perform sentiment analysis on the IMDb Movie Reviews dataset, classifying reviews as positive or negative. We'll leverage ClearML for experiment tracking, visualization, and comparison.

## Workflow
1. Load and explore the dataset.
2. Preprocess the text data.
3. Train and evaluate machine learning models.
4. Track experiments and log results using ClearML.

## Tools and Libraries
- Python
- Pandas, Scikit-learn for data handling and modeling
- ClearML for experiment tracking and logging

## Step 1: Load and Explore the Dataset


In [19]:
import pandas as pd
from clearml import Dataset

# Create a new dataset version
dataset = Dataset.create(dataset_name="IMDb Reviews", dataset_project="Sentiment Analysis")
dataset.add_files("../data/IMDB-Dataset.csv")
dataset.upload()
dataset.finalize()

df = pd.read_csv("../data/IMDB-Dataset.csv")

df.head()

ClearML results page: https://app.clear.ml/projects/4940d4acad2948baa4975ef47ca43225/experiments/d4bcea17b73547268812d44ef74d5eca/output/log
ClearML dataset page: https://app.clear.ml/datasets/simple/4940d4acad2948baa4975ef47ca43225/experiments/d4bcea17b73547268812d44ef74d5eca
Uploading dataset changes (1 files compressed to 25.33 MiB) to https://files.clear.ml
File compression and upload completed: total size 25.33 MiB, 1 chunk(s) stored (average size 25.33 MiB)


Unnamed: 0,review,sentiment
0,One of the other reviewers has mentioned that ...,positive
1,A wonderful little production. <br /><br />The...,positive
2,I thought this was a wonderful way to spend ti...,positive
3,Basically there's a family where a little boy ...,negative
4,"Petter Mattei's ""Love in the Time of Money"" is...",positive


In [9]:
df.isnull().sum()

review       0
sentiment    0
dtype: int64

In [None]:
df['sentiment'].value_counts()

sentiment
positive    25000
negative    25000
Name: count, dtype: int64

## Step 2: Data Preprocessing
We'll clean the text and convert it into numerical format using TF-IDF.

In [11]:
from sklearn.model_selection import train_test_split
from sklearn.feature_extraction.text import TfidfVectorizer

# Split the data into training and testing sets
X = df['review']  
y = df['sentiment']  
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Convert text to numerical representation using TF-IDF
vectorizer = TfidfVectorizer(max_features=5000, stop_words='english')
X_train_vec = vectorizer.fit_transform(X_train)
X_test_vec = vectorizer.transform(X_test)

X_train_vec

<Compressed Sparse Row sparse matrix of dtype 'float64'
	with 2806532 stored elements and shape (40000, 5000)>

## Step 3: Model Training and ClearML Integration
We'll train a Logistic Regression model and track the experiment using ClearML.


In [15]:
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score, classification_report
from clearml import Task

# Initialize ClearML Task
task = Task.init(project_name="IMDb Sentiment Analysis", task_name="Logistic Regression")

# Train a Logistic Regression model
model = LogisticRegression(max_iter=1000)
model.fit(X_train_vec, y_train)

# Evaluate the model
y_pred = model.predict(X_test_vec)
accuracy = accuracy_score(y_test, y_pred)
print(f"Test Accuracy: {accuracy}")
print("\nClassification Report:")
print(classification_report(y_test, y_pred))

Test Accuracy: 0.8889

Classification Report:
              precision    recall  f1-score   support

    negative       0.90      0.87      0.89      4961
    positive       0.88      0.90      0.89      5039

    accuracy                           0.89     10000
   macro avg       0.89      0.89      0.89     10000
weighted avg       0.89      0.89      0.89     10000



In [16]:
# Log the accuracy to ClearML
task.get_logger().report_scalar("Accuracy", "Test", iteration=1, value=accuracy)
task.close()

## Step 4: Experiment Tracking
The ClearML dashboard now logs:
1. Model training metrics (accuracy, classification report).
2. Artifacts (trained model files, logs, etc.).
3. Metadata about the experiment.

You can compare multiple experiments (e.g., different models) on the dashboard.

## Step 5: Compare Multiple Models
We'll train another model (Naive Bayes) and log its results for comparison.

In [17]:
from sklearn.naive_bayes import MultinomialNB

# Train a Naive Bayes model
task = Task.init(project_name="IMDb Sentiment Analysis", task_name="Naive Bayes")
model_nb = MultinomialNB()
model_nb.fit(X_train_vec, y_train)

# Evaluate the Naive Bayes model
y_pred_nb = model_nb.predict(X_test_vec)
accuracy_nb = accuracy_score(y_test, y_pred_nb)
print(f"Naive Bayes Test Accuracy: {accuracy_nb}")

Could not read Jupyter Notebook: No module named 'nbconvert'
Please install nbconvert using "pip install nbconvert"


ClearML Task: created new task id=279d5440a220455ca2933295946161d9
ClearML results page: https://app.clear.ml/projects/8d7422d16fe44ab780913011deb6f3f8/experiments/279d5440a220455ca2933295946161d9/output/log
Naive Bayes Test Accuracy: 0.8508


ClearML Monitor: GPU monitoring failed getting GPU reading, switching off GPU monitoring


In [18]:
task.get_logger().report_scalar("Accuracy", "Test", iteration=1, value=accuracy_nb)
task.close()

## Step 6: Conclusion

### Key Findings
1. **Logistic Regression** achieved an accuracy of X% (replace with your result).
2. **Naive Bayes** achieved an accuracy of Y% (replace with your result).
3. ClearML streamlined experiment tracking and comparison, providing a centralized platform for logging and visualizing results.

### Future Work
- Experiment with advanced models (e.g., Random Forest, Neural Networks).
- Automate the pipeline using ClearML Agents.
- Deploy the best-performing model using ClearML Serving or a custom web app.

## Thank You!
