# AI Model to Understand and Process Abstract Concepts

## Introduction
In this notebook, we will implement an AI model that can classify text based on abstract concepts like emotions (happiness and sadness). We will use a combination of Natural Language Processing (NLP) techniques and machine learning to achieve this.

## Step 1: Setup Environment
First, ensure that you have the necessary libraries installed. Run the following commands to install `transformers`, `scikit-learn`, and `torch` if you haven't already.

In [1]:
!pip install transformers scikit-learn torch



## Step 2: Import Libraries
We will import the necessary libraries for our implementation. These include `numpy` for numerical operations, `sklearn` for machine learning tasks, `transformers` for BERT model and tokenizer, and `torch` for handling tensor operations.

In [2]:
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from transformers import BertTokenizer, BertModel
import torch

  torch.utils._pytree._register_pytree_node(


## Step 3: Create a Dataset
We will create a dataset of sentences labeled as expressing either happiness or sadness. This dataset will be used to train and evaluate our model.

In [3]:
data = [
    ("I feel great today!", "happiness"),
    ("This is the best day ever!", "happiness"),
    ("I am so sad and depressed.", "sadness"),
    ("Why does everything bad happen to me?", "sadness"),
    ("I am thrilled with the news!", "happiness"),
    ("I can't stop crying.", "sadness"),
    ("What a wonderful world!", "happiness"),
    ("I am feeling blue.", "sadness"),
    ("Everything is going well.", "happiness"),
    ("I lost my job today.", "sadness"),
    ("I am excited about the future.", "happiness"),
    ("I am heartbroken.", "sadness"),
    ("Life is beautiful.", "happiness"),
    ("I am in despair.", "sadness"),
    ("The sun is shining bright.", "happiness"),
    ("I feel lonely.", "sadness"),
    ("I have achieved my goals.", "happiness"),
    ("I am worried about the test.", "sadness"),
    ("I love spending time with my family.", "happiness"),
    ("I am scared of the dark.", "sadness"),
    ("I am so proud of myself.", "happiness"),
    ("I regret my decisions.", "sadness"),
    ("This is an amazing experience!", "happiness"),
    ("I am feeling hopeless.", "sadness"),
    ("I am on top of the world!", "happiness"),
    ("I am completely lost.", "sadness"),
    ("I am grateful for everything.", "happiness"),
    ("I feel like a failure.", "sadness"),
    ("Today is a fantastic day!", "happiness"),
    ("I am disappointed.", "sadness"),
    ("I am full of energy.", "happiness"),
    ("I am exhausted.", "sadness"),
    ("I am enjoying every moment.", "happiness"),
    ("I am devastated.", "sadness"),
    ("I am content with my life.", "happiness"),
    ("I feel numb.", "sadness"),
    ("I am in love.", "happiness"),
    ("I am anxious.", "sadness"),
    ("I am proud of my achievements.", "happiness"),
    ("I feel empty inside.", "sadness"),
    ("I am celebrating my success.", "happiness"),
    ("I feel abandoned.", "sadness")
]

sentences, labels = zip(*data)
labels = np.array([1 if label == "happiness" else 0 for label in labels])

## Step 4: Tokenize and Extract Features using BERT
We will use BERT (Bidirectional Encoder Representations from Transformers) to tokenize the sentences and extract meaningful features. BERT is a state-of-the-art model for NLP tasks.

In [4]:
tokenizer = BertTokenizer.from_pretrained('bert-base-uncased')
tokens = tokenizer(list(sentences), return_tensors='pt', padding=True, truncation=True, max_length=128)

model = BertModel.from_pretrained('bert-base-uncased')
with torch.no_grad():
    outputs = model(**tokens)
    features = outputs.last_hidden_state.mean(dim=1).numpy()

## Step 5: Train a Classifier
We will split the data into training and test sets and then train a logistic regression classifier on the features extracted by BERT.

In [5]:
X_train, X_test, y_train, y_test = train_test_split(features, labels, test_size=0.2, random_state=42)

classifier = LogisticRegression()
classifier.fit(X_train, y_train)

## Step 6: Evaluate the Model
We will evaluate the accuracy of our model on the test set to see how well it performs.

In [6]:
accuracy = classifier.score(X_test, y_test)
print(f'Accuracy: {accuracy * 100:.2f}%')

Accuracy: 100.00%


## Step 7: Make Predictions
Finally, we will demonstrate making a prediction on a new sentence to classify it as expressing either happiness or sadness.

In [7]:
test_sentence = "I am extremely happy today!"
test_tokens = tokenizer(test_sentence, return_tensors='pt', padding=True, truncation=True, max_length=128)
with torch.no_grad():
    test_features = model(**test_tokens).last_hidden_state.mean(dim=1).numpy()
prediction = classifier.predict(test_features)

print(f'The sentence "{test_sentence}" is classified as: {"happiness" if prediction == 1 else "sadness"}')

The sentence "I am extremely happy today!" is classified as: happiness


## Conclusion
In this notebook, we implemented an AI model to classify text based on abstract concepts like emotions using BERT for feature extraction and logistic regression for classification. This approach can be extended to more complex and larger datasets for improved accuracy and performance.

In [8]:
test_sentence = "I am not well"
test_tokens = tokenizer(test_sentence, return_tensors='pt', padding=True, truncation=True, max_length=128)
with torch.no_grad():
    test_features = model(**test_tokens).last_hidden_state.mean(dim=1).numpy()
prediction = classifier.predict(test_features)

print(f'The sentence "{test_sentence}" is classified as: {"happiness" if prediction == 1 else "sadness"}')


The sentence "I am not well" is classified as: sadness


In [9]:
test_sentence = "I am good"
test_tokens = tokenizer(test_sentence, return_tensors='pt', padding=True, truncation=True, max_length=128)
with torch.no_grad():
    test_features = model(**test_tokens).last_hidden_state.mean(dim=1).numpy()
prediction = classifier.predict(test_features)

print(f'The sentence "{test_sentence}" is classified as: {"happiness" if prediction == 1 else "sadness"}')


The sentence "I am good" is classified as: happiness


In [10]:
test_sentence = "I feel low"
test_tokens = tokenizer(test_sentence, return_tensors='pt', padding=True, truncation=True, max_length=128)
with torch.no_grad():
    test_features = model(**test_tokens).last_hidden_state.mean(dim=1).numpy()
prediction = classifier.predict(test_features)

print(f'The sentence "{test_sentence}" is classified as: {"happiness" if prediction == 1 else "sadness"}')


The sentence "I feel low" is classified as: sadness
