<a href="https://colab.research.google.com/github/lynaBoukari/Coding4Integrity/blob/main/ExerciseDifficultyClassifierNN.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## Exercise Difficulty Classification Model

In this section, we implement a neural network model to classify exercises into different difficulty levels (easy, medium, or hard). This model plays a crucial role in automating the process of exercise metadata updates.

### Model Architecture

The neural network is a feedforward neural network with the following layers:

- Input Layer: This layer takes the input features of exercises as input.
- Hidden Layers: These layers process the input data through a series of weighted connections and activation functions.
- Output Layer: This layer produces the predicted difficulty level.

### Training and Evaluation

The model is trained on a labeled dataset of exercises, where each exercise is associated with a difficulty level. The model learns to map input features to the corresponding difficulty level during training.

After training, the model's performance is evaluated on a separate test dataset to assess its accuracy and generalization capabilities.

### Usage

The trained model is integrated into the exam paper platform, where it automatically classifies exercises based on their features and updates the exercise metadata with the predicted difficulty level.

### Implementation Details

- Dataset: The training data consists of exercises with labeled difficulty levels. The data is preprocessed before feeding it into the neural network.
- Training: The model is trained using a training set and optimized using backpropagation and gradient descent.
- Evaluation: The model's performance is assessed on a separate test set using metrics such as accuracy.

### Code Implementation

The following code cells demonstrate how to implement and train the neural network model for exercise difficulty classification.

In [5]:
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import LabelEncoder
from keras.preprocessing.text import Tokenizer
from keras.preprocessing.sequence import pad_sequences
from keras.models import Sequential
from keras.layers import Embedding, Flatten, Dense

# Load your exercise difficulty data from your CSV file
data = pd.read_csv('/content/level_difficulty.csv')

# Encode the difficulty labels to numerical values
label_encoder = LabelEncoder()
data['difficulty'] = label_encoder.fit_transform(data['difficulty'])

# Tokenize the exercise text
tokenizer = Tokenizer()
tokenizer.fit_on_texts(data['description'])
X = tokenizer.texts_to_sequences(data['description'])
X = pad_sequences(X)

# Split the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, data['difficulty'], test_size=0.2, random_state=42)

# Create a simple neural network model
model = Sequential()
model.add(Embedding(input_dim=len(tokenizer.word_index) + 1, output_dim=100, input_length=X.shape[1]))
model.add(Flatten())
model.add(Dense(units=64, activation='relu'))
model.add(Dense(units=3, activation='softmax'))  # 3 output classes (Easy, Medium, Hard)

# Compile the model
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])

# Train the model
model.fit(X_train, y_train, epochs=10, batch_size=32, validation_split=0.2)

# Evaluate the model on test data
loss, accuracy = model.evaluate(X_test, y_test)
print(f"Test loss: {loss:.4f}, Test accuracy: {accuracy:.4f}")


Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
Test loss: 0.5516, Test accuracy: 0.8077
