# Title:  Lightweight CNN for Text Classification with LSTM-Based Performance Enhancement

#### Group Member Names :

 Kelvin Ikrokoto

 Clinton Avornu


### INTRODUCTION:
Text classification plays a crucial role in Natural Language Processing (NLP), supporting applications such as document categorization, spam detection, and sentiment analysis.
This project focuses on reproducing the methodology of the research paper “Light-Weighted CNN for Text Classification” and implementing an additional contribution to evaluate the model's performance under a modified architecture. The baseline model is a lightweight Convolutional Neural Network (CNN), and the contribution involves introducing an LSTM-based model for comparison.
*********************************************************************************************************************
#### AIM :
- To reproduce a lightweight CNN text classification model from a selected research paper.

- To introduce a significant contribution by developing an LSTM model for comparative analysis.

- To evaluate and compare the performance of both models using accuracy as the primary metric.
*********************************************************************************************************************
#### Github Repo:
https://github.com/RituYadav92/Lightweighted-CNN-for-Document-Classification
*********************************************************************************************************************
#### DESCRIPTION OF PAPER:
The selected research paper proposes a lightweight CNN architecture designed to improve text classification efficiency while maintaining competitive performance. The paper emphasizes reducing model complexity, speeding up training, and maintaining classification accuracy across multiple classes.
The author's implementation employs convolutional layers to extract local textual features and a pooling layer to reduce dimensionality, followed by dense layers for classification.
*********************************************************************************************************************
#### PROBLEM STATEMENT :
Traditional deep learning approaches for text classification often involve heavy architectures that require significant computational resources. The need for lightweight and efficient models is increasing, especially for deployment in low-resource environments.
This project aims to reproduce the lightweight CNN proposed in the paper and evaluate whether incorporating a sequential model (LSTM) can further enhance classification performance.
*********************************************************************************************************************
#### CONTEXT OF THE PROBLEM:
Modern NLP systems must balance accuracy and efficiency. CNNs capture local text patterns effectively but may struggle with long-term dependencies. LSTMs, by contrast, are designed to capture sequential patterns.
By comparing both architectures within the same dataset, this project explores trade-offs between lightweight design and sequence modeling.
*********************************************************************************************************************
#### SOLUTION:
- A lightweight CNN was implemented as the baseline model.

- An LSTM model was introduced as the contribution to analyze the impact of sequential learning.

- Both models were trained and evaluated on the same dataset for fair comparison.

- Accuracy was used as the main evaluation metric.


# Background
*********************************************************************************************************************


### Reference
Light-Weighted CNN for Text Classification (Research Paper)
### Explanation
The study proposes a minimalistic CNN architecture aimed at reducing computational cost while achieving adequate multi-class classification performance. The architecture typically consists of embedding → convolution → pooling → dense layers.
### Dataset/Input
- Training Data: tobacco-data folder

- Testing Data: sfsf_test_pol_data_bbf folder

- Labels range from 0 to 9, where each .txt file corresponds to a class.
### Weakness
- CNN may not capture long-range dependencies.

- Original implementation used TensorFlow 1.x, no longer compatible with modern environments.

- Limited documentation on hyperparameters and dataset preprocessing.



*********************************************************************************************************************






# Implement paper code :
*********************************************************************************************************************

embedding_dim = 100

cnn_model = Sequential([

    Embedding(input_dim=vocab_size,
              output_dim=embedding_dim,
              input_length=max_len),
    Conv1D(128, 5, activation="relu"),
    GlobalMaxPooling1D(),
    Dense(64, activation="relu"),
    Dropout(0.5),
    Dense(10, activation="softmax")  # 10 classes: 0–9
])

cnn_model.compile(

    loss="sparse_categorical_crossentropy",
    optimizer="adam",
    metrics=["accuracy"]
)

cnn_model.summary()



*********************************************************************************************************************
### Contribution  Code :
lstm_model = Sequential([

    Embedding(input_dim=vocab_size,
              output_dim=embedding_dim,
              input_length=max_len),
    LSTM(128, dropout=0.2, recurrent_dropout=0.2),
    Dense(64, activation="relu"),
    Dropout(0.5),
    Dense(10, activation="softmax")
])

lstm_model.compile(

    loss="sparse_categorical_crossentropy",
    optimizer="adam",
    metrics=["accuracy"]
)

lstm_model.summary()

### Results :
Baseline CNN Accuracy:

41.48%

LSTM Contribution Accuracy:

42.99%

The LSTM model outperformed the baseline CNN by approximately 1.51%, demonstrating that incorporating sequential modeling provides measurable performance improvement.
*******************************************************************************************************************************


#### Observations :
- CNN trains faster but may miss long-term dependencies.

- LSTM captures sequential information, leading to slightly better accuracy.

- Dataset appears noisy, contributing to moderate accuracy scores for both models.

- Increasing epochs or using hybrid CNN-LSTM architectures might further improve performance.
*******************************************************************************************************************************


### Conclusion and Future Direction :
The project successfully addressed the key objectives: replicating the lightweight CNN for text classification and implementing an LSTM-based contribution. The LSTM model achieved a higher test accuracy than the CNN baseline, validating the significance of the contribution.
*******************************************************************************************************************************
#### Learnings :
- CNNs are efficient but limited in handling sequential data.

- LSTMs improve contextual understanding in text classification tasks.

- TensorFlow version compatibility is crucial when reproducing research code.
*******************************************************************************************************************************
#### Results Discussion :
The LSTM model’s improved accuracy indicates that sequential dependencies play an important role in classification tasks, even when datasets contain short text fragments. The CNN baseline still provides competitive results with lower computational cost.

*******************************************************************************************************************************
#### Limitations :
- Dataset labels may not be perfectly balanced.

- Training time increases significantly with LSTM.

- Only one dataset was used for evaluation.


*******************************************************************************************************************************
#### Future Extension :
- Explore hybrid CNN-LSTM models.

- Test transformer-based models such as BERT.

- Apply hyperparameter tuning and regularization techniques.

- Evaluate model generalization using additional datasets.

# References:

[1]:  Light-Weighted CNN for Text Classification — Research Paper

[2]:  Source Code Repository: https://github.com/RituYadav92/Lightweighted-CNN-for-Document-Classification

[3]:  TensorFlow Documentation (Keras API)