Skip to content

TYMBert is our submission for NCIM 2025, a spam classifier that makes use of knowledge distillation to compress the model while preserving accuracy

License

Notifications You must be signed in to change notification settings

mirzaazwad/TYMBert

Repository files navigation

Tymbert

Table of Contents

Introduction

TYMBert is our submission for NCIM 2025, a spam classifier that makes use of knowledge distillation to compress the model while preserving accuracy

This repository provides a Conda environment configuration file (environment.yml) for setting up the tymbert environment. Follow the steps below to install and configure it correctly on your system.

Prerequisites

Installation Steps

  1. Clone the Repository

    git clone https://github.com/mirzaazwad/TYMBert.git
    cd TYMBert
  2. Create the Conda Environment Run the following command to create the tymbert environment from the environment.yml file:

    conda env create -f environment.yml
  3. Update the Environment Prefix The environment.yml file may contain an absolute path under the prefix field, which may not match your system's Conda installation directory. To fix this:

    • Open the environment.yml file in a text editor
    • Locate the prefix: field at the bottom of the file (if present)
    • Change it to your own Conda environment path, which can be found using:
      conda info --envs
    • Alternatively, create the environment without using the prefix by running:
      conda env create --name tymbert --file environment.yml
  4. Activate the Environment

    conda activate tymbert
  5. Verify Installation Check that the necessary dependencies are installed:

    conda list

Updating the Environment

If you make changes to environment.yml and need to update the existing environment:

conda env update --name tymbert --file environment.yml --prune

Deactivating and Removing the Environment

To deactivate the environment:

conda deactivate

To remove the environment completely:

conda env remove --name tymbert

Jupyter Use

After this environment is setup, use this environment as your kernel and you can use it via Jupyter Notebook or VSCode with the Jupyter extension.

Datasets

Dataset Name Description Link
SPStudy A dataset for spam research, containing various studies and data points. GitHub - SPStudy
SMS Spam Collection Dataset A dataset containing SMS messages labeled as spam or ham. Kaggle - SMS Spam Collection

Additional References

Quantization Logic and Code was used with the help of GitHub - BERT-Quantization by srimoyee1212

License

This project is licensed under the MIT License. See the LICENSE file for details.

About

TYMBert is our submission for NCIM 2025, a spam classifier that makes use of knowledge distillation to compress the model while preserving accuracy

Topics

Resources

License

Stars

Watchers

Forks