Classification of IMDB data by using model hiddenstates and their sparse encodings

I'm interested in finding out if the sparse autoencoded representations for hidden activations of LLMs are better for classification. The idea was inspired by this paper and my previous work.

Install

You will have to clone the sparse coding repository to access the relevant class definitions. You also need to add the folder to your path.

git clone https://github.com/loganriggs/sparse_coding.git

In addition to that install the requirements

pip install -r requirements.txt

I tested with Python 3.11.5

Usage

You can run the notebook with either pythia-70m-deduped or pythia-410m-deduped which are the two models for which I had access to pretrained sparse autoencoders.

Results

The sparse representations perform significantly worse on the classification task.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
plots		plots
README.md		README.md
classifying_sparse_representations.ipynb		classifying_sparse_representations.ipynb
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

plots

plots

README.md

README.md

classifying_sparse_representations.ipynb

classifying_sparse_representations.ipynb

requirements.txt

requirements.txt

Repository files navigation

Classification of IMDB data by using model hiddenstates and their sparse encodings

Install

Usage

Results

About

Releases

Packages

Languages

annahdo/classifying_sparse_representations

Folders and files

Latest commit

History

Repository files navigation

Classification of IMDB data by using model hiddenstates and their sparse encodings

Install

Usage

Results

About

Resources

Stars

Watchers

Forks

Languages