GitHub - norhum/model_distillation: Simple Model Distillation: Making GPT-2 Smarter

Simple Model Distillation: Making GPT-2 Smarter

This project makes a smaller language model (GPT-2 124M) smarter by teaching it with a bigger model (GPT-Neo-2.7B). We use the HellaSwag score metric to see if it worked.

What We Did

We used "knowledge distillation." Think of it like a teacher (the big model) helping a student (the small model) learn.

Teacher: GPT-Neo-2.7B (big and smart)
Student: GPT-2 124M (smaller, we want to improve it)
Both models sizes are documented.

The Test: HellaSwag

HellaSwag is a test that checks if a model understands common sense. It gives the model a sentence and some choices for how to finish it. Only one choice makes sense.

Our Goal: Beat the original GPT-2's score on HellaSwag, which was 0.2955. Higher score = better.

How Distillation Works (Simplified)

Teacher's Answers: The big model (GPT-Neo-2.7B) is used to generate "soft targets" on the WikiText-2 dataset. This means we get its predictions, including how sure it is about each word (not just the best one).
Student Learns: The small model (GPT-2) is trained on WikiText-2 in two ways:
- It tries to predict the correct next word in the text (like normal language model training).
- It also tries to match how sure the teacher was about each word. This is the "distillation" part. It learns from the teacher's "soft targets."
Softening the Answers: The answers (called "logits") from both models are divided by a temperature value. The higher the temperature, the "softer" (less confident) the predictions become. This helps the student learn.

Results (Will be updated)

We'll update this after we train the student model.

Model	HellaSwag Score
Original GPT-2	0.2955
Smarter GPT-2 (Ours)	[To be added]

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

What We Did

The Test: HellaSwag

How Distillation Works (Simplified)

Results (Will be updated)

About

Uh oh!

Releases

Packages

Languages

License

norhum/model_distillation

Folders and files

Latest commit

History

Repository files navigation

What We Did

The Test: HellaSwag

How Distillation Works (Simplified)

Results (Will be updated)

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages