Train baseline models for evaluation #42

huu4ontocord · 2023-05-04T20:16:36Z

We need to eval the experts that are merged against if we trained a 1b Pythia model all together.

Trained with all layers on the 6 datasets we have.
Trained with just the upper layers.

To keep it fair, we would need to get the exact same 8000 random train example each for the 7 dataset we used in the ohter experiments. And we merge the 6 experts with basic averaging and run the same eval from the 7 dataset on that model.

This will give us a comparison of :

training all layers on same token and data
training some layers on same token and data
merging with different experts trained on same compute

mrseeker · 2023-05-04T20:22:05Z

Have you tried using the EleutherAI eval harness? It should give you a nice representation on how well the model performs, and can be used as an indicator?

mrcabbage972 · 2023-05-05T00:36:43Z

I didn't understand the part about the 1000 training examples. Our datasets are much bigger than that!

huu4ontocord · 2023-05-05T00:41:44Z

didn't we just train our models on 1000 examples only? Or did i misunderstand that

huu4ontocord · 2023-05-05T00:42:25Z

We definately should try on eleuther eval harness. but just testing validity loss will tell us something too. regular finetuing vs. expert finetuning + merge

mrcabbage972 · 2023-05-05T00:47:42Z

We have an issue for Eval Harness in the backlog.

huu4ontocord · 2023-05-05T00:50:48Z

so i am told that:
It seems they were trained on 1k batches
I think the batch size was 8 because of the number of GPUs
So that gives uhs 8k samples

So the above 1000 examples should be 8K examples.

jordiclive · 2023-05-07T07:43:39Z

@ontocord for 2. we want layer_9,10,11,12,13 ?

mrcabbage972 · 2023-05-08T02:37:01Z

@jordiclive @ontocord
We had used layers 9-13 when we trained the experts. See: https://github.com/ontocord/MDEL/blob/main/src/mdel/train.sh#L4

mrcabbage972 · 2023-05-10T15:00:05Z

@jordiclive Any updates on this issue?

jordiclive · 2023-05-10T15:34:22Z

@mrcabbage972 I trained 1. a model (all layers) on the exact splits...https://wandb.ai/ontocord/jordi_testing/runs/hu8j9ta1?workspace=user-jordanclive if you toggle the evaluation.

But I then thought we decided on automating the experiment again with more training data/less validation, maybe same amount of final testing data #47

mrcabbage972 added enhancement New feature or request and removed enhancement New feature or request labels May 5, 2023

mrcabbage972 changed the title ~~train baseline model on 6 instruction set~~ Full fine-tune of baseline model for evaluation May 5, 2023

mrcabbage972 changed the title ~~Full fine-tune of baseline model for evaluation~~ Train baseline models for evaluation May 5, 2023

mrcabbage972 added the Evaluation label May 5, 2023

huu4ontocord assigned jordiclive May 6, 2023

mrcabbage972 added Training Runs and removed Evaluation labels May 12, 2023

mrcabbage972 mentioned this issue May 13, 2023

Evaluate a merged expert model's perplexity #53

Open

mrcabbage972 added this to the Minimal Evaluation milestone May 15, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Train baseline models for evaluation #42

Train baseline models for evaluation #42

huu4ontocord commented May 4, 2023 •

edited

Loading

mrseeker commented May 4, 2023

mrcabbage972 commented May 5, 2023

huu4ontocord commented May 5, 2023

huu4ontocord commented May 5, 2023

mrcabbage972 commented May 5, 2023

huu4ontocord commented May 5, 2023

jordiclive commented May 7, 2023

mrcabbage972 commented May 8, 2023

mrcabbage972 commented May 10, 2023

jordiclive commented May 10, 2023

Train baseline models for evaluation #42

Train baseline models for evaluation #42

Comments

huu4ontocord commented May 4, 2023 • edited Loading

mrseeker commented May 4, 2023

mrcabbage972 commented May 5, 2023

huu4ontocord commented May 5, 2023

huu4ontocord commented May 5, 2023

mrcabbage972 commented May 5, 2023

huu4ontocord commented May 5, 2023

jordiclive commented May 7, 2023

mrcabbage972 commented May 8, 2023

mrcabbage972 commented May 10, 2023

jordiclive commented May 10, 2023

huu4ontocord commented May 4, 2023 •

edited

Loading