Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Question] are there some some train replication results? #469

Closed
YannDubs opened this issue Mar 2, 2024 · 5 comments
Closed

[Question] are there some some train replication results? #469

YannDubs opened this issue Mar 2, 2024 · 5 comments
Assignees

Comments

@YannDubs
Copy link

YannDubs commented Mar 2, 2024

Hi,

Thanks for the library! I'm new to the JAX+LLM ecosystem and trying to understand which library I should be using.

I see a lot of (very impressive) computational efficiency benchmarks of maxtext but can't find any benchmark in terms of performance. Do you have some perplexity/evals on a model trained with maxtext in standard settings? E.g. nanoGPT on wikitext (evaluated with perplexities or MMLU) or llama finetuning on Vicuna or Alapca data? I think it would be very useful to decide which JAX library to use for training LLMs!

Thank you for your help!

@rwitten
Copy link
Collaborator

rwitten commented Mar 8, 2024

@YannDubs Sorry for the late response here, some travel.

MaxText has historically been focused on the largest customers who were training custom models of their own design. So we've only been focused on making sure there was correctness and prioritizing perf/scalability. But we've assumed pretraining customers would have their own secret sauce regarding convergence, etc. To demonstrate correctly, we verify we can directly reproduce Chinchilla results:
https://github.com/google/maxtext/blob/main/end_to_end/test_convergence_1b_params.sh

We've gotten a lot of interest in off-the-shelf models that appeal to different folks so we've been adding support for more models. (Now Gemma, Llama and Mistral.)

We also have high performance inference coming soon.

But I think you're asking for something more. Happy to talk live as well, (rwitten at google.com)

@YannDubs
Copy link
Author

YannDubs commented Mar 8, 2024

Thanks @rwitten, something like the Chinchilla results was what I was asking about but I was hoping to see the actual training curves and final evaluation results to be able to compare to (1) be able to compare to the original results, and (2) have a reference curve to compare with when modifying the configs/model.

Thanks!

@rwitten
Copy link
Collaborator

rwitten commented Mar 11, 2024

@gobbleturk can you provide?

@gobbleturk
Copy link
Collaborator

gobbleturk commented Mar 11, 2024

I've uploaded loss curve data using test_convergence_1b_params.sh here

Here is a screenshot of some learning metrics from that run that we display via tensorboard:
image

@YannDubs
Copy link
Author

Great, thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants