Create Google--GcpGPT #107

NeelamMandavia · 2023-12-18T13:22:25Z

Creating a Google GCPGCP based on the model google/flan-t5-base in huggingface.co f you already know T5, FLAN-T5 is just better at everything. For the same number of parameters, these models have been fine-tuned on more than 1000 additional tasks covering also more languages. As mentioned in the first few lines of the abstract :

Flan-PaLM 540B achieves state-of-the-art performance on several benchmarks, such as 75.2% on five-shot MMLU. We also publicly release Flan-T5 checkpoints,1 which achieve strong few-shot performance even compared to much larger models, such as PaLM 62B. Overall, instruction finetuning is a general method for improving the performance and usability of pretrained language models.

Disclaimer: Content from this model card has been written by the Hugging Face team, and parts of it were copy pasted from the T5 model card.

Creating a Google GCPGCP based on the model google/flan-t5-base in huggingface.co f you already know T5, FLAN-T5 is just better at everything. For the same number of parameters, these models have been fine-tuned on more than 1000 additional tasks covering also more languages. As mentioned in the first few lines of the abstract : Flan-PaLM 540B achieves state-of-the-art performance on several benchmarks, such as 75.2% on five-shot MMLU. We also publicly release Flan-T5 checkpoints,1 which achieve strong few-shot performance even compared to much larger models, such as PaLM 62B. Overall, instruction finetuning is a general method for improving the performance and usability of pretrained language models. Disclaimer: Content from this model card has been written by the Hugging Face team, and parts of it were copy pasted from the T5 model card. Signed-off-by: NeelamMandavia <153899354+NeelamMandavia@users.noreply.github.com>

NeelamMandavia

Google

NeelamMandavia · 2023-12-18T13:30:44Z

google GPT

Signed-off-by: Antoni Baum <antoni.baum@protonmail.com>

NeelamMandavia changed the base branch from master to ray_llm_0_3 December 18, 2023 13:25

NeelamMandavia changed the base branch from ray_llm_0_3 to master December 18, 2023 13:27

NeelamMandavia commented Dec 18, 2023

View reviewed changes

NeelamMandavia closed this Dec 18, 2023

NeelamMandavia reopened this Dec 18, 2023

alanwguo pushed a commit that referenced this pull request Jan 25, 2024

Increase input length, reduce batch size (#107)

5f1fa11

Signed-off-by: Antoni Baum <antoni.baum@protonmail.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Create Google--GcpGPT #107

Create Google--GcpGPT #107

NeelamMandavia commented Dec 18, 2023

NeelamMandavia left a comment

NeelamMandavia commented Dec 18, 2023

Create Google--GcpGPT #107

Are you sure you want to change the base?

Create Google--GcpGPT #107

Conversation

NeelamMandavia commented Dec 18, 2023

NeelamMandavia left a comment

Choose a reason for hiding this comment

NeelamMandavia commented Dec 18, 2023