Skip to content

Conversation

@lsy323
Copy link
Collaborator

@lsy323 lsy323 commented May 8, 2024

  • Update convert_checkpoint.py to convert Gemma weights from HuggingFace to safetensor format, and PyTorch state_dict.
  • Enable loading converted Gemma weights in both safetensor format and state_dict
  • Update README.md for running Gemma

Tested with running convert_checkpoint.py and run_interactive.py

@lsy323 lsy323 requested review from FanhaiLu1, qihqi and wang2yn84 May 8, 2024 18:17
@lsy323 lsy323 marked this pull request as draft May 8, 2024 18:26
@lsy323 lsy323 marked this pull request as ready for review May 8, 2024 22:54
@qihqi
Copy link
Collaborator

qihqi commented May 8, 2024

Can you also add the instructions in your PR to the README.md? thanks!

Copy link
Collaborator

@FanhaiLu1 FanhaiLu1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you share the real output result with the gemma weights?

@lsy323
Copy link
Collaborator Author

lsy323 commented May 9, 2024

Can you share the real output result with the gemma weights?

There is some issue with the response, it's generating English but it will repeat the sentence. @qihqi will update the attention module for Gemma in the following PR, which should fix the accuracy issue.

Prompt

"I believe the meaning of life is"

The response is:

 to experience.

Here is my reasoning:

* **Life is a journey.** We are all on a journey through time. The experiences we gain through life's journey is to experience the world and all that we experience, both positive and negative experiences, both good and bad experiences, both positive and negative experiences, both joy and suffering. We all go through life to experience and learn from both good and bad and positive and negative and positive and negative experiences, both joy and suffering and pain and joy, both good and bad and good and bad and negative and positive and negative and positive and negative and good and bad and the good and bad and negative and positive and negative and positive and negative and the good and bad and negative and positive and negative experiences, both good and bad and the good and bad and negative and the good and bad. The is the good and bad and the good and bad and the the good and bad and the the are the experiences. Through the joy and negative. The the the journey is the good. The through the the.

@FanhaiLu1
Copy link
Collaborator

Can you share the real output result with the gemma weights?

There is some issue with the response, it's generating English but it will repeat the sentence. @qihqi will update the attention module for Gemma in the following PR, which should fix the accuracy issue.

Prompt

"I believe the meaning of life is"

The response is:

 to experience.

Here is my reasoning:

* **Life is a journey.** We are all on a journey through time. The experiences we gain through life's journey is to experience the world and all that we experience, both positive and negative experiences, both good and bad experiences, both positive and negative experiences, both joy and suffering. We all go through life to experience and learn from both good and bad and positive and negative and positive and negative experiences, both joy and suffering and pain and joy, both good and bad and good and bad and negative and positive and negative and positive and negative and good and bad and the good and bad and negative and positive and negative and positive and negative and the good and bad and negative and positive and negative experiences, both good and bad and the good and bad and negative and the good and bad. The is the good and bad and the good and bad and the the good and bad and the the are the experiences. Through the joy and negative. The the the journey is the good. The through the the.

Great! The result is reasonable.

@lsy323
Copy link
Collaborator Author

lsy323 commented May 9, 2024

Can you also add the instructions in your PR to the README.md? thanks!

Added to the README.md for how to run Gemma.

@lsy323 lsy323 requested review from FanhaiLu1 and qihqi May 9, 2024 02:44
@lsy323
Copy link
Collaborator Author

lsy323 commented May 9, 2024

Updated convert_checkpoint.py to generate weight in safetensor format by default. Loading weights in state_dict is still added in the weight loading logic.

@FanhaiLu1
Copy link
Collaborator

Updated convert_checkpoint.py to generate weight in safetensor format by default. Loading weights in state_dict is still added in the weight loading logic.

Great! All looks good to me now.

@qihqi qihqi merged commit 811d718 into AI-Hypercomputer:main May 9, 2024
@lsy323 lsy323 deleted the lsiyuan/convert-gemma-hf-weight branch May 9, 2024 16:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants