We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hi I can run 1.3B using benchmark code here, but 2.7B is still not working (bad results) with the following params:
parser = argparse.ArgumentParser(description='H3 generation benchmarking') parser.add_argument('--dmodel', type=int, default=2560) # 2048 parser.add_argument('--nlayer', type=int, default=32) # 24 parser.add_argument('--attn-layer-idx', type=list, default=[8, 16, 24]) # [8, 16] parser.add_argument('--nheads', type=int, default=20) # 16 parser.add_argument('--ckpt', type=str, default='/fsx/BlinkDL/CODE/_PUBLIC_/H3/H3-2.7B/model-3attn.pt') parser.add_argument('--promptlen', type=int, default=1024) parser.add_argument('--genlen', type=int, default=128) args = parser.parse_args()
The text was updated successfully, but these errors were encountered:
We're looking into this, stay tuned!
Sorry, something went wrong.
Thanks for the bug report, we've just fixed this. There was a mistake in the mapping between old and new parameter names that we've now fixed.
Great. How abt the configuration for 125M and 355M
Here are examples about how to load all the models, and example outputs: https://github.com/HazyResearch/H3/blob/main/examples/README.md
No branches or pull requests
Hi I can run 1.3B using benchmark code here, but 2.7B is still not working (bad results) with the following params:
The text was updated successfully, but these errors were encountered: