Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RoPE scaling config confusing #40

Open
tattrongvu opened this issue Nov 15, 2023 · 0 comments
Open

RoPE scaling config confusing #40

tattrongvu opened this issue Nov 15, 2023 · 0 comments

Comments

@tattrongvu
Copy link

Hi Yarn team,
thank you guys for the awesome work. Currently I'm trying to evaluate several rope scaling methods and fortunately there are all available in this git. I have some question related to the Config of rope scaling.
I see that in the requirements.txt you already include transformers >= 4.34.0, so it mean I could use the "linear" and "dynamic-ntk" out of the box with transformers, just by add the rope scaling in AutoConfig.from_pretrained() like that:
config.rope_scaling = { "type": "linear", "factor": args.linear }
or
config.rope_scaling = { "type": "dynamic", "factor": args.dynamic_ntk }
I tried that and remove the patch for linear & dynamic-ntk and the result look identical when using your implemented patch.
Moreover, it also support Falcon architecture. (https://github.com/huggingface/transformers/blob/main/src/transformers/models/falcon/modeling_falcon.py#L162)
So my question is that, are there any different between this two implementation? or your implementation for linear & dynamic-ntk patch is for keeping the reproduction eval consistent?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant