Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Set Mistral sliding window to max position embeddings when None #128

Merged
merged 1 commit into from
Dec 13, 2023

Conversation

tgaddair
Copy link
Contributor

Fixes #127.

The latest v0.2 Mistral model has a None sliding window in the config. Allowing sliding_window to be None appears to cause the model to output nonsense if we attempt to allow it. Empirically, setting it to the max position embeddings gets it to generate results that seem reasonable.

@tgaddair tgaddair merged commit 7256d15 into main Dec 13, 2023
1 check passed
@tgaddair tgaddair deleted the fix-mistral-window branch December 13, 2023 18:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

issues launching docker cmd for "mistralai/Mistral-7B-Instruct-v0.2"
1 participant