Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add support for OpenELM #63

Merged
merged 3 commits into from
Apr 30, 2024
Merged

add support for OpenELM #63

merged 3 commits into from
Apr 30, 2024

Conversation

smdesai
Copy link
Contributor

@smdesai smdesai commented Apr 29, 2024

@awni Here's the PR

public static let openelm270m4bit = ModelConfiguration(
id: "mlx-community/OpenELM-270M-Instruct"
) { prompt in
"\(prompt)"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First off, phenomenal work! I tested it, and it seems to be doing completion just fine. Do you know how we should format it for instruction? Is it like the phi model at all?, or the other ones like the model above? I didn't see any special tokens for the instruct version thus far.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks very much Matthew. I've no idea what the chat template format should be and perhaps someone from Apple can comment.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, looks like I found something helpful here, just not sure how to translate to the code https://github.com/apple/corenet/blob/main/projects/openelm/instruction_tuning/openelm-instruct.yaml

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh that looks useful, let me grab the relevant content from the chat template and try it out. Thanks for looking into it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@MatthewWaller I tried the following template without luck.
"<|system|>\nYou are a helpful assistant<|end|>\n<|user|>(prompt)<|end|>\n<|assistant|>"

Copy link
Contributor Author

@smdesai smdesai Apr 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using the default template for the Llama tokenizer, <s>[INST]\(prompt)[/INST] seems to work but I'll leave it to someone who knows better.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, overall this is working well for completion. I thing for chat there are some things we can do after this PR: there is a bug in swift transformers that just got fixed in main pertaining to an encoding special tokens. AND the config isn’t setup to recognize tokens like “<|user|>” and such, so we would need to adjust that too eventually.

@davidkoski davidkoski mentioned this pull request Apr 30, 2024
@awni
Copy link
Member

awni commented Apr 30, 2024

@smdesai could you run the swift formatting?

pre-commit run --all-files

@smdesai
Copy link
Contributor Author

smdesai commented Apr 30, 2024

@awni It's run.

Copy link
Collaborator

@davidkoski davidkoski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the contribution!!

@davidkoski davidkoski merged commit 4d20785 into ml-explore:main Apr 30, 2024
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants