Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Neuron model for causal lm partial support #117

Merged
merged 19 commits into from
Jul 6, 2023
Merged

Conversation

dacorvo
Copy link
Collaborator

@dacorvo dacorvo commented Jun 28, 2023

This is a very preliminary integration of a NeuronModelForCausalLM generic class to wrap neuronx optimized models.

For now, only gpt2 model type is supported.

The following features are available:

  • instantiate (export + compile) a neuron model from a transformers model_id (from_pretrained),
  • save the exported model locally (save_pretrained),
  • instantiate (export + reload compilation artifacts) a neuron model from a local directory (from_pretrained),
  • generate outputs (generate).

What is missing:

  • between two generations, a specific reset_generation method must be called: I plan to remove that in a subsequent pull-request,
  • we should check the compiler version when instantiating from a local path,
  • the export should be available through the CLI,
  • the new model class should have a corresponding text-generation pipeline.

What could be added if this is a use case:

  • use cache (is it even possible since the compilation happens behind the scene in transformers-neuronx ?),
  • support recompilation with different parameters when instantiating from a local path,

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint.

@dacorvo dacorvo force-pushed the neuron_model_for_causal_lm branch 5 times, most recently from c357500 to c76e6fe Compare June 30, 2023 12:28
@dacorvo dacorvo force-pushed the neuron_model_for_causal_lm branch from c76e6fe to 2635024 Compare June 30, 2023 14:14
optimum/neuron/modeling_decoder.py Outdated Show resolved Hide resolved
optimum/neuron/modeling_decoder.py Outdated Show resolved Hide resolved
optimum/neuron/modeling_decoder.py Show resolved Hide resolved
optimum/neuron/modeling_decoder.py Outdated Show resolved Hide resolved
optimum/neuron/modeling_decoder.py Outdated Show resolved Hide resolved
optimum/neuron/modeling_decoder.py Outdated Show resolved Hide resolved
optimum/neuron/modeling_decoder.py Outdated Show resolved Hide resolved
optimum/neuron/modeling_decoder.py Outdated Show resolved Hide resolved
optimum/neuron/modeling_decoder.py Outdated Show resolved Hide resolved
optimum/neuron/modeling_decoder.py Outdated Show resolved Hide resolved
Copy link
Member

@michaelbenayoun michaelbenayoun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left some comments, but overall looks great!

optimum/exporters/neuron/config.py Outdated Show resolved Hide resolved
optimum/exporters/neuron/config.py Outdated Show resolved Hide resolved
optimum/exporters/neuron/config.py Outdated Show resolved Hide resolved
optimum/exporters/neuron/config.py Outdated Show resolved Hide resolved
optimum/exporters/neuron/config.py Outdated Show resolved Hide resolved
optimum/neuron/modeling.py Outdated Show resolved Hide resolved
optimum/neuron/modeling_decoder.py Show resolved Hide resolved
optimum/neuron/modeling_decoder.py Outdated Show resolved Hide resolved
optimum/neuron/modeling_decoder.py Show resolved Hide resolved
test_llm_optimum.py Outdated Show resolved Hide resolved
Copy link
Member

@michaelbenayoun michaelbenayoun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!
Thanks for the great work @dacorvo

Can be merged once @JingyaHuang approved.

Copy link
Collaborator

@JingyaHuang JingyaHuang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM as well!

Amazing job, let's get it merged 🚀 !

optimum/neuron/modeling.py Show resolved Hide resolved
optimum/neuron/utils/version_utils.py Show resolved Hide resolved
@JingyaHuang
Copy link
Collaborator

Oh btw, let's add a guiding doc for decoder model as well (maybe in a separate PR):

(And also precise the limit like currently only works for gpt2 in the supported architecture: https://huggingface.co/docs/optimum-neuron/package_reference/configuration)

@dacorvo
Copy link
Collaborator Author

dacorvo commented Jul 6, 2023

Oh btw, let's add a guiding doc for decoder model as well (maybe in a separate PR):

(And also precise the limit like currently only works for gpt2 in the supported architecture: https://huggingface.co/docs/optimum-neuron/package_reference/configuration)

Ok, that will be my next pull-request, now that I finally figured out that I needed to actually edit the mdx files to get something generated 🤭.

@dacorvo dacorvo merged commit eeea740 into main Jul 6, 2023
7 of 10 checks passed
@dacorvo dacorvo deleted the neuron_model_for_causal_lm branch July 6, 2023 14:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants