Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LitGPT Python API v1 #1463

Merged
merged 15 commits into from
Jun 20, 2024
Merged

LitGPT Python API v1 #1463

merged 15 commits into from
Jun 20, 2024

Conversation

rasbt
Copy link
Collaborator

@rasbt rasbt commented Jun 5, 2024

This is a PR to implement a subset of the LitGPT Python API as discussed in #1459 (CC @aniketmaurya). This subset will focus only on the inference aspects (not the training and finetuning, yet).

TODOs

  • add class structure
  • add model loading
  • add generate method
  • add type hints
  • change checkpoint_dir to model
  • add docstrings
  • add tests
  • add docs

To get the v1 out for inference soon, and to lower the reviewer burden for a single PR, these features will be added in separate PRs:

  • Add streaming option
  • Add functionality to download the model automatically
  • Add multi-GPU loading
  • Add finetuning functions
  • Add pretraining function
  • Extend to multi-device inference

@rasbt rasbt marked this pull request as draft June 5, 2024 22:22
@rasbt
Copy link
Collaborator Author

rasbt commented Jun 7, 2024

So far, the basic use case on a single device works:

# run `litgpt download EleutherAI/pythia160-m`

from litgpt.api import LLM
llm = LLM.load("EleutherAI/pythia-160m", device_type="cuda", devices=1)
text = llm.generate("What do Llamas eat?", top_k=1)

I am concerned with the multi-GPU support though. I remember from a discussion with Luca that we want the option to load the model onto multiple devices. But how would I do it with fabric if I don't want to train / use FSDP right away? I think the natural way would be to do frabric.launch(train, ...), but that wouldn't work with the

from litgpt.api import LLM
llm = LLM.load("EleutherAI/pythia-160m", device_type="cuda", devices=4)
llm.instruction_finetune(dataset, ...)

approach (not implemented yet, but just thinking down the road).

I am honestly a bit stuck. Would the only way be to load the model on a single device and then use multiple devices only when finetuning? Any ideas here @awaelchli ? (And how does the overall code look like? It feels a bit ugly to me but I hope it's not too terrible)

litgpt/api.py Outdated Show resolved Hide resolved
@aniketmaurya
Copy link
Contributor

hi @rasbt, which method do we use here for streaming the response?

@rasbt
Copy link
Collaborator Author

rasbt commented Jun 10, 2024

@aniketmaurya I haven't added streaming to this v1, but I can add it if it's important because it should be relatively straight forward.

@aniketmaurya
Copy link
Contributor

Yes, I think we would need that for LitServe streaming example and as well if we want to serve an OpenAI API compatible API.

@rasbt
Copy link
Collaborator Author

rasbt commented Jun 10, 2024

I tried to add it but it's getting a bit messy to do all that in a single PR because streaming requires more refactoring because the way Python does pattern matching with you have a yield and a return on a method or function. We can add streaming in a separate PR later once we have the basics working.

@rasbt rasbt mentioned this pull request Jun 14, 2024
@rasbt rasbt marked this pull request as ready for review June 18, 2024 12:43
litgpt/api.py Outdated Show resolved Hide resolved
litgpt/api.py Outdated Show resolved Hide resolved
@rasbt
Copy link
Collaborator Author

rasbt commented Jun 18, 2024

If you have some time, could you take a look at whether this v1 looks structurally ok @lantiga @awaelchli ?

As mentioned at the top, more features will be added later. This is a simple v1 that focuses on the basics to not bloat the PR too much.

@rasbt
Copy link
Collaborator Author

rasbt commented Jun 20, 2024

Let's merge it so that it can be used by Aniket as an experimental feature from the main branch. This is not going to be advertised or recommended to people yet until it's a bit more mature and more functionality is added. If you have a chance some time (maybe after all the conferences), your expert feedback and second pair of eyes would be super appreciated.

@rasbt rasbt merged commit ff50df2 into main Jun 20, 2024
9 checks passed
@rasbt rasbt deleted the python-api-v1 branch June 20, 2024 19:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants