Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move batching from Task to LLM, fix vLLM.generate and add DISTILABEL_LOG_LEVEL #371

Merged
merged 9 commits into from
Mar 2, 2024

Conversation

alvarobartt
Copy link
Member

@alvarobartt alvarobartt commented Mar 1, 2024

Description

This PR moves the batching from the Task to the LLM so that the LLM handles the batches in the best way possible rather than via a simple for-loop, since there are some LLM engines that have mechanisms to handle batches in a more efficient way. Also the prepare_input abstract method has been removed from LLM and is no longer needed, unless a specific LLM implementation requires it.

Also this PR fixes vLLM.generate and stops propagating the unsolicited inputs through the Pipeline, so that only the ones solicited via the property are kept.

Besides that, this PR also adds the DISTILABEL_LOG_LEVEL environment variable to control the log level of distilabel, which defaults to INFO.

Example

Find a full example at https://huggingface.co/datasets/alvarobartt/instruction-dataset-mistral-7b-instruct-v0.2/blob/main/example.py

@alvarobartt alvarobartt changed the title WIP Move batching from Task to LLM, fix vLLM.generate and add DISTILABEL_LOG_LEVEL Mar 1, 2024
@alvarobartt alvarobartt self-assigned this Mar 1, 2024
@alvarobartt alvarobartt added this to the 1.0.0 milestone Mar 1, 2024
@alvarobartt alvarobartt marked this pull request as ready for review March 1, 2024 13:28
Copy link
Contributor

@plaguss plaguss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚀

@alvarobartt alvarobartt merged commit 9937c43 into core-refactor Mar 2, 2024
0 of 4 checks passed
@alvarobartt alvarobartt deleted the generate-handles-batch branch March 2, 2024 13:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants