Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RankLLM Revamp - VLLM support, APEER prompt, fixed caching, improved documentation #119

Merged
merged 16 commits into from
Jul 17, 2024

Conversation

ronakice
Copy link
Member

Pull Request Checklist

Reference Issue

Please provide the reference to issue this PR is addressing (# followed by the issue number). If there is no associated issue, write "N/A".

ref:

Checklist Items

Before submitting your pull request, please review these items:

  • Have you followed the contributing guidelines?
  • Have you verified that there are no existing Pull Requests for the same update/change?
  • Have you updated any relevant documentation or added new tests where needed?

PR Type

What kind of change does this PR introduce?

  • Bugfix
  • Feature
  • Code style update (formatting, local variables)
  • Refactoring (no functional changes, no api changes)
  • Documentation content changes
  • Other...
    • Description:

@ronakice ronakice requested a review from sahel-sh May 22, 2024 18:27
@@ -153,5 +155,10 @@ def main(args):
default="You are RankLLM, an intelligent assistant that can rank passages based on their relevancy to the query.",
help="the system message used in prompts",
)
parser.add_argument(
"--batched",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This batch can be confused with rerank_batch, the latter is for reranking multiple queries regardless of if it is happening one query at a time or multiple queries at once. The former refers to reranking multiple queries at once. from the default param values it looks like we want to support rerank_batch function with batch=false? if yes, I think the two different meanings of batch can be confusing, maybe we need to rename one of the two usecases? WDYT @ronakice , @lintool

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm i agree this could be a source of confusion for users. Batching usually means multiple queries being processed at once correct? In which case rerank_batch needs renaming?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

with my understanding batch processing is async and yes it is normally multiple queries at once, but it doesn't restrict the batch size to be strictly greater than 1. IIRC, @lintool chose rerank and rerank_batch to align with pyserini function names (retrieve and retrieve_batch).
@ronakice WDYT about changing --batched to --vllm-batched?
I propose to have an --inference-method enum (or some similar name) with FastChat as the default value. The user can set it to vllm if they want to use vllm rather than fastchat inference.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can add these two yes 👍🏼 I agree for the time being this is more clear

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sahel-sh would we need to mention the --inference-method; --vllm-batched is the only vllm setting, otherwise it defaults to FastChat.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Less than Ideal since batch in rerank_batch has a different implication than batch in vllm-batched, but it is ok for now

@ronakice ronakice changed the title VLLM Support RankLLM Revamp - VLLM support, APEER prompt, fixed caching, improved documentation Jul 15, 2024
@ronakice ronakice requested a review from sahel-sh July 15, 2024 18:12
run_beir.sh Outdated Show resolved Hide resolved
src/rank_llm/rerank/rank_gpt.py Show resolved Hide resolved
src/rank_llm/rerank/rank_gpt.py Outdated Show resolved Hide resolved
src/rank_llm/rerank/rank_listwise_os_llm.py Outdated Show resolved Hide resolved
import json
import os

def convert_json_to_jsonl(input_file, output_file):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this needed?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Convert from unsupported JSON request format to newer JSONL request format.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

got it, thanks, please mention where the data class that the new format is comining from in the high level comment above.

@ronakice ronakice merged commit 851471f into main Jul 17, 2024
@ronakice ronakice deleted the vllm_new branch July 17, 2024 15:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants