Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Option to Use GPU, CUDA #8

Open
hieuhthh opened this issue Feb 20, 2024 · 4 comments
Open

Option to Use GPU, CUDA #8

hieuhthh opened this issue Feb 20, 2024 · 4 comments

Comments

@hieuhthh
Copy link

I really appreciate this repository. I hope the rerank model can optionally use a GPU to fully utilize the performance increase, potentially even with multi-GPU support.

Thank you.

@PrithivirajDamodaran
Copy link
Owner

Thanks for raising this, we have this in our list.

@prashantg445
Copy link

Hey @PrithivirajDamodaran,
Can you publish this list of next action items somewhere, so that people interested in contribution can get started.

P.S.: I am interested to contribute.

@PrithivirajDamodaran
Copy link
Owner

Thanks for reaching out, @prashantg445

@prabhkaran is working on a few optimisations. He will share those.

Besides that we are going to work on extending FlashRank to support listwise rerankers. Today we are supporting pointwise / pairwise rerankers which frames reranking as a classification task. Given a query q and a passage p pointwise reranker produces a real score indicating the relevance of the passage to the query. The model is optimized using cross entropy or the contrastive loss based on binary relevance judgments from human annotators. At inference time, given the top-k passages returned by the 1st-stage retriever are passed and scored independently. The final passages are then ranked by decreasing the magnitude of their corresponding relevance scores. Instead listwise rerankers consider all the candidate passages.

@YVMVN
Copy link

YVMVN commented Aug 17, 2024

Thanks for reaching out, @prashantg445

@prabhkaran is working on a few optimisations. He will share those.

Besides that we are going to work on extending FlashRank to support listwise rerankers. Today we are supporting pointwise / pairwise rerankers which frames reranking as a classification task. Given a query q and a passage p pointwise reranker produces a real score indicating the relevance of the passage to the query. The model is optimized using cross entropy or the contrastive loss based on binary relevance judgments from human annotators. At inference time, given the top-k passages returned by the 1st-stage retriever are passed and scored independently. The final passages are then ranked by decreasing the magnitude of their corresponding relevance scores. Instead listwise rerankers consider all the candidate passages.

Good day!
I really appreciate this repo. However, listwise is too slow on CPUs with llama-cpp. Is there any update on GPU support?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants