Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RAGatouille API Server #92

Open
wants to merge 7 commits into
base: main
Choose a base branch
from
Open

RAGatouille API Server #92

wants to merge 7 commits into from

Conversation

0-hero
Copy link

@0-hero 0-hero commented Jan 30, 2024

No changes on core RAGatouille module. Just a fastapi addition

Looking for suggestion on the implementation, routes etc...

API Endpoints

Index

  • POST /api/v1/index/: Create an index with documents.

Document Management

  • POST /api/v1/documents/search/: Search documents.
  • [ISSUE] POST /api/v1/documents/add/: Add documents to an existing index.
  • [ISSUE] DELETE /api/v1/documents/delete/: Delete documents from an index.

Encode

  • POST /api/v1/encode/: Encode documents.
  • POST /api/v1/encode/search/: Search through encoded documents.
  • POST /api/v1/encode/clear/: Clear all encoded documents.

Rerank

  • POST /api/v1/rerank/: Rerank a set of documents based on a query.

Known Issues. Still exploring the cause

  1. Add to index doesn’t work
2024-01-31 00:15:41,992 - app.api.endpoints.document_management - ERROR - Failed to add documents to index: Error in void faiss::Clustering::train_encoded(faiss::idx_t, const uint8_t *, const faiss::Index *, faiss::Index &, const float *) at /Users/runner/work/faiss-wheels/faiss-wheels/faiss/faiss/Clustering.cpp:281: Error: 'nx >= k' failed: Number of training points (114) should be at least as large as number of clusters (128)
  1. Search after delete & Search after failed "add" document throws below error.
2024-01-30 18:11:28,756 - app.api.endpoints.search - INFO - Initiating search with query: 'egypt'. Index: 'string', k: 10
2024-01-30 18:11:28,848 - app.api.endpoints.search - ERROR - Search operation failed for query: 'egypt': 0
Traceback (most recent call last):
  File “…/RAGatouille-experiments/serve/app/api/endpoints/search.py", line 13, in search
    results = rag.search(
  File "…/RAGatouille-experiments/ragatouille/RAGPretrainedModel.py", line 292, in search
    return self.model.search(
  File "…/RAGatouille-experiments/ragatouille/models/colbert.py", line 458, in search
    document_id = self.pid_docid_map[id_]
KeyError: 0
INFO:     127.0.0.1:64170 - "POST /api/v1/search/ HTTP/1.1" 500 Internal Server Error

@0-hero
Copy link
Author

0-hero commented Jan 30, 2024

@bclavie @anirudhdharmarajan Any thoughts on the above issues?

EDIT: Just saw issue 1 might be related to #71

@bclavie
Copy link
Owner

bclavie commented Jan 30, 2024

Thanks a lot for this! I’m likely not going to be able to review bigger PRs in the next few days but will do so ASAP (or defer to @Anmol6 / @anirudhdharmarajan if they have the bandwidth to do so!)

@0-hero
Copy link
Author

0-hero commented Jan 30, 2024

@bclavie No problem, thanks! I’m running additional tests, while trying to fix the above

@0-hero
Copy link
Author

0-hero commented Feb 7, 2024

@bclavie gentle bump!

@bclavie
Copy link
Owner

bclavie commented Feb 7, 2024

Hey, sorry for the delay in reviewing this, still getting some random bouts of poor health (COVID really is tricky!)

As for add_to_index, I think it's fine for it to have this known issue, as the issue is being fixed by @anirudhdharmarajan and isn't related to this service (the function itself has a fundamental issue atm). Moreover, I think this style of on-the-fly serving would work better with an HNSW style index than the optimised one currently used if CRUD is an issue!

Things are looking pretty good at first glance and I really do appreciate it, it lowers the barrier to entry to even more!

A nitpick: could you migrate the dependency management to poetry, or, at least (I don't mind this option!) pin the versions in requirements.txt? Some of those libs move fast and it'd be best to have them pinned!

@0-hero
Copy link
Author

0-hero commented Feb 13, 2024

Sorry to hear that @bclavie. Updated the requirements & migrated the dependency management to poetry

@0-hero
Copy link
Author

0-hero commented Feb 14, 2024

@bclavie do you think this is ready for the merge?

@hiepxanh
Copy link

@bclavie hi, what do you think, I love this feature so I can implement it and deploy on docker <3

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants