Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bug: RAG_RERANKING not functional with manual Win10 install #2354

Open
2 of 4 tasks
chrisoutwright opened this issue May 17, 2024 · 3 comments
Open
2 of 4 tasks

bug: RAG_RERANKING not functional with manual Win10 install #2354

chrisoutwright opened this issue May 17, 2024 · 3 comments

Comments

@chrisoutwright
Copy link

chrisoutwright commented May 17, 2024

Bug Report

Description

Bug Summary:
After fetching and processing files, the reranking model does not function as expected. Weights from the DebertaV2ForSequenceClassification are not properly initialized, leading to ineffective reranking outcomes. Additionally, there seems to be a lack of interaction from the reranker with the documents, suggesting it cannot interpret or process document content effectively.

Steps to Reproduce:

  1. Run start_windows.bat with the following environment variables set:
ENABLE_RAG_HYBRID_SEARCH=True
RAG_EMBEDDING_MODEL=snowflake-arctic-embed:latest
RAG_EMBEDDING_ENGINE=ollama 
OLLAMA_BASE_URLS=http://host.docker.internal:11434;http://host.docker.internal:11435;http://192.168.1.225:11439;http://192.168.1.225:11437;http://192.168.1.225:11436
CHUNK_SIZE=500
CHUNK_OVERLAP=100
RAG_RERANKING_MODEL_AUTO_UPDATE=True
RAG_RERANKING_MODEL=mightbe/Better-PairRM 
  1. Ensure that the RAG_RERANKING_MODEL variable points to a functional model.
  2. Observe the system logs for errors or warnings related to model initialization and embedding generation.
  3. Note the behavior of the reranking model during document processing.

Expected Behavior:
The reranking model should successfully initialize with pre-trained weights and effectively interpret and process documents to enhance search and retrieval outcomes.

Actual Behavior:

  • Initialization errors occur due to some weights not being loaded from the checkpoint.
  • The reranker does not interact effectively with document content, leading to no meaningful enhancements in search or document retrieval functionality as no results used.

Environment

  • Open WebUI Version: v0.1.124
  • Operating System: Windows 10, manual install, CUDA support
  • Browser: Chrome

Reproduction Details

Confirmation:

  • I have read and followed all the instructions provided in the README.md.
  • I am on the latest version of Open WebUI.
  • I have included the browser console logs.
  • I have included the CMD logs.

Additional Notes:

  • The issues occur consistently across multiple attempts with varying document types and contents.
  • Potential compatibility issues with the mightbe/Better-PairRM model could be a contributing factor to the initialization problems.
  • Other Reranking models with same issue.

Logs and Screenshots

WARNING:models.DebertaV2:Some weights of DebertaV2ForSequenceClassification were not initialized from the model checkpoint at C:\Users\Chris\.cache\huggingface\hub\models--mightbe--Better-PairRM\snapshots\c26058437c327e56e878163f74b345b4fb75ce98 and are newly initialized
INFO:apps.rag.main:file.content_type: application/pdf
INFO:apps.rag.main:store_data_in_vector_db
INFO:apps.ollama.main:generate_ollama_embeddings model='snowflake-arctic-embed:latest' prompt='[details of prompt not fully captured]'
INFO:     192.168.1.224:61610 - "GET /_app/immutable/nodes/5.9117d953.js HTTP/1.1" 304 Not Modified
INFO:apps.ollama.main:generate_ollama_embeddings

Effectively it uses:

INFO:apps.ollama.main:generate_ollama_embeddings model='snowflake-arctic-embed:latest' prompt='who involved' options=None keep_alive=None
INFO:apps.ollama.main:url: http://localhost:11435

and then does
INFO:apps.ollama.main:generate_ollama_embeddings

where is reranking seen there?

@chrisoutwright
Copy link
Author

with reranking:

image
image

without reranking
image
image

image

@tjbck tjbck changed the title RAG_RERANKING not functional with manual Win10 install bug: RAG_RERANKING not functional with manual Win10 install May 18, 2024
@tjbck
Copy link
Contributor

tjbck commented May 18, 2024

PR welcome!

@chrisoutwright
Copy link
Author

With hybrid search one would expect RerankCompressor being involved and then being logged with the query_doc_with_hybrid_search:result

I will try again in the Win10 setup, as for the logs. We need to understand the challenge better first, or are there known ones?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants