Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add weave documenation to ingestion pipeline #73

Merged
merged 9 commits into from
Apr 25, 2024

Conversation

parambharat
Copy link
Contributor

In this PR we did two things.

  1. Add weave documentation to the data ingestion pipeline
  2. Remove Ja's documentation from the ingestion

We choose to remove JA documentation due to the following reasons:

  1. In the new chat pipeline - the response synthesis module is instructed to respond in the user's query language. i.e. even if the context is in English, as long as the query is in JA the response will be in JA
  2. The JA documentation is now moved to a new location the Docodile repository. The last two ingestions didn't ingest any JA docs due to this change, but the JA responses in JA Slack have still been working.
  3. Additionally, new languages such as Kr have been introduced. Indexing each language is just storing redundant information in the vectorstore.
  4. Since we use multi-lingual embedding search based on the query language, we can retrieve the information for questions in other languages even if the documentation is in English and the query is in other languages.

@morganmcg1 morganmcg1 merged commit cd9fb7a into main Apr 25, 2024
3 checks passed
@morganmcg1 morganmcg1 deleted the feat/add-weave-docs branch April 25, 2024 11:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants