Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

QnA wont get any content from Kendra , if the Kendra content is not in english. #713

Closed
5 tasks
Guillaume-Bourque-Levio opened this issue Apr 9, 2024 · 15 comments
Assignees
Labels

Comments

@Guillaume-Bourque-Levio
Copy link

Guillaume-Bourque-Levio commented Apr 9, 2024

Describe the bug
From the default QnA client only English content will be replied by QnA bot.

Even if the QnA langage is set to French, and that we have content in French in Kendra, no data is return to my french question.

As soon as I add a data source in English to Kendra I see only those answer in the chat bot.

To Reproduce
Create de default QnA bot and select French as language

Create a kendra index and add a web crawler with data in french.

Expected behavior
French response available in french that we can see from the kendra console if we set the language to Frnech

Please complete the following information about the solution:

  • Version: [e.g. v5.5.1]

  • Region: ca-central-1

  • Was the solution modified from the version published on this repository?
    No

  • Have you checked your service quotas for the services this solution uses? N/A

  • Were there any errors in the CloudWatch Logs?
    No

  • I was told that the kendra call must include a language_code something like this.

response = kendra.retrieve(
PageSize = 5,
PageNumber = 1,
QueryText = text,
IndexId = index_id,
AttributeFilter={ "EqualsTo": {
"Key": "_language_code",
"Value": {
"StringValue": "fr"
}
}
  • Another point can we passe more than 1 language to kendra.retrieve ?

TIA

Guillaume.

@dougtoppin
Copy link
Member

@Guillaume-Bourque-Levio thanks for your report, we will investigate and get back to you

@bios6
Copy link
Member

bios6 commented Apr 10, 2024

Hi @Guillaume-Bourque-Levio ,

Please try the following and confirm: For this go to the AWS Console and search for Kendra, then look for the Data Management on the left hand side of the AWS console and click on Search indexed content . Then try query your question there that you are asking QnABot (in French). If it is not retrieving anything then it's a issue on the Kendra side. If it is indeed an issue on the Kendra side then try to set the Kendra data sources language to be English (which is the default) and you can still pass in the French data sources and you should be able to query in French.

Thanks!

@Guillaume-Bourque-Levio
Copy link
Author

Hello,

if I try to search in the text box I get nothing because the search only search in the english documents and my document are in french.

But if I change then kendra search behavior to French as you see here kendra will return information from my French documents.

So from what I understand Kendra is fine

image

TIA

@bios6
Copy link
Member

bios6 commented Apr 11, 2024

Hello, In this case can you change KENDRA_INDEXED_DOCUMENTS_LANGUAGES to be fr in the QnABot Settings ? Thanks!

@Guillaume-Bourque-Levio
Copy link
Author

Hello, if a replace the en by fr I get no result.

Can we specify more than 1 language in that parameter ?

Best

@bios6
Copy link
Member

bios6 commented Apr 16, 2024

Hi @Guillaume-Bourque-Levio ,

I'm unable to see any issue with Kendra integration with QnABot. I just tried to deploy an environment with Swedish as the Language in the CloudFormation Parameter and a Kendra Index with Swedish data and I am also able to query in Swedish (with or without the multi-language setting enabled) as well as other languages (if I have multi-language setting enabled ) . Our recommendation would be to turn on your debugging in the settings and as well as try to change around the score threshold for Kendra to see if that helps.

To answer your previous question KENDRA_INDEXED_DOCUMENTS_LANGUAGES supports a comma separate list. For more information about the settings on QnABot you can check out all parameters and descriptions here: https://github.com/aws-solutions/qnabot-on-aws/blob/main/docs/settings.md . You can also try to download our Implementation guide here https://aws.amazon.com/solutions/implementations/qnabot-on-aws/ which contains a lot of information about this AWS solution.

Screenshot 2024-04-16 at 8 43 51 AM Screenshot 2024-04-16 at 8 48 26 AM

@Guillaume-Bourque-Levio
Copy link
Author

Hello @bios6

could you please share your QnA bot configurations parameters that allow you solution to get the swedish information,

TIA.

@bios6
Copy link
Member

bios6 commented Apr 16, 2024

Hi @Guillaume-Bourque-Levio ,

Here's some images of how I setup my cloudformation parameters and qnabot settings . I only recall touching 2-3 settings mainly for debugging but most are default.

CFN Parameters:
Screenshot 2024-04-16 at 2 25 14 PM
Screenshot 2024-04-16 at 2 26 44 PM

QnABot settings:
Screenshot 2024-04-16 at 2 32 06 PM
Screenshot 2024-04-16 at 2 32 28 PM
Screenshot 2024-04-16 at 2 33 54 PM
Screenshot 2024-04-16 at 2 35 44 PM

I hope this helps!

@Guillaume-Bourque-Levio
Copy link
Author

Thanks, I'm now able to get my french answer.

@fhoueto-amz
Copy link
Member

@Guillaume-Bourque-Levio
Can you please share what was the root cause of the issue on your side. @bios6 shared the suggestion on the KENDRA_INDEXED_DOCUMENTS_LANGUAGES a few comments before but you mentioned that it was still not working then.

@Guillaume-Bourque-Levio
Copy link
Author

@bios6 can you confirm that in your kendra index you had no English content at all, when you did your test ?

Also was kendra configures from the QnA bot console, or you when into Amazon console and created an index from their ?

We are usinge the crawler v2 extentions to bring data in kendra with the locale set to fr.

We are still having a hard time to have a working solution if we have no english content in Kendra

Best

@Guillaume-Bourque-Levio
Copy link
Author

@fhoueto-amz ,

I went into the Amazon console and create a Kendra index with no particular configuration. Then I add only french local with french data using the crawler v2.

From QnA default web client I tried to get info from what I see in kendra, but no luck. But from Amazon kendra search I can see all my french content

If I add french content aging with the webcrawler but with the default content to En, then from the QnA default web client I can get answers.

This variable KENDRA_INDEXED_DOCUMENTS_LANGUAGES is not use or dont affect the way answer are given for our simple setup.

Should we let the QnA cloudformation stack create the kendra index ? Because we have created the Kendra config outside the cloudformation stack, but as said earlier the solution work fine if we only have english content, so this is not a missing policy.

That's what I understand ritgh now.

@Guillaume-Bourque-Levio
Copy link
Author

Maybe one more note in canada the locale sent by the browser is fr_CA not fr could that be the issue, since kendra source is fr not fr_CA ?

@fhoueto-amz fhoueto-amz reopened this Apr 22, 2024
@fhoueto-amz
Copy link
Member

@Guillaume-Bourque-Levio, Currently you need to index in english for this to work as you did. We are looking into the issue.

@fhoueto-amz
Copy link
Member

Fixed in v6.0.0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants