Query any structured data with Natural Language Understanding using Amazon Q Business. In this example, we'll look at an architecture to query structured data using Amazon Q Business, and build out an application to query cost and usage data in Amazon Athena with Amazon Q Business. Q Business can be used create SQL queries to your datasources when provided with the database schema, additional metadata describing the columns and tables, and prompting instructions. This architecture can be extended to use additional data sources, query validation and prompting techniques to cover a wider range of use cases.
The workflow includes the following steps:
-
First the user accesses the chatbot application, which is hosted behind an Application Load Balancer.
-
The user is prompted to log with Cognito
-
The application exchanges the token from Cognito with an IAM Identity Center token with the scope for Amazon Q Business
-
The application assumes an IAM role and retrieves an AWS Session from Secure Token Service (STS), augmented with the IAM Identity Center token to interact with Amazon Q Business
-
The application calls the chat_sync api of Amazon Q Business with relevant prompt and metadata based on the natural language query. Amazon Q Business responds back with relevant Athena query to be run
-
The query is run against Athena and results displayed on the webapp
- You need to set up AWS Identity Center and add users that you intend to give access to in your Q Business application.
- An existing, working Amazon Q Business application and give access to the users created in the previous step to the application.
- CUR data is available in Athena. If you have CUR data, you can skip the below steps for CUR data setup. If not, you've a few options to set up CUR data:
- To set up sample CUR data, Go to this lab and follow the instructions.
- You'll also need to set up a Glue Crawler to make the data available in Athena.
- If you already have an SSL certificate, you can skip this step otherwise Generate a private certificate
- Import the certificate into AWS Certificate Manager (ACM). For more details, refer to Importing a certificate.
git clone https://github.com/aws-samples/data-insights-with-amazon-q-business.git
- Review the table name under app/schemas/cur_schema.txt. It should match the table name you created in the CUR data setup steps. By default, the table name is customer_all. You can also modify the schema/table name per your data.
- Also, Review the prompts under app/qb_config.py . For the demo, zip up the code repository and upload it to a S3 bucket. You can also modify the prompts based on your test results later.
Step 2: Launch the AWS CloudFormation template to deploy ELB , Cognito User pool , including the EC2 instance to host the webapp.
βοΈ Provide the following parameters for stack
β’ Stack name β The name of the CloudFormation stack (for example, AmazonQ-Data-Insights-Demo)
β’ AthenaDbName - Athena database name where the CUR table resides
β’ AthenaS3Loc - S3 location for Athena output
β’ AuthName β A globally unique name to assign to the Amazon Cognito user pool
β’ CertificateARN β The CertificateARN generated from the previous step
β’ IdcApplicationArn β Identity Center customer application ARN , keep it blank on first run as we need to create the cognito user pool as part of this stack to create IAM Identity Center application with a trusted token issuer
β’ PublicSubnetIds β Use atleast two. The IDs of the public subnets that can be used to deploy the EC2 instance and the Application Load Balancer
β’ QApplicationId β The existing application ID of Amazon Q
β’ S3CodeLoc - Full S3 location of the code zip file
β’ VPCId β The ID of the existing VPC that can be used to deploy the demo
Audience : Audience to setup customer application in Identity Center
RoleArn : ARN of the IAM role required to setup token exchange in Identity Center
TrustedIssuerUrl : Endpoint of the trusted issuer to setup Identity Center
URL : The Load balancer URL to access the streamlit app
Next Steps : To proceed further, you need to follow steps 2-6 listed in this solution to deploy the webapp
You can now login to the app using your credentials.
The end to end workflow has 5 major steps -
- User Intent - Natural Language Query
- Prompt Builder - Open domain prompt for Q Business along with table schema and metadata info
- Amazon Q Business generates the query
- Query is run against Athena
- Results are displayed on the webapp.
Note that the sample data set is from year 2023. Natural language queries referring to current year will give not return results.
- what were the top 3 services by spend last year
- Total spend for ES for each month of 1st quarter of last year
- Give me a list of the top 3 products by total spend last year. For each of these products, what percentage of the overall spend is from this product?
- what all sagemaker instance types i used last year and what was their cost
Delete the cloud formation stack, Q Business Application and Athena tables.
See CONTRIBUTING for more information.
This library is licensed under the MIT-0 License. See the LICENSE file.