This repository contains the AWS CDK Infrastructure-as-code for an AWS AppSync web socket API to provide real-time functionality backed by AWS Bedrock. It is designed with serverless in mind and uses a variety of AWS services to provide a scalable Gen AI API.
A simple Vite and TypeScript based frontend is provided to demonstrate the functionality of the API.
The following demo demonstrates the project in action with an example persona who is an expert in viticulture and has been provided with documents on French wineries so that specific wineries can be referenced by name when asked questions.
Demo.mp4
- Model: A model is a Gen AI model that is used to process user messages.
- Persona: A persona is the identity of the chatbot and contains all information that is specific to a given chatbot, but most notably the system prompt and model that they use for processing user messages.
- Thread: A thread is a conversation between a user and the chatbot.
- Message: A message is a single message in a thread.
Frontend and Backend Infrastructure Build Tools:
- AWS CDK: AWS Cloud Development Kit (CDK) is an open-source software development framework to define cloud infrastructure in code and provision it through AWS CloudFormation.
- TypeScript: TypeScript is a language that builds on JavaScript by adding static type definitions.
- Node.js: Node.js is an open-source, cross-platform, back-end JavaScript runtime environment that runs on the V8 engine and executes JavaScript code outside a web browser.
- AWS Tools and Services:
- AWS AppSync: AWS AppSync is a managed serverless GraphQL service that simplifies application development by letting you create a flexible API to securely access, manipulate, and combine data from one or more data sources.
- AWS Bedrock: AWS Bedrock is a service that provides access to Gen AI models like Anthropics Claude 2 or Jurassic-1.
- AWS DynamoDB: Amazon DynamoDB is a key-value and document database that delivers single-digit millisecond performance at any scale.
- AWS Lambda: AWS Lambda is a serverless computing service that lets you run code without provisioning or managing servers.
- AWS Cognito: Amazon Cognito lets you add user sign-up, sign-in, and access control to your web and mobile apps quickly and easily.
- AWS S3: Amazon Simple Storage Service (Amazon S3) is an object storage service that offers industry-leading scalability, data availability, security, and performance.
- AWS CloudFront: Amazon CloudFront is a fast content delivery network (CDN) service that securely delivers data, videos, applications, and APIs to customers globally with low latency, and high transfer speeds, all within a developer-friendly environment.
- AWS SQS: Amazon Simple Queue Service (SQS) is a fully managed message queuing service that enables you to decouple and scale microservices, distributed systems, and serverless applications.
- Third-Party Tools and Services:
- Pinecone: Pinecone is a vector database that allows you to store and query high-dimensional vectors.
- ElevenLabs API: ElevenLabs is a company that provides a variety of AI services, including speech-to-text and text-to-speech. Their API is used in this project to provide the text-to-speech functionality for the chatbot.
Frontend App Build Tools:
- Vite: Vite is a build tool that aims to provide a faster and leaner development experience for modern web projects.
- React: React is a JavaScript library for building user interfaces.
- TypeScript: TypeScript is a language that builds on JavaScript by adding static type definitions.
- Tailwind CSS: Tailwind CSS is a utility-first CSS framework packed with classes like flex, pt-4, text-center, and rotate-90 that can be composed to build any design, directly in your markup.
- AWS Amplify Library: The AWS Amplify library is a collection
Frontend:
Service | Description |
---|---|
CloudFront | Acts as a CDN for the static files of the front end and serves them from the S3 bucket. |
S3 | Stores the static files of the front end. |
Backend:
Service | Description |
---|---|
AppSync | Provides a GraphQL API to the frontend, using both mutations and queries , as well as subscriptions for real-time chat messages. |
Bedrock | AWS Bedrock is used to provide the AI capabilities of the chatbot. |
DynamoDB | Used to store the threads, messages, and the status of the Gen AI processing. |
SQS | Acts as a queue for the messages that are sent to the Bedrock invocation lambda. |
Lambda | Handles the invocation of AWS Bedrock and pre-processes the message before sending it to SQS. |
Cognito | Handles the authentication of the users and authorizes them to use the AppSync API. |
- A user sends a message to the AppSync API.
- If provided a threadId, an AppSync resolver will retrieve the thread from DynamoDB.
- The Queue trigger lambda will be invoked with the conversation history, the user's new prompt, and the threadId.
- The Queue trigger lambda will check the status of the thread in DynamoDB to see if it is currently being processed and only continue if it is not.
- The Queue trigger lambda will send the message to SQS after additional validation checks and update the status of the thread in DynamoDB to
PROCESSING
. - The Bedrock invocation lambda will be invoked by SQS and will send the message to AWS Bedrock for processing.
- As the Bedrock invocation lambda receives chunks, it will store the completed result and send back the chunks to the user via an AppSync subscription.
- Once, the Bedrock invocation lambda has received all the chunks, it will update the status of the thread in DynamoDB to
PROCESSED
, store the result, and send the status update to the user via an AppSync subscription to indicate that the last chunk has been sent. - The user will receive the result of the Gen AI processing via an AppSync subscription.
- The user can then send another message to the AppSync API.
- The process repeats.
This section guides you through the process of deploying the infrastructure.
This section guides you through the process of setting up the infrastructure. The project is written in TypeScript and uses (AWS CDK)[https://aws.amazon.com/cdk/] to deploy the infrastructure. Before we dive into deploying the front end and backend, we will need to get the prerequisites out of the way.
Before you start, it is highly encouraged to review the AWS CDK Getting Started Guide to get a better understanding of the AWS CDK and how it works.
- Pre-requisites:
- Node.js version 20 or later.
- AWS CLI version 2 or later.
- AWS CDK version 2 or later.
- Pinecone account (optional).
- ElevenLabs account (optional).
- Configure your AWS CLI. If you have not already done so, you can configure your AWS CLI by running the following command and following the prompts. For more information, see Configuring the AWS CLI.
aws configure
- Clone the repository and navigate to the infrastructure directory.
git@github.com:Kyle-L/Bedrock-AppSync-API.git
cd Bedrock-AppSync-API/infrastructure
- Install the dependencies.
npm install
- Install AWS CDK
npm install -g aws-cdk
- Bootstrap the AWS CDK environment.
cdk bootstrap
- (Optional) If you would like to speed up some of the steps, a
setup.sh
script is provided to help you set up the infrastructure. This will install all node depedencies and copy the.env.example
to.env
for you. You will still need to update the.env
file with your appropriate values.
./setup.sh
- Congrats! You are now ready to deploy the infrastructure. See (Backend)[#backend] and (Frontend)[#frontend] for more information on deploying the frontend and the backend of the project.
If you are not familiar with AWS Bedrock, AWS Bedrock is an API that grants access to GenAI models like Anthropics Claude 2 or Jurassic-1. It is used in this project to provide the GenAI capabilities of the chatbot. While it is an AWS service that is largely set up by the infrastructure, you will need to have access to the service to use it.
-
Go to the AWS Bedrock console at https://console.aws.amazon.com/bedrock/.
-
Go to "Model Access" and request access to the model that you would like to use.
- The default models used in this project are
Anthropics Claude 1 Instant
,Anthropics Claude 2.1
, andAnthropics Claude 3 Sonnet
. - If you would like to use a different model, you can request access to it by clicking on "Request Access" and update the model tuning file to include the new model here: infrastructure/src/lib/utils/ai/model-tuning.ts).
- The default models used in this project are
-
Wait and check your email for the approval of your request. Should take only a few minutes, but I have had a few take several hours. Once approved, you are ready to deploy the infrastructure.
If you are not familiar with Pinecone, Pinecone is a vector database that allows you to store and query high-dimensional vectors. It is used in this project to store the embeddings of the messages and to query for similar messages. For this project, we are using it as the vector database that backs a Bedrock Knowledge Base to allow our chatbot to provide relevant responses to the user based on data from an S3 bucket.
-
Create a Pinecone account at https://www.pinecone.io/.
-
Create a new Pinecone index.
- There are additional configurations that you must provide when creating a Pinecone index:
- Name – The name of the vector index. Choose any valid name of your choice. Later, when you create your knowledge base, enter the name you choose in the Vector index name field.
- Dimensions – The number of dimensions in the vector. Choose
1536
. This is what the Knowledge Base uses. - Distance metric – The metric used to measure the similarity between vectors. While the Knowledge Base supports multiple distance metrics, choose
cosine
for this example. You can experiment with other distance metrics later.
- Get the Pinecone API key and the Pinecone index name.
- You can find the API key and the index name in the Pinecone console.
- Create a new AWS Secrets Manager secret for PineCone. Save the ARN of the secret for later.
aws secretsmanager create-secret --name <MY_SECRET_NAME> --secret-string '{"apiKey":"<MY_API_KEY>"}'
If you are not familiar with ElevenLabs, ElevenLabs is a company that provides a variety of AI services, including speech-to-text and text-to-speech. It is used in this project to provide the text-to-speech functionality for the chatbot.
-
Create an ElevenLabs account at https://www.elevenlabs.io/.
-
Get the ElevenLabs API key.
- You can find the API key in the ElevenLabs in the upper right corner click on your profile picture -> profile.
- Documentation for the ElevenLabs API Query can be found here.
- Create a new AWS Secrets Manager secret for ElevenLabs. Save the ARN of the secret for later.
aws secretsmanager create-secret --name <MY_SECRET_NAME> --secret-string '{"apiKey":"<MY_API_KEY>"}'
This section guides you through the process of deploying a custom domain for related frontend and backend infrastructure.
Note: We are doing this outside of CDK as DNS validation is required and if you are managing your domain outside of AWS, it can be a hassle as CloudFormation will wait for DNS validation to complete before deploying the stack, which can take a while depending on your DNS provider and the TTL of your DNS records.
- Choose a custom domain for the backend and the front end. For example,
api.example.com
for the backend andapp.example.com
for the frontend.
- It is important that the custom domain for the backend and the frontend are entirely or subdomains of the same domain.
- Create an ACM certificate for the custom domain(s). Once you have the custom domain(s) chosen, you will need to create an ACM certificate for the custom domain(s).
- You can create an ACM certificate by navigating to the ACM console clicking on
Request a certificate
and following the prompts.- If one or both of the custom domains are subdomains of the same domain, you can create a single certificate with multiple domain names, but you will need to specify both domain names when creating the certificate or create a wildcard certificate.
- Example: If I chose
api.example.com
andapp.example.com
as my custom domains, I can create a single certificate with both domain names withexample.com
and*.example.com
as a subject alternative name (SAN).
- Example: If I chose
- If the frontend and backend are entirely different domains, you will need to create two separate certificates.
- Example: If I chose
api.example.com
andapp.example.io
as my custom domains, I will need to create two separate certificates, one forapi.example.com
and one forapp.example.io
.
- Example: If I chose
- If one or both of the custom domains are subdomains of the same domain, you can create a single certificate with multiple domain names, but you will need to specify both domain names when creating the certificate or create a wildcard certificate.
- Update your DNS records to validate the ACM certificate. Once you have created the ACM certificate, you will need to update your DNS records to validate the certificate.
- You can validate the certificate by navigating to the ACM console and clicking on the certificate that you created. You will then need to click on
Create record in Route 53
orCreate record in DNS
and follow the prompts. - If you are using a DNS provider outside of AWS, you will need to create the DNS records manually. You can find the DNS records that you need to create by clicking on the certificate and navigating to the
Domain
section of the certificate.
-
Complete (API Deployment)[#api-deployment] and (Frontend Deployment)[#frontend-deployment] before continuing.
-
Create a new DNS record for the AppSync API.
- Once you have deployed the backend, you will need to create a new DNS record for the AppSync API.
- You can find the DNS record that you need to create by navigating to the AppSync console and clicking on the API that you created. You will then need to click on the
Settings
tab and copy the value of theAPI URL
field. - You will then need to create a new DNS record for the AppSync API. The type of the DNS record will be
CNAME
and the value will be theAPI URL
of the AppSync API.
- Create a new DNS record for the CloudFront distribution.
- Once you have deployed the front end, you will need to create a new DNS record for the CloudFront distribution.
- You can find the DNS record that you need to create by navigating to the CloudFront console and clicking on the distribution that you created. You will then need to click on the
Domain Name
and copy the value of theDomain Name
field. - You will then need to create a new DNS record for the CloudFront distribution. The type of the DNS record will be
CNAME
and the value will be theDomain Name
of the CloudFront distribution.
This section guides you through the process of deploying the backend infrastructure. This includes our AppSync API, DynamoDB tables, Lambda functions, and more.
- Check into the
infrastructure
directory.
cd Bedrock-AppSync-API/infrastructure
- Install the dependencies.
npm install
- Copy
.env.example
to.env
and update the values with your appropriate values. The following are the variables that you will need to update:
Variable | Description | Optional/Required |
---|---|---|
FRONTEND_ACM_CERTIFICATE_ARN |
The Amazon Resource Name (ARN) of the AWS Certificate Manager (ACM) certificate for the backend's custom domain. | Optional |
FRONTEND_DOMAIN |
The custom domain name for the backend. | Optional |
BACKEND_API_ACM_CERTIFICATE_ARN |
The Amazon Resource Name (ARN) of the AWS Certificate Manager (ACM) certificate for the backend's custom domain. | Optional |
BACKEND_API_DOMAIN |
The custom domain name for the backend. | Optional |
config.backend.pinecone.connectionString |
The connection string for the Pinecone service. | Optional |
PINECONE_CONNECTION_STRING |
The connection string for the Pinecone service. | Optional |
PINECONE_SECRET_ARN |
The Amazon Resource Name (ARN) of the AWS Secrets Manager secret for PineCone. | Optional |
SPEECH_SECRET_ARN |
The Amazon Resource Name (ARN) of the AWS Secrets Manager secret for ElevenLabs. | Optional |
- Note: While you can deploy everything at once, it is highly encouraged to update
customDomain
,pinecone
, andspeechSecretArn
after the initial deployment of the backend infrastructure. This is to minimize deployment time should issues arise with the deployment.
- Deploy the backend infrastructure.
cdk deploy GenAI/Backend
- Congrats! You have now deployed the backend infrastructure. See (Frontend)[#frontend] for more information on deploying the frontend.
This section guides you through deploying the front-end infrastructure and the corresponding Vite app. This includes our S3 bucket and CloudFront distribution.
- Check into the
infrastructure
directory.
cd Bedrock-AppSync-API/infrastructure
- Install the dependencies.
npm install
- Copy
.env.example
to.env
and update the values with your appropriate values. The following are the variables that you will need to update:
Variable | Description | Optional/Required |
---|---|---|
VITE_COGNITO_USER_POOL_ID |
The user pool ID of the Cognito user pool. | Required |
VITE_COGNITO_USER_POOL_CLIENT_ID |
The user pool client ID of the Cognito user pool. | Required |
VITE_API_URL |
The URL of the AppSync API. | Required |
- Deploy the frontend infrastructure.
cdk deploy GenAI/Frontend
- Congrats! You have now deployed the frontend infrastructure. You can now access the frontend by navigating to the CloudFront distribution URL.
- Check into the
frontend
directory
cd Bedrock-AppSync-API/frontend
- Install the dependencies.
npm install
- Build the Vite App
npm run build
- Deploy the Vite App to S3
aws s3 sync dist/ s3://<YOUR_BUCKET_NAME>/ --delete