title | description | ms.date | ms.topic | ms.custom |
---|---|---|---|---|
Scale Azure OpenAI for Java chat sample using RAG |
Learn how to add load balancing to your Java solution to extend the chat app beyond the Azure OpenAI token and model quota limits. |
05/13/2024 |
get-started |
devx-track-java, devx-track-java-ai, devx-track-extended-java, build-2024-intelligent-apps |
[!INCLUDE aca-load-balancer-intro]
-
Azure subscription. Create one for free
-
Access granted to Azure OpenAI in the desired Azure subscription.
Currently, access to this service is granted only by application. You can apply for access to Azure OpenAI by completing the form at https://aka.ms/oai/access.
-
Dev containers are available for both samples, with all dependencies required to complete this article. You can run the dev containers in GitHub Codespaces (in a browser) or locally using Visual Studio Code.
- GitHub account
- Docker Desktop - start Docker Desktop if it's not already running
- Visual Studio Code
- Dev Container Extension
[!INCLUDE scaling-load-balancer-aca-procedure.md]
[!INCLUDE py-deployment-procedure]
[!INCLUDE logs]
[!INCLUDE capacity.md]
[!INCLUDE py-aca-cleanup]
Samples used in this article include:
- Use Azure Load Testing to load test your chat app with