New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add scalability instructions to readme #450
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me!
@@ -117,6 +117,17 @@ const getUserInfoList = async () => { | |||
## Common Customization Scenarios | |||
Feel free to fork this repository and make your own modifications to the UX or backend logic. For example, you may want to change aspects of the chat display, or expose some of the settings in `app.py` in the UI for users to try out different behaviors. | |||
|
|||
### Scalability | |||
For apps published with `az webapp up` or from the Azure AI Studio, you can increase your app's ability to handle concurrent requests from multiple users with the following steps: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI, you could also add a gunicorn.conf.py in the App src itself to dynamically compute the optimal number of workers, it's what we often do for our App Service apps, i.e.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you, I will look into this!
2. Configure the following app settings on your App Service in the Azure Portal: | ||
- `PYTHON_ENABLE_GUNICORN_MULTIWORKERS`: true | ||
- `PYTHON_GUNICORN_CUSTOM_WORKER_NUM`: 5 (may be higher or lower depending on your App Service Plan tier) | ||
- `PYTHON_GUNICORN_CUSTOM_THREAD_NUM`: 5 (may be higher or lower depending on your App Service Plan tier) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I checked the Oryx Repo and i found this pull request microsoft/Oryx#1707
with a great explanation how this parameters are processed.
So maybe you could add a comment that the PYTHON_GUNICORN_CUSTOM_WORKER_NUM
and `PYTHON_GUNICORN_CUSTOM_THREAD_NUM' are NOT a must. And in case someone is unsure what is the correct setting for his underlaying App Service Plan he should not add this settings. So the web app will use the gunicorn recommendation => (2 * numCores) + 1 for the number of workers and threads;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point! I discussed that in this PR where I made this setting default for the azd appservice Bicep files: Azure/azure-dev#2571
(But then I forgot that Oryx did the right thing)
So yeah if you're not already overriding the gunicorn command, then the best practice is to just set PYTHON_ENABLE_GUNICORN_MULTIWORKERS to true, and no other customization.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From the Oryx PR it looks like the default thread count will be 1 if not set, but workers will default to (2 + numCores) + 1. I'll make a follow up PR to add more detail in the suggestions here, thanks folks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Follow up #451
Add scalability instructions to readme (microsoft#450)
Add instructions for additional app settings for multi-worker and multi-thread processing.