Check and Possibly Reduce Memory usage of Pre-built PDF #1280
Comments
This function sets the Here is the This is method in the PDF handling python code that joins multiple pdfs into one and returns its bytes: |
The largest file size of any |
It's not clear if the memory issues occur from Python or Java processes. It could be either one. |
Here is a script that should be able to reproduce the memory issues caused by pdfs: https://gist.github.com/bengolder/ec437271f0ea5f3050b15ba3082ff983 |
After managing to ssh into the active web and worker dynos I was able to determine that in 0 traffic state its using ~250Mb of memory which is made of up of 4-8 50Mb worker processes. We can see that from the outside when traffic causes sf to generate a bundled pdf(on stage) memory usage can jump up by ~20Mb and not go down for a long period of time and subsequent requests will increase it by another 20Mb. When testing on stage it taking longer than 5-10min to generate a pdf with 1000s of generated pdfs in it. So its plausible that 5-10 submissions to sf over the course of a day if the bundled pdf are really big could cause this problems. An easy solution is to reduce the number of celery instances on a worker to just 2. This should both cut the baseline down to 100Mb. If the problem still keeps happening we can look into other solutions to decrease the amount of memory left over and its duration. |
I was not able to build pdfs while sshed into dynos though because Heroku breaks pdf creation when in the JAVA remote debug mode. |
Because Lambda has one instance per task this problem should go away when we are one Lambda. |
This issue is the result of investigating #1275.
If San Francisco does not login for long enough, their prebuilt PDF will be built up in memory with new applications, and may become big enough to exceed our memory quota during the construction of this concatenated PDF.
Assess the memory usage of large prebuilt PDF, determine the ways we can reduce memory usage, and make changes as needed to prevent exceeding our memory quota.
The text was updated successfully, but these errors were encountered: