Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dynamically set heap and direct memory limits. #49

Merged
merged 2 commits into from
Nov 21, 2019
Merged

Conversation

aws-patlin
Copy link
Contributor

@aws-patlin aws-patlin commented Nov 13, 2019

Description of changes:
The MaxDirectMemorySize parameter limits the amount of direct memory that the JVM can use. This has a direct effect of limiting the size of inference requests we can make, causing OutOfDirectMemoryErrors as soon as the 10MB limit is reached. This was temporarily raised during testing to 1GB.

After raising the direct memory limit, the Xmx parameter, which limits the max heap size, began causing OutOfMemoryErrors when testing batch transform jobs. After raising that limit as well, the errors were resolved.

The Xmx and MaxDirectMemorySize vmargs are now set based on the number of workers * the max payload size * a buffering factor of 1.2 + a base amount of 128mb, which is theoretically sufficient for the server when under full load (all workers occupied). Environment variables are not parsed for vmargs (MMS has not implemented this yet), so the current workaround is to write the values to the config.properties file before model startup.

Change were tested by running container tests and batch transform jobs with small (<5MB) and larger payloads (up to 20MB).

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

@@ -1,4 +1,4 @@
vmargs=-Xmx128m -XX:-UseLargePages -XX:+UseG1GC -XX:MaxMetaspaceSize=32M -XX:MaxDirectMemorySize=10m -XX:+ExitOnOutOfMemoryError
vmargs=-Xmx128m -XX:-UseLargePages -XX:+UseG1GC -XX:MaxMetaspaceSize=32M -XX:MaxDirectMemorySize=1G -XX:+ExitOnOutOfMemoryError
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you done any measurement? Why 1G and, say, not 100M?

cbalioglu
cbalioglu previously approved these changes Nov 13, 2019
ericangelokim
ericangelokim previously approved these changes Nov 13, 2019
@ericangelokim
Copy link
Contributor

Please leave a comment explaining the TODO that should follow up

@aws-patlin aws-patlin force-pushed the inference branch 2 times, most recently from 58b544a to d57d2e6 Compare November 13, 2019 23:31
@iyerr3
Copy link
Contributor

iyerr3 commented Nov 13, 2019

Maybe this was discussed offline: is the parameter mandatory? Can MMS provide the functionality of detecting the right value per available memory?

@aws-patlin aws-patlin changed the title Updated MaxDirectMemorySize from 10MB to 1GB. Increased heap and direct memory limits to 1GB. Nov 14, 2019
@aws-patlin
Copy link
Contributor Author

Maybe this was discussed offline: is the parameter mandatory? Can MMS provide the functionality of detecting the right value per available memory?

I think it becomes harder to control the memory behavior once we start hosting multiple containers on an endpoint, so it may be better to have a well defined limit. It also seems like removing the MaxDirectMemorySize parameter falls back to some other value, as I was seeing the same error around 100MB (could be tied to the Xmx parameter, which we also needed to increase).

@aws-patlin
Copy link
Contributor Author

aws-patlin commented Nov 14, 2019

To confirm the default behavior when Xmx and MaxDirectMemorySize are not set:

  • Max heap size defaults to ¼ of the physical memory, up to 1GB if the server is 32-bit, or 32GB if the server is 64-bit. Ref
  • Max direct memory defaults to the value of the max heap size. Ref

In this case, it makes sense for us to be setting the limits to higher values depending on the instance type instead of relying on these defaults.

@aws-patlin aws-patlin changed the title Increased heap and direct memory limits to 1GB. Dynamically set heap and direct memory limits. Nov 15, 2019
@aws-patlin aws-patlin dismissed stale reviews from ericangelokim and cbalioglu November 15, 2019 01:24

Code changed since last review.

@aws-patlin aws-patlin force-pushed the inference branch 2 times, most recently from dcf1b51 to 9d7eb10 Compare November 15, 2019 17:17
+ " -XX:MaxDirectMemorySize=" + os.environ["SAGEMAKER_MAX_DIRECT_MEMORY_SIZE"] + "\n")
g.write(f.read())
except Exception:
pass
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the server going to start without config.properties? If not, is it ok to fail silently here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added this try-except because the path to the config.properties only exists on the actual container, which was causing the unit tests to fail. Is there a better way to do this?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just read the code comments above the try-except block. If the plan is to remove this code in the near future, then I guess it's okay. Reading try: something, except Exception: pass made me think "it's okay to fail this block".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, ideally we want to have the config.properties as it was before and use the environment variables to get the vmargs. The issue is that the MMS team hasn't implemented environment variable parsing for vmargs, so this is a temporary solution until they release that feature.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: instead of concat'ing strings, can you use format and use variable names to put values in the string for readability?

Also, I'm ok with this temporary solution but another solution worth thinking about would've been to take files and a dict and do overrides to create a new config file and its location.

@aws-patlin aws-patlin force-pushed the inference branch 2 times, most recently from dbb9276 to d6176db Compare November 15, 2019 17:52
+ " -XX:MaxDirectMemorySize=" + os.environ["SAGEMAKER_MAX_DIRECT_MEMORY_SIZE"] + "\n")
g.write(f.read())
except Exception:
pass
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just read the code comments above the try-except block. If the plan is to remove this code in the near future, then I guess it's okay. Reading try: something, except Exception: pass made me think "it's okay to fail this block".

…umber of workers.

Cap max_content_length to 20mb.
+ " -XX:MaxDirectMemorySize=" + os.environ["SAGEMAKER_MAX_DIRECT_MEMORY_SIZE"] + "\n")
g.write(f.read())
except Exception:
pass
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: instead of concat'ing strings, can you use format and use variable names to put values in the string for readability?

Also, I'm ok with this temporary solution but another solution worth thinking about would've been to take files and a dict and do overrides to create a new config file and its location.

@aws-patlin aws-patlin force-pushed the inference branch 2 times, most recently from addb00b to 27b54ef Compare November 20, 2019 01:40
…osting.

Use job queue size in max heap size calculation.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
5 participants