-
Notifications
You must be signed in to change notification settings - Fork 69
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dynamically set heap and direct memory limits. #49
Conversation
@@ -1,4 +1,4 @@ | |||
vmargs=-Xmx128m -XX:-UseLargePages -XX:+UseG1GC -XX:MaxMetaspaceSize=32M -XX:MaxDirectMemorySize=10m -XX:+ExitOnOutOfMemoryError | |||
vmargs=-Xmx128m -XX:-UseLargePages -XX:+UseG1GC -XX:MaxMetaspaceSize=32M -XX:MaxDirectMemorySize=1G -XX:+ExitOnOutOfMemoryError |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have you done any measurement? Why 1G and, say, not 100M?
Please leave a comment explaining the TODO that should follow up |
58b544a
to
d57d2e6
Compare
Maybe this was discussed offline: is the parameter mandatory? Can MMS provide the functionality of detecting the right value per available memory? |
I think it becomes harder to control the memory behavior once we start hosting multiple containers on an endpoint, so it may be better to have a well defined limit. It also seems like removing the |
To confirm the default behavior when
In this case, it makes sense for us to be setting the limits to higher values depending on the instance type instead of relying on these defaults. |
d57d2e6
to
c21683b
Compare
c21683b
to
f8606eb
Compare
Code changed since last review.
dcf1b51
to
9d7eb10
Compare
+ " -XX:MaxDirectMemorySize=" + os.environ["SAGEMAKER_MAX_DIRECT_MEMORY_SIZE"] + "\n") | ||
g.write(f.read()) | ||
except Exception: | ||
pass |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the server going to start without config.properties
? If not, is it ok to fail silently here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added this try-except because the path to the config.properties only exists on the actual container, which was causing the unit tests to fail. Is there a better way to do this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just read the code comments above the try-except block. If the plan is to remove this code in the near future, then I guess it's okay. Reading try: something, except Exception: pass
made me think "it's okay to fail this block".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, ideally we want to have the config.properties as it was before and use the environment variables to get the vmargs. The issue is that the MMS team hasn't implemented environment variable parsing for vmargs, so this is a temporary solution until they release that feature.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor: instead of concat'ing strings, can you use format and use variable names to put values in the string for readability?
Also, I'm ok with this temporary solution but another solution worth thinking about would've been to take files and a dict and do overrides to create a new config file and its location.
dbb9276
to
d6176db
Compare
+ " -XX:MaxDirectMemorySize=" + os.environ["SAGEMAKER_MAX_DIRECT_MEMORY_SIZE"] + "\n") | ||
g.write(f.read()) | ||
except Exception: | ||
pass |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just read the code comments above the try-except block. If the plan is to remove this code in the near future, then I guess it's okay. Reading try: something, except Exception: pass
made me think "it's okay to fail this block".
d6176db
to
26a7ade
Compare
…umber of workers. Cap max_content_length to 20mb.
26a7ade
to
fe55154
Compare
+ " -XX:MaxDirectMemorySize=" + os.environ["SAGEMAKER_MAX_DIRECT_MEMORY_SIZE"] + "\n") | ||
g.write(f.read()) | ||
except Exception: | ||
pass |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor: instead of concat'ing strings, can you use format and use variable names to put values in the string for readability?
Also, I'm ok with this temporary solution but another solution worth thinking about would've been to take files and a dict and do overrides to create a new config file and its location.
addb00b
to
27b54ef
Compare
…osting. Use job queue size in max heap size calculation.
27b54ef
to
bb74d9c
Compare
Description of changes:
The
MaxDirectMemorySize
parameter limits the amount of direct memory that the JVM can use. This has a direct effect of limiting the size of inference requests we can make, causingOutOfDirectMemoryError
s as soon as the 10MB limit is reached. This was temporarily raised during testing to 1GB.After raising the direct memory limit, the
Xmx
parameter, which limits the max heap size, began causingOutOfMemoryError
s when testing batch transform jobs. After raising that limit as well, the errors were resolved.The
Xmx
andMaxDirectMemorySize
vmargs are now set based on the number of workers * the max payload size * a buffering factor of 1.2 + a base amount of 128mb, which is theoretically sufficient for the server when under full load (all workers occupied). Environment variables are not parsed for vmargs (MMS has not implemented this yet), so the current workaround is to write the values to the config.properties file before model startup.Change were tested by running container tests and batch transform jobs with small (<5MB) and larger payloads (up to 20MB).
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.