[ML] More accurate job memory overhead #47516
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
When an ML job runs the memory required can be
broken down into:
ancilliary processes that is not instrumented
Previously we added a simple fixed overhead to
account for 1 and 3. This was 100MB for anomaly
detection jobs (large because of the completely
uninstrumented categorization function and
normalize process), and 20MB for data frame
analytics jobs.
However, this was an oversimplification because
the executable code only needs to be loaded once
per machine. Also the 100MB overhead for anomaly
detection jobs was probably too high in most cases
because categorization and normalization don't use
that much memory.
This PR therefore changes the calculation of memory
requirements as follows:
job of any type to be run on a given node - this
is to account for loading the executable code
model memory limit of the job
jobs and 5MB for data frame analytics jobs, to
account for the uninstrumented memory usage
This change will enable more jobs to be run on the
same node. It will be particularly beneficial when
there are a large number of small jobs. It will
have less of an effect when there are a small number
of large jobs.