Validate resource constraint (RAM and CPU) in RoundRobinPacking #3142
Validate resource constraint (RAM and CPU) in RoundRobinPacking #3142
Conversation
heron/common/src/java/org/apache/heron/common/basics/CPUShare.java
Outdated
Show resolved
Hide resolved
heron/common/src/java/org/apache/heron/common/basics/CPUShare.java
Outdated
Show resolved
Hide resolved
heron/common/src/java/org/apache/heron/common/basics/CPUShare.java
Outdated
Show resolved
Hide resolved
heron/packing/src/java/org/apache/heron/packing/roundrobin/RoundRobinPacking.java
Outdated
Show resolved
Hide resolved
heron/packing/src/java/org/apache/heron/packing/roundrobin/RoundRobinPacking.java
Outdated
Show resolved
Hide resolved
|
||
T increaseBy(int percentage); | ||
|
||
boolean greaterThan(T other); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Comparable
interface already provides the compareTo
method. Do we really need these 4 comparison methods?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question. The ByteAmount
had these methods and I preserved these to guarantee minimum changes. I guess they were there just for readability?
|
||
T plus(T other); | ||
|
||
T multiply(int factor); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why the factor
and percentage
are int
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The semantics for factor
is percentage in ByteAmount
and don't allow decimal spaces. The original method signatures in ByteAmount
use int
too.
heron/packing/src/java/org/apache/heron/packing/roundrobin/RoundRobinPacking.java
Outdated
Show resolved
Hide resolved
heron/packing/src/java/org/apache/heron/packing/roundrobin/RoundRobinPacking.java
Outdated
Show resolved
Hide resolved
heron/packing/src/java/org/apache/heron/packing/roundrobin/RoundRobinPacking.java
Outdated
Show resolved
Hide resolved
import java.util.HashMap; | ||
import java.util.Map; | ||
|
||
public final class CPUShare implements ResourceMeasure<CPUShare> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we really need to create a separate CPUShare
class to handle the CPU resource calculation? Is that possible to reuse what we already have in the ram resource calculation? Or will there be RamShare
and DiskShare
classes in following patches?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't seem necessary to separate RamShare
and DiskShare
as RAM and disk are both measured in ByteAmount
. CPU is measured in share of time used.
Previously, we didn't take CPU resource constraints into consideration when composing packing plan. Now that we do -- namely we need to collect instance cpu resource mapping, and the logic of that is very similar to that of instance RAM resource mapping (previously as getInstancesRAMMapInContainer()
). But due to the difference in the type of measurement(ByteAmount
vs double
), we might end up with a lot of code duplicates. Hence, I abstract out CPUShare
so that it looks very similar to what we have in ByteAmount
, and thus we can have one calculateInstancesResourceMapInContainer()
that works for both CPU and RAM (potentially disk if needed) instance resource mapping collection.
dc78dbb
to
5bd71c0
Compare
heron/common/src/java/org/apache/heron/common/basics/ByteAmount.java
Outdated
Show resolved
Hide resolved
heron/common/src/java/org/apache/heron/common/basics/CPUShare.java
Outdated
Show resolved
Hide resolved
heron/packing/src/java/org/apache/heron/packing/roundrobin/RoundRobinPacking.java
Show resolved
Hide resolved
heron/packing/tests/java/org/apache/heron/packing/roundrobin/RoundRobinPackingTest.java
Outdated
Show resolved
Hide resolved
* Test the scenario CPU map config is completely set | ||
*/ | ||
@Test | ||
public void testCompleteCpuMapRequested() throws Exception { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add these cases:
- container cpu is more than needed by instances.
- container cpu is less than needed by instances.
also, other processes in container could need a fixed amount of CPU as well. So if container cpu == needed by instances, it might be better to return false.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 Added more tests to cover the mentioned scenarios
|
||
ByteAmount containerDiskInBytes = getContainerDiskHint(roundRobinAllocation); | ||
double containerCpu = getContainerCpuHint(roundRobinAllocation); | ||
double containerCpuHint = getContainerCpuHint(roundRobinAllocation); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
containerCpuHint (configured container cpu, or max (padding + 1 per instance)) is not used. As the result, I think user specify container cpu is ignored. Maybe need a max(containerCpuHint, containerCpu) I feel?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is actually used down in calculateInstancesResourceMapInContainer
as the containerResHint
. It is used to validate the all instances + padding in roundRobinAllocation
does not exceed the containerResHint
. It's just not used anymore in packingInternal
, similar to how we treated containerRamHint
before the change
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is true. Assuming user configures:
- container cpu to be 5
- 1 cpu per instance
- 3 instances in container
Then:
cpu hint is 5,
instance cpu is valid (enough for padding and instances)
In the old calculation, container cpu is 5 because user configured it.
In the new calculation, container cpu is 4 (1 * 3 + 1 padding).
With "max(containerCpuHint, containerCpu)", container cpu is the same as before.
I think it is important to make the result consistent. Some users might rely on setting container cpu to allocate more processing power to their instances instead of configuring per instance cpu (although it is technically correct).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm. I see. Then I think we don't even need to take the max out of 2. ContainerCpuHint is always gonna be greater no matter what.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But if container cpu is not configured, the getContainerCpuHint() would return padding + 1 * instance number, which is smaller than your new value, unless you update the getContainerCpuHint() function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeh you are right, I just figured that situation :) Updated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The original logic is messy and makes it hard to update. :(
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
examples/src/java/org/apache/heron/examples/api/ExampleResources.java
Outdated
Show resolved
Hide resolved
…he#3142) * init * general resource constraint validation * pass existing unit tests * add more tests * rename * rename * generic ResourceMeasure * fixed wc example * even more general generics * address comments * address comments by putting more tests * set safe amount of cpu * meaningful constants in ExampleResource
* init * general resource constraint validation * pass existing unit tests * add more tests * rename * rename * generic ResourceMeasure * fixed wc example * even more general generics * address comments * address comments by putting more tests * set safe amount of cpu * meaningful constants in ExampleResource
This PR might break some mis-configured topologies in production
This PR addresses some issues raised previously in the slack chat that the packing algorithm did not honor container-level and instance-level constraints on CPU resource. Additionally, we added resource constraint validation in RoundRobinPacking. Specifically:
topology.container.cpu
andtopology.container.ram
are honored and considered as the hard cap constraint for the corresponding resource in one container.topology.component.cpumap
andtopology.component.rammap
are honored and considered as the instance resource mapping in one container.