Bug 1375350 Edits to OOR info by bfallonf · Pull Request #3023 · openshift/openshift-docs

bfallonf · 2016-10-11T04:52:39Z

As per: https://bugzilla.redhat.com/show_bug.cgi?id=1375350

But also, I went through all of #2690 because I wanted to get to some of the vague bits.

@derekwaynecarr I'll put some comments in the PR. Can I get your thoughts? And of course if you have any comments on my changes.

Thanks!

cc: @ahardin-rh

bfallonf · 2016-10-11T04:54:09Z

admin_guide/out_of_resource_handling.adoc

Would it be easier to say "The pod will fail"? I feel like otherwise, we'd need to say "the *PodPhase* is transitioned to Failed in the X file".

a node does not lose resources, so i think that phrasing is awkward. maybe, in cases where a node is running low on available resources...

The upstream documentation I wrote (which this copied) was crisp on what it meant to fail a pod as it caused confusion. I would prefer we keep that description.

bfallonf · 2016-10-11T04:55:16Z

admin_guide/out_of_resource_handling.adoc

I do think this could be changed to something like "How the node signals that it's almost full", (though that's a terrible suggestion). But really, the paragraph below seems to be describing a setting that says that. Any other suggestions?

I would prefer to not change. The list of signals will grow, and fullness is not a term used in this area.

bfallonf · 2016-10-11T04:55:48Z

admin_guide/out_of_resource_handling.adoc

"The node can support the ability"

Does this mean it is not enabled by default? Is this what we're enabling here?

in 3.3, they are not enabled by default.

in 3.4, we will set some default values for memory.

in 3.5, we will set some default values for disk related resources.

Changed to "the node can be configured to..." cos 3.3 is the current.

bfallonf · 2016-10-11T04:56:05Z

admin_guide/out_of_resource_handling.adoc

Where are these signals actually given out? Is this in the logs? Which files exactly?

the signals are calculated from the summary stats api on the node.

the user can invoke that api by calling:

curl <certificate details> https://<master>/api/v1/nodes/<node>/proxy/stats/summary

the right hand side of these equations are literally taken from those values in the API response above.

bfallonf · 2016-10-11T04:56:35Z

admin_guide/out_of_resource_handling.adoc

I think what's missing is an explanation of what an eviction threshold. Why you would want to use a soft one over a hard one?

an example for a soft eviction and hard eviction threshold would be the following:

operator wants to evict immediately if memory falls below <5% (hard threshold)

operator wants to evict if memory falls below <30% for 1 min (soft threshold)

in this scenario, the operator would like their machines to steady at 70% utilization, but are willing to go above that for short periods of time. that is the general idea. our ops team will typically use two thresholds for a similar reason.

bfallonf · 2016-10-11T04:57:44Z

admin_guide/out_of_resource_handling.adoc

With the below, and this is said a few times, but we're saying "The node supports the ability to X" a lot. I take that to mean something needs to be enabled in order to use it. Is that the case? If it's by default, it should just be saying "The node can X".

we have no default eviction thresholds enabled in 3.3, so a user today must opt-in to this behavior. Meaning they need to setup the node to do this now. we will get defaults in the future.

bfallonf · 2016-10-11T04:58:09Z

admin_guide/out_of_resource_handling.adoc

Where is this found? A file somewhere?

this is basically describing a syntax for a threshold.

in 3.3, we just support literal thresholds (memory.available<100Mi)

in 3.4, we will support percentage thresholds as well (memory.available<10%)

so in 3.4, we will need to update this doc to reflect that both input styles are valid.

a sample node configuration is given in the "Example scenario" section.

bfallonf · 2016-10-11T04:58:28Z

admin_guide/out_of_resource_handling.adoc

How does this tie into the above? There's nothing to really explain that at all.

its meant to mean the signals in table 1 (which now lists one), but in 3.4 needs to list 4 more (all disk related)

bfallonf · 2016-10-11T04:58:42Z

admin_guide/out_of_resource_handling.adoc

What does that mean? Is that the only valid value for the bit?

right now, < is the only supported operator, but there was discussion on potentially offering more. i kept the doc this way to allow it to grow without major re-writes.

bfallonf · 2016-10-11T04:58:51Z

admin_guide/out_of_resource_handling.adoc

How do you find that? Quantity of what?

this basically means 'the syntax you used to express a quantity anywhere in openshift/kube' - so whether its how you declare the amount of cpu or memory for a pod, a constraint on quota, a limit in a limit range, etc. they all use the quantity representation. maybe its just obvious. we don't appear to have a good doc to describe:
https://github.com/kubernetes/kubernetes/blob/master/docs/design/resources.md#resource-quantities

OK. I get you. I made a link out to the docs above, but you're right, it might be a good idea for us on docs to doc this at some point.

bfallonf · 2016-10-11T05:00:28Z

admin_guide/out_of_resource_handling.adoc

With the below, I think an example would be a lot better... @derekwaynecarr Do you have one we could put into the docs?

can we hold on a soft eviction scenario until we have disk? i think that will make more sense in that context for 3.4.

bfallonf · 2016-10-11T05:00:56Z

admin_guide/out_of_resource_handling.adoc

Where is the "Housekeeping" interval?

As noted earlier, I think we should just say:

"The node evaluates eviction thresholds every 10s."

In the future, hard eviction thresholds for memory will not use polling every 10s, and instead we will have the kernel tell us the threshold has been passed and act immediately. That is planned for Kubernetes 1.5 / Origin 1.5.

so drop any mention of housekeeping interval.

I did some minor rewrites so that it's specified that 10 seconds is the housekeeping interval.

Scratch that, after your next couple comments I scrapped it all instead.

bfallonf · 2016-10-11T05:01:36Z

admin_guide/out_of_resource_handling.adoc

Is cAdvisor something supported by OpenShift? I've not heard of it before. Is there a better place we could link out to?

we should drop housekeeping-interval from this document. its hard-coded, and was included here in error.

bfallonf · 2016-10-11T05:03:08Z

admin_guide/out_of_resource_handling.adoc

I'm not sure what this is saying. Is it saying that the scheduler can read the eviction signal and do something accordingly? Is so, I'd move this above to the signal section.

I would not move this to the signal section.

The list of reported Node conditions will grow in 3.4 to include DiskPressure.

This table is saying the following:

The scheduler looks at the NodeConditions reported by the node, and if it sees the node reporting "MemoryPressure" it will not place BestEffort pods on that node.

In 3.4, if the scheduler sees nodes that report "DiskPressure", it will not schedule any pods to that node.

The list of pressure conditions will grow, and the scheduler will do something slightly different for each.

so the scheduler does NOT read eviction signals, it reads Node Conditions that are driven based on the configured eviction thresholds. So for example, if i set an eviction threshold like the following:

eviction-hard is "memory.available<500Mi"

if available memory falls below that value, the Node has a value reported in Node.Status.Conditions[] whose Type will be MemoryPressure and whose Status will be True. It's that value on the node object that the scheduler integrates with when making scheduling decisions.

bfallonf · 2016-10-11T05:16:22Z

admin_guide/overcommit.adoc

@derekwaynecarr This seems odd. It's as though having it enabled is a bad thing. Is there a reason why it's not just disabled by default instead of the user needing to do this??

It is a bad thing, but some customers want to have it enabled because they used that feature to meet certain densities in v2. unfortunately, swap being enabled means you can use other features.

@sdodson -- would it be bad for us to disable swap by default in our install, and instead write doc to discuss how it could be turned back on if desired? is that something we can look to do in 3.4/3.5?

Yeah, it's not hard, just consensus building. We'll try to get it in the first update after 3.4. https://trello.com/c/vGmZYJ79/296-disable-swap-at-install-and-upgrade

Sounds good. The initial BZ is about this contradiction, so I'll check if that's enough for Eric ( @TheDiemer ) and continue with the rest of the comments. Thanks, all.

bfallonf · 2016-10-13T06:22:15Z

@derekwaynecarr ^ bump

derekwaynecarr

Thanks for improving the documentation, please address the comments.

derekwaynecarr · 2016-10-18T20:46:28Z

admin_guide/out_of_resource_handling.adoc

i would stick with low.

derekwaynecarr · 2016-10-18T20:49:52Z

admin_guide/out_of_resource_handling.adoc

a node does not lose resources, so i think that phrasing is awkward. maybe, in cases where a node is running low on available resources...

The upstream documentation I wrote (which this copied) was crisp on what it meant to fail a pod as it caused confusion. I would prefer we keep that description.

derekwaynecarr · 2016-10-18T20:51:49Z

admin_guide/out_of_resource_handling.adoc

we should drop housekeeping-interval from this document. its hard-coded, and was included here in error.

derekwaynecarr · 2016-10-18T20:54:17Z

admin_guide/out_of_resource_handling.adoc

I would prefer to not change. The list of signals will grow, and fullness is not a term used in this area.

derekwaynecarr · 2016-10-18T20:55:01Z

admin_guide/out_of_resource_handling.adoc

in 3.3, they are not enabled by default.

in 3.4, we will set some default values for memory.

in 3.5, we will set some default values for disk related resources.

derekwaynecarr · 2016-10-18T21:24:52Z

admin_guide/out_of_resource_handling.adoc

basically, I am trying to explain that the only rationale thing to do when a node is running only guaranteed pods but system services are consuming too much resource is to fail the guaranteed pods since i cant really fail node system services.

derekwaynecarr · 2016-10-18T21:26:40Z

admin_guide/out_of_resource_handling.adoc

I would not move this to the signal section.

The list of reported Node conditions will grow in 3.4 to include DiskPressure.

This table is saying the following:

The scheduler looks at the NodeConditions reported by the node, and if it sees the node reporting "MemoryPressure" it will not place BestEffort pods on that node.

In 3.4, if the scheduler sees nodes that report "DiskPressure", it will not schedule any pods to that node.

The list of pressure conditions will grow, and the scheduler will do something slightly different for each.

derekwaynecarr · 2016-10-18T21:29:53Z

admin_guide/out_of_resource_handling.adoc

so the scheduler does NOT read eviction signals, it reads Node Conditions that are driven based on the configured eviction thresholds. So for example, if i set an eviction threshold like the following:

eviction-hard is "memory.available<500Mi"

if available memory falls below that value, the Node has a value reported in Node.Status.Conditions[] whose Type will be MemoryPressure and whose Status will be True. It's that value on the node object that the scheduler integrates with when making scheduling decisions.

derekwaynecarr · 2016-10-18T21:33:22Z

admin_guide/out_of_resource_handling.adoc

s/it/it's

I think this is so important that it should be called out maybe in the top of the document with something like ensuring your node has been configured correctly.

Agree. I moved this up into the Overview in an admonition.

derekwaynecarr · 2016-10-18T21:36:07Z

admin_guide/overcommit.adoc

It is a bad thing, but some customers want to have it enabled because they used that feature to meet certain densities in v2. unfortunately, swap being enabled means you can use other features.

@sdodson -- would it be bad for us to disable swap by default in our install, and instead write doc to discuss how it could be turned back on if desired? is that something we can look to do in 3.4/3.5?

bfallonf · 2016-10-19T06:26:57Z

@derekwaynecarr Thanks for taking a look. I've made edits, pretty much to what you suggested. Can I get a final ack there's nothing else before I move forward with this? Thanks, again.

bfallonf · 2016-10-25T00:51:30Z

@derekwaynecarr ^ Bump. (Thanks!)

derekwaynecarr · 2016-10-28T15:33:21Z

@bfallonf -- this is a big improvement. LGTM

bfallonf · 2016-10-31T01:13:02Z

Big thanks @derekwaynecarr ! Good to see it's an improvement. Can I ask you to please approve the changes? Seems that new GitHub review feature means people need to give the thumbs up by clicking a button.

@adellape @ahardin-rh Any comments before I merge?

derekwaynecarr · 2016-10-31T17:31:33Z

@bfallonf -- approved changes, if we can merge this today I can send my updates for the disk-eviction support in 1.4 origin release.

bfallonf · 2016-11-01T00:16:58Z

@derekwaynecarr Sure thing. Thanks much. I'll get this merged. If this has been reviewed before tomorrow morning, I can maybe get someone in BNE to take a look.

cc @adellape @ahardin-rh

ahardin-rh · 2016-11-07T21:13:18Z

admin_guide/out_of_resource_handling.adoc

With the updated doc guidelines, we can apply the new formatting here:

`PodPhase`

ahardin-rh · 2016-11-07T21:16:34Z

admin_guide/out_of_resource_handling.adoc

Apply updated formatting here:

`Node.Status.Conditions` `MemoryPressure`

ahardin-rh · 2016-11-07T21:17:54Z

admin_guide/out_of_resource_handling.adoc

`MemoryPressure`

ahardin-rh · 2016-11-07T21:18:21Z

admin_guide/out_of_resource_handling.adoc

`oom_score_adj`

ahardin-rh · 2016-11-07T21:18:48Z

admin_guide/out_of_resource_handling.adoc

`oom_score_adj`

there are more instances across the page

ahardin-rh · 2016-11-07T21:22:10Z

admin_guide/out_of_resource_handling.adoc

`BestEffort`

ahardin-rh · 2016-11-07T21:24:17Z

@bfallonf just a few minor comments from me regarding updated style guidelines. ⭐

bfallonf · 2016-11-08T04:53:31Z

Thanks @ahardin-rh . I'll pay more attention to the new guidelines... I'll merge away!

bfallonf commented Oct 11, 2016

View reviewed changes

vikram-redhat mentioned this pull request Oct 11, 2016

Add documentation for memory eviction, update documentation about swap #2956

Merged

derekwaynecarr requested changes Oct 18, 2016

View reviewed changes

bfallonf force-pushed the swap-1375350 branch from fdb5050 to cd35c67 Compare October 19, 2016 06:25

derekwaynecarr approved these changes Oct 31, 2016

View reviewed changes

ahardin-rh reviewed Nov 7, 2016

View reviewed changes

admin_guide/out_of_resource_handling.adoc Outdated

Copy link

Contributor

ahardin-rh Nov 7, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apply updated formatting here:

`Node.Status.Conditions` `MemoryPressure`

ahardin-rh reviewed Nov 7, 2016

View reviewed changes

admin_guide/out_of_resource_handling.adoc Outdated

Copy link

Contributor

ahardin-rh Nov 7, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

`MemoryPressure`

ahardin-rh reviewed Nov 7, 2016

View reviewed changes

admin_guide/out_of_resource_handling.adoc Outdated

Copy link

Contributor

ahardin-rh Nov 7, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

`oom_score_adj`

ahardin-rh reviewed Nov 7, 2016

View reviewed changes

admin_guide/out_of_resource_handling.adoc Outdated

Copy link

Contributor

ahardin-rh Nov 7, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

`BestEffort`

bfallonf force-pushed the swap-1375350 branch from cd35c67 to 8feb380 Compare November 8, 2016 04:42

initial edits

82e82e3

bfallonf force-pushed the swap-1375350 branch from 8feb380 to 82e82e3 Compare November 8, 2016 04:44

bfallonf added the branch/enterprise-3.3 label Nov 8, 2016

bfallonf added this to the Next Release milestone Nov 8, 2016

bfallonf merged commit 611370a into openshift:master Nov 8, 2016

bfallonf deleted the swap-1375350 branch November 8, 2016 04:54

ahardin-rh modified the milestones: Next Release, Staging, Published - 11/14/16 Nov 14, 2016

Comments

Conversation

bfallonf commented Oct 11, 2016

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

bfallonf Oct 11, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

sdodson Oct 18, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

bfallonf Oct 11, 2016 •

edited

Loading

sdodson Oct 18, 2016 •

edited

Loading