Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

F33 feature/change proposal SwapOnZRAM by default #509

Closed
cmurf opened this issue Jun 1, 2020 · 22 comments · Fixed by coreos/fedora-coreos-config#687
Closed

F33 feature/change proposal SwapOnZRAM by default #509

cmurf opened this issue Jun 1, 2020 · 22 comments · Fixed by coreos/fedora-coreos-config#687

Comments

@cmurf
Copy link

cmurf commented Jun 1, 2020

This will be proposed for Fedora 33 all editions and spins
https://fedoraproject.org/wiki/Changes/SwapOnZRAM

I'd like to make the case that some swap, especially if it's fast, is better than no swap. In the no swap case, if the system comes under any memory pressure, it means the system must resort to reclaim of file pages because it's not possible to evict even inactive anonymous pages due to lack of swap. Such a system starts to do a kind of swap thrashing, which is really not at all swap related since it doesn't exist, but it's this churn of reading file pages on demand, and then almost immediately they get dropped out of memory. Whereas if there were a swap-on-ZRAM device, those inactive anonymous pages would get evicted, compressed, and free up memory to avoid reclaiming file pages.

Other than totally opting out, I think there are two options Fedora CoreOS could consider:

  1. include zram-generator, but not include a configuration file. Without a configuration file, there is no setup of a swap-on-zram device.

  2. include both zram-generator and configuration; but you could have a different configuration than the default if you want to go even more minimalist than proposed. e.g. maybe use a fraction of 20% RAM instead of 50%.

Note that Fedora IoT has been using swap-on-ZRAM for some time and they're defaulting to 50% RAM which is the same as this proposal.

@cmurf
Copy link
Author

cmurf commented Jun 1, 2020

It's in the change proposal, but just to put a fine point on it here: when the /dev/zram0 device is created with whatever size, this is not a preallocation. It doesn't actually consume memory right away. There is about 0.1% overhead to create it, but otherwise the memory is dynamically allocated and deallocated based on demand.

@cgwalters
Copy link
Member

Note that Kubernetes explicitly fails if swap is enabled: kubernetes/kubernetes#53533

Of course this swap isn't really the same as other swap, particularly if you're doing swap on any kind of rotational storage (but hopefully no one is doing that anymore).

Personally, I think zram is a convenient and cheap approach in some scenarios, but what we really want is to make the operating system behave more like iOS/Android by default and actively evict (i.e. kill) applications (and yes, this requires intelligence in the frame work and apps). Scheduling applications more intelligently is basically what Kubernetes is doing.

@cmurf
Copy link
Author

cmurf commented Jun 1, 2020

From my reading, Android is uses swap-on-zram already for a while, as well as Chromium/Chrome OS. It's not consistently deployed by OEMs I guess.

https://source.android.com/devices/tech/perf/low-ram

@dustymabe dustymabe added the meeting topics for meetings label Jun 3, 2020
@cmurf
Copy link
Author

cmurf commented Jun 10, 2020

"Swap can make a system slower to OOM kill". I don't know if this concern is why cloud environments tend to not have swap configured. But oomd2 and likely future systemd-oomd depends on various PSI metrics including swap pressure, i.e. swap needs to exist to do this. I think using zram based swap for this is an open question; I did ask some upstream kernel cgroupsv2 folks about it, and they kinda shrugged and said it all depends, and may even need to be made dynamic based on the workload.

@lucab
Copy link
Contributor

lucab commented Jun 15, 2020

We covered this in the last meeting. There were several thumbs-up on both not having swap by default (i.e. current status) and allowing users to opt-in swap-on-zram (i.e. the F33 way, minus the always-on default).

The general flow that we could be targeting is:

  1. write zram conf via Ignition
  2. write formatting + mounting units via Ignition
  3. let the zram-generator create the devices
  4. let the other units format and enable the swap-on-zram

For this to work, we assume that FCOS can generally just follow vanilla Fedora approach here. The only point of contention/customization would be around vendor defaults. zram-generators does not currently support the whole set of fragments and overlays like other systemd components, and I opened systemd/zram-generator#29 to push that forward.

@cmurf I didn't see the change-ticket for this F33 feature, but IMHO it would be nice to bring up systemd/zram-generator#29 as a soft-blocker there.

PS: I'm deliberately ignoring the whole "is swap default better on or off" discussion here. I'd like to keep this ticket focused on the swap-on-zram topic.

@lucab lucab removed the meeting topics for meetings label Jun 15, 2020
@cmurf
Copy link
Author

cmurf commented Jun 15, 2020

@lucab The feature is still brand new, so it doesn't have a change tracking bug yet and hasn't yet been approved by FESCo. I agree with the approch in zram-generator#29 but leave it up to CoreOS folks to decide whether to ship a missing /usr config to indicate disabled by default, or if you want ignition to drop an empty file into /etc by default to indicate it.

@lucab
Copy link
Contributor

lucab commented Jun 18, 2020

For reference, the discussion here brought to light CVE-2020-10781 (local DoS, fix upcoming).

@dustymabe
Copy link
Member

Nice work Luca!

@dustymabe dustymabe added the meeting topics for meetings label Jun 24, 2020
@lucab
Copy link
Contributor

lucab commented Jul 1, 2020

systemd/zram-generator#33 added support for configuration fragments. If/once this proposal land in Fedora, we just need to add a vendor fragment to disable the default configuration, and then document how people can opt-in again into that.

@cmurf
Copy link
Author

cmurf commented Jul 1, 2020

I think what you'd do is install zram-generator package, and not install zram-generator-defaults package. The generator will be present but do nothing. And then you can opt in by any means of creating /etc/systemd/zram-generator.conf that you wish.

@dustymabe
Copy link
Member

Now that I know a little bit more about how swap on zram works i think we could consider enabling it by default after getting some real world experience with it. It could also be something we do at a later time (opt in for now, default to later). It would be nice if it could give us some wins in environments with less resources.

@cmurf
Copy link
Author

cmurf commented Jul 1, 2020

Fedora IoT has been enabling it since the start. These defaults are a bit more conservative considering the 4G cap, which are subject to change. I've talked to a few kernel fs/storage/mm/cgroups folks about it and it's definitely better than no swap. Eviction at 50% efficacy (based on compression ratio estimate) is not as good as 100% efficacy using disk-based swap; but is still better than 0% which causes anonymous pages to be pinned to memory, and increases the chance of otherwise unnecessary reclaim. So even without swap you can get "swap like" behavior, and repetitive reclaim is expensive.

@dustymabe
Copy link
Member

A bit more information: The configuration for the zram-generator has a setting:

# The maximum amount of memory (in MiB). If the machine has more RAM
# than this, zram device will not be created.
#
# "host-memory-limit = none" may be used to disable this limit. This
# is also the default.
host-memory-limit = 9048

So hosts with more than $host-memory-limit RAM will see no change if we were to implement this. I think currently we've accepted that including the zram-generator package (pending any security or bug fixes that are found) is something we want to do.

The real question is:

  • enabled by default? options:
    • yes: what defaults values do we want to use? Do we want to use the zram-generator-defaults package?
    • maybe: maybe this is something we do later on after some soak in Fedora?
    • no: document how to take advanatage of it

@lucab
Copy link
Contributor

lucab commented Jul 9, 2020

@dustymabe I'd like to answer "yes/maybe" here, but my understanding is that all higher level orchestration systems (e.g. k8s, nomad, etc.) basically assume a "no".
Their memory accounting and scheduling logic usually does not cover the swap case as it makes the logic way more complex (a hierarchy of memory pools with different access properties), see Colin's first comment.
In short, I doubt we have freewill on the default value here right now, similarly to the cgroupsv1 case.

@dustymabe
Copy link
Member

I think what you're saying is reasonable.

@dustymabe
Copy link
Member

ok so it seems like we are leaning towards included but not enabled by default. We have two options for that that I see:

  1. Include both zram-generator-defaults and zram-generator packages. Place override at /etc/systemd/zram-generator.conf to disable.

This means in order to enable the defaults the user just deletes the /etc/systemd/zram-generator.conf file. Documentation is slightly easier.

  1. Include just the zram-generator package.

In order to enable zram you'd need to create a file at /etc/systemd/zram-generator.conf with at least [zram0] in it. In this case documentation is slightly longer and probably needs to explain the contents of the file briefly, which might be desirable anyway.

@dustymabe
Copy link
Member

dustymabe commented Oct 14, 2020

ok so it seems like we are leaning towards included but not enabled by default.

Though one thing we could do is create our own FCOS config with host-memory-limit = 4096 so it would only be enabled on systems with less than 4GiB of ram by default (or some other ram value we deem appropriate).

@bgilbert
Copy link
Contributor

Though one thing we could do is create our own FCOS config with host-memory-limit = 4096 so it would only be enabled on systems with less than 4GiB of ram by default (or some other ram value we deem appropriate).

It seems like it could surprise users if we enable a potentially Kubernetes-breaking feature only on machines with certain amounts of RAM.

@cmurf
Copy link
Author

cmurf commented Oct 14, 2020

Maybe ask Kubernetes users if noswap actually manifests well in practice? It's leaving a lot on the table to either insist on significant overprovision of memory to avoid both the need for page eviction and reclaim, or suffer with reclaim which can be worse than incidental paging, especially when using a memory based swap. The noswap by default policy translates into an expectation to throw more memory at such setups (and pay for it).

I think it's better to optimize for the general purpose use cases CoreOS is targeting, rather than papering over Kubernetes design oversight. That is, they fail on swap because they haven't worked out how swap should look with the semantics of guaranteed pods, not because there's some real technical limitation of swap.

@dustymabe dustymabe added the meeting topics for meetings label Oct 14, 2020
@dustymabe
Copy link
Member

We discussed this in the meeting today.

13:05:25      dustymabe | #agreed We'll include the zram-generator package for now, which will allow
                        | users to drop down a config file to enable swaponzram. Additionally we'll
                        | add docs to show users how to do this. In the future we'll re-evaluate if
                        | creating a swaponzram device by default, is the right thing for us to do.

@dustymabe dustymabe added the jira for syncing to jira label Oct 14, 2020
dustymabe added a commit to dustymabe/fedora-coreos-config that referenced this issue Oct 14, 2020
This was part of a F33 proposed change. We'll include the generator
but not the defaults subpackage because we don't want it enabled by
default just yet. We'll add docs for users instructing them how to
enable it by dropping down a file via Ignition.

Closes coreos/fedora-coreos-tracker#509
dustymabe added a commit to dustymabe/fedora-coreos-config that referenced this issue Oct 14, 2020
This was part of a F33 proposed change. We'll include the generator
but not the defaults subpackage because we don't want it enabled by
default just yet. We'll add docs for users instructing them how to
enable it by dropping down a file via Ignition/FCCT. The following
snippet is an example:

```
variant: fcos
version: 1.1.0
passwd:
  users:
    - name: core
      ssh_authorized_keys:
        - ssh-rsa AAAAB user@example.com
storage:
  files:
    - path: /etc/systemd/zram-generator.conf
      mode: 0644
      contents:
        inline: |
          # This config file enables a /dev/zram0 device with the default settings
          [zram0]
```

Closes coreos/fedora-coreos-tracker#509
dustymabe added a commit to dustymabe/fedora-coreos-config that referenced this issue Oct 15, 2020
This was part of a F33 proposed change. We'll include the generator
but not the defaults subpackage because we don't want it enabled by
default just yet. We'll add docs for users instructing them how to
enable it by dropping down a file via Ignition/FCCT. The following
snippet is an example:

```
variant: fcos
version: 1.1.0
passwd:
  users:
    - name: core
      ssh_authorized_keys:
        - ssh-rsa AAAAB user@example.com
storage:
  files:
    - path: /etc/systemd/zram-generator.conf
      mode: 0644
      contents:
        inline: |
          # This config file enables a /dev/zram0 device with the default settings
          [zram0]
```

Closes coreos/fedora-coreos-tracker#509
dustymabe added a commit to dustymabe/fedora-coreos-docs that referenced this issue Oct 15, 2020
dustymabe added a commit to dustymabe/fedora-coreos-config that referenced this issue Oct 15, 2020
This was part of a F33 proposed change. We'll include the generator
but not the defaults subpackage because we don't want it enabled by
default just yet. We'll add docs for users instructing them how to
enable it by dropping down a file via Ignition/FCCT. The following
snippet is an example:

```
variant: fcos
version: 1.1.0
passwd:
  users:
    - name: core
      ssh_authorized_keys:
        - ssh-rsa AAAAB user@example.com
storage:
  files:
    - path: /etc/systemd/zram-generator.conf
      mode: 0644
      contents:
        inline: |
          # This config file enables a /dev/zram0 device with the default settings
          [zram0]
```

Closes coreos/fedora-coreos-tracker#509
dustymabe added a commit to dustymabe/fedora-coreos-config that referenced this issue Oct 15, 2020
This was part of a F33 proposed change. We'll include the generator
but not the defaults subpackage because we don't want it enabled by
default just yet. We'll add docs for users instructing them how to
enable it by dropping down a file via Ignition/FCCT. The following
snippet is an example:

```
variant: fcos
version: 1.1.0
passwd:
  users:
    - name: core
      ssh_authorized_keys:
        - ssh-rsa AAAAB user@example.com
storage:
  files:
    - path: /etc/systemd/zram-generator.conf
      mode: 0644
      contents:
        inline: |
          # This config file enables a /dev/zram0 device with the default settings
          [zram0]
```

Closes coreos/fedora-coreos-tracker#509
dustymabe added a commit to coreos/fedora-coreos-config that referenced this issue Oct 15, 2020
This was part of a F33 proposed change. We'll include the generator
but not the defaults subpackage because we don't want it enabled by
default just yet. We'll add docs for users instructing them how to
enable it by dropping down a file via Ignition/FCCT. The following
snippet is an example:

```
variant: fcos
version: 1.1.0
passwd:
  users:
    - name: core
      ssh_authorized_keys:
        - ssh-rsa AAAAB user@example.com
storage:
  files:
    - path: /etc/systemd/zram-generator.conf
      mode: 0644
      contents:
        inline: |
          # This config file enables a /dev/zram0 device with the default settings
          [zram0]
```

Closes coreos/fedora-coreos-tracker#509
dustymabe added a commit to coreos/fedora-coreos-docs that referenced this issue Oct 21, 2020
@dustymabe
Copy link
Member

The fix for this went into testing stream release 32.20201018.2.0. Please try out the new release and report issues.

@dustymabe dustymabe added the status/pending-stable-release Fixed upstream and in testing. Waiting on stable release. label Oct 21, 2020
@dustymabe
Copy link
Member

The fix for this went into stable stream release 32.20201018.3.0.

@dustymabe dustymabe removed the status/pending-stable-release Fixed upstream and in testing. Waiting on stable release. label Nov 12, 2020
kelvinfan001 pushed a commit to kelvinfan001/fedora-coreos-config that referenced this issue Dec 14, 2020
This was part of a F33 proposed change. We'll include the generator
but not the defaults subpackage because we don't want it enabled by
default just yet. We'll add docs for users instructing them how to
enable it by dropping down a file via Ignition/FCCT. The following
snippet is an example:

```
variant: fcos
version: 1.1.0
passwd:
  users:
    - name: core
      ssh_authorized_keys:
        - ssh-rsa AAAAB user@example.com
storage:
  files:
    - path: /etc/systemd/zram-generator.conf
      mode: 0644
      contents:
        inline: |
          # This config file enables a /dev/zram0 device with the default settings
          [zram0]
```

Closes coreos/fedora-coreos-tracker#509
@bgilbert bgilbert removed the meeting topics for meetings label Mar 10, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants