Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[dev] Enforce quota for storing charm and uniter state #11390

Conversation

achilleasa
Copy link
Contributor

@achilleasa achilleasa commented Apr 1, 2020

Description of change

This PR introduces multi-level quota checks when attempting to update the charm and internal uniter state stored on the controller.

The quota checks for charm state (modified via calls to the state-set tool ) operate on two different levels:

  • Each entry in the state map is allowed to have:
    • A maximum key length of 256 bytes.
    • A maximum value length of 65536 bytes (64k).
  • The total serialized size of the state cannot exceed a particular threshold (configured at the controller level via a setting).

The first quota check is always performed by the uniter when state-set is invoked so we can offer early feedback to charms if charm authors are attempting to abuse this feature to store BLOB/CLOB data to the controller instead of using a mechanism such as resources.

Both checks are always performed by the controller when attempting to use the SetState or CommitHookChanges API calls. If the quota checks fail, a typed error (QuotaLimitExceeded from juju/errors) is returned and reconstructed by the API client.

A separate, quota check is also performed when setting the uniter (uniter, per-relation and storage) state. The threshold for this check is also configured at the controller level by the operator.

The rationale behind having separate checks for the charm and uniter state is to allow us to block misbehaving charms but at the same time still allow the unit to persist its internal state (e.g. the hook that failed etc.)

The two quotas are controlled by the following controller options:

option default value notes
max-charm-state-size 2M Can be set to 0 to bypass check
max-uniter-state-size 512K Can be set to 0 to bypass check

The values where guestimated to provide charms with enough breathing room for storing an ample amount of data but to discourage charm authors from mis-using the feature. Note that the maximum combined quota cannot exceed 16M which is the hard-limit imposed by mongo for each document.

To enforce the quota, we approximate the total size of stored data by serializing to BSON and counting the length of the serialized data. The logic for the quota checks has been placed in core/quota because:

  • It is generic and potentially re-usable by future work.
  • It needs to be used by both the controller and the uniter worker.

Fun spelunking facts: If you were to import apiserver/common from api/uniter (a common use-case since the RestoreError code lives in there) you will cause an import cycle! See the comments at the end of api/uniter/uniter.go for more details.

QA steps

With actions

# Trigger max-key limit
$ juju run-action ubuntu-plus/0 write-charm-state num-entries=1 key-len=300 val-len=100
Action queued with id: "2"

$ juju show-action-output 2
UnitId: ubuntu-plus/0
id: "2"
message: exit status 1
results:
  ReturnCode: 1
  Stderr: |
    ERROR max allowed key length (256) exceeded
  Stdout: |
    attempting to write a payload with 1 entries where keys have length 300 and values have length 100
status: failed
timing:
  completed: 2020-04-01 15:45:05 +0000 UTC
  enqueued: 2020-04-01 15:45:01 +0000 UTC
  started: 2020-04-01 15:45:04 +0000 UTC

# Trigger max-value-limit
$ juju run-action ubuntu-plus/0 write-charm-state num-entries=1 key-len=10 val-len=65537
Action queued with id: "4"

$ juju show-action-output 4
UnitId: ubuntu-plus/0
id: "4"
message: exit status 1
results:
  ReturnCode: 1
  Stderr: |
    ERROR max allowed value length (65536) exceeded
  Stdout: |
    attempting to write a payload with 1 entries where keys have length 10 and values have length 65537
status: failed
timing:
  completed: 2020-04-01 15:47:04 +0000 UTC
  enqueued: 2020-04-01 15:47:00 +0000 UTC
  started: 2020-04-01 15:47:04 +0000 UTC

# List default limits
$ juju controller-config | grep state-size
max-charm-state-size       2.097152e+06
max-uniter-state-size      524288

# Trigger max state limit
$ juju controller-config max-charm-state-size=1337
$ juju run-action ubuntu-plus/0 write-charm-state num-entries=5 key-len=10 val-len=1000
Action queued with id: "6"

$ juju show-action-output 6
UnitId: ubuntu-plus/0
id: "6"
message: 'cannot apply changes: persisting charm state: max allowed size (1337) exceeded'
results:
  Stdout: |
    attempting to write a payload with 5 entries where keys have length 10 and values have length 1000
status: failed
timing:
  completed: 2020-04-01 15:51:00 +0000 UTC
  enqueued: 2020-04-01 15:50:58 +0000 UTC
  started: 2020-04-01 15:50:59 +0000 UTC

# Make sure we can still write to the state; 640k should be enough for everyone ;-)
$ juju controller-config max-charm-state-size=640000
$ juju run-action ubuntu-plus/0 write-charm-state num-entries=5 key-len=10 val-len=1000
Action queued with id: "8"

$ juju show-action-output 8  
UnitId: ubuntu-plus/0
id: "8"
results:
  Stdout: |
    attempting to write a payload with 5 entries where keys have length 10 and values have length 1000
status: completed
timing:
  completed: 2020-04-01 15:53:12 +0000 UTC
  enqueued: 2020-04-01 15:53:08 +0000 UTC
  started: 2020-04-01 15:53:11 +0000 UTC

$ juju run --unit ubuntu-plus/0 'state-get'
"0000000001": aaaaaaa ...
...
"0000000005": aaaaaaa ...

With regular hooks

Besides actions we should test a similar scenario for hooks by running the write-charm-state tool via juju run, e.g:

$ juju run --unit ubuntu-plus/0 'tools/write-charm-state 5 10 1000'
attempting to write a payload with 5 entries where keys have length 10 and values have length 1000

Documentation changes

We need to update the charm author resources to communicate the fixed limits per key/value as well as that there is an operator-configurable knob for the total state size and its default value.

@achilleasa
Copy link
Contributor Author

@howbazaar Since we were chatting about this you may want to also take a look.

This version brings in the QuotaExceeded error type.
The two options come with sensible defaults, 2M for charm-related state
data and 512k for the uniter's internal state data. Mongo imposes a hard
limit of 16M per document so the two options combined cannot exceed that
hard limit and will trigger a validation error if the do.

Note that the defaults where selected so that we give enough breathing
room for storing charm data but not large enough so that charm authors
are discouraged from mis-using this feature for storing large
CLOBs/BLOBs.
This commit introduces some useful helpers for enforcing quota limits.
More specifically:

- the MapKeyValueSizeChecker allows us to enforce a limit to the length
of map keys and values.
- the BSONTotalSizeChecker allows us to limit the total size of one or
more objects when serialized into BSON.
- the MultiChecker allows us to compose multiple checkers from this package.
This commit modifies the SetState and SetStateOperation methods to
accept an extra argument that specifies a limit to be applied when
updating the charm and unit state data.

The reasoning behind having two quotas is that it allows us to block
misbehaving charms from storing state data that exceeds a threshold
while at the same time still allowing the uniter to persist its internal
state which may contain information about failed hooks etc. The
individual quota limits may also be set to zero by the operator to
bypass quota checks alltogether. In this case, the max limit is enforced
by mongo's hard limit of 16M per document.

Apart from checking the total (operator-configurable) size for charm
state data, the setter will additionally check that the keys and values
in the state map satisfy the *fixed* length limits which are described
in quote/fixed_limits.go. The limit values are provided by the quota
package since they also need to be enforced by the uniter worker when
charm authors invoke the 'state-set' hook tool.
The command docs for state-set have been updated to include the
aforementioned limits. Note that the uniter only enforces the fixed
limits which allow us to give some useful early feedback to the charm
that the charm author is probably attempting to mis-use this particular
feature.

As the changes are buffered in memory and committed in bulk when the
hook context is flushed (after the hook scripts have finished
executing), quota checks for the total charm state size are enforced by
the controller. This makes the implementation simpler as the uniter does
not need to fetch and/or watch the controller's configuration to obtain
the active limits. What's more, the buffer-and-let-the-controller-sort-it-out
approach exhibits better performance charactericts as hooks (either
directly or via the operator framework) may call state-set multiple
times and we certainly want to avoid having to serialize the entire
state for each call to enforce the quota limit.
If the controller manages a large number of units, each unit will have
its own state document in the unitstates collection. Since our queries
automatically filter on the model-uuid, having an index should allow for
faster lookups.
@achilleasa achilleasa force-pushed the dev-enforce-quota-for-storing-charm-and-uniter-state branch from a7b39b0 to 4605310 Compare April 2, 2020 10:17
@achilleasa
Copy link
Contributor Author

Rebased to resolve conflicts after the merge of #11383

Copy link
Member

@manadart manadart left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nicely done. QA is all good.

err := api.SetState(params.SetUnitStateArg{
State: &map[string]string{"one": "two"},
})
c.Assert(err, jc.Satisfies, errors.IsQuotaLimitExceeded, gc.Commentf("expected the client to reconstruct QuotaLimitExceeded error from server response"))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice. We should use these comments more.

// Copyright 2020 Canonical Ltd.
// Licensed under the AGPLv3, see LICENCE file for details.

package quota_test
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should have a general rule that package_test.go is local to the package.

@achilleasa
Copy link
Contributor Author

$$merge$$

@jujubot jujubot merged commit 47f29e6 into juju:develop Apr 3, 2020
@achilleasa achilleasa deleted the dev-enforce-quota-for-storing-charm-and-uniter-state branch April 3, 2020 08:52
jujubot added a commit that referenced this pull request Apr 3, 2020
…g-knob-for-uniter-state-size

#11401

This PR renames the controller config option for setting a limit on the internal uniter state (originally introduced by #11390 ) to make the configuration setting a bit more intuitive and congruent with future work to unify the various agents.
jujubot added a commit that referenced this pull request Apr 7, 2020
…s-for-uniter-state

#11422

## Description of change

The quota-limit checks that were introduced by #11390 accidentally broke the migration steps for the agent -> controller state migration due to a mismatch (the addition of an extra argument) in the interface expected by the upgrade worker and the implicit interface implemented by `state.Unit`. 

Unfortunately, this problem was not caught by our tests and surfaced as part of the QA steps for another related chunk of work.

This PR:
- Fixes the interface incompatibility.
- Adds compile-time checks to prevent this from happening in the future.
- Switches from calling `SetState` to using the `SetStateOperation, ApplyOperation` method combo.
- Also renames the `WriteUniterState` API call to `WriteAgentState` so it better conveys its purpose which is to store the migrated agent (uniter agent for now) state that may include workers other than the uniter (e.g. the meter status worker).

## QA steps

```console
# Bootstrap 2.7.5 from snap
$ /snap/bin/juju bootstrap lxd test --no-gui

$ juju deploy cs:~jameinel/ubuntu-lite-7

# Wait for the charm to settle and then upgrade to the code from this PR

$ juju upgrade-controller --build-agent
$ juju upgrade-model

# Verify that no upgrade-related errors pop up in the controller/model logs and
# that the agent state upgrade step has completed successfully
$ juju debug-log -m controller --level ERROR
$ juju debug-log -m default --level INFO
unit-ubuntu-lite-0: 11:12:45 INFO juju.upgrade running upgrade step: write unit agent state to controller for all running units and remove files
unit-ubuntu-lite-0: 11:12:45 INFO juju.upgrade All upgrade steps completed successfully
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants