Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid index (epoch) cleanup and compaction during index reads #3638

Closed
julio-lopez opened this issue Feb 8, 2024 · 7 comments · Fixed by #3834
Closed

Avoid index (epoch) cleanup and compaction during index reads #3638

julio-lopez opened this issue Feb 8, 2024 · 7 comments · Fixed by #3834
Assignees
Labels
enterprise For anything enterprise related helping reqs from the industry. keep-open performance resource-utilization

Comments

@julio-lopez
Copy link
Collaborator

julio-lopez commented Feb 8, 2024

Objectives

  • Increase safety in terms of repository consistency.
  • Improve user experience by providing a consistent expected behavior in terms of duration of operations and resource utilization.

Current Issues:

Currently, kopia is performing writes on the index load path.

  • This is not desirable for read-only operations. It is also surprising for the user (See Make "read" commands/operations really read-only.  #3639).
  • It delays the completion of other commands such as repository connect or snapshot list among others.
  • Results in unexpected high resource utilization due to aggressive parallelism.
  • Has the potential for duplicated (and wasted) work. Two clients, or in some cases the same client may perform the same expensive work multiple times. For example, the same epoch could be compacted more than once.
@julio-lopez
Copy link
Collaborator Author

julio-lopez commented Feb 9, 2024

Proposed approach

@julio-lopez
Copy link
Collaborator Author

julio-lopez commented Feb 10, 2024

Index loading simplified call graph before the proposed change

eim-call

julio-lopez added a commit that referenced this issue Feb 16, 2024
Refactor: move index compaction and cleanup out of refreshAttemptLocked

Introduces an `allowWritesOnLoadHelper` to check whether or not writes
can be performed when loading the indexes. Currently this is only
a function of whether the storage is in read-only mode. In the near
future, an explicit flag will be added to control this behavior.

Fix epoch manager: avoid single-epoch compaction when writes are disallowed.
Functional change: prevents compacting single epochs when writes are
disallowed, that is when using read-only storage. Currently, the epoch
manager will attempt to perform single-epoch compactions for all
eligible epochs, even on read-only storage.

Ref:
- #3224
- #3225
- #3638
- #3639
@ashmrtn
Copy link
Collaborator

ashmrtn commented Feb 16, 2024

silly question, but I could use a little extra context: what type(s) of maintenance will allow index compaction? Will it be full and quick maintenance or just one of them?

julio-lopez added a commit that referenced this issue Feb 16, 2024
Conditionally disables epoch index maintenance operations when loading indexes.
This prevents (potentially expensive) cleanup write operations on the index
read path.
The behavior is controlled via the `epoch.Manager.allowCleanupWritesOnIndexLoad`
field, which can be temporarily overridden via an environment variable.
This override mechanism will be removed in the near future.

Refs:

- #3174
- #3224
- #3225
- #3638
- #3639
@julio-lopez
Copy link
Collaborator Author

julio-lopez commented Feb 16, 2024

silly question, but I could use a little extra context: what type(s) of maintenance will allow index compaction? Will it be full and quick maintenance or just one of them?

@ashmrtn Good question.

The approach implemented in the current PR is as follows.

For quick maintenance:

  • Single epoch compactions
  • Epoch advancement

For full maintenance:

  • Single epoch compactions
  • Epoch range compaction
  • Epoch advancement
  • Epoch marker cleanup
  • Deletion watermark cleanup

Does that make sense?

@ashmrtn
Copy link
Collaborator

ashmrtn commented Feb 16, 2024

Thanks for the info Julio! The details you laid out make sense from an understanding perspective

However, looking at the additional info I'm unsure how to feel about the range compaction being in full maintenance. I don't have hard numbers, but I feel like it starts to turn the index compaction/data blob compaction into a bit of a balancing act. Keeping the index compacted with range compactions helps reduce GETs and general load time to do any work on the repo. However, if it's with full maintenance, which also does data blob compaction, power users need to consider the cost of potentially compacting data blobs which could trigger GET requests to slower storage tiers

Do you have any thoughts/guidance on this?

@julio-lopez
Copy link
Collaborator Author

@ashmrtn The change should not adversely impact users, and instead in some cases result in resource savings, potentially including fewer storage requests (GETs, etc). Overall, this does not change the effective frequency of single epoch or range compactions. See details below.

Here's the reasoning, based also on observations with various repositories:

  • The actual (effective) frequency of single epoch compaction would remain the same, it would happen ~ every 24 hours or so. What could change is that it "may" be delayed by about 4 hours in the worst case. Given the current default parameters, the minimum epoch duration is 24 hours (AND a minimum number of index blobs). The difference is that instead of implicitly checking every 15 minutes, the check would happen on every maintenance execution (both quick and full), which by default would be done every 4 hours.

  • The actual (effective) frequency of epoch range compaction would remain the same, it would happen every 7 epochs with default parameters, this is roughly ~7 days for repos with enough churn. In the worst case scenario, the range compaction could be delayed ~24 hours, since this would be done in full maintenance. Notice that at this point single-epoch compacted indexes are already being used for these epochs and not the individual raw indexes that get originally written. And for older epochs, the previously range-compacted indexes will be used.

Does that make sense? Does that answer your question?

julio-lopez added a commit to kastenhq/kopia that referenced this issue Mar 6, 2024
julio-lopez added a commit that referenced this issue Mar 7, 2024
Add
- TestMabyeAdvanceEpoch
- TestMabyeAdvanceEpoch_Empty
- TestMaybeAdvanceEpoch_Error
- TestMaybeAdvanceEpoch_GetParametersError

Ref:
- #3638
- #3645
- #3651
julio-lopez added a commit to kastenhq/kopia that referenced this issue Mar 12, 2024
julio-lopez added a commit that referenced this issue Mar 12, 2024
Extracted from #3651.

Thanks to @plar and @redgoat650 for the suggestions.

Ref:

- #3603
- #3645
- #3638
- #3639
Lyndon-Li pushed a commit to Lyndon-Li/kopia that referenced this issue Mar 18, 2024
Refactor: move index compaction and cleanup out of refreshAttemptLocked

Introduces an `allowWritesOnLoadHelper` to check whether or not writes
can be performed when loading the indexes. Currently this is only
a function of whether the storage is in read-only mode. In the near
future, an explicit flag will be added to control this behavior.

Fix epoch manager: avoid single-epoch compaction when writes are disallowed.
Functional change: prevents compacting single epochs when writes are
disallowed, that is when using read-only storage. Currently, the epoch
manager will attempt to perform single-epoch compactions for all
eligible epochs, even on read-only storage.

Ref:
- kopia#3224
- kopia#3225
- kopia#3638
- kopia#3639
Lyndon-Li pushed a commit to Lyndon-Li/kopia that referenced this issue Mar 18, 2024
…#3645)

Conditionally disables epoch index maintenance operations when loading indexes.
This prevents (potentially expensive) cleanup write operations on the index
read path.
The behavior is controlled via the `epoch.Manager.allowCleanupWritesOnIndexLoad`
field, which can be temporarily overridden via an environment variable.
This override mechanism will be removed in the near future.

Refs:

- kopia#3174
- kopia#3224
- kopia#3225
- kopia#3638
- kopia#3639
julio-lopez added a commit that referenced this issue Mar 19, 2024
Refactoring for the original implementation with intRange and
getKeyRange from closed-open ranges [lo, hi) to closed ranges: [lo, hi].
The primary motivation is for consistency with the implementation
of epoch.RangeMetadata in the same package, and thus avoid
confusion and reduce cognitive load.

Changes:

- adds a getContiguousKeyRange wrapper that checks for contiguity.
- getKeyRange simply returns a range with minimum and maximum
  values for the keys in the map.
- changes the range implementation from closed-open ranges [lo, hi)
   to closed ranges: [lo, hi] where both lo and hi are included in the range.
- Additional unit tests are included.
- renames intRange to closedIntRange to reflect new functionality.

Ref:
- Follow up refactor(general): add epoch.getKeyRange helper #3721
- Needed for refactor(general): add epoch.Manager.MaybeCompactSingleEpoch #3728
- Avoid index (epoch) cleanup and compaction during index reads #3638
julio-lopez added a commit that referenced this issue Mar 20, 2024
Add:
- epoch.Manager.MaybeCompactSingleEpoch
- getCompactedEpochRange helper
- oldestUncompactedEpoch helper
- TestOldestUncompactedEpoch
- Tests for MaybeCompactSingleEpoch

Ref:
- Subset and dependency of #3651
- Depends on #3735
- Avoid index (epoch) cleanup and compaction during index reads #3638
- Make "read" commands/operations really read-only.  #3639
Lyndon-Li pushed a commit to Lyndon-Li/kopia that referenced this issue Mar 22, 2024
Refactor: move index compaction and cleanup out of refreshAttemptLocked

Introduces an `allowWritesOnLoadHelper` to check whether or not writes
can be performed when loading the indexes. Currently this is only
a function of whether the storage is in read-only mode. In the near
future, an explicit flag will be added to control this behavior.

Fix epoch manager: avoid single-epoch compaction when writes are disallowed.
Functional change: prevents compacting single epochs when writes are
disallowed, that is when using read-only storage. Currently, the epoch
manager will attempt to perform single-epoch compactions for all
eligible epochs, even on read-only storage.

Ref:
- kopia#3224
- kopia#3225
- kopia#3638
- kopia#3639
Lyndon-Li pushed a commit to Lyndon-Li/kopia that referenced this issue Mar 22, 2024
…#3645)

Conditionally disables epoch index maintenance operations when loading indexes.
This prevents (potentially expensive) cleanup write operations on the index
read path.
The behavior is controlled via the `epoch.Manager.allowCleanupWritesOnIndexLoad`
field, which can be temporarily overridden via an environment variable.
This override mechanism will be removed in the near future.

Refs:

- kopia#3174
- kopia#3224
- kopia#3225
- kopia#3638
- kopia#3639
julio-lopez added a commit that referenced this issue Mar 27, 2024
…3651)

Perform index epoch compaction and cleanup during repository maintenance

* refactor: rename maintenance task for deleting superseded indexes
* maintenance task to cleanup epoch markers
* maintenance task to advance write index epoch
* maintenance task to compact epoch ranges
* quick maintenance task to compact a single (index) epoch
* full maintenance task to compact a single (index) epoch

Ref:
- #3638
- #3639

Followup to:
- #3603
- #3645

Related helpers and changes:
- #3721
- #3735
- #3709
- #3728
- #3727
- #3726
julio-lopez added a commit that referenced this issue Mar 27, 2024
Lyndon-Li pushed a commit to Lyndon-Li/kopia that referenced this issue Mar 29, 2024
…opia#3651)

Perform index epoch compaction and cleanup during repository maintenance

* refactor: rename maintenance task for deleting superseded indexes
* maintenance task to cleanup epoch markers
* maintenance task to advance write index epoch
* maintenance task to compact epoch ranges
* quick maintenance task to compact a single (index) epoch
* full maintenance task to compact a single (index) epoch

Ref:
- kopia#3638
- kopia#3639

Followup to:
- kopia#3603
- kopia#3645

Related helpers and changes:
- kopia#3721
- kopia#3735
- kopia#3709
- kopia#3728
- kopia#3727
- kopia#3726
Lyndon-Li pushed a commit to Lyndon-Li/kopia that referenced this issue Mar 29, 2024
julio-lopez added a commit that referenced this issue Apr 1, 2024
Change default 'allowWritesOnIndexLoad' to false when env var is unset
Add TestNoEpochAdvanceOnIndexRead

Ref:
- Followup to #3645
- Avoid index (epoch) cleanup and compaction during index reads #3638
- Make "read" commands/operations really read-only.  #3639
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enterprise For anything enterprise related helping reqs from the industry. keep-open performance resource-utilization
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants