Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[dbnode] Add claims for index segments volume index #2846

Merged
merged 17 commits into from
Nov 11, 2020

Conversation

robskillington
Copy link
Collaborator

What this PR does / why we need it:

This adds the ability to safely flush in parallel for index segments for the same block start without concern about concurrency and writing to the same file paths. Adds ability to safely do more compactions/updates/etc to same block start for index segments.

Special notes for your reviewer:

Does this PR introduce a user-facing and/or backwards incompatible change?:

NONE

Does this PR require updating code package or user-facing documentation?:

NONE

Copy link
Contributor

@notbdu notbdu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a couple nits, but changes LGTM.

Seems like this logic works fine w/ how we determine whether we've warm flushed an index block:

func (i *nsIndex) hasIndexWarmFlushedToDisk(
        infoFiles []fs.ReadIndexInfoFileResult,
        blockStart time.Time,
) bool {
        var hasIndexWarmFlushedToDisk bool
        // NB(bodu): We consider the block to have been warm flushed if there are any
        // filesets on disk. This is consistent with the "has warm flushed" check in the db shard.
        // Shard block starts are marked as having warm flushed if an info file is successfully read from disk.
        for _, f := range infoFiles {
                indexVolumeType := idxpersist.DefaultIndexVolumeType
                if f.Info.IndexVolumeType != nil {
                        indexVolumeType = idxpersist.IndexVolumeType(f.Info.IndexVolumeType.Value)
                }
                if f.ID.BlockStart == blockStart && indexVolumeType == idxpersist.DefaultIndexVolumeType {
                        hasIndexWarmFlushedToDisk = true
                }
        }
        return hasIndexWarmFlushedToDisk
}

We check the index volume type so there is no dependency on the volume index.

// Now check if previous claim exists.
blockStart := xtime.ToUnixNano(blockStartTime)
namespace := namespaceMetadata.ID()
key := fmt.Sprintf("%s/%s/%d", filePathPrefix, namespace.String(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: is it better to have a nested map here so we're not generating a ton of keys with duplicated strings for long retention?

Also, isn't file path prefix static after set? I don't think we have multiple file paths and/or dynamically changing file paths at runtime?

rp, bs, t := retOpts.RetentionPeriod(), indexOpts.BlockSize(), nowFn()
earliestBlockStart := retention.FlushTimeStartForRetentionPeriod(rp, bs, t)
earliestBlockStartUnixNanos := xtime.ToUnixNano(earliestBlockStart)
for key, claim := range indexVolumeIndexClaims {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: prob a hyper optimization but would it be better to just iterate over block starts beginning w/ edge of retention and going backwards in time until we can't find any claim and bail then? That way we're only ever iterating over block starts that are out of retention (if any). If there are none, it will just check the map once and bail early.

// to validate the index segment.
// If fails validation will rebuild since missing from
// fulfilled range.
fsOpts = fsOpts.SetIndexReaderAutovalidateIndexSegments(true)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 on this change, are there other places that we'd want to also validate index segments on read or are we already doing it?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Self f/u on this: It looks like the only other place we are reading in index segments is in the persist manager after flush the segments to disk. We've just written to disk at that point so that should be fine.

@nbroyles nbroyles self-requested a review November 6, 2020 16:22
Copy link
Collaborator

@nbroyles nbroyles left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LG pending code gen + test fix ups

@@ -654,3 +657,70 @@ func (pm *persistManager) SetRuntimeOptions(value runtime.Options) {
pm.currRateLimitOpts = value.PersistRateLimitOptions()
pm.Unlock()
}

var (
indexVolumeIndexClaimsLock sync.Mutex
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we decide on going down the path of a global persist manager instance vs a global shared lock btwn persist managers, currently both flush and cold flush each have their own persist manager instance.

We also create one for the fs/peers bootstrappers but they run serially.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Spoke offline, decided on going w/ a global index volume index claims manager approach instead.

@notbdu notbdu force-pushed the r/add-claims-for-index-segment-volume-indexes branch 2 times, most recently from 7bda086 to a8e0720 Compare November 9, 2020 19:57

// NewIndexClaimsManager returns an instance of the index claim manager. This manages
// concurrent claims for volume indices per ns and block start.
// NB(bodu): There should be only a single shared index claim manager among all threads
Copy link
Collaborator

@nbroyles nbroyles Nov 10, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: does it make sense to make this method cache an instance and just return one if it already exists?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm so there are unit tests that rely on the CTOR returning separate instances of the index claims manager.

I could add a reset method to the index claims manager and explicitly reset before each test but that feels kinda awkward esp just for tests...

I think it's better to just keep it as is for now. We have some confidence we're using a single instance since we are setting it at the storage.Options level and propagating it through the db w/ validation checks if not set.

@nbroyles
Copy link
Collaborator

Still LG @notbdu. Codegen failure looks lint related so nbd. 👍

@notbdu notbdu force-pushed the r/add-claims-for-index-segment-volume-indexes branch from 544f20e to 5fb09f5 Compare November 10, 2020 21:14
@codecov
Copy link

codecov bot commented Nov 10, 2020

Codecov Report

Merging #2846 (f1ededb) into master (5b5c050) will increase coverage by 0.0%.
The diff coverage is 83.6%.

Impacted file tree graph

@@           Coverage Diff            @@
##           master    #2846    +/-   ##
========================================
  Coverage    72.1%    72.1%            
========================================
  Files        1100     1101     +1     
  Lines       99968   100071   +103     
========================================
+ Hits        72077    72184   +107     
+ Misses      22941    22933     -8     
- Partials     4950     4954     +4     
Flag Coverage Δ
aggregator 75.8% <ø> (-0.1%) ⬇️
cluster 85.0% <ø> (ø)
collector 84.3% <ø> (ø)
dbnode 79.2% <83.6%> (+<0.1%) ⬆️
m3em 74.4% <ø> (ø)
m3ninx 73.1% <ø> (ø)
metrics 17.2% <ø> (ø)
msg 74.0% <ø> (-0.1%) ⬇️
query 68.8% <ø> (ø)
x 80.6% <ø> (+0.2%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.


Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 5b5c050...f1ededb. Read the comment docs.

@notbdu notbdu merged commit 251dc3d into master Nov 11, 2020
@notbdu notbdu deleted the r/add-claims-for-index-segment-volume-indexes branch November 11, 2020 05:14
soundvibe added a commit that referenced this pull request Nov 11, 2020
* master: (28 commits)
  [dbnode] Add claims for index segments volume index (#2846)
  [dbnode] Remove namespaces from example config and integration tests (#2866)
  [dbnode] Resurrect flaky test skip (#2868)
  [aggregator] Fix checkCampaignStateLoop (#2867)
  [dbnode] implement deletion method in namespace kvadmin service (#2861)
  Replace closer with resource package (#2864)
  Add coding style guide (#2831)
  Add GOVERNANCE.md to describe governance (#2830)
  Add COMPATIBILITY.md to describe version compatibility (#2829)
  Refactor etcd config as discovery section with convenience types (#2843)
  Refactor x/lockfile into dbnode/server (#2862)
  [lint] Disable nlreturn linter (#2865)
  [m3cluster] Expose placement algorithm in placement service (#2858)
  [etcd] Set reasonable cluster connection/sync settings by default (#2860)
  [dbnode] Use bits.LeadingZeros64 to improve encoder performance (#2857)
  Cleanup m3nsch leftovers (#2856)
  Update ci-scripts to correct coverage tracking (#2854)
  [aggregator] Process campaign state without waiting for first campaign check interval (#2855)
  Bump go to 1.14 (#2853)
  [query] Remove single series error from M3
  ...
@notbdu notbdu restored the r/add-claims-for-index-segment-volume-indexes branch November 12, 2020 05:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants