Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improved location claim algorithm #1008

Merged
merged 32 commits into from
Jun 12, 2023
Merged

Commits on Mar 27, 2023

  1. Configuration and module alignment

    Add claim function and target_n_val configuration into cuttlefish.
    
    Move modules around to try and make it more obvious where functions used in membership reside.
    martinsumner committed Mar 27, 2023
    Configuration menu
    Copy the full SHA
    234e863 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    e1ce784 View commit details
    Browse the repository at this point in the history
  3. Update framework for Claim

    Stops fake wants being required to prompt claim on a location change.
    
    Allow for a claim module to implement a sort_members_for_choose(Ring, Members, Owners) -> SortedMembers function, to pre-sort the members being passed into claim_rebalance.
    
    Add further specs.
    martinsumner committed Mar 27, 2023
    Configuration menu
    Copy the full SHA
    28fbcf9 View commit details
    Browse the repository at this point in the history
  4. Add choose_claim_v4

    martinsumner committed Mar 27, 2023
    Configuration menu
    Copy the full SHA
    eb13a48 View commit details
    Browse the repository at this point in the history

Commits on Mar 29, 2023

  1. Location claim improvements

    Location claim improved so that it will try to balance the spread of vnodes, if it reaches the end and is still unbalanced.
    
    Also uses a stronger meets_target_n to fallback to sequential_claim more reliably on incorrect spacing (of vnodes across nodes, but not yet across locations).
    martinsumner committed Mar 29, 2023
    Configuration menu
    Copy the full SHA
    35b9e2f View commit details
    Browse the repository at this point in the history

Commits on Mar 30, 2023

  1. Refinements to claim_v4

    Extended potential for test by determining what nodes are safe to both add and remove from loops, rather than simply relying on sequential order.
    martinsumner committed Mar 30, 2023
    Configuration menu
    Copy the full SHA
    4e2c963 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    3ab0f4b View commit details
    Browse the repository at this point in the history

Commits on Mar 31, 2023

  1. Configuration menu
    Copy the full SHA
    8fe7fc7 View commit details
    Browse the repository at this point in the history
  2. Better order of initial striping

    Resolves some test issues.  Also try harder to do safe removals when looking back in the ring (as opposed to the removal other additions)
    martinsumner committed Mar 31, 2023
    Configuration menu
    Copy the full SHA
    58d6264 View commit details
    Browse the repository at this point in the history

Commits on May 3, 2023

  1. A new claim algorithm (#1003)

    * Support two transition changes
    
    Where the second transition is triggered by a change of location.  Need to ensure that the location_changed status update is recognised in the ring
    
    * Unrelated fix to remove reference to gen_fsm_compat
    
    * unrelated fix to get rid of deprecation warning
    
    * Testing claim
    
    * The new claim algorithm as purely functional algorithm
    
    * add new entry for version 5 claiming
    
    * Refactor v5 into v4
    
    * move impossible config test to place where we actually may enter recursion
    
    * Documentation
    
    The algorithm should be described in more detail in a markup document
    
    * Allow configurations with zero nodes in location for better placement update
    
    This works better when a location is emptied on nodes.
    Less transfers.
    
    * Keep order of nodes to avoid back translate issue
    
    ---------
    
    Co-authored-by: Martin Sumner <martin.sumner@adaptip.co.uk>
    ThomasArts and martinsumner committed May 3, 2023
    Configuration menu
    Copy the full SHA
    bfe605b View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    278ffa4 View commit details
    Browse the repository at this point in the history

Commits on May 9, 2023

  1. Always return indices

    Otherwise, if all vnodes have become excluded there is no escape from this condition (unless other traffic can trigger the creation of vnodes).  This is helpful in situations where transfers are performed on standby clusters with no other traffic.
    
    This commit also logs a timing of the claim each time it is called.
    martinsumner committed May 9, 2023
    Configuration menu
    Copy the full SHA
    e13f4e0 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    e5083f1 View commit details
    Browse the repository at this point in the history

Commits on May 11, 2023

  1. Remember v4 solutions via claimant

    To allow for the riak_core_ring_manager and riak_core_claimant to remember v4 solutions, they are shared via the state of the claimant
    martinsumner committed May 11, 2023
    Configuration menu
    Copy the full SHA
    6820c7b View commit details
    Browse the repository at this point in the history

Commits on May 15, 2023

  1. Long-running tests

    martinsumner committed May 15, 2023
    Configuration menu
    Copy the full SHA
    6527033 View commit details
    Browse the repository at this point in the history

Commits on May 17, 2023

  1. Adding an extra test (#1004)

    * Add an extra test to show owners may stay the same if only location changes
    
    This is not always the case, but holds when there is a solution in the first place
    
    * Fix type error that dialyzer could not find
    
    * Introduce necessary conditions to fallback to version 2
    
    * update tests
    
    * Check whether it is worth to use brute force
    
    * make historic values the norm
    
    * Introduce nvals map type
    
    * Take nr nodes into account when checking for brute force cond.
    
    * Property to evaluate skipping brute force strategy
    
    * QuickCheck property starts with choosing ring size.
    
    * Remove fallback for necessary conditions
    
    * Filter tests to get away with flakyness
    
    * In order to re-run tests suite, remove strict precondition
    
    * Check in test suite
    
    * Replace claim_suite.suite by larger claim.suite
    
    * Sometimes it is worth to brute_force to a zero node violation
    
    * better documentation binring algorithm
    
    * Run property with a sufficient condition
    ThomasArts committed May 17, 2023
    Configuration menu
    Copy the full SHA
    0da5c68 View commit details
    Browse the repository at this point in the history
  2. Revert "Long-running tests"

    This reverts commit 6527033.
    martinsumner committed May 17, 2023
    Configuration menu
    Copy the full SHA
    be0451c View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    e7329f0 View commit details
    Browse the repository at this point in the history
  4. Test adjustments

    martinsumner committed May 17, 2023
    Configuration menu
    Copy the full SHA
    c43d0a6 View commit details
    Browse the repository at this point in the history

Commits on May 18, 2023

  1. Test adjustments

    martinsumner committed May 18, 2023
    Configuration menu
    Copy the full SHA
    b6dcbc2 View commit details
    Browse the repository at this point in the history

Commits on May 19, 2023

  1. Configuration menu
    Copy the full SHA
    2dd845f View commit details
    Browse the repository at this point in the history
  2. Memoise fixes

    The cache of v4 solutions is required by the ring_manager and the claimant - so specifically update both of these processes each time.  Otherwise cache will be missed when the ring_manager calls to riak_core_claimant:ring_changed/2.
    
    There is a fix to the last gasp check before writing the ring file.  prune_write_notify_ring function does not care if the write of a ring errors - so error for this function rather than crashing the ring manager.  This otherwise causes instability in location tests.
    martinsumner committed May 19, 2023
    Configuration menu
    Copy the full SHA
    0c05dc4 View commit details
    Browse the repository at this point in the history

Commits on May 23, 2023

  1. Example configurations saves in source format (#1005)

    * Remove pre-computed test suite
    
    * cleanup
    
    * Make claim_eqc tests not fail on weird configs by supplying a diverse list of options
    ThomasArts committed May 23, 2023
    Configuration menu
    Copy the full SHA
    057b17e View commit details
    Browse the repository at this point in the history
  2. Add full-rebalance for v4

    The leave call on a failure of simple_transfer will call sequential_claim - which is part of the v2 claim family.  Now we have v4, if this is configured it should call v4 as it does handle leaves.
    martinsumner committed May 23, 2023
    Configuration menu
    Copy the full SHA
    c9ca336 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    ba4ef70 View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    470a57f View commit details
    Browse the repository at this point in the history
  5. Update - to use correct claim_fun on leave

    Property temporarily changed to consider only failures with locations
    martinsumner committed May 23, 2023
    Configuration menu
    Copy the full SHA
    c6a3dd5 View commit details
    Browse the repository at this point in the history

Commits on May 25, 2023

  1. Use application env to read target_n_val (#1007)

    * Use application env to read target_n_val
    
    * Re-introduce v2 in riak_core_claim_eqc
    
    * Move precondition to postcondition to also test less perfect cases
    
    * Fixed error in transfer_node usage
    
    * cleanup not using remove_from_cluster/5.
    ThomasArts committed May 25, 2023
    Configuration menu
    Copy the full SHA
    a49697c View commit details
    Browse the repository at this point in the history
  2. Add warning if simple_transfer produces unbalanced result

    In this case - full_rebalance should be enabled
    martinsumner committed May 25, 2023
    Configuration menu
    Copy the full SHA
    60e7199 View commit details
    Browse the repository at this point in the history

Commits on May 26, 2023

  1. only_swap/swap_only confusion

    Add recommendation to use full_rebalance_on_leave for locations
    martinsumner committed May 26, 2023
    Configuration menu
    Copy the full SHA
    57675fb View commit details
    Browse the repository at this point in the history

Commits on May 27, 2023

  1. Configuration menu
    Copy the full SHA
    bf1e668 View commit details
    Browse the repository at this point in the history

Commits on Jun 12, 2023

  1. Mas i1001 docupdate (#1009)

    * Change doc
    
    Change introduction to refer to vnodes and nodes.  Removes the recommendation not to vary location_n_val and node_n_val.
    
    * Update comments on having different target n_vals
    
    * Further doc updates
    
    * Update docs/claim-version4.md
    
    * Update docs/claim-version4.md
    
    * Update docs/claim-version4.md
    
    Co-authored-by: Thomas Arts <thomas.arts@quviq.com>
    
    * Update docs/claim-version4.md
    
    Co-authored-by: Thomas Arts <thomas.arts@quviq.com>
    
    ---------
    
    Co-authored-by: Thomas Arts <thomas.arts@quviq.com>
    martinsumner and ThomasArts committed Jun 12, 2023
    Configuration menu
    Copy the full SHA
    5d8912f View commit details
    Browse the repository at this point in the history