Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[IMPROVEMENT] General stability and bug fixes. #3999

Merged
merged 20 commits into from
Mar 30, 2023
Merged

Commits on Mar 29, 2023

  1. Bug fixes and general stability improvements.

    1. Fixed a bug that would process a removal of a message after the message block was closed.
    2. Improved removal of non-existant message when we know the store is empty.
    3. Improved last write index size tracking when opening the file descriptor after being closed.
    4. Improved Compact() by not loading messages for last block twice.
    5. Improved Compact() determination of calling purge by determing last sequence under write lock.
    6. Improved Compact() by only compacting underlying message block if over certain size threshold.
    7. Improved Compact() by writing the index file if needed while still holding lock avoiding an unecessary re-lock.
    8. Improved Compact() by not calling out to upper layers on no messages being purged.
    9. Fixed a bug in Compact() that would not delete members from a block's delete map.
    10. Fixed a bug in reset() when a callback was not registered (raft logs) which avoiding msg block cleanup.
    11. Improved consumer store Update() call for when to avoid an outdated update.
    
    Signed-off-by: Derek Collison <derek@nats.io>
    derekcollison committed Mar 29, 2023
    Configuration menu
    Copy the full SHA
    6d43041 View commit details
    Browse the repository at this point in the history
  2. Bug fixes and general stability improvements.

    1. If reset ignore Applied() that are greater then our commit.
    2. Improved StepDown() by placing at back of queue if preferred.
    3. Improved handling of leadership transfer during StepDown().
    4. Do not store EntryLeaderTransfer records on disk.
    5. Remove un-needed processing of older terms.
    6. If append entry has higher term, also inherit pterm.
    7. Only inherit a candidate's term if we decide to vote for them.
    
    Signed-off-by: Derek Collison <derek@nats.io>
    derekcollison committed Mar 29, 2023
    Configuration menu
    Copy the full SHA
    182bf6c View commit details
    Browse the repository at this point in the history
  3. Improvements to consumers attached to an interest retention stream.

    1. Do not process an ack if we are closed.
    2. When checking for needing an ack for a given consumer, hold lock entire time.
    3. During recovery and restarts we check if we need to replay acks to the parent stream.
    
    Signed-off-by: Derek Collison <derek@nats.io>
    derekcollison committed Mar 29, 2023
    Configuration menu
    Copy the full SHA
    e516c47 View commit details
    Browse the repository at this point in the history
  4. General improvements around handling interest retention.

    1. During ackMsg processing hold write lock to block concurrent access.
    2. Check for presence of preAcks before and force removal if present.
    3. Rework check for orphan msgs on startup to use checkStateForInterestStream().
    
    Signed-off-by: Derek Collison <derek@nats.io>
    derekcollison committed Mar 29, 2023
    Configuration menu
    Copy the full SHA
    5cabc36 View commit details
    Browse the repository at this point in the history
  5. General improvements to interest based stream processing when acks ar…

    …rive before the actual msgs.
    
    1. If we are retention based, make sure our consumers are running before entering into monitorStream logic.
    2. If we skip messages and are interest based, make sure we check for a preAck state.
    3. On finalization of recovery for consumers have them check against the interest based stream.
    4. Do not process ack state updates if consumer is closed and shutting down.
    5. When processing final state for a stream after upper layer catchup, check all attached consumers for ack skew.
    6. During catchup of stream messages consult preAck state and skip messages as needed.
    
    Signed-off-by: Derek Collison <derek@nats.io>
    derekcollison committed Mar 29, 2023
    Configuration menu
    Copy the full SHA
    71af150 View commit details
    Browse the repository at this point in the history
  6. Additional tests to stress interest based streams with pull subscribe…

    …rs during rolling restarts.
    
    Signed-off-by: Derek Collison <derek@nats.io>
    derekcollison committed Mar 29, 2023
    Configuration menu
    Copy the full SHA
    0d9f707 View commit details
    Browse the repository at this point in the history
  7. Since we no longer store leaderTransfers, which is proper, some tests…

    … were getting and advantage on that after server restart.
    
    This change speeds up raft layer more to avoid timeouts.
    
    Signed-off-by: Derek Collison <derek@nats.io>
    derekcollison committed Mar 29, 2023
    Configuration menu
    Copy the full SHA
    52fbac6 View commit details
    Browse the repository at this point in the history
  8. Tweak tests due to changes, make test timeouts uniform.

    Signed-off-by: Derek Collison <derek@nats.io>
    derekcollison committed Mar 29, 2023
    Configuration menu
    Copy the full SHA
    e97ddcd View commit details
    Browse the repository at this point in the history
  9. Make sure consumer is valid and state was returned

    Signed-off-by: Derek Collison <derek@nats.io>
    derekcollison committed Mar 29, 2023
    Configuration menu
    Copy the full SHA
    c4da37e View commit details
    Browse the repository at this point in the history
  10. Snapshots of no length can hold state as well

    Signed-off-by: Derek Collison <derek@nats.io>
    derekcollison committed Mar 29, 2023
    Configuration menu
    Copy the full SHA
    35d1a77 View commit details
    Browse the repository at this point in the history
  11. Fix for flapping test

    Signed-off-by: Derek Collison <derek@nats.io>
    derekcollison committed Mar 29, 2023
    Configuration menu
    Copy the full SHA
    a9a4df8 View commit details
    Browse the repository at this point in the history
  12. Additional protection for bad state when rebuilding a message block

    Signed-off-by: Derek Collison <derek@nats.io>
    derekcollison committed Mar 29, 2023
    Configuration menu
    Copy the full SHA
    ddfa5cd View commit details
    Browse the repository at this point in the history
  13. Always make sure cluster and meta raft node available when needed

    Signed-off-by: Derek Collison <derek@nats.io>
    derekcollison committed Mar 29, 2023
    Configuration menu
    Copy the full SHA
    6c3e64b View commit details
    Browse the repository at this point in the history
  14. On bad or corrupt message load during commit, reset WAL vs mark write…

    … error
    
    Signed-off-by: Derek Collison <derek@nats.io>
    derekcollison committed Mar 29, 2023
    Configuration menu
    Copy the full SHA
    e274693 View commit details
    Browse the repository at this point in the history
  15. Double check here if the jetstream cluster was shutdown when we relea…

    …sed the lock
    
    Signed-off-by: Derek Collison <derek@nats.io>
    derekcollison committed Mar 29, 2023
    Configuration menu
    Copy the full SHA
    2b89fea View commit details
    Browse the repository at this point in the history
  16. Update server/jetstream_cluster.go

    Pre-allocate
    
    Co-authored-by: Neil <neil@nats.io>
    derekcollison and neilalexander authored Mar 29, 2023
    Configuration menu
    Copy the full SHA
    c77872b View commit details
    Browse the repository at this point in the history
  17. Update server/stream.go

    Pre-allocate
    
    Co-authored-by: Neil <neil@nats.io>
    derekcollison and neilalexander authored Mar 29, 2023
    Configuration menu
    Copy the full SHA
    152b25c View commit details
    Browse the repository at this point in the history
  18. Update based on review feedback

    Signed-off-by: Derek Collison <derek@nats.io>
    derekcollison committed Mar 29, 2023
    Configuration menu
    Copy the full SHA
    9a714e7 View commit details
    Browse the repository at this point in the history
  19. Snapshot meta for this function to use in case it gets removed out fr…

    …om underneath of us.
    
    Signed-off-by: Derek Collison <derek@nats.io>
    derekcollison committed Mar 29, 2023
    Configuration menu
    Copy the full SHA
    ade0e9d View commit details
    Browse the repository at this point in the history
  20. Moved log running test to NoRace suite

    Signed-off-by: Derek Collison <derek@nats.io>
    derekcollison committed Mar 29, 2023
    Configuration menu
    Copy the full SHA
    c546828 View commit details
    Browse the repository at this point in the history