Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feature: Health warnings on long network ping times, add "dump_osd_network" to get a report #28755

Merged
merged 18 commits into from
Sep 5, 2019

Commits on Aug 26, 2019

  1. osd mon: Track heartbeat ping times and report health warning

    Fixes: http://tracker.ceph.com/issues/40640
    
    Signed-off-by: David Zafman <dzafman@redhat.com>
    dzafman committed Aug 26, 2019
    Configuration menu
    Copy the full SHA
    66d44e7 View commit details
    Browse the repository at this point in the history
  2. osd: Add "dump_osd_network" osd admin request to get a sorted report

    Signed-off-by: David Zafman <dzafman@redhat.com>
    dzafman committed Aug 26, 2019
    Configuration menu
    Copy the full SHA
    025b10a View commit details
    Browse the repository at this point in the history
  3. mgr: Add "dump_osd_network" mgr admin request to get a sorted report

    Signed-off-by: David Zafman <dzafman@redhat.com>
    dzafman committed Aug 26, 2019
    Configuration menu
    Copy the full SHA
    5d3c185 View commit details
    Browse the repository at this point in the history
  4. osd mgr mon: Add mon_warn_on_slow_ping_ratio config as 5% of osd_hear…

    …tbeat_grace
    
    Compute network ping threshold based on ratio (5% of 20 seconds is 1 second)
    Make the threshold value used part of dump_osd_network for osd and mgr
    Keep mon_warn_on_slow_ping_time (default 0) to optionally override the ratio
    
    Signed-off-by: David Zafman <dzafman@redhat.com>
    dzafman committed Aug 26, 2019
    Configuration menu
    Copy the full SHA
    0d1bbd3 View commit details
    Browse the repository at this point in the history
  5. doc: Add documentation and release notes

    Signed-off-by: David Zafman <dzafman@redhat.com>
    dzafman committed Aug 26, 2019
    Configuration menu
    Copy the full SHA
    f4a0be2 View commit details
    Browse the repository at this point in the history
  6. osd mgr: Add minimum and maximum tracking to network ping time

    Signed-off-by: David Zafman <dzafman@redhat.com>
    dzafman committed Aug 26, 2019
    Configuration menu
    Copy the full SHA
    297a0e7 View commit details
    Browse the repository at this point in the history
  7. osd mgr: Store last pingtime for possible graphing

    Signed-off-by: David Zafman <dzafman@redhat.com>
    dzafman committed Aug 26, 2019
    Configuration menu
    Copy the full SHA
    3f846d7 View commit details
    Browse the repository at this point in the history
  8. osd: After first interval populate vectors so 5min/15min values aren't 0

    Signed-off-by: David Zafman <dzafman@redhat.com>
    dzafman committed Aug 26, 2019
    Configuration menu
    Copy the full SHA
    6555699 View commit details
    Browse the repository at this point in the history
  9. osd mon: Add last_update to osd_stat_t heartbeat info

    Ignore old heartbeat info which hasn't updated
    
    Signed-off-by: David Zafman <dzafman@redhat.com>
    dzafman committed Aug 26, 2019
    Configuration menu
    Copy the full SHA
    ea20d35 View commit details
    Browse the repository at this point in the history
  10. mon: Indicate when an osd with slow ping time is down

    Signed-off-by: David Zafman <dzafman@redhat.com>
    dzafman committed Aug 26, 2019
    Configuration menu
    Copy the full SHA
    5ab145d View commit details
    Browse the repository at this point in the history
  11. osd mgr: Add osd_mon_heartbeat_stat_stale option to time out ping info

    after 1 hour
    
    Signed-off-by: David Zafman <dzafman@redhat.com>
    dzafman committed Aug 26, 2019
    Configuration menu
    Copy the full SHA
    048f809 View commit details
    Browse the repository at this point in the history
  12. osd: Add debug_disable_randomized_ping config for use in testing

    Signed-off-by: David Zafman <dzafman@redhat.com>
    dzafman committed Aug 26, 2019
    Configuration menu
    Copy the full SHA
    f2b26d8 View commit details
    Browse the repository at this point in the history
  13. osd: Add debug_heartbeat_testing_span to allow quicker testing

    Signed-off-by: David Zafman <dzafman@redhat.com>
    dzafman committed Aug 26, 2019
    Configuration menu
    Copy the full SHA
    573aea2 View commit details
    Browse the repository at this point in the history
  14. test: Add basic test for network ping tracking

    Signed-off-by: David Zafman <dzafman@redhat.com>
    dzafman committed Aug 26, 2019
    Configuration menu
    Copy the full SHA
    4fb42ea View commit details
    Browse the repository at this point in the history
  15. common: Add support routines to generate strings for fixed point

    Signed-off-by: David Zafman <dzafman@redhat.com>
    dzafman committed Aug 26, 2019
    Configuration menu
    Copy the full SHA
    8ac1562 View commit details
    Browse the repository at this point in the history
  16. osd mon mgr: Convert all network ping time output to milliseconds

    To output milliseconds (usec / 1000), treat as fixed point integers
    
    Signed-off-by: David Zafman <dzafman@redhat.com>
    dzafman committed Aug 26, 2019
    Configuration menu
    Copy the full SHA
    9d02e5d View commit details
    Browse the repository at this point in the history

Commits on Sep 4, 2019

  1. osd doc mon mgr: To milliseconds for config value, user input and thr…

    …eshold out
    
    Signed-off-by: David Zafman <dzafman@redhat.com>
    dzafman committed Sep 4, 2019
    Configuration menu
    Copy the full SHA
    5f83a61 View commit details
    Browse the repository at this point in the history
  2. doc: Document network performance monitoring

    Signed-off-by: David Zafman <dzafman@redhat.com>
    dzafman committed Sep 4, 2019
    Configuration menu
    Copy the full SHA
    71015b9 View commit details
    Browse the repository at this point in the history