Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mon: Monitor: validate prefix on handle_command() #9700

Merged
merged 1 commit into from Jun 30, 2016
Merged

mon: Monitor: validate prefix on handle_command() #9700

merged 1 commit into from Jun 30, 2016

Conversation

JiYou
Copy link
Contributor

@JiYou JiYou commented Jun 14, 2016

This patch mainly fix issue:

http://tracker.ceph.com/issues/16297

Ceph monitors will crash when mon_command receive empty prefix sent by rados.py. Such as: use below script to query pool stats info, will cause ceph monitors crash.

"""Utilities and helper functions."""

import json
import subprocess
import logging
import rados


LOG = logging.getLogger(__name__)


class CephConfig(dict):
    """
    Validate ceph cluster connection configs
    """

    def __init__(self, cluster_name):
        dict.__init__(self)

        config = {"ceph_config": "/etc/ceph/%s.conf" % cluster_name,
                  "client_name": "client.admin",
                  "keyring": "/etc/ceph/%s.client.admin.keyring" % cluster_name}
        self['conffile'] = config['ceph_config']
        self['conf'] = dict()

        if 'keyring' in config:
            self['conf']['keyring'] = config['keyring']

        if 'client_name' in config:
            self['name'] = config['client_name']


def ceph_status(cluster_name, timeout):
    """Use rados client to fetch the information."""
    def _run(cluster, timeout, **kwargs):
        ret, buf, err = cluster.mon_command(json.dumps(kwargs), '',
                                            timeout=timeout)
        if ret != 0:
            LOG.error('Run ceph with cluster.mon_command meets error %s' % err)
            return None, None, None

        return json.loads(buf)

    config = CephConfig(cluster_name)
    with rados.Rados(**config) as cluster:
        status = _run(cluster, timeout,
                      abc="osd pool stats",  # <<-- you may type anything here (not prefix="xxx") will cause monitor crush.
                      format="json")
        return status
    LOG.error('Cat error when use rados.Rados to fetch info.')
    return None

# set cluster name as ceph
status = ceph_status('ceph', 60)
print json.dumps(status, indent=2)

If mon_command does not contain prefix=xxx parameter, ceph monitor will get empty prefix, then monitor will crash.

Signed-off-by: You Ji youji@ebay.com

return;
}

if (!valid_cmd) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest moving this check just below the cmd_getval() call. No need to check if prefix is empty if the call failed and it will better outline the what comes before what.

@jecluis
Copy link
Member

jecluis commented Jun 14, 2016

Have a few comments on your patch.

Further, please replaces all occurrences of 'crush' with 'crash'. While this would otherwise be a minor typo, we don't want to pollute the history with a commit that may unintentionally refer CRUSH.

Also, adjust your commit message title. Instead of

mon/bug16297: ceph monitor will crush when get empty command prefix

please consider

mon: Monitor: validate prefix on handle_command()

Also, this pull request lacks tests. I recommend implementing a few test cases using an existing test [1] or create a new unit test.

[1] - test/mon/test-mon-msg.cc -- please be aware that the test may require some tweaks if you want to wait for a reply message to consider success in case of error.

@jecluis jecluis self-assigned this Jun 14, 2016
@jecluis jecluis added this to the kraken milestone Jun 14, 2016
@JiYou JiYou changed the title mon/bug16297: ceph monitor will crush when get empty command prefix mon/bug16297: ceph monitor will crash when get empty command prefix Jun 14, 2016
@JiYou JiYou changed the title mon/bug16297: ceph monitor will crash when get empty command prefix mon: Monitor: validate prefix on handle_command() Jun 14, 2016
@JiYou
Copy link
Contributor Author

JiYou commented Jun 14, 2016

thanks @jecluis

@JiYou
Copy link
Contributor Author

JiYou commented Jun 15, 2016

thanks @jecluis @xiaoxichen

Now I'm writing test cases: I've find below testcases which may cause exists code to crash.

Case 1

If no prefix parameter provided.

def _run(cluster, timeout, **kwargs):
    ret, buf, err = cluster.mon_command(json.dumps(kwargs), '', timeout=timeout)
with rados.Rados(**config) as cluster:
    status = _run(cluster, timeout)

Ceph monitor will crash with logs (I've add some outputs)

        -8> 2016-06-14 22:23:36.231615 7f72f7622700  1 -- 10.147.250.178:6789/0 <== client.314116 10.147.250.178:0/2670976543 4 ==== mon_command({ } v 0) v1 ==== 48+0+0 (3130553754 0 0) 0x3b31800 con 0x3e86420
        -7> 2016-06-14 22:23:36.231746 7f72f7622700  0 mon.cceph-5635@2(peon) e2 is_valid = 0
        -6> 2016-06-14 22:23:36.231771 7f72f7622700  0 mon.cceph-5635@2(peon) e2 prefix =
        -5> 2016-06-14 22:23:36.231784 7f72f7622700  0 mon.cceph-5635@2(peon) e2 handle_command mon_command({ } v 0) v1
        -4> 2016-06-14 22:23:36.231800 7f72f7622700  0 mon.cceph-5635@2(peon) e2 format = plain
        -1> 2016-06-14 22:23:36.231868 7f72f7622700  0 mon.cceph-5635@2(peon) e2 fullcmd.length = 0
     0> 2016-06-14 22:23:36.234271 7f72f7622700 -1 *** Caught signal (Segmentation fault) **

Case 2

if no prefix parameter provided, extra parameters provided.

def _run(cluster, timeout, **kwargs):
    ret, buf, err = cluster.mon_command(json.dumps(kwargs), '', timeout=timeout)
with rados.Rados(**config) as cluster:
    status = _run(cluster, timeout, abc='something', format='json')

Will raise the same error as Case 1

Case 3

If set prefix="" in parameter:

def _run(cluster, timeout, **kwargs):
    ret, buf, err = cluster.mon_command(json.dumps(kwargs), '',  timeout=timeout)
with rados.Rados(**config) as cluster:
    status = _run(cluster, timeout, prefix="", format='json')

Will get error as below:

        -7> 2016-06-14 21:42:58.347244 7f06ad1b9700  0 mon.bceph-4634@1(peon) e2 is_valid = 1
        -6> 2016-06-14 21:42:58.347251 7f06ad1b9700  0 mon.bceph-4634@1(peon) e2prefix =
        -5> 2016-06-14 21:42:58.347253 7f06ad1b9700  0 mon.bceph-4634@1(peon) e2 handle_command mon_command({ " p r e f i x " :   " " ,   " f o r m a t " :   " j s o n " } v 0) v1
        -4> 2016-06-14 21:42:58.347447 7f06ad1b9700  0 mon.bceph-4634@1(peon) e2 format = json
        -1> 2016-06-14 21:42:58.347470 7f06ad1b9700  0 mon.bceph-4634@1(peon) e2 fullcmd.length = 0
     0> 2016-06-14 21:42:58.349774 7f06ad1b9700 -1 *** Caught signal (Segmentation fault) **

Case 4

set prefix=" ...(a lot of spaces).."

def _run(cluster, timeout, **kwargs):
    ret, buf, err = cluster.mon_command(json.dumps(kwargs), '', timeout=timeout)
with rados.Rados(**config) as cluster:
    status = _run(cluster, timeout, prefix="          ", format='json')

Will get the error as below:

        -7> 2016-06-14 21:48:38.211439 7f54f4c5a700  0 mon.dceph-6636@3(peon) e2 is_valid = 1
        -6> 2016-06-14 21:48:38.211446 7f54f4c5a700  0 mon.dceph-6636@3(peon) e2 prefix =  "         "   // print with spaces here.
        -5> 2016-06-14 21:48:38.211447 7f54f4c5a700  0 mon.dceph-6636@3(peon) e2 handle_command mon_command({ " p r e f i x " :   "           " ,   " f o r m a t " :   " j s o n " } v 0) v1
        -4> 2016-06-14 21:48:38.211716 7f54f4c5a700  0 mon.dceph-6636@3(peon) e2 format = json
        -1> 2016-06-14 21:48:38.211736 7f54f4c5a700  0 mon.dceph-6636@3(peon) e2 fullcmd.length = 0
     0> 2016-06-14 21:48:38.214421 7f54f4c5a700 -1 *** Caught signal (Segmentation fault) **

Case 5

set prefix=";;;,,,;;;;"

def _run(cluster, timeout, **kwargs):
    ret, buf, err = cluster.mon_command(json.dumps(kwargs), '', timeout=timeout)
with rados.Rados(**config) as cluster:
    status = _run(cluster, timeout, prefix="          ", format='json')

Will get the error as below:

        -7> 2016-06-14 21:48:38.211439 7f54f4c5a700  0 mon.dceph-6636@3(peon) e2 is_valid = 1
        -6> 2016-06-14 21:48:38.211446 7f54f4c5a700  0 mon.dceph-6636@3(peon) e2 prefix =  ;;;,,,;;;;
        -5> 2016-06-14 21:48:38.211447 7f54f4c5a700  0 mon.dceph-6636@3(peon) e2 handle_command mon_command({ " p r e f i x " :   ";;;,,,;;;;",   " f o r m a t " :   " j s o n " } v 0) v1
        -4> 2016-06-14 21:48:38.211716 7f54f4c5a700  0 mon.dceph-6636@3(peon) e2 format = json
        -1> 2016-06-14 21:48:38.211736 7f54f4c5a700  0 mon.dceph-6636@3(peon) e2 fullcmd.length = 0
     0> 2016-06-14 21:48:38.214421 7f54f4c5a700 -1 *** Caught signal (Segmentation fault) **

Case 6

if set prefix=";x;a;" etc, monitors will not crash, continue to running, clients will get error message.

@JiYou
Copy link
Contributor Author

JiYou commented Jun 15, 2016

@jecluis @xiaoxichen
I think add test for this patch may use: test/librados/cmd.cc, while here are some existing test cases for ceph monitor command. While [1] provided by @jecluis is mainly for testing monitor messages.

[1] - test/mon/test-mon-msg.cc

@xiaoxichen
Copy link
Contributor

@dachary , seems the check failed with timeout, is that any issue in jenkins slave?

@ghost
Copy link

ghost commented Jun 16, 2016

jenkins, could you test this please (timeout) ?

@JiYou
Copy link
Contributor Author

JiYou commented Jun 16, 2016

@xiaoxichen @dachary

Why this could happened? CI running a fixed time, longer that that time, tests will be terminated?

PASS: test/cephtool-test-mon.sh
...
make: *** [check-recursive] Terminated 
make[2]: *** [check-recursive] Terminated
make[5]: make[4]: *** [check-TESTS] Terminated
Build step 'Execute shell' marked build as failure
[PostBuildScript] - Execution post build scripts.
[ceph-pull-requests] $ /bin/sh -xe /tmp/hudson5552035644780642355.sh
+ sudo reboot
[BFA] Scanning build for known causes...
..[BFA] Found failure cause(s):
[BFA] Unexpected die of job from category Development
[BFA] Done. 2s
Setting status of 0b5dce697b22598bf29c54fdb186cc51ebdffffb to FAILURE with url https://jenkins.ceph.com/job/ceph-pull-requests/7416/ and message: 'Build finished. '
Finished: FAILURE

@JiYou
Copy link
Contributor Author

JiYou commented Jun 17, 2016

@dachary

I've run make check on my desktop. And find out below:

Case 1
find some make process hang there.

root@youji-work:~# ps aux | grep make
root     10364  0.0  0.0  11656  1088 ?        S     6月16   0:00 make check
root     10365  0.0  0.0  16644  1548 ?        S     6月16   0:00 /bin/bash -c fail=; \ if (target_option=k; case ${target_option-} in ?) ;; *) echo "am__make_running_with_option: internal error: invalid" "target option '${target_option-}' specified" >&2; exit 1;; esac; has_opt=no; sane_makeflags=$MAKEFLAGS; if test -n ' Makefile' && test -n '0'; then sane_makeflags=$MFLAGS; else case $MAKEFLAGS in *\\[\ \?]*) bs=\\; sane_makeflags=`printf '%s\n' "$MAKEFLAGS" | sed "s/$bs$bs[$bs $bs?]*//g"`;; esac; fi; skip_next=no; strip_trailopt () { flg=`printf '%s\n' "$flg" | sed "s/$1.*$//"`; }; for flg in $sane_makeflags; do test $skip_next = yes && { skip_next=no; continue; }; case $flg in *=*|--*) continue;; -*I) strip_trailopt 'I'; skip_next=yes;; -*I?*) strip_trailopt 'I';; -*O) strip_trailopt 'O'; skip_next=yes;; -*O?*) strip_trailopt 'O';; -*l) strip_trailopt 'l'; skip_next=yes;; -*l?*) strip_trailopt 'l';; -[dEDm]) skip_next=yes;; -[JT]) skip_next=yes;; esac; case $flg in *$target_option*) has_opt=yes; break;; esac; done; test $has_opt = yes); then \   failcom='fail=yes'; \ else \   failcom='exit 1'; \ fi; \ dot_seen=no; \ target=`echo check-recursive | sed s/-recursive//`; \ case "check-recursive" in \   distclean-* | maintainer-clean-*) list='. src man doc systemd selinux' ;; \   *) list='. src man doc systemd selinux' ;; \ esac; \ for subdir in $list; do \   echo "Making $target in $subdir"; \   if test "$subdir" = "."; then \     dot_seen=yes; \     local_target="$target-am"; \   else \     local_target="$target"; \   fi; \   (CDPATH="${ZSH_VERSION+.}:" && cd $subdir && make  $local_target) \   || eval $failcom; \ done; \ if test "$dot_seen" = "no"; then \   make  "$target-am" || exit 1; \ fi; test -z "$fail"
root     13181  0.0  0.0  15940   972 pts/5    S+   06:00   0:00 grep --color=auto make
root     14395  0.0  0.0  16644   696 ?        S     6月16   0:00 /bin/bash -c fail=; \ if (target_option=k; case ${target_option-} in ?) ;; *) echo "am__make_running_with_option: internal error: invalid" "target option '${target_option-}' specified" >&2; exit 1;; esac; has_opt=no; sane_makeflags=$MAKEFLAGS; if test -n ' Makefile' && test -n '0'; then sane_makeflags=$MFLAGS; else case $MAKEFLAGS in *\\[\ \?]*) bs=\\; sane_makeflags=`printf '%s\n' "$MAKEFLAGS" | sed "s/$bs$bs[$bs $bs?]*//g"`;; esac; fi; skip_next=no; strip_trailopt () { flg=`printf '%s\n' "$flg" | sed "s/$1.*$//"`; }; for flg in $sane_makeflags; do test $skip_next = yes && { skip_next=no; continue; }; case $flg in *=*|--*) continue;; -*I) strip_trailopt 'I'; skip_next=yes;; -*I?*) strip_trailopt 'I';; -*O) strip_trailopt 'O'; skip_next=yes;; -*O?*) strip_trailopt 'O';; -*l) strip_trailopt 'l'; skip_next=yes;; -*l?*) strip_trailopt 'l';; -[dEDm]) skip_next=yes;; -[JT]) skip_next=yes;; esac; case $flg in *$target_option*) has_opt=yes; break;; esac; done; test $has_opt = yes); then \   failcom='fail=yes'; \ else \   failcom='exit 1'; \ fi; \ dot_seen=no; \ target=`echo check-recursive | sed s/-recursive//`; \ case "check-recursive" in \   distclean-* | maintainer-clean-*) list='. src man doc systemd selinux' ;; \   *) list='. src man doc systemd selinux' ;; \ esac; \ for subdir in $list; do \   echo "Making $target in $subdir"; \   if test "$subdir" = "."; then \     dot_seen=yes; \     local_target="$target-am"; \   else \     local_target="$target"; \   fi; \   (CDPATH="${ZSH_VERSION+.}:" && cd $subdir && make  $local_target) \   || eval $failcom; \ done; \ if test "$dot_seen" = "no"; then \   make  "$target-am" || exit 1; \ fi; test -z "$fail"
root     14396  0.0  0.1  63952 53540 ?        S     6月16   0:03 make check
root     14400  0.0  0.1  63952 53536 ?        S     6月16   0:03 make check-recursive
root     14404  0.0  0.0  16652  1552 ?        S     6月16   0:00 /bin/bash -c fail=; \ if (target_option=k; case ${target_option-} in ?) ;; *) echo "am__make_running_with_option: internal error: invalid" "target option '${target_option-}' specified" >&2; exit 1;; esac; has_opt=no; sane_makeflags=$MAKEFLAGS; if { if test -z '2'; then false; elif test -n ''; then true; elif test -n '3.81' && test -n '/var/www/html/new/youji/ceph/src'; then true; else false; fi; }; then sane_makeflags=$MFLAGS; else case $MAKEFLAGS in *\\[\ \?]*) bs=\\; sane_makeflags=`printf '%s\n' "$MAKEFLAGS" | sed "s/$bs$bs[$bs $bs?]*//g"`;; esac; fi; skip_next=no; strip_trailopt () { flg=`printf '%s\n' "$flg" | sed "s/$1.*$//"`; }; for flg in $sane_makeflags; do test $skip_next = yes && { skip_next=no; continue; }; case $flg in *=*|--*) continue;; -*I) strip_trailopt 'I'; skip_next=yes;; -*I?*) strip_trailopt 'I';; -*O) strip_trailopt 'O'; skip_next=yes;; -*O?*) strip_trailopt 'O';; -*l) strip_trailopt 'l'; skip_next=yes;; -*l?*) strip_trailopt 'l';; -[dEDm]) skip_next=yes;; -[JT]) skip_next=yes;; esac; case $flg in *$target_option*) has_opt=yes; break;; esac; done; test $has_opt = yes); then \   failcom='fail=yes'; \ else \   failcom='exit 1'; \ fi; \ dot_seen=no; \ target=`echo check-recursive | sed s/-recursive//`; \ case "check-recursive" in \   distclean-* | maintainer-clean-*) list='gmock ocf java' ;; \   *) list='ocf java' ;; \ esac; \ for subdir in $list; do \   echo "Making $target in $subdir"; \   if test "$subdir" = "."; then \     dot_seen=yes; \     local_target="$target-am"; \   else \     local_target="$target"; \   fi; \   (CDPATH="${ZSH_VERSION+.}:" && cd $subdir && make  $local_target) \   || eval $failcom; \ done; \ if test "$dot_seen" = "no"; then \   make  "$target-am" || exit 1; \ fi; test -z "$fail"
root     14414  0.0  0.1  69808 59184 ?        S     6月16   0:04 make check-am
root     14518  0.0  0.1  64088 53536 ?        S     6月16   0:03 make check-TESTS
root     14528  0.0  0.0  16680  1560 ?        S     6月16   0:00 /bin/bash -c set +e; bases='unittest_erasure_code_plugin.log unittest_erasure_code.log unittest_erasure_code_jerasure.log unittest_erasure_code_plugin_jerasure.log unittest_erasure_code_isa.log unittest_erasure_code_plugin_isa.log unittest_erasure_code_lrc.log unittest_erasure_code_plugin_lrc.log unittest_erasure_code_shec.log unittest_erasure_code_shec_all.log unittest_erasure_code_shec_thread.log unittest_erasure_code_shec_arguments.log unittest_erasure_code_plugin_shec.log unittest_erasure_code_example.log unittest_compression_plugin.log unittest_compression_snappy.log unittest_compression_plugin_snappy.log unittest_compression_zlib.log unittest_compression_plugin_zlib.log unittest_librados.log unittest_librados_config.log unittest_journal.log unittest_rbd_replay.log unittest_encoding.log unittest_base64.log unittest_run_cmd.log unittest_simple_spin.log unittest_libcephfs_config.log unittest_bluefs.log unittest_bit_alloc.log unittest_bluestore_types.log unittest_transaction.log unittest_mon_moncap.log unittest_mon_pgmap.log unittest_ecbackend.log unittest_osdscrub.log unittest_pglog.log unittest_hitset.log unittest_osd_osdcap.log unittest_pageset.log unittest_rocksdb_option_static.log unittest_chain_xattr.log unittest_lfnindex.log unittest_mds_authcap.log unittest_addrs.log unittest_blkdev.log unittest_bloom_filter.log unittest_histogram.log unittest_prioritized_queue.log unittest_weighted_priority_queue.log unittest_str_map.log unittest_mutex_debug.log unittest_shunique_lock.log unittest_sharedptr_registry.log unittest_shared_cache.log unittest_sloppy_crc_map.log unittest_time.log unittest_util.log unittest_crush_wrapper.log unittest_crush.log unittest_osdmap.log unittest_workqueue.log unittest_striper.log unittest_prebufferedstreambuf.log unittest_str_list.log unittest_log.log unittest_throttle.log unittest_ceph_argparse.log unittest_ceph_compatset.log unittest_mds_types.log unittest_osd_types.log unittest_lru.log unittest_io_priority.log unittest_gather.log unittest_signals.log unittest_bufferlist.log unittest_xlist.log unittest_crc32c.log unittest_arch.log unittest_crypto.log unittest_crypto_init.log unittest_perf_counters.log unittest_admin_socket.log unittest_ceph_crypto.log unittest_utf8.log unittest_mime.log unittest_escape.log unittest_strtol.log unittest_confutils.log unittest_config.log unittest_context.log unittest_safe_io.log unittest_heartbeatmap.log unittest_formatter.log unittest_daemon_config.log unittest_ipaddr.log unittest_texttable.log unittest_on_exit.log unittest_readahead.log unittest_tableformatter.log unittest_bit_vector.log unittest_interval_set.log ceph-detect-init/run-tox.sh.log ceph-disk/run-tox.sh.log test/run-rbd-unit-tests.sh.log test/ceph_objectstore_tool.py.log test/test-ceph-helpers.sh.log test/cephtool-test-osd.sh.log test/cephtool-test-mon.sh.log test/cephtool-test-mds.sh.log test/cephtool-test-rados.sh.log test/test_pool_create.sh.log test/test_crush_bucket.sh.log unittest_bufferlist.sh.log test/encoding/check-generated.sh.log test/mon/osd-pool-create.sh.log test/mon/misc.sh.log test/mon/osd-crush.sh.log test/mon/mon-ping.sh.log test/mon/mon-created-time.sh.log test/mon/osd-erasure-code-profile.sh.log test/mon/mkfs.sh.log test/mon/mon-scrub.sh.log test/mon/test_pool_quota.sh.log test/osd/osd-scrub-snaps.sh.log test/osd/osd-config.sh.log test/osd/osd-reuse-id.sh.log test/osd/osd-bench.sh.log test/osd/osd-reactivate.sh.log test/osd/osd-copy-from.sh.log test/osd/osd-markdown.sh.log test/mon/mon-handle-forward.sh.log test/libradosstriper/rados-striper.sh.log test/test_objectstore_memstore.sh.log test/test_pidfile.sh.log test/pybind/test_ceph_argparse.py.log test/pybind/test_ceph_daemon.py.log ../qa/workunits/erasure-code/encode-decode-non-regression.sh.log test/encoding/readable.sh.log'; bases=`for i in $bases; do echo $i; done | sed 's/\.log$//'`; bases=`echo $bases`; \ log_list=`for i in $bases; do echo $i.log; done`; \ trs_list=`for i in $bases; do echo $i.trs; done`; \ log_list=`echo $log_list`; trs_list=`echo $trs_list`; \ make  test-suite.log TEST_LOGS="$log_list"; \ exit $?;
root     14537  0.0  0.1  69276 58816 ?        S     6月16   0:04 make test-suite.log TEST_LOGS=unittest_erasure_code_plugin.log unittest_erasure_code.log unittest_erasure_code_jerasure.log unittest_erasure_code_plugin_jerasure.log unittest_erasure_code_isa.log unittest_erasure_code_plugin_isa.log unittest_erasure_code_lrc.log unittest_erasure_code_plugin_lrc.log unittest_erasure_code_shec.log unittest_erasure_code_shec_all.log unittest_erasure_code_shec_thread.log unittest_erasure_code_shec_arguments.log unittest_erasure_code_plugin_shec.log unittest_erasure_code_example.log unittest_compression_plugin.log unittest_compression_snappy.log unittest_compression_plugin_snappy.log unittest_compression_zlib.log unittest_compression_plugin_zlib.log unittest_librados.log unittest_librados_config.log unittest_journal.log unittest_rbd_replay.log unittest_encoding.log unittest_base64.log unittest_run_cmd.log unittest_simple_spin.log unittest_libcephfs_config.log unittest_bluefs.log unittest_bit_alloc.log unittest_bluestore_types.log unittest_transaction.log unittest_mon_moncap.log unittest_mon_pgmap.log unittest_ecbackend.log unittest_osdscrub.log unittest_pglog.log unittest_hitset.log unittest_osd_osdcap.log unittest_pageset.log unittest_rocksdb_option_static.log unittest_chain_xattr.log unittest_lfnindex.log unittest_mds_authcap.log unittest_addrs.log unittest_blkdev.log unittest_bloom_filter.log unittest_histogram.log unittest_prioritized_queue.log unittest_weighted_priority_queue.log unittest_str_map.log unittest_mutex_debug.log unittest_shunique_lock.log unittest_sharedptr_registry.log unittest_shared_cache.log unittest_sloppy_crc_map.log unittest_time.log unittest_util.log unittest_crush_wrapper.log unittest_crush.log unittest_osdmap.log unittest_workqueue.log unittest_striper.log unittest_prebufferedstreambuf.log unittest_str_list.log unittest_log.log unittest_throttle.log unittest_ceph_argparse.log unittest_ceph_compatset.log unittest_mds_types.log unittest_osd_types.log unittest_lru.log unittest_io_priority.log unittest_gather.log unittest_signals.log unittest_bufferlist.log unittest_xlist.log unittest_crc32c.log unittest_arch.log unittest_crypto.log unittest_crypto_init.log unittest_perf_counters.log unittest_admin_socket.log unittest_ceph_crypto.log unittest_utf8.log unittest_mime.log unittest_escape.log unittest_strtol.log unittest_confutils.log unittest_config.log unittest_context.log unittest_safe_io.log unittest_heartbeatmap.log unittest_formatter.log unittest_daemon_config.log unittest_ipaddr.log unittest_texttable.log unittest_on_exit.log unittest_readahead.log unittest_tableformatter.log unittest_bit_vector.log unittest_interval_set.log ceph-detect-init/run-tox.sh.log ceph-disk/run-tox.sh.log test/run-rbd-unit-tests.sh.log test/ceph_objectstore_tool.py.log test/test-ceph-helpers.sh.log test/cephtool-test-osd.sh.log test/cephtool-test-mon.sh.log test/cephtool-test-mds.sh.log test/cephtool-test-rados.sh.log test/test_pool_create.sh.log test/test_crush_bucket.sh.log unittest_bufferlist.sh.log test/encoding/check-generated.sh.log test/mon/osd-pool-create.sh.log test/mon/misc.sh.log test/mon/osd-crush.sh.log test/mon/mon-ping.sh.log test/mon/mon-created-time.sh.log test/mon/osd-erasure-code-profile.sh.log test/mon/mkfs.sh.log test/mon/mon-scrub.sh.log test/mon/test_pool_quota.sh.log test/osd/osd-scrub-snaps.sh.log test/osd/osd-config.sh.log test/osd/osd-reuse-id.sh.log test/osd/osd-bench.sh.log test/osd/osd-reactivate.sh.log test/osd/osd-copy-from.sh.log test/osd/osd-markdown.sh.log test/mon/mon-handle-forward.sh.log test/libradosstriper/rados-striper.sh.log test/test_objectstore_memstore.sh.log test/test_pidfile.sh.log test/pybind/test_ceph_argparse.py.log test/pybind/test_ceph_daemon.py.log ../qa/workunits/erasure-code/encode-decode-non-regression.sh.log test/encoding/readable.sh.log

Case 2

make processes over, but print Hangup message.

LD_PRELOAD=liblttng-ust-fork.so
make[5]: *** Deleting file `unittest_erasure_code_shec_thread.log'
make[4]: *** wait: No child processes.  Stop.
make[4]: *** Waiting for unfinished jobs....
make[4]: *** wait: No child processes.  Stop.
make[3]: *** [check-am] Error 2
make: ]: *** [check-recursive] Hangup

make[5]: *** [unittest_erasure_code_shec_thread.log] Hangup
make[1]: *** [check] Hangup
Hangup

@xiaoxichen
Copy link
Contributor

@jecluis this PR passed jenkins bot, would you mind take a review?

@tchaikov tchaikov self-assigned this Jun 29, 2016
@liewegas liewegas merged commit 957ece7 into ceph:master Jun 30, 2016
@xiaoxichen
Copy link
Contributor

@liewegas , thanks, can we go ahead for backport to hammer+ jewel (infernalis is eol soon?)

@jtek
Copy link

jtek commented Jul 6, 2016

Hammer backport is important. Gentoo just broke the no downtime upgrade path from its latest stable package (Firefly) because they removed Firefly and Hammer packages and stabilised Infernalis at the same time. Current Gentoo users must stop their whole cluster and upgrade to Infernalis or manually install Hammer and not have any supported stable package.

@xiaoxichen
Copy link
Contributor

@jtek hammer is backported. are you asking for infernalis backport?

@jtek
Copy link

jtek commented Jul 7, 2016

I suppose that 0.94.7 doesn't have the fix and this is scheduled for 0.94.8 ?

In the meantime Gentoo devs may be confused about which version to use and only marked Infernalis stable (previously the only stable release version on Gentoo was Firefly 0.80.10) :
https://bugs.gentoo.org/show_bug.cgi?id=587568

Until this is cleared up Gentoo users upgrading Ceph or installing new servers will have a nasty surprise.

Is there a patch available for 0.94.7 I can refer to in the bug above ?

@xiaoxichen
Copy link
Contributor

@jtek here you go #10038

@jtek
Copy link

jtek commented Jul 7, 2016

@xiaoxichen thanks, I would probably have spent quite some time finding it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
6 participants