Permalink
Commits on Apr 19, 2018
  1. Tree: propagation channels shouldn't update Task rc

    thiell committed Apr 16, 2018
    Use a temporary fix to avoid the gateways to update Task's rc dict. The
    permanent fix will be to clean this mess with Task Event Handlers (see
    at least #369 and #231).
    
    Change-Id: I2049a4b6ce3931f1b245b29326ba3daa065b36ba
  2. Tree: support offline gateways (#260)

    thiell committed Apr 16, 2018
    With this change, WorkerTree is now able to safely repropagate commands
    when the communication channel to a specific gateway cannot be
    established at all.  Such offline gateways were already marked as
    unreachable, but the 'relaunch' code was missing. A new
    WorkerTree._relaunch() method has been added for that purpose.
    
    Note that if a gateway fails once a propagation channel is "set up", the
    remote commands cannot be safely repropagated at this point and so will
    be seen as failed, but this was already the case.
    
    Closes #260
    
    Change-Id: I2be52b01638de95f58cedc0f2671f02cfffe8297
  3. CLI: add options to specify alternate config files (#336)

    thiell committed Apr 12, 2018
    Added the following command line options:
    --conf to specify alternative clush.conf (clush only)
    --groupsconf to specify alternative groups.conf (all CLIs)
    
    For example, this can be useful to programmatically create/update
    clustershell config files with an alternate group source setup.
    
    Closes #336
    
    Change-Id: I3aff60057274198b700b44b4d5bbdc916a5b5d0b
Commits on Apr 17, 2018
  1. EventHandler: reinstate ev_error and ev_timeout (as deprecated)

    thiell committed Apr 17, 2018
    Both EventHandler.ev_error() and EventHandler.ev_timeout() methods were
    mistakenly removed from 1.8.0, leading to possible issues with
    EventHandler subclasses. This patch reinstates both methods to improve
    backward compatibility but add DEPRECATED notes in the documentation.
    
    Closes #377
    
    Change-Id: I1d12e3ab6b190e91dca32745d83145d7fe5bb75b
  2. Worker: make abort() safe to call on an already closing client

    thiell committed Apr 16, 2018
    Fix the underlying EngineClient.abort() call so that abort() can safely be
    called on an already closing (or aborting) worker.
    
    This fix (and API clarification) will help for upcoming patches regarding
    aborting gateway channels (#260).
    
    Change-Id: Ic714dc71db1aa9a0d5b16607780891f019c10859
Commits on Apr 12, 2018
  1. CLI/Error: add error handling for malformed groups.conf (#379)

    thiell committed Apr 12, 2018
    Handle configparser's exceptions in case of malformed clush.conf or
    groups.conf.
    
    Closes #379
    
    Change-Id: Iffa3bf3f425affefd976a4c4f252a3dcee7fda1a
Commits on Mar 9, 2018
  1. vim: update groups.conf vim syntax file

    degremont committed Feb 15, 2018
    Update vim syntax file for groups.conf with missing changes
    like $CFGDIR
    
    Change-Id: I58e1b6e7bc3515cf70fa4d542c323966eb4c74fc
Commits on Mar 8, 2018
  1. (#378) Tree: fix variable name conflict in _on_remote_node_close()

    degremont committed Mar 8, 2018
    'node' variable name was superseeded by another usage of the same
    variable name in a for-loop in the same method, leading to wrong
    variable content and KeyError exception when doing rcopy in tree
    mode.
    
    Closes #378
    
    Change-Id: I8a7d7d0042ec221bc2aa2cd91e7866051b8e09cd
Commits on Feb 7, 2018
  1. Task: pchannel logging updated from info to debug

    bamb0u authored and degremont committed Feb 5, 2018
    Closes #372
Commits on Jan 10, 2018
  1. (#370) NodeSet: speed-up nodeset parsing

    degremont committed Jan 8, 2018
    Rewrite NodeSet._next_op() using re.search(), testing
    all supported operators with only one call instead of calling
    str.find() for each operator (we support 4 of them)
    which is more expensive.
    
    ParsingEngine.OP_CODES dict has been swapped as we now need
    an index using a operator character as key.
    
    This is useful when pattern to parse has a lot of operators
    (ie: foo1,foo2,foo3,...,foo5000)
    
    Benchmark shows a 60% improvement with this patch when parsing
    large strings with a lot of operators.
    
    Change-Id: Idca7763afb34269210799e8a5176785c314382da
Commits on Nov 28, 2017
  1. doc/sphinx: add reference to package availability in openSUSE Leap

    thiell committed Nov 28, 2017
    Change-Id: If7451fc8c5394c1c5f21b48fa139b2e6678f2e82
Commits on Oct 23, 2017
  1. Release 1.8 (#360)

    thiell committed Oct 23, 2017
    http://clustershell.readthedocs.io/en/v1.8/release.html
    
    Closes #360
    
    Change-Id: I0db2b382d9c2c4be2845c023a639f80323e34316
Commits on Oct 18, 2017
  1. Release 1.7.91 (1.8 RC1) (#360)

    thiell committed Oct 18, 2017
    Change-Id: Iee6fc3a0b44e779b8ff74e4d3282ee62573af338
Commits on Oct 17, 2017
  1. NodeSet: implement node wildcards (#349)

    thiell committed Oct 15, 2017
    Node wildcard expansion, using the well-known characters '*' and '?',
    allows users to filter on node names that match the provided wildcard
    mask.
    
    The implementation modifies ParsingEngine to search for nodes from 'all'
    using the provided wildcard mask, that is, nodes that are returned by
    the all upcall, or in case of absence, the union of list+map.
    
    Simultaneous use of wildcards and ranges (like '*foo[1-3]') is fully
    supported.
    
    With this change,
        $ nodeset -f '*'
    becomes the same as:
        $ nodeset -f -a
    
    Closes #349
    
    Change-Id: I1b99811dc65bd7fd9a5a9fedc6e5f1f3b8cb875e
    Signed-off-by: Stephane Thiell <sthiell@stanford.edu>
    Signed-off-by: Aurelien Degremont <aurelien.degremont@cea.fr>
  2. Worker.Tree: rename WorkerTree to TreeWorker

    thiell committed Oct 16, 2017
    Keep WorkerTree alias for compat.
    
    Change-Id: I22e4fb4dc1da259559b4ee107952c556bf284926
  3. Travis CI: load virtualenv when sshing to localhost

    thiell committed Oct 17, 2017
    Activate virtualenv when sshing non-interactively to localhost to enable
    TreeWorkerTest's real gateway launches. That way, the gateway code is
    running within the same Python environment.
    
    The virtualenv file 'bin/activate' is added at the very beginning of
    .bashrc so that it is sourced even if not running interactively (there
    is a check in the file on Ubuntu).
    
    Change-Id: I281ff75dcb8e8a4d5eebdd45c7bb4fef0b883cd1
Commits on Oct 16, 2017
  1. Worker.Tree: ev_pickup, ev_close and rcopy fixes (#363)

    thiell committed Oct 12, 2017
    Pickup events were not generated in tree mode! This is now fixed and
    ev_pickup() is called for each target node after ev_start().
    
    Regarding ev_close, commit f7710da made the false assumption than
    EventHandler used by WorkerTree was using the new signature. Add
    signature check to easly fix this issue.
    
    New tests revealed a wrong string definition instead of bytes in rcopy,
    but bytes is required for Python 3 support.
    
    Added new test TreeWorkerTest with 96% coverage of WorkerTree.
    
    Removed obsolete TreeCopyTest.
    
    Closes #363.
    
    Change-Id: I00dea854dc4cb7af6e6db69594286a0039fc0bf7
  2. doc/sphinx: add new Slurm bindings example section (#359)

    thiell committed Oct 13, 2017
    Also add some documentation about cache_time.
    
    Change-Id: I888791182caec4a1da3cdfe5ad0516359158d5b2
  3. doc: remove that numeric node sets are not supported

    thiell committed Oct 13, 2017
    Support for fully numeric node sets was added in d0ea10c.
    
    Change-Id: I703d89da0d4741e99c5d7749ebe8086f578ceba3
Commits on Oct 13, 2017
  1. Gateway: ignore logging initialization errors

    thiell committed Oct 13, 2017
    Ignore logging initialization errors, like when the user doesn't have
    permission to write.
    
    Note that, by default, the tree mode enables logging at level INFO on
    the gateways in /tmp.
    
    Change-Id: I440338fa2e94d452762039460442360378d7b4d4
Commits on Oct 12, 2017
  1. Update Slurm bindings (#362)

    kcgthb authored and thiell committed Oct 12, 2017
    * shortcuts for group names
    * remove additional trailing characters in node state
    * added job bindings
    
    Closes #359
  2. CLI/Nodeset: fix defect when no group sources are found

    thiell committed Oct 11, 2017
    Issue easily reproducible by creating an empty groups.conf.
    
    Change-Id: I4ac063b09493cd4d1f453557dae361b8a819d058
  3. doc/sphinx: new section for installing on openSUSE (#360)

    thiell committed Oct 11, 2017
    Change-Id: Ib3e66f22ffcc9b50c44c32d9049b220b09bf8abb
  4. doc/sphinx: update sphinx documentation for 1.8 (#360)

    thiell committed Oct 9, 2017
    Change-Id: I317fa745f0af707bac073c7823d9307d1aae869c
Commits on Oct 11, 2017
  1. NodeUtils: check that yaml dict key type is string (#361)

    thiell committed Oct 10, 2017
    YAML syntax is more powerful than other configuration files, which can
    be confusing for users not used to its syntax. Switching to PyYAML
    BaseLoader use strings but breaks other things so it doesn't seem to be
    an option here. This patch adds a check to ensure that the type of group
    source names and group names is string (eg. properly quoted if there are
    only digits) and raise an GroupResolverConfigError with a proper error
    message if not.
    
    Closes #361
    
    Change-Id: If52b726b29490c5f5f7ef97a2b9256280a3055a0
  2. CLI: fix GroupResolverConfigError handling

    thiell committed Oct 10, 2017
    GroupResolverConfigError exceptions were not properly handled since
    2a3823f that introduced lazy init of group sources. Error checking was
    previously done during import, but this is now useless (and wasn't even
    ported to Python 3). The exception is now handled as any other error by
    the CLI.Error module.
    
    Change-Id: Idf01fdb3cb50ebce355b4938f6f149d913bbf62d
Commits on Oct 10, 2017
  1. packaging: a few cleanups

    thiell committed Oct 6, 2017
    spec.in: avoid cleaning buildroot in %install as we don't support EL5
    anymore
    
    setup.py: re-add shebang
    
    Change-Id: I836d6ad6cc7022d65bb1266cf39be45b650120db
Commits on Oct 6, 2017
  1. tests: a few cleanups

    thiell committed Oct 6, 2017
    Removed unused imports.
    
    NodeSetGroupTest:
    - replaced shutil.rmtree with os.rmdir()
    - added missing os.rmdir() to that /tmp stays clean
    - skip test_yaml_permission_denied if run as root
    
    Change-Id: I85450983b66b8e9184611ed287f0525ceb26e548
Commits on Oct 2, 2017
  1. Release 1.7.82 (1.8 beta2)

    thiell committed Oct 2, 2017
    Change-Id: I5c67b3e43e175178f2f8f8ee85f0b8e1b62b7647
  2. doc: fix a few broken links in clush.rst

    thiell committed Oct 2, 2017
    Links in bold don't seem to be supported by sphinx.
    
    Change-Id: I535227f6e3bff6d81a380f7c6f7a24fa54de20e2
  3. CLI/Nodeset: display warning on misplaced set operation (#318)

    thiell committed Oct 2, 2017
    This change adds a warning to stderr in the following case:
    
        $ nodeset -f -i foo1 foo[1-2]
        WARNING: empty left operand for set operation
        foo[1-2]
    
    That way, the user knows -i foo1 placed here is useless. The same would
    be for -x. For -X (xor), the empty left operand doesn't make it a no-op,
    but the warning still makes sense.
    
    Closes #318
    
    Change-Id: I5c53646bb3edabc64eff4933fc5afd9a94422ab2
  4. CLI/Clush.py: fix mishandled broken pipe in Python 3

    thiell committed Oct 2, 2017
    This change fixes a mishandled broken pipe error when using clush under
    Python 3, like in `clush ... | head`.
    
    IOError was merged into OSError in Python 3, but clush's exception
    handler was treating them separately to handle broken pipe and max
    resource limits errors. Merge both exception handling into Errors.py
    using a "careful exception handling", as recommended by PEP 3151.
    
    Change-Id: I2746ed09a80a684055d21b35dcdf8becb5581c5a
  5. NodeUtils: ignore YAML group files with permission error (#348)

    thiell committed Oct 1, 2017
    Sometimes users don't have access to some YAML group files. The current
    behavior is a fatal error in that case, making clush or nodeset unusable
    in some cases. For example, at the SRCC, we have operators without root
    access that use clush and cluset on the cluster to gather cluster
    accounting and other info, but they don't have access to all group
    files. This change just ignores the YAML files the user doesn't have
    access to, print a debug message and continue processing the rest.
    
    Closes #348.
    
    Change-Id: I0e8a82304ad592a2f0866b542191ebe1912e642d
  6. CLI/Clush.py: initialize logging earlier (#348)

    thiell committed Oct 1, 2017
    Python logging was initialized quite late with clush, this patch fixes
    this so the user can also see early debug messages, especially useful
    now that we do support lazy init of group sources (2a3823f).
    
    Change-Id: I6583ffbd68fec280ff0d7d700ec659f893cb0c90