New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] - Libssh2 based clients #86

Merged
merged 50 commits into from Oct 14, 2017

Conversation

Projects
None yet
1 participant
@pkittenis
Member

pkittenis commented Jul 19, 2017

This is a WIP pull request for new libssh2 based clients, created to give visibility on this work in progress.

TODO:

  • Usable, asynchronous libssh2 based SSHClient
  • Libssh2 client tests - integration tests based on openssh server
  • Programmatic private key authentication
  • SSH agent authentication
  • Standard identity files attempted if available
  • ParallelSSHClient using libssh2 based client
  • Port parallel client tests to libssh2 client (partly done)
  • Raise proper exceptions on errors from libssh2 (libssh2 bindings)
  • Update libssh2 bindings for new style classes
  • Implement libssh2 bindings for all of the libssh2 API
  • Figure out correct way to poll if remote command has finished executing
  • Implement TCP forwarding
  • Implement proxy connections via TCP forwarding
  • Implement connection timeout
  • Implement SFTP client functions
  • Refactor parallel clients for code reuse
  • Finalise public API for libssh2 clients
  • Packaging, wheel building etc

Motivation

Motivation for creating new libssh2 clients as opposed to continuing with paramiko:

  1. Performance.
    Libssh2 has been tested to be several orders of magnitude faster than paramiko and other SSH library alternatives, ranging from simple connection and authentication overhead to large data transfer performance.

  2. Stability.
    Paramiko often suffers from memory leaks and regressions, ranging from compatibility issues to API and stability/performance regressions, the latter particularly evident with paramiko version 2. Libssh2 OTOH is a well tested native code base, used in many other applications.

  3. Asynchronous/non-blocking support.
    Paramiko does not support non-blocking operations. It also makes use of threads and suffers from race conditions in several areas which parallel-ssh has had to work around. Its use in parallel-ssh therefore relies on monkey patching the python standard library via gevent in order to make it non-blocking.
    This has several drawbacks, most importantly that all other python libraries also have to use the monkey patched standard library and may not work correctly as a result.
    Libssh2 has native support for non-blocking mode which allows for a natively non-blocking client to be written, sans monkey patching, directly using the gevent API. It is also thread safe and can be further scaled beyond what a single event loop handling thread can handle.

Native code extension dependency

  • Why libssh2, why depend on native code extensions?

  • There are not many options for SSH libraries in python. Paramiko is the de facto standard, mostly due to said lack of options. Paramiko itself depends on other native code extensions for the heavy lifting, particularly around cryptography.

    Alternatives to paramiko like asyncssh are python 3 only and also make use of third party native code extensions, while at the same time using an incompatible license. Re-rolling yet another SSH library is not considered as it is (a) too time consuming and (b) would be re-inventing the wheel.

    In other words, there is no option which does not rely in some way on native code extensions. The argument for not having to depend on native code extensions is therefore moot. So why not use an existing, well tested, well performing native code library directly via python bindings.
    Options for that are similarly limited and it comes down to libssh2 vs libssh (no relation). Libssh2 was chosen for its better performance and greater number of features.

Features

Features of the new clients:

  • Vastly improved performance and better scaling in all areas.
  • Asynchronous by nature - no monkey patching.
  • Use the gevent API directly.
  • Allows for the client to be used within other applications with no side effects.

Missing Features

Features in paramiko not present in libssh2 at this time of writing:

  • SSH agent forwarding - libssh2/#185
  • GSS-API authentication (aka kerberos)

The intention is to provide both the existing paramiko and new libssh2 based clients (both parallel and single clients), allowing users to choose which to use depending on requirements.

Scope

New Python wrapper for libssh2 has been written from scratch as ssh2-python and available as a stand alone library. It implements vast majority of the libssh2 API including all server side features. A new from scratch library using Cython including a C-API layer was preferred to having to extensively modify pylibssh2 (hand crafted C), along with some issues found in it.

ssh2-python is mostly feature complete at this stage with support up to latest libssh2 release and includes binary wheels as well as system packages.

Unfortunately, python bindings for libssh2 have been un-maintained for years now and the last release (from 2011), is missing key features since added by third parties. The original project has therefor been forked at ParallelSSH/pylibssh2 and several changes and additions made, including merging of PRs, to be used by parallel-ssh. Luckily have experience with writing python bindings of C code, have written some extension libraries as well - see NodeTrie.

This fork is intended for use by parallel-ssh and has had its package name changed so as not to conflict with system installed versions of the original library. It will most likely be shipped as part of parallel-ssh, possibly in addition to being published at PyPi under a new name. There is no intention to maintain or support the forked project for use by other libraries, though any contributions to it will be reviewed on a case by case basis.

Implications to users and package maintainers

  • The libssh2 based ParallelSSHClient will have a slightly different end-user API. Whether the existing client will change, or a new one made, is still TBD. Never the less, any breaking API changes will only come in a new major version (2.x.x).
  • If libssh2 bindings are shipped as part of parallel-ssh (likely), package architecture will change from noarch to x86_64. Binary wheels will be provided.

Contributions

Contributions are most welcome and highly encouraged!

Feel free to contribute either on this here branch or at ssh2-python.

The libssh2 clients in this branch are currently usable for testing purposes only.

@pkittenis pkittenis added this to the 2.0.0 milestone Jul 19, 2017

pkittenis added some commits Jul 2, 2017

Updated libssh2 references for new package. Updated pssh logger to no…
…t use hardcoded name. Improved logic of finished function and get_output
Added openssh based embedded server for testing and test client keys.…
… Added programmatic public key authentication to libssh2 client. Updated libssh2 client. Added libssh2 client tests and parallel client tests based on libssh2.
Updated read from stdout logic, updated tests. Updated embedded opens…
…sh server start to wait for port to be opened
Removed pylibssh submodule.
Updated embedded server.
Updated setup.py
Updated libssh2 client tests.
Updated embedded openssh server, dev requirements.
Updated ssh2 client for new API.
Updated ssh2 client tests.
Refactoring - added base pssh module and class, updated ssh2 class to…
… be sub-class of it.

Updated embedded openssh server to take listen IP parameter and be able to handle starting multiple servers at same time.
Updated ssh2 client to handle encoding, sudo, shell, allow agent and timeout parameters and to handle authentication, session and connection errors on session startup/auth.
Added SessionError exception.
Updated libssh2 parallel client integration tests to cover all currently supported functionality of the libssh2 based client.
Refactored paramiko parallel client to use base class.
PEP8 and cleanups for ssh2 parallel and single clients.
Re-enabled travis tests for all clients.
Updated base and paramiko parallel clients.
Cleaned up ssh2 client.
Utils and test updates.
Updated openssh server to fix directory permissions before start up.
Updated native select function, parallel ssh2 client test cleanups.
Added SFTP copy local and remote operations to ssh2 client.
Updated base pssh module and paramiko parallel client.
Enabled parallel SFTP operation tests for ssh2 parallel client.
Added SFTP exceptions for ssh2 clients.
Updated native functions.
Updated embedded openssh server.
Updated API docstrings for ssh2 single and parallel clients.
Improved sftp init error handling in ssh2 client.
Added unknown host error handling to ssh2 client and tests.
Added connection error and retries tests for ssh2 clients.
Added retry delay optional parameter for ssh2 clients and delay default in constants.
Added native sftp get and put functions, updated ssh2 client.
Updates for py3 compatibility.
Better ssh2 client sftp error handling.
Added CI scripts for building system packages and binary wheels.
Migrated ssh2 client line parsing code to Cython.
Updated readme.
Updated test imports.
Added monkey patching to paramiko single host client as well as parallel client
@codecov

This comment has been minimized.

codecov bot commented Oct 7, 2017

Codecov Report

Merging #86 into master will increase coverage by 2.77%.
The diff coverage is 92.07%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master      #86      +/-   ##
==========================================
+ Coverage   89.63%   92.41%   +2.77%     
==========================================
  Files           8       11       +3     
  Lines         444      844     +400     
==========================================
+ Hits          398      780     +382     
- Misses         46       64      +18
Impacted Files Coverage Δ
tests/embedded_server/__init__.py 100% <ø> (ø)
pssh/exceptions.py 100% <100%> (ø) ⬆️
pssh/pssh2_client.py 100% <100%> (ø)
pssh/utils.py 100% <100%> (ø) ⬆️
pssh/constants.py 100% <100%> (ø) ⬆️
pssh/ssh_client.py 87.14% <85.18%> (-0.3%) ⬇️
pssh/ssh2_client.py 89.52% <89.52%> (ø)
pssh/pssh_client.py 96.2% <90%> (+5.97%) ⬆️
pssh/base_pssh.py 97.19% <97.19%> (ø)
... and 5 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 2e82198...b5805e9. Read the comment docs.

pkittenis added some commits Oct 7, 2017

Updated ssh2 client sftp functions.
Updated travis, appveyor cfgs.
Updated readme.
Updated documentation.
Updated sftp tests.
Added get_last_output function and cmds attribute to parallel clients…
… and tests.

 Updated changelog, documentation.
Updated ssh2 client for non-unix platforms.
Updated appveyor cfg and upload script.
Fixed output line parsing bug, added test.

@pkittenis pkittenis merged commit 900c3e2 into master Oct 14, 2017

6 checks passed

codecov/patch 92.07% of diff hit (target 89.63%)
Details
codecov/project 92.41% (+2.77%) compared to 2e82198
Details
continuous-integration/appveyor/branch AppVeyor build succeeded
Details
continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
continuous-integration/travis-ci/push The Travis CI build passed
Details

@pkittenis pkittenis referenced this pull request Oct 14, 2017

Closed

[WIP] - Libssh2 functionality #95

18 of 18 tasks complete
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment