Assets 2

New features:

  1. Added support for SRQ (shared receive queue) in RDMA. This allows the receive resources (such as xio_task and its' adjacent memory) to be shared between several QPs. This significantly reduces Accelio memory foot print. SRQ is currently only supported in user mode Accelio and thus is disabled by default. It can be enabled by a configuration flag during configuration:

    ./configure --enable-shared-receive-queue=yes

  2. Added new types of session events:

    1. XIO_SESSION_CONNECTION_RECONNECTING_EVENT that indicates that reconnect has began
    2. XIO_SESSION_CONNECTION_RECONNECTED_EVENT that indicated that reconnect has successfully ended
  3. Added support for Debian packaging

Changes:

  1. The user application can now configure nexus close timeout passing parameter XIO_OPTNAME_TRANSPORT_CLOSE_TIMEOUT to xio_set_opt(). This parameter is in units of milliseconds (default value is 1 minute) and it represents the time that passes after the last xio_connection on the same transport is closed and the transport (TCP socket or RDMA QP) is closed.
  2. The user can now configure the number of tasks in the RDMA receive queue. This is done by passing rq_depth in xio_context_params when creating the xio_context. Passing 0 indicated that the user wants to use the default value (the receive queue will be "XIO_MAX_IOV + constant" deep). This is useful for example if the user knows for a fact that all QPs on this context will receive only msgs with a small number of vector elements seldom and he wishes to save memory.

Bug fixes:

  1. Fix header not appearing in assign_data_in_buf() in TCP. When using XIO_TCP_READ in dual stream the user header was attached to the user data and not to the xio_header. This caused the assign_data_in_buf() callback to be triggered on a message without the header.
  2. In case of nexus shared by multiple connections, not all "connection established" notifications were posted since work member was overwritten again and again. Now the work member is dynamically allocated/freed.
Pre-release

@ilansmith ilansmith released this Jul 1, 2016 · 3 commits to master since this release

Assets 2
Merge branch 'for_next' for v1.7-rc1

@katyakats katyakats released this Apr 5, 2016 · 38 commits to master since this release

Assets 2

New features:

1. xio_connection_ioctl has an option to check if connection is leading on server side. Can be useful when the user chose to do forward and wants to differentiate between closing of leading connection (the initial connection received immediately following on_new_session callback) and the connection on which the msgs are received. 2. kernel reconnect 3. XIO_THREAD_SAFE_DEBUG flag was added to configure. This flag checks whether the application written above accelio is thread safe. The flag hurts performance significantly and should be used only in development stage. 4. Added rpm

Changes:

1. In accelio/user space: xio_init() must be called explicitly before calling any xio function. same for xio_shutdown which should be called after last call to accelio functions. In case xio_init() wasn't called, xio_context_create() will fail. 2. Warning is now printed if the user wants to register memory whose size is not aligned to page size. 3. Added disconnect timeout to xio_connection_params. Timeout will be minimum 1 sec. Default is 5 min. This enables to configure the timeout of the disconnect (timer when send fin req). 4. Added flag to apply memory registration to xio_context_params. This will cause memory registration in case rdma is configured on the machine, even when user is running on tcp. 5. Flag disable-raio-build will not compile raio example even when compiling with enable-kernel-module flag. 6. Nbdx/raio:
  • raio_server: Add option to determine num threads per server process
  • raio_server: cpumask was extended to support large multi-cpus systems
  • raio_server: minimal iodepth is now set for portal data
  • keep_alive messages were disabled for nbdx and raio_server.
7. Accelio/fio: supports fio 2.2.11 +.

Bug fixes:

1. Multiple bug fixes in flow where call to xio_connect() is done from thread different than the thread running accelio event loop. 2. Multiple bug fixes in user space reconnect 3. Rdma/kernel: bug fix in case response contains header only (setting txd.nents to 1 in case response contains header only) 4. Multiple bug fixes in tcp transport:
  • on_msg callback parameter provides erroneous indication of last_in_rxq
  • completed messages where wrongly flushed
  • sync fd addition and removal to xio_context before fix, fd removal which were not added to xio_context ended in error message
5. In case there already is a connection for this session+ctx, the existing one should not be freed 6. Connection timeout on FIN_WAIT_1 does not notify "session teardown" event 7. Solving race condition in case transport gets disconnected (between notifying connection teardown and transport closed) that caused the nexus to be released twice. 8. Avoid calling on_msg_error on internal accelio's messages 9. Allow connection removal immediately after created but before established 10. When connection is in fin_wait_1 state and sudden disconnect occurs, flushing of the fin request results in miscalculation of reference count and connection teardown 11. In case of nexus error/refused session lead_con and redir_con is not set to null. If needed it will happen after connection is destroyed 12. Check if connection exists by the time nexus refused 13. Check if rdma devices are installed and if not bypass rdma 14. fio: on some machines, fio plugin could not built due to relative paths

Open issues:

1. Occasionally seeing error prints of "user object leaked" 2. Accelio/tcp: header is not received in assign_in_buf callback 3. Raio server sometimes crashed after second run of client fio with ndbx 4. Occasionally getting getting a crash in kernel/rdma on xio_unmap_desc following xio_context_destroy.
Nov 23, 2015

@katyakats katyakats released this Sep 16, 2015 · 107 commits to master since this release

Assets 2

New features:

1. Reconnect (User space GA): This feature allows detection and recovery from connection failure. The feature is disabled by default and can be turned on using xio_set_opts. (For more information please see programmer guide) Configuration manual: https://community.mellanox.com/docs/DOC-2158. 2. Connection keep alive – both client and server send heart beat msgs to one another. In case no heart beat responses are received for 3 msgs connection is closed. (For more information please see programmer guide) 3. New programmer guide which can be found in http://www.accelio.org/wp-admin/accelio_doc/index.html 4. Added support for ConnectX-4 (MLNX_OFED 3.1 is needed) 5. Added support for kernel 4.2

Changes:

1. NBDX is merged into accelio/examples/raio/kernel directory. 2. RAIO:
  • fio's raio plugin directory is moved into raio directory
  • adding nbdx's admin tools and scripts to enable running with fio
  • support for buffer alignment for o_direct operations
  • update fio scripts to include gtod_reduce, setting gtod_reduce in fio results in slightly better IOPs at the expense of omitting latency measurements
3. accelio/kernel: rework on hello world example:
  • initializing, filling data and releasing xio_msg correctly
  • server sends response data to client
4. Xio_context_params has new parameter: max_conns_per_ctx. This allows the tuning of maximum amount of tasks required to serve all connections that are attached to the current context. 5. xio_context_modify_ev_handler method exported. 6. Accelio build flags:
  • Enable-extra-checks. This is on by default. To disable run ./configure with enable-extra-checks=no.
  • Stat-counters. This is on by default. To disable run ./configure with Stat-counters =no.

Bug fixes:

1. Accelio tcp:
  • simulating "send completion" in tcp was called after long time
  • messages were not transmitted if send completion batching threshold was reached
  • batching messages without seting "last in batch" would cause messages to wait in queue for long time.
  • error "epoll_ctl failed. Bad file descriptor" after shutting down the application. Happened on tcp when socket fd was closed but not removed from the event loop.
  • epoll fd lookup failed due to fd removal of fd that was never added to the epoll
2. check if last_in_rxq for WC_RDMA_READ as well. Resulted in notification that msg is not last in batch. 3. xio_mempool crash when alloc_quantum_nr=1. 4. Accelio kernel: occasionally when sending response of size 8K-10K using rdma transport there is IB_WC_REM_INV_REQ_ERR error and the response is not being sent

Known issues:

1. Sometimes when running fio over NBDX with big num_jobs there is a crash. 2. Occasionally kernel crashes on xio_rdma_task_pre_put. 3. Kernel Reconnect is not working 4. TCP TPS Performance degradation.

@katyakats katyakats released this Jun 3, 2015 · 177 commits to master since this release

Assets 2

New Features:

  1. Reconnect (beta). This feature allows detection and recovery from connection failure.
    Configuration manual: https://community.mellanox.com/docs/DOC-2158
  2. Tasks pools are now created per context basis substantially reducing memory allocation and registration

Changes:

  1. Add hello_lat test for kernel
  2. Add option for cpu bind for kernel tests
  3. API change:
    a. one consistent API for memory allocation/registration and memory pooling. registered memory returned in single type xio_reg_mem
    b. xio_context_create method now receives xio_context_params structure
    c. XIO_OPTNAME_MAX_INLINE_DATA was renamed to XIO_OPTNAME_MAX_INLINE_XIO_DATA and XIO_OPTNAME_MAX_INLINE_HEADER was renamed to XIO_OPTNAME_MAX_INLINE_XIO_HEADER.
    d. The following xio_msg_flag XIO_MSG_FLAG_SMALL_ZERO_COPY was deprecated. It is splitted into 2 flags:
    i. XIO_MSG_FLAG_PEER_WRITE_RSP
    ii. XIO_MSG_FLAG_PEER_READ_REQ
  4. Kernel latency was improved by adding polling mechanism.
  5. Add an option to change the default max_inline_data
  6. Add option for memory allocation before process start
  7. Add new functions:
    a. xio_version - returns the current version of accelio in the form "accelio_v1.3-rc3-0-gb69a343"
    b. xio_context_poll_wait - poll for events for a specified (possibly infinite) amount of time
    c. xio_context_poll_events - poll for events using direct access to the event signaling resources
    d. xio_connection_ioctl - enable query connection’s runtime parameters such as credits

Bug Fixes:

  1. Session removed teardown work from inside the work function preventing kernel crash.
  2. cm_rejected comes after cm_established event causing crash
  3. flushing tasks before closing the transport
  4. protect server session teardown when session is destructed while connections are still trying to connect
  5. In case ctx of last connection in session was destroyed before session_destroy, calling session_destroy resulted in dangling pointer
  6. protect against both notification of connection "disconnect" and "close"
  7. protect against NULL verbs that is returned from rdmacm
  8. xio_mem_register/xio_mem_alloc failed on hosts with installed rdma stack and without installed rdma capable HCAs
  9. xio_context_run_loop which run with timeout, exits long time after timeout passed
  10. sending a beacon was not completed leaving xio's rdma_handle in positive reference count

Known issues:

  1. Accelio kernel: occasionally when sending response of size 8K-10K using rdma transport there is IB_WC_REM_INV_REQ_ERR error and the response is not being sent

System Requirements

Driver stack:

  • Inbox Infiniband stack
  • MLNX_OFED 3.0/2.4

User Space Operating system:

  • RHEL 6.4, and up
  • Ubuntu 12.04 and up

Kernel Module:

  • Linux Kernel 3.13 and up

Tested RDMA NICs:

  • Mellanox ConnectX-3
  • Mellanox ConnectX-3 Pro
  • Mellanox ConnectIB
Jun 3, 2015
Accelio v1.4-rc1

@rosenbaumalex rosenbaumalex released this Dec 30, 2014 · 444 commits to master since this release

Assets 2

New Features:

  1. TCP transport (kernel)
  2. Flow Control
    Support for msg based and byte based queue depth
    This feature is disabled by default to enable use xio_set_opt to enable flow control
    Use following options in xio_set_opts
    a. XIO_OPTNAME_ENABLE_FLOW_CONTROL, // enables byte based flow control
    b. XIO_OPTNAME_SND_QUEUE_DEPTH_MSGS, // maximum tx queued msgs
    c. XIO_OPTNAME_RCV_QUEUE_DEPTH_MSGS, // maximum rx queued msgs
    d. XIO_OPTNAME_SND_QUEUE_DEPTH_BYTES, // maximum tx queued bytes
    e. XIO_OPTNAME_RCV_QUEUE_DEPTH_BYTES, // maximum rx queued bytes
  3. TOS (type of service) for connection
    Current support in rdma transport only (user & kernel)
  4. Tune Accelio internal buffer pool via xio_set_opt(XIO_OPTNAME_CONFIG_MEMPOOL)
  5. RDMA transport improved resilience (QP retry_count=3 & rnr_retry_count=3)

Changes:

  1. API Change: msg direction parameter was added to on_msg_error, differentiating incoming and outgoing errors
  2. Session established notification comes before first connection established notification
  3. Buffer pool is now created per context and not per transport – this improves multi-threaded buffer allocation from pool
  4. Adding new configure flag "--enable-rdma" to be able to disable rdma transport in Accelio. This enables tcp only build even if InfiniBand development files are installed
  5. Include files re-factoring – there are 3 '*.h' files: xio_base.h, xio_user.h and xio_kernel.h
  6. New configuration options under xio_set_opts:
    a. XIO_OPTNAME_ENABLE_FORK_INIT option to call Infiniband's ibv_fork_init()
    b. XIO_OPTNAME_MAX_INLINE_HEADER and XIO_OPTNAME_MAX_INLINE_DATA – determine the internal buffer size that is used by data send /recv
    c. XIO_OPTNAME_TRANS_BUF_THRESHOLD is deprecated
    d. XIO_OPTNAME_CONFIG_MEMPOOL - configure internal memory pool
  7. RDMA Transport close flow rework (kernel and user)
  8. Improving context creation time
  9. Support user assign buffers in different sizes for TCP
  10. Statistic counters for one way messages
  11. Add Accelio kernel hello-test
  12. Accelio is C++ compatible

Bug Fixes:

  1. After assign data in buffers, rdma local protection occurred when user provided buffer with null address
  2. When server rejected client session, one message was sent from client to the server
  3. CQ overflow triggered asynchronous event 'local queue catastrophic error'
  4. Arriving large messages (rdma read) were queued in the receiver and were not executed. the messages were executed only after arrival of new message
  5. Kernel:
    a. Crash during rdma module unloading
    b. RDMA_WRITE + allow user to rely on Accelio internal pool
    c. RDMA_WRITE for vector sizes > 1
    d. Upon async error call rdma_disconnect without changing state to disconnected
    e. Compilation on RH7
    f. Multi-thread example fixes

Known issues:

  1. Rare kernel crashes after resources released
  2. High connection establishment time for RDMA transport (~24 msec)
  3. When using ibv_fork_init() there is performance penalty during memory registration or new connection establishment
  4. High Accelio kernel latency

System Requirements

Driver stack:

  • Inbox Infiniband stack
  • MLNX_OFED 2.3/2.2

User Space Operating system:

  • RHEL 6.4, and up
  • Ubuntu 12.04 and up

Kernel Module:

  • Linux Kernel 3.13 and up

Tested RDMA NICs:

  • Mellanox ConnectX-3
  • Mellanox ConnectX-3 Pro
  • Mellanox ConnectIB