Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Standing alan_working pull request. Do not merge! #181

Merged
merged 260 commits into from
Apr 7, 2020

Conversation

tacketar
Copy link
Contributor

@tacketar tacketar commented Sep 7, 2017

This is a placeholder to make it easy to generate RPMs and DEBs. It should NEVER be merged into master.

A pipe() is used to communicate with the central 0mq processing thread
about outgoing messages.  The 0mq event thread handles both incoming
and outgoing communication and uses 0mq's ability to monitor both 0mq
sockets and arbitrary FDs under the same notification umbrella.

The old code had both the pipe reading and writing protected inside the
same lock and when the pipe buffer was filled, ~64k pending tasks, it
would block on the next pipe write causing a deadlock since it was still
holding the lock blocking the pipe read.

There is no need for the pipe I/O to be protected by a lock but moving it
outside the lock caused another issue -- thread explosion.  There is no
throttle on the thread pool since tasks can submit other thread pool tasks
and so on each waiting for the other to complete.  There is logic in place
to minimize thread creation which significantly limits this explosion but
all external tasks generate a new thread on hte LServer. So when a large
rapid fire succession of incoming commands is received it tries to process
them as quickly as possible by spawning threads as needed.

Simply creating a smaller thread pool doesn't work because of the task
submission loops mentioned earlier could lead to deadlock. For the LServer
this occurs solely from tasks that require multiple responses for the same
task and all of these are handled via the mq_stream routines.

The solution implemented by the patch is to artificially create 2 different
thread pools - short and long running tasks. Technically there is only a
single thread pool along with a counter holding the number of tasks in
the short running "pool".  All incoming tasks are placed in the short
running "pool".  This pool has a cap on the number of tasks that can
be running at the same time and is controlled by the
"bind_short_running_max" variable in the MQ section of the config.

This acts as an inherit throttle on incoming tasks. Any task that
requires multiple mq_stream responses automatically moves itself into
the unlimited long running "pool".  This way the long running task
can be waiting for an Ok to send more data response from the client
which will be coming in through the short running "pool" and not
deadlock. This is implemented by adding a couple of routines to
the API allowing any task to switch pools.
All tasks are put on a WQ where a single thread then pops them off
sorts, merges into larger tasks, and executes them.

The patch supports doing both AIO and WQ.  The default is AIO with
WQ enabled via a lio_wq_enable() call.
You can now test any of the I/O modes including hitting a local
file system.  The local file support was added to test the FUSE driver.
The tester also supports hitting multiple targets to detect inter-file
issues.
tacketar and others added 29 commits October 24, 2019 09:38
Currently several options need to be specified in 2 locations:
command line and on the fuse_conn itself. Keeping them in sync
is problematic.
… code

This adds support for lio_cp, lio_get, lio_put to work on purely local data
without requiring a LIO target. The purpose is to help do performance tuning
of the FUSE mount.
@PerilousApricot PerilousApricot merged commit 9a4d925 into accre:master Apr 7, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants