Standing alan_working pull request. Do not merge! #181

tacketar · 2017-09-07T13:02:47Z

This is a placeholder to make it easy to generate RPMs and DEBs. It should NEVER be merged into master.

A pipe() is used to communicate with the central 0mq processing thread about outgoing messages. The 0mq event thread handles both incoming and outgoing communication and uses 0mq's ability to monitor both 0mq sockets and arbitrary FDs under the same notification umbrella. The old code had both the pipe reading and writing protected inside the same lock and when the pipe buffer was filled, ~64k pending tasks, it would block on the next pipe write causing a deadlock since it was still holding the lock blocking the pipe read. There is no need for the pipe I/O to be protected by a lock but moving it outside the lock caused another issue -- thread explosion. There is no throttle on the thread pool since tasks can submit other thread pool tasks and so on each waiting for the other to complete. There is logic in place to minimize thread creation which significantly limits this explosion but all external tasks generate a new thread on hte LServer. So when a large rapid fire succession of incoming commands is received it tries to process them as quickly as possible by spawning threads as needed. Simply creating a smaller thread pool doesn't work because of the task submission loops mentioned earlier could lead to deadlock. For the LServer this occurs solely from tasks that require multiple responses for the same task and all of these are handled via the mq_stream routines. The solution implemented by the patch is to artificially create 2 different thread pools - short and long running tasks. Technically there is only a single thread pool along with a counter holding the number of tasks in the short running "pool". All incoming tasks are placed in the short running "pool". This pool has a cap on the number of tasks that can be running at the same time and is controlled by the "bind_short_running_max" variable in the MQ section of the config. This acts as an inherit throttle on incoming tasks. Any task that requires multiple mq_stream responses automatically moves itself into the unlimited long running "pool". This way the long running task can be waiting for an Ok to send more data response from the client which will be coming in through the short running "pool" and not deadlock. This is implemented by adding a couple of routines to the API allowing any task to switch pools.

All tasks are put on a WQ where a single thread then pops them off sorts, merges into larger tasks, and executes them. The patch supports doing both AIO and WQ. The default is AIO with WQ enabled via a lio_wq_enable() call.

You can now test any of the I/O modes including hitting a local file system. The local file support was added to test the FUSE driver. The tester also supports hitting multiple targets to detect inter-file issues.

…ache

Currently several options need to be specified in 2 locations: command line and on the fuse_conn itself. Keeping them in sync is problematic.

… code This adds support for lio_cp, lio_get, lio_put to work on purely local data without requiring a LIO target. The purpose is to help do performance tuning of the FUSE mount.

…ation

…e LIO config

tacketar added 30 commits May 10, 2017 11:20

Properly handle iter reads after EOS

e7e2e81

Minor changes

825e8d2

Add checks for aborting a stream

24f9417

Fix shutdown race conditions

d099482

Track running tasks and shutdown after they complete

aba19e2

Handle errors from dead OS connections

24d2f7e

Properly report object iterator errors

fa8b180

Fix missing output unless -ns or -s were used

f4138fc

Fix return code

dc08646

Add some POSIXy type functions

48bb456

Added support for TCMU-runner

97cb38f

Fix bug introduced in other commit on cleanup

92d26f6

Shouldn't be tracking this file

140531d

Fix total file count in summary

21cfed0

Sanity check the DIR handle is valid

3008065

Set a more intelligent default sending high water mark

d5ff5c2

Handle 0-byte attributes and attr read errors

ba3c3f9

Fix double close(fd) issues and correct struct size

c62c042

Fix mangled name

ee0d12a

Protect against a spurious empty RID config

bdd96f0

Shuffle a GOP call to the API and add a common task to stack

4d183be

Extend R/W tester to support exnodes, AIO, WQ, and local files

b462fa6

You can now test any of the I/O modes including hitting a local file system. The local file support was added to test the FUSE driver. The tester also supports hitting multiple targets to detect inter-file issues.

Enable WQ support in FUSE and pretty print a structure

a78b908

Enable WQ support for TCMU

005505a

Cleanup some WQ logging

c5bb17f

Move global lock to improve performance

df25d3d

Discard errors concerning allocation deletions on files being deleted

13868fc

Properly NULLIFY mqs before use

5f1415c

tacketar and others added 29 commits October 24, 2019 09:38

Add ability to perform Direct I/O and bypass going through the page c…

e6d6515

…ache

Make LFS use kernel defaults for fuse_conn options.

5ff1263

Currently several options need to be specified in 2 locations: command line and on the fuse_conn itself. Keeping them in sync is problematic.

Demangle blacklist destroy name

bbb33dd

Fix destruction order

99516ff

According to 0mq docs you aren't supposed to init a msg more than once

f82aaf3

Use the proper mpool for the object hash so it can be reaped

b6e5fb3

Only throw an error in the case of a HARD error

9d0703d

Minor formatting change

979bc51

Gracefully handle failed RID updates

20ce96d

Place brackets around the macro to treat like a single line for if-then.

0111798

Corrent and update usage info

db182d8

Print timing info for writes similar ot reads

8b254e3

Fix initialization bug

58943b4

Add more intelligent caching for DNS lookup failures

1b04160

Add support for memory aligned malloc and checking

c19ce4f

Add support for local2local transfers and cleanup starting offset/len…

d46cf8e

… code This adds support for lio_cp, lio_get, lio_put to work on purely local data without requiring a LIO target. The purpose is to help do performance tuning of the FUSE mount.

Move parity free to outside tryagain loop

1eb5f7f

Remove redundant typedef's

427edb5

Added ability to not core dump cache pages. Defaults to not.

509b985

Add support for local2local transfers and O_DIRECT

4266c16

Print more diagnostic errors on cleanup and sanity chack new page cre…

413fa11

…ation

Add some more error checking on shutdown

1cf8313

Simplify the Work Que logic and fix potential race conditions

29b727e

Forgot to add header file

a9f0114

Add support for setting max_background and congestion_threshold in th…

edfc89a

…e LIO config

Sanity check if extra connections are needed

a59dc24

Fix myclose race condition

14e278c

Add locking

39fc7fa

Fix race condition on attribute updates resulting in stale information

2217237

PerilousApricot merged commit 9a4d925 into accre:master Apr 7, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Standing alan_working pull request. Do not merge! #181

Standing alan_working pull request. Do not merge! #181

tacketar commented Sep 7, 2017

Standing alan_working pull request. Do not merge! #181

Standing alan_working pull request. Do not merge! #181

Conversation

tacketar commented Sep 7, 2017