Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Errors on socket write while connections/filters are closing #116

Closed
frikilax opened this issue Nov 5, 2019 · 0 comments · Fixed by #100
Closed

[BUG] Errors on socket write while connections/filters are closing #116

frikilax opened this issue Nov 5, 2019 · 0 comments · Fixed by #100
Assignees
Labels
bug Something isn't working important technical
Milestone

Comments

@frikilax
Copy link
Member

frikilax commented Nov 5, 2019

Describe the bug
Filters can get stuck when sending data to next filter, while inbound connection is closing or filter has received a SIGTERM.

Plateform (please complete the following information):

  • OS (version): Ubuntu and FreeBSD 12.0
  • Darwin version: 1.0.2 (+ threading modifications)

To Reproduce
Steps to reproduce the behavior:

  1. Launch a Content_inspection filter (either standalone or with darwin manager) with a next filter (either Logs or a simple unix socket with nc -Ulk) and type LOG, set several threads (around 5)
  2. Launch Rsyslog with impcap and mmdarwin, route mmdarwin output for content_inspection (see content inspection wiki), make sure to set the response_type to "darwin" or "both"
  3. Generate traffic (if necessary) to make all threads work (open/refresh several web pages att once, for example)
  4. send a SIGTERM to the filter (either with Ctrl-C when executed with -z, or with htop/top/kill/whatever)
  5. some of the threads should stop, but not all

Expected behavior
The filter should close cleanly.

Additional context
Seems the filter is hanging on the next filter synchronous write

@frikilax frikilax added bug Something isn't working technical important labels Nov 5, 2019
@frikilax frikilax self-assigned this Nov 5, 2019
@frikilax frikilax added this to the Version 1.1 milestone Nov 5, 2019
@HugoSoszynski HugoSoszynski mentioned this issue Nov 6, 2019
7 tasks
HugoSoszynski added a commit that referenced this issue Nov 6, 2019
**Breaking change**: fix or feature that would cause existing functionality to not work as expected.

## Related Issue(s)

- Resolve #98 
- Fix #74 
- Fix #116

## Description

### Core
Added simple thread group to the project in order to run the asio callbacks.
The ThreadGroup class uses a simple list of threads because it is made to create threads once and join all threads once.

On stopping signal, no longer clean the sessions before the handlers
are finished running.
Now stopping io_context to prevent any callback push in the queue
during a graceful exit.
This removes memory errors, leaks, bad_weak_ptr and bad file descriptors.
Added it to monitoring thread too.

Rework of the "Send" methods of the session to fix filter closing and potential datarace on closing filters.

Some refactoring around the Generator and the Session.

### Cmake
- Updated C++ version to 14 to 17
- Removed fend from default compilation
- Removed freputation from default compilation

### Redis Manager
- works for heavy multithreading
- supports individual connections for each thread
- supports master/slave connections
- easy query, wide response types (int, string, array)
- automatic disconnection/reconnection timeouts (default 20 seconds)
- keep alive thread
- fallback to slave connection if master failed and slave exists
- stills connects to IP:port or unix sockets

### Filters
- Updating all the filters to support threading.
- Updating all the filters to match the refactor of Generator and Session.

### Content Inspection
- Fix memory leaks
- Rename implementation of TCP states to avoid conflicts (GNU implementation not used as order of definitions is important)
- Add conditional inclusion for assert()

### DGA
- Now closing tensorflow Session the proper way.
- Tensorflow still has internal memory leaks that we cannot solve

### Session _(the filter)_
- Rework functions to send data to filters/clients
- Removed cache because of potential security issues while using it (e.g. session deleted in redis but still in the in-memory cache)

### Tests
**RedisManager:**
- Simple master connect
- Slave -> master connect
- Slave -> master connect, master fail
- Slave -> master connect, master off
- Master connect, disconnect/reconnect after timeout
- Multi-threading tests to the RedisManager

**DGA:** _(The test are not using valgrind due to internal tensorflow memory leaks)_
- Token map loading tests

## Test Environments

### FreeBSD 12.0
- Redis 4.0.14
- Boost 1.71.0
- clang++ 8.0.1
- CMake 3.15.2
- Python 3.6.9

### Ubuntu 19.04
- Redis 5.0.3
- Boost 1.71.0
- g++ 8.3.0
- CMake 3.13.4
- Python 3.7.3
- Valgrind 3.14.0
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working important technical
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant