Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Darwin error at service stop in Vulture OS #74

Closed
HugoSoszynski opened this issue Oct 2, 2019 · 4 comments · Fixed by #100
Closed

[BUG] Darwin error at service stop in Vulture OS #74

HugoSoszynski opened this issue Oct 2, 2019 · 4 comments · Fixed by #100
Assignees
Labels
Milestone

Comments

@HugoSoszynski
Copy link
Contributor

Describe the bug
When doing a service darwin stop, CRITICAL logs pop telling that some filters (not all) cannot open UNIX socket because of a bad weak pointer.

Plateform (please complete the following information):

  • OS (version): FreeBSD 12.0
  • Darwin version: 1.0
  • Vulture BASE version: 0.9.7
  • Vulture GUI version: 0.9.92

To Reproduce
Steps to reproduce the behavior:

  1. Install Vulture 4
  2. Create a new Darwin Policy containing:
  • fconnection
  • finspection
  • fdga
  • fhostlookup
  • ftanomaly
  1. Open logs in one terminal (tail -f /var/log/darwin/darwin.log)
  2. service darwin stop

Expected behavior
Darwin filters terminating without errors.

Screenshots/logs
Try 1:

{"date":"Wed Oct  2 12:48:49 2019","level":"CRITICAL","filter":"logs_1","message":"Core::run:: Cannot open unix socket: bad_weak_ptr"} 
{"date":"Wed Oct  2 12:48:49 2019","level":"CRITICAL","filter":"content_inspection_2","message":"Core::run:: Cannot open unix socket: bad_weak_ptr"} 
{"date":"Wed Oct  2 12:48:49 2019","level":"CRITICAL","filter":"tanomaly_2","message":"Core::run:: Cannot open unix socket: bad_weak_ptr"} 

Try 2:

{"date":"Wed Oct  2 12:49:34 2019","level":"CRITICAL","filter":"logs_1","message":"Core::run:: Cannot open unix socket: bad_weak_ptr"}
{"date":"Wed Oct  2 12:49:34 2019","level":"CRITICAL","filter":"content_inspection_2","message":"Core::run:: Cannot open unix socket: bad_weak_ptr"}
{"date":"Wed Oct  2 12:49:34 2019","level":"CRITICAL","filter":"dga_2","message":"Core::run:: Cannot open unix socket: bad_weak_ptr"}
{"date":"Wed Oct  2 12:49:34 2019","level":"CRITICAL","filter":"tanomaly_2","message":"Core::run:: Cannot open unix socket: bad_weak_ptr"}

Additional context
Using Rsyslog's mmdarwin module with an impcap listener.

@HugoSoszynski HugoSoszynski added this to the Version 1.0.1 milestone Oct 2, 2019
@NS4nti
Copy link
Contributor

NS4nti commented Oct 4, 2019

I don't succeed in reproducing this bug, can you maybe post your config file ?

@frikilax
Copy link
Member

additional context, while working on something else.
with logs filter and a new version of redis manager.
filter was launched in DEVELOPER mode, logs are what followed a Ctrl+C.

"date":"Thu Oct 17 23:19:28 2019","level":"DEBUG","filter":"logs_1","message":"Server::Handle:: Closing acceptor"}
{"date":"Thu Oct 17 23:19:28 2019","level":"DEBUG","filter":"logs_1","message":"Session::Stop::"}
{"date":"Thu Oct 17 23:19:28 2019","level":"DEBUG","filter":"logs_1","message":"Server::Handle:: Closing acceptor"}
{"date":"Thu Oct 17 23:19:28 2019","level":"INFO","filter":"logs_1","message":"Server::HandleAccept:: Acceptor closed, closing server..."}                                                                                                  
{"date":"Thu Oct 17 23:19:28 2019","level":"DEBUG","filter":"logs_1","message":"Session::ReadHeaderCallback:: Reading header"}                                                                                                              
{"date":"Thu Oct 17 23:19:28 2019","level":"INFO","filter":"logs_1","message":"Monitor::HandleAccept:: Acceptor closed, closing monitor..."}                                                                                                
{"date":"Thu Oct 17 23:19:28 2019","level":"WARNING","filter":"logs_1","message":"Session::ReadHeaderCallback:: Operation canceled"}                                                                                                        
{"date":"Thu Oct 17 23:19:28 2019","level":"CRITICAL","filter":"logs_1","message":"Core::run:: Cannot open unix socket: bad_weak_ptr"}                                                                                                      
{"date":"Thu Oct 17 23:19:28 2019","level":"DEBUG","filter":"logs_1","message":"Core::ClearPID:: Removing PID file"}

@HugoSoszynski
Copy link
Contributor Author

It looks like this happen when we close a filter running some callback.

I modified the way a filter is cleaning its resources.
This seems to solve a lot of memory issues when stopping a filter that is running some callbacks.

Seems to solve this issue, needs further testing.
Change will come in v1.1 with addition of the threading (moving this issue to milestone Version 1.1).

@HugoSoszynski HugoSoszynski self-assigned this Oct 28, 2019
@HugoSoszynski
Copy link
Contributor Author

HugoSoszynski commented Oct 30, 2019

Seems to be resolved in #100.

Was due to early clean of the Sessions.

@HugoSoszynski HugoSoszynski mentioned this issue Nov 6, 2019
7 tasks
HugoSoszynski added a commit that referenced this issue Nov 6, 2019
**Breaking change**: fix or feature that would cause existing functionality to not work as expected.

## Related Issue(s)

- Resolve #98 
- Fix #74 
- Fix #116

## Description

### Core
Added simple thread group to the project in order to run the asio callbacks.
The ThreadGroup class uses a simple list of threads because it is made to create threads once and join all threads once.

On stopping signal, no longer clean the sessions before the handlers
are finished running.
Now stopping io_context to prevent any callback push in the queue
during a graceful exit.
This removes memory errors, leaks, bad_weak_ptr and bad file descriptors.
Added it to monitoring thread too.

Rework of the "Send" methods of the session to fix filter closing and potential datarace on closing filters.

Some refactoring around the Generator and the Session.

### Cmake
- Updated C++ version to 14 to 17
- Removed fend from default compilation
- Removed freputation from default compilation

### Redis Manager
- works for heavy multithreading
- supports individual connections for each thread
- supports master/slave connections
- easy query, wide response types (int, string, array)
- automatic disconnection/reconnection timeouts (default 20 seconds)
- keep alive thread
- fallback to slave connection if master failed and slave exists
- stills connects to IP:port or unix sockets

### Filters
- Updating all the filters to support threading.
- Updating all the filters to match the refactor of Generator and Session.

### Content Inspection
- Fix memory leaks
- Rename implementation of TCP states to avoid conflicts (GNU implementation not used as order of definitions is important)
- Add conditional inclusion for assert()

### DGA
- Now closing tensorflow Session the proper way.
- Tensorflow still has internal memory leaks that we cannot solve

### Session _(the filter)_
- Rework functions to send data to filters/clients
- Removed cache because of potential security issues while using it (e.g. session deleted in redis but still in the in-memory cache)

### Tests
**RedisManager:**
- Simple master connect
- Slave -> master connect
- Slave -> master connect, master fail
- Slave -> master connect, master off
- Master connect, disconnect/reconnect after timeout
- Multi-threading tests to the RedisManager

**DGA:** _(The test are not using valgrind due to internal tensorflow memory leaks)_
- Token map loading tests

## Test Environments

### FreeBSD 12.0
- Redis 4.0.14
- Boost 1.71.0
- clang++ 8.0.1
- CMake 3.15.2
- Python 3.6.9

### Ubuntu 19.04
- Redis 5.0.3
- Boost 1.71.0
- g++ 8.3.0
- CMake 3.13.4
- Python 3.7.3
- Valgrind 3.14.0
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants