Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

logreader/logwriter: fix race condition of idle timer and scheduled I/O job #2650

Merged
merged 3 commits into from Apr 3, 2019

Conversation

gaborznagy
Copy link
Collaborator

@gaborznagy gaborznagy commented Mar 28, 2019

There is a race condition between scheduled I/O operations in worker threads and closing the idle connections (response/ack_timeout timer).
There is no protection - in afsocket source-driver - to not destroy connections(readers), which have scheduled I/O operation (not like when we reap writers on file dest. side).

The idle timer is not stopped, just when an I/O operation finished and we would like to re-schedule the next action on source side (update_watches() ).
The timer can trigger a connection closing, although there is schedule I/O operation.

Timer should be stopped when we schedule an I/O.
This way we eliminate the race condition between 2 sides.
We do stop I/O polling (see calling log_reader_stop_watches() in log_reader_io_handle_in()), so stopping the idle timer makes sens.

UPDATE:
About stop_idle_timer cals in update_watches:
Initially I thought that calling idle_timer is unnecessary in update_watches,
since we stop the timer in io_handle. I've ven put an assertion to the timer registration,
disabling timer re-registering.
I forgot that in case of suspend, we will call an update_watches again, where an idle_timer
registration will fail due to the assertion.

@kira-syslogng
Copy link
Contributor

Build FAILURE

@gaborznagy gaborznagy changed the title logreader: stop idle timer when we have an I/O operation. WIP: logreader: stop idle timer when we have an I/O operation. Mar 28, 2019
@gaborznagy
Copy link
Collaborator Author

@kira-syslogng retest this please;

@kira-syslogng
Copy link
Contributor

Build SUCCESS

@gaborznagy gaborznagy changed the title WIP: logreader: stop idle timer when we have an I/O operation. logreader: stop idle timer when we have an I/O operation. Mar 28, 2019
@kira-syslogng
Copy link
Contributor

Build SUCCESS

@gaborznagy gaborznagy added this to the syslog-ng-3.21 milestone Apr 1, 2019
furiel
furiel previously approved these changes Apr 2, 2019
MrAnno
MrAnno previously approved these changes Apr 2, 2019
@gaborznagy
Copy link
Collaborator Author

@alltilla kindly pointed out that we should make LogReader and LogWriter symmetric.
Also the issue can occur there as well.

I'll add it soon and rerun some additional tests. marking as WIP until.

@gaborznagy gaborznagy changed the title logreader: stop idle timer when we have an I/O operation. WIP: logreader: stop idle timer when we have an I/O operation. Apr 2, 2019
@gaborznagy gaborznagy dismissed stale reviews from MrAnno and furiel via 85cb7fa April 2, 2019 14:00
@gaborznagy gaborznagy force-pushed the idle-timer-race branch 2 times, most recently from 85cb7fa to e1ee285 Compare April 2, 2019 14:07
@kira-syslogng
Copy link
Contributor

Build FAILURE

@gaborznagy
Copy link
Collaborator Author

@kira-syslogng retest this please

@kira-syslogng
Copy link
Contributor

Build SUCCESS

There is a race condition between scheduled I/O operations in worker threads
 and closing the idle connections (response/ack_timeout timer).

The idle timer is not stopped, just when an I/O operation finished and
 we would like to re-schedule the next action on source side (update_watches()).
The timer can trigger a connection closing, although there is schedule I/O operation.

Timer should be stopped when we schedule an I/O.
This way we eliminate the race condition between 2 sides.
We do stop I/O polling (see calling log_reader_stop_watches() in log_reader_io_handle_in(),
also log_writer_stop_watches() in log_writer_io_handler()).

About stop_idle_timer cals in update_watches:
Initially I thought that calling idle_timer is unnecessary in update_watches,
since we stop the timer in io_handle. I've ven put an assertion to the timer registration,
disabling timer re-registering.
I forgot that in case of suspend, we will call an update_watches again, where an idle_timer
registration will fail due to the assertion.
I've put bach the idle timer stopping.

Signed-off-by: Gabor Nagy <gabor.nagy@balabit.com>
@gaborznagy
Copy link
Collaborator Author

About stop_idle_timer calls in update_watches:
Initially I thought that calling idle_timer is unnecessary in update_watches, since we stop the timer in io_handle.
I've even put an assertion to the timer registration, disabling timer re-registering.
I forgot that in case of suspend, we will call an update_watches again, where an idle_timer registration will fail due to the assertion.

@kira-syslogng
Copy link
Contributor

Build SUCCESS

Gabor Nagy added 2 commits April 3, 2019 14:11
Previous patch eliminates race condition of idle_timer and scheduled IO job.
Still adding protection can help in future changes.

Signed-off-by: Gabor Nagy <gabor.nagy@balabit.com>
Signed-off-by: Gabor Nagy <gabor.nagy@balabit.com>
@gaborznagy gaborznagy changed the title WIP: logreader: stop idle timer when we have an I/O operation. WIP: logreader/logwriter: fix race condition of idle timer and scheduled I/O job. Apr 3, 2019
@kira-syslogng
Copy link
Contributor

Build SUCCESS

@bazsi
Copy link
Collaborator

bazsi commented Apr 3, 2019

This pull request fixes 1 alert when merging b48c6a6 into b1620af - view on LGTM.com

fixed alerts:

  • 1 for Ambiguously signed bit-field member

Comment posted by LGTM.com

@gaborznagy
Copy link
Collaborator Author

I have executed all additional test cases and they PASSED.
Removing WIP flag.

@gaborznagy gaborznagy changed the title WIP: logreader/logwriter: fix race condition of idle timer and scheduled I/O job. logreader/logwriter: fix race condition of idle timer and scheduled I/O job. Apr 3, 2019
@gaborznagy gaborznagy changed the title logreader/logwriter: fix race condition of idle timer and scheduled I/O job. logreader/logwriter: fix race condition of idle timer and scheduled I/O job Apr 3, 2019
@alltilla alltilla merged commit ca015ba into syslog-ng:master Apr 3, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants