Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Get initial network_time in 3.1 #938

Closed
J-Gras opened this issue Apr 27, 2020 · 7 comments · Fixed by #948
Closed

Get initial network_time in 3.1 #938

J-Gras opened this issue Apr 27, 2020 · 7 comments · Fixed by #948
Assignees
Labels
Area: Event Engine Type: Bug 🐛 Unexpected behavior or output.
Milestone

Comments

@J-Gras
Copy link
Contributor

J-Gras commented Apr 27, 2020

When processing PCAPs, network time might significantly differ from current time. As already documented, network_time() is not yet initialized in zeek_init(). To get network time before any packet is processed, one could schedule an event with 0sec:

event pcap_init()
	{
	print network_time(); # Time is initialized
	}

event zeek_init()
	{
	print network_time(); # Time is 0
	schedule 0sec { pcap_init() };
	}

With version 3.1 this approach stopped working for me. pcap_init() is executed before network time is set. I guess this is a side effect of the changes to the IO loop and the timer manager. One solution might be to introduce a new event (in case there isn't already one I missed) after network time is initialized but before the first packet is processed.

@0xxon 0xxon added Area: Event Engine Type: Bug 🐛 Unexpected behavior or output. labels Apr 28, 2020
@0xxon
Copy link
Member

0xxon commented Apr 28, 2020

Flagging this as a bug since it seems like a regression.

@0xxon 0xxon added this to the 3.2.0 milestone Apr 28, 2020
@0xxon 0xxon added this to Unassigned / Todo in Release 3.2.0 via automation Apr 28, 2020
@timwoj timwoj self-assigned this Apr 28, 2020
@timwoj
Copy link
Contributor

timwoj commented Apr 28, 2020

I guess this is a side effect of the changes to the IO loop and the timer manager.

That's correct. The new timer architecture would wake the loop up immediately for that zero-second timer and fire the event before anything else was processed.

One solution might be to introduce a new event (in case there isn't already one I missed) after network time is initialized but before the first packet is processed.

I don't know of one off the top of my head that exists already but I'll look around. It'd be easy enough to add a new one.

@timwoj
Copy link
Contributor

timwoj commented Apr 28, 2020

How are you running it, out of curiosity? I'm getting the same result on both 3.0 and master:

coregeek:Desktop tim$ cat init_test.zeek 
event pcap_init()
	{
	print network_time(); # Time is initialized
	}

event zeek_init()
	{
	print network_time(); # Time is 0
	schedule 0sec { pcap_init() };
	}
coregeek:Desktop tim$ zeek -v
zeek version 3.0.6-dev.2-debug
coregeek:Desktop tim$ zeek -B tm -b init_test.zeek -i en0
listening on en0

0.0
1588105770.998826
^C1588105772.383974 received termination signal
1588105772.383974 106 packets received on interface en0, 0 dropped
coregeek:Desktop tim$ more debug.log 
         0.000000/1588105770.999172 [tm] Adding timer ScheduleTimer to TimeMgr 0x7ff5c2a1d040
         0.000000/1588105770.999285 [tm] Adding timer ScheduleTimer to TimeMgr 0x7ff5c2a1d040
         0.000000/1588105771.010024 [tm] no tag, using global timer mgr 0x7ff5c2a1d040
1588105770.998826/1588105771.010067 [tm] no tag, using global timer mgr 0x7ff5c2a1d040
1588105770.998826/1588105771.010076 [tm] advancing global timer mgr 0x7ff5c2a1d040 to 1588105770.998826
1588105770.998826/1588105771.010086 [tm] Dispatching timer ScheduleTimer in TimeMgr 0x7ff5c2a1d040
1588105770.998826/1588105771.010100 [tm] Dispatching timer ScheduleTimer in TimeMgr 0x7ff5c2a1d040
coregeek:Desktop tim$ zeek --version
zeek version 3.2.0-dev.405-debug
coregeek:Desktop tim$ zeek -B tm -b init_test.zeek -i en0
listening on en0

0.0
1588105755.283502
^C1588105757.882301 received termination signal
1588105757.882301 77 packets received on interface en0, 0 (0.00%) dropped
coregeek:Desktop tim$ cat debug.log 
         0.000000/1588105755.278389 [tm] Adding timer ScheduleTimer (0x7ffcac51d140) at 1.000000
         0.000000/1588105755.278473 [tm] Adding timer ScheduleTimer (0x7ffcac51db30) at 0.000000
1588105755.278536/1588105755.278536 [tm] advancing timer mgr to 1588105755.278536
1588105755.278536/1588105755.278545 [tm] Dispatching timer ScheduleTimer (0x7ffcac51db30)
1588105755.278536/1588105755.278555 [tm] Dispatching timer ScheduleTimer (0x7ffcac51d140)
1588105755.283502/1588105755.289643 [tm] advancing timer mgr to 1588105755.283502

I tried the same without -b and it's the same results, just with extra debug logging as it schedules a bunch more timers at startup.

@J-Gras
Copy link
Contributor Author

J-Gras commented Apr 29, 2020

Sorry, I forgot to mention that this only occurs when reading a PCAP. I think in case of live captures, zeek takes another path.

@J-Gras
Copy link
Contributor Author

J-Gras commented Apr 29, 2020

I wrote a test case and added a new event in #943. Playing around with the test I noticed that scheduled_delayed_event seems to be executed immediately after it's scheduled. I will take a look at it today or tomorrow and complete the PR.

@J-Gras
Copy link
Contributor Author

J-Gras commented Apr 29, 2020

Writing the comment I noticed my mistake. As stated in the docs, the event is executed together with the first packet as it has been scheduled when network_time is 0 and thus the timer expires immediately.

Assuming we don't look at PCAPs with 0 timestamps, one could use that to schedule an event when network time is initialized. This way we wouldn't need a new event.

@jsiwek
Copy link
Contributor

jsiwek commented Apr 29, 2020

I like the idea of adding an explicit event for first time network_time is set and suggesting people use that to inspect initial network_time, but still seems the new behavior lands slightly on the side of "regression" or at least points out a minor improvement to make in the IO loop -- stepping through the first call to FindReadySources(), it skips past the pcap source, but probably shouldn't.

A quick hack/PoC that restores old behavior: b6756a5

timwoj added a commit that referenced this issue May 1, 2020
…always-ready'

* origin/topic/jsiwek/gh-938-pcap-source-always-ready:
  GH-938: fix IO loop iterations sometimes skipping offline pcap sources
Release 3.2.0 automation moved this from Unassigned / Todo to Done May 1, 2020
jsiwek added a commit that referenced this issue May 6, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area: Event Engine Type: Bug 🐛 Unexpected behavior or output.
Projects
No open projects
Release 3.2.0
  
Done
Development

Successfully merging a pull request may close this issue.

4 participants