New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support multi-file logfiles #118

Open
m3d opened this Issue Dec 4, 2018 · 1 comment

Comments

Projects
None yet
1 participant
@m3d
Member

m3d commented Dec 4, 2018

Yesterday we already reached 1 hour recording (1.3GB) for SubT ROS simulation in Virtual Track:

Exception in thread Thread-2:
Traceback (most recent call last):
  File "/usr/lib/python3.6/threading.py", line 916, in _bootstrap_inner
    self.run()
  File "/home/martind/md/osgar/examples/subt/mytimer.py", line 20, in run
    self.bus.publish('tick', None)
  File "/home/martind/md/osgar/osgar/bus.py", line 34, in publish
    timestamp = self.logger.write(stream_id, serialize(data))
  File "/home/martind/md/osgar/osgar/logger.py", line 69, in write
    assert dt.seconds < 3600, dt  # overflow not supported yet
AssertionError: 1:00:00.091434

So it is time to support indexed files, right? The default split criteria should be time=>3600s, but it can be also shorter time or size. I would use ADTF scheme for DAT files (and probably many others) with _000, _001 etc names (alternative is to use dots, i.e. .000). I would still keep the .log extension and each part should be "independent" in the sense, that if you want recording from the 2nd hour you should not need to load the first (not indexed) file, OK?

On the other hand, if you open the first file with default parameters it should go through all files and with some extra parameter it would read a single file only.

Do we want to mark in the first file, that the recording is not complete?

It is surely necessary to copy all named streams from the zero stream, but should we also add command line and config? (replay functions would probably fail anyway on the other parts)

subt-go1m-181203_195930.log
subt-go1m-181203_195930_000.log
subt-go1m-181203_195930_001.log
subt-go1m-181203_195930_002.log

Note, that 3600s limit is due to our timestamp representation, but it is never-the-less reasonable to split several GB files into smaller pieces.

@m3d m3d added the question label Dec 4, 2018

@m3d

This comment has been minimized.

Member

m3d commented Dec 6, 2018

I would add Zbynek comment from yesterday, that maybe we should have all nodes serialized at the beginning of a log file. Then they can be "deserialized in the middle" (say at the beginning of the next log file) and you do not need to re-run everything in order to replay some part.

Another note is related to #115 (logging of shutdown) - after that we do not need any "mark" that this is the end of split log file.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment