Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update OSITrace to be used in osi-validation #762

Closed
wants to merge 17 commits into from

Conversation

ClemensLinnhoff
Copy link
Contributor

@ClemensLinnhoff ClemensLinnhoff commented Jan 17, 2024

Reference to a related issue in the repository

#761

Add a description

Re-add cache_messages_in_index_range() and some minor variables that were removed in #783 to ensure compatibility with osi-validation.

Take this checklist as orientation for yourself, if this PR is ready for the Change Control Board:

Signed-off-by: ClemensLinnhoff <clemens.linnhoff@partner.bmw.de>
@ClemensLinnhoff ClemensLinnhoff linked an issue Jan 17, 2024 that may be closed by this pull request
Signed-off-by: ClemensLinnhoff <clemens.linnhoff@persival.de>
… OpenSCENARIO

Signed-off-by: ClemensLinnhoff <clemens.linnhoff@persival.de>
Signed-off-by: ClemensLinnhoff <clemens.linnhoff@persival.de>
@ClemensLinnhoff ClemensLinnhoff changed the title Add SPDX license identifiers and copyright notices Update OSITrace to be used in osi-validation Jan 18, 2024
ClemensLinnhoff and others added 10 commits January 18, 2024 10:50
Signed-off-by: ClemensLinnhoff <clemens.linnhoff@persival.de>
Signed-off-by: ClemensLinnhoff <clemens.linnhoff@persival.de>
Signed-off-by: ClemensLinnhoff <clemens.linnhoff@persival.de>
Signed-off-by: ClemensLinnhoff <clemens.linnhoff@persival.de>
Signed-off-by: ClemensLinnhoff <clemens.linnhoff@persival.de>
Signed-off-by: ClemensLinnhoff <clemens.linnhoff@persival.de>
Signed-off-by: ClemensLinnhoff <clemens.linnhoff@persival.de>
Signed-off-by: ClemensLinnhoff <clemens.linnhoff@partner.bmw.de>
Signed-off-by: ClemensLinnhoff <clemens.linnhoff@partner.bmw.de>
Signed-off-by: ClemensLinnhoff <clemens.linnhoff@partner.bmw.de>
@ClemensLinnhoff ClemensLinnhoff marked this pull request as ready for review January 22, 2024 09:51
Signed-off-by: ClemensLinnhoff <clemens.linnhoff@partner.bmw.de>
Signed-off-by: ClemensLinnhoff <clemens.linnhoff@partner.bmw.de>
@ClemensLinnhoff ClemensLinnhoff requested review from pmai and removed request for vkresch April 4, 2024 15:31
Signed-off-by: ClemensLinnhoff <clemens.linnhoff@partner.bmw.de>
@ClemensLinnhoff ClemensLinnhoff added this to the V3.7.0 milestone Apr 4, 2024
Copy link
Contributor

@pmai pmai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Except for the bug fix in the if condition, I think this is all a bad idea, and rather osi-validation message reading should be rewritten from scratch to be more efficient, cleaner and more transparent. Given the little that is actually needed, this should not be too hard to do; I'll maybe give it a try after the OSI release.

@@ -32,6 +32,7 @@ def __init__(self, path=None, type_name="SensorView", cache_messages=False):
self.read_complete = False
self.message_cache = {} if cache_messages else None
self._header_length = 4
self.timestep_count = 0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is redundant to the length of message_offsets, and only valid once a file has been fully read in, so is not an apropriate API to the outside.

@@ -49,13 +50,16 @@ def from_file(self, path, type_name="SensorView", cache_messages=False):
self.message_offsets = [0]
self.message_cache = {} if cache_messages else None

self.timestep_count = len(self.retrieve_offsets())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This turns a simple open operation into a very expensive operation, as it must read through the whole file to extract all the offsets. Sometimes this is needed, but in most instances you want to actually read and process a file sequentially, either step-by-step, or you need all messages anyway, in which case you also do not want to retrieve the offsets but rather read the messages at the same time. So that is a big no-no.

@@ -49,13 +50,16 @@ def from_file(self, path, type_name="SensorView", cache_messages=False):
self.message_offsets = [0]
self.message_cache = {} if cache_messages else None

self.timestep_count = len(self.retrieve_offsets())
self.type = self.map_message_type(type_name)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a duplication of line 40.

Comment on lines -57 to +62
while (
not self.read_complete and not limit or len(self.message_offsets) <= limit
while not self.read_complete and (
not limit or len(self.message_offsets) <= limit
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The bugfix is important,

It just makes me regret that we added black reformatting checks that this is convoluted by the arbitrary re-indentations needed. We probably should increase the line length to avoid stupidity like this.

@@ -156,6 +160,21 @@ def get_messages_in_index_range(self, begin, end):
yield message
current += 1

def cache_messages_in_index_range(self, begin, end):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This whole method is a Bad Idea(tm): A cache is something that stores things that have already been accessed.
If you just want to read everything into memory, this is not a cache, and should not reside inside OSI Trace, but rather inside the recipient that wants to work with the data, i.e. osi-validation. I actually hesitated in re-implementing the cacheing behavior because of the ripe potential for mis-use, as shown here.

In any case, the iterator returned from get_messages_in_index_range will already place all read messages into the cache, so the implementation is also needlessly redundant.

If one just wants to have all messages available in memory, then this really is not cacheing, it is just buffer reading and should be handled as such; i.e. if I want all messages anyway, then the simple one-liner suffices:

   trace = OSITrace(path, type) # Note: No cacheing in OSITrace
   messages = [*trace.get_messages()]
   trace.close()

And then access the messages by index via messages[i]. This is both the most efficient, and straightforward. If one wants to read in batches, something very similar can be done:

   from itertools import islice
   from functools import partial

   trace = OSITrace(path, type) # Note: No cacheing in OSITrace
   for batch in iter(partial(lambda n,i: list(islice(i,n)),batch_size,iter(trace.get_messages())), []):
     # Process batch here
   trace.close()

In Python 3.12 and later itertools.batched could be used directly.

In summary I would say osi-validation code that handles messages is currently a mess and should be rewritten to more clearly do the batch-processing it needs without multiple reads/re-reads of the file and deep interaction with internal mechanisms of the reader.

@ClemensLinnhoff
Copy link
Contributor Author

Current OSITrace implementation will be used. We are going to make changes to osi-validation to use it: OpenSimulationInterface/osi-validation#70

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Update OSITrace
2 participants