Bug in reading .dat file #124

drhannah94 · 2019-02-07T16:43:43Z

Python version: 3.6.1

from asammdf import MDF

def merge_mdfs(files):
    return MDF.merge(files).resample(0.1)

def main():
    # Here I have code to read in a series of '.dat' files as a list into the variable 'files'
    merge_file = merge_mdfs(files)

    # The variable 'merge_file' contains channel names with the corrupt data

Code

MDF version

4.7.8

Code snippet

return MDF.merge(files).resample(0.1)

Traceback

This code doesn't produce an error or a traceback

Description

Let me preface by saying that I am relatively new to python and even newer to this library, but I understand the importance of collaboration so I will try my best to help. Since I work for a big company, I can't share exactly everything I am working on, especially the data files I am using. However, I will do my best to supply all available information I can to resolve this issue.

The issue I'm having is that when I read in a series of '.dat' files, sometimes the data gets read in perfectly, but other times the data gets all messed up and values that were not in the original data find their way in.

Example: I am reading in acceleration data from an accelerometer. The min and max values of this data trace are confirmed by one of the other tools my company uses to plot data to be max: ~6.5, min: ~-1.5 (units = m/s^2). When I read a series of these same files in I get a max of the same value and a min of ~-13 m/s^2. When I go in and look at the data there are more data points than their should be, and the data doesn't flow like what I would expect to see (i.e. a lot of repeating values).

Please let me know if anyone needs more information to help solve this issue. I will try my best to supply any additional information requested.

Thanks for supporting this awesome library! :)

danielhrisca · 2019-02-07T16:59:20Z

Hello @drhannah94 ,
I appreciate your effort; thank you!

I have some questions:

if you open the files individually and plot the problematic channels do you still see wrong values?

MDF(filename).get(channelname).plot()

what is the mdf version of your files?

print(MDF(filename).version

how does the problematic channel look like?

mdf = MDF(filename)
for group_index, channel_index in mdf.whereis(channelname):
    print(mdf.groups[group_index]['channels'][channel_index])

drhannah94 · 2019-02-07T18:04:05Z

Hey @danielhrisca! Thanks for getting back to me so fast! Here is the output from the information you requested. I also attached a plot of what the actual data should look like.

*Note that the data in the Excel plot is sampled at 250 ms so the x-axis scale is different.

    print(MDF(files[0]).version) # --> 3.00 - version of the original files
    MDF(files[0]).get('Accel_Chasis_X').plot() # --> Figure 1
    merge_file = MDF.merge(files)
    print(merge_file.version) # --> 4.10 - version of the merged files
    merge_file.get('Accel_Chasis_X').plot() # --> Figure 2
    merge_file.resample(0.1)
    merge_file.get('Accel_Chasis_X').plot() # --> Figure 3

    for group_index, channel_index in merge_file.whereis('Accel_Chasis_X'):
        print(merge_file.groups[group_index]['channels'][channel_index])

    # Output from for loop:
    # <Channel (name: Accel_Chasis_X\ES650 / AD/Thermo:1, unit: m/s^2, comment: , address: 0x0,
    # conversion: <ChannelConversion (name: , unit: m/s^2, comment: , formula: , referenced blocks: None, address: 0x0, fields: {'id': b'##CC', 'reserved0': 0, 'block_len': 96, 'links_nr': 4, 'name_addr': 0, 'unit_addr': 0, 'comment_addr': 0, 'inv_conv_addr': 0, 'conversion_type': 1, 'precision': 1, 'flags': 0, 'ref_param_nr': 0, 'val_param_nr': 2, 'min_phy_value': 0.0, 'max_phy_value': 0.0, 'b': -24.608724153475002, 'a': 9.80665e-06})>,
    # source: None,
    # fields: {'id': b'##CN', 'reserved0': 0, 'block_len': 160, 'links_nr': 8, 'next_ch_addr': 0, 'component_addr': 0, 'name_addr': 0, 'source_addr': 0, 'conversion_addr': 0, 'data_block_addr': 0, 'unit_addr': 0, 'comment_addr': 0, 'channel_type': 0, 'sync_type': 0, 'data_type': 2, 'bit_offset': 0, 'byte_offset': 16, 'bit_count': 32, 'flags': 0, 'pos_invalidation_bit': 0, 'precision': 3, 'reserved1': 0, 'attachment_nr': 0, 'min_raw_value': 0, 'max_raw_value': 0, 'lower_limit': 0, 'upper_limit': 0, 'lower_ext_limit': 0, 'upper_ext_limit': 0})>

Figure 1: Output First File Before Merge (Looks correct)

Figure 2: Output of Corrupt Channel From Merge

Figure 3: Output of Corrupt Channel From Merge Resampled at 100ms

Figure 4: Plot of What Acceleration Data is Supposed to Look Like

danielhrisca · 2019-02-07T19:49:48Z

What is the output of the snippet?

mdf = MDF(files[0])
group_index, _ = mdf.whereis(channelname)[0]
group = mdf.groups[group_index]
mdf._prepare_record(group)
print(*group['types'], sep='\n')

danielhrisca · 2019-02-07T19:52:48Z

Please run the snippet from point 3 using one of the original files (version 3.00)

mdf = MDF(files[0])
for group_index, channel_index in mdf.whereis(channelname):
    print(mdf.groups[group_index]['channels'][channel_index])

drhannah94 · 2019-02-07T21:07:31Z

I am getting an error when trying to print '*group['types']'

mdf = MDF(files[0])
group_index, _ = mdf.whereis('Accel_Chasis_X')[0]
group = mdf.groups[group_index]
mdf._prepare_record(group)
print(*group['types'], sep='\n') # --> Error Here

Output

File "My Directory", line 247, in merge_mdfs
print(*group['types'], sep='\n')
KeyError: 'types'

mdf = MDF(files[0])
for group_index, channel_index in mdf.whereis('Accel_Chasis_X'):
    print(mdf.groups[group_index]['channels'][channel_index])

Output

Channel (name: Accel_Chasis_X\ES650 / AD/Thermo:1, display name: Accel_Chasis_X\ES650 / AD/Thermo:1, comment: , address: 0x2406, fields: {'id': b'CN', 'block_len': 228, 'next_ch_addr': 0, 'conversion_addr': 5826, 'source_addr': 0, 'ch_depend_addr': 0, 'comment_addr': 9144, 'channel_type': 0, 'short_name': b'Accel_Chasis_X\ES650 / AD/Thermo', 'description': b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00', 'start_offset': 128, 'bit_count': 32, 'data_type': 1, 'range_flag': 0, 'min_raw_value': 0.0, 'max_raw_value': 0.0, 'sampling_rate': 0.001, 'long_name_addr': 2187, 'display_name_addr': 9149, 'aditional_byte_offset': 0})

danielhrisca · 2019-02-07T21:15:29Z

Right, please run the snippet again without the asterix

print(group["types"])

drhannah94 · 2019-02-07T21:18:48Z

For some reason there is no key 'types'.

danielhrisca · 2019-02-07T21:19:39Z

You need to call _prepare_record first

drhannah94 · 2019-02-07T21:31:44Z

I'm still calling '_prepare_record' first.

mdf = MDF(files[0])
group_index, _ = mdf.whereis('Accel_Chasis_X')[0]
group = mdf.groups[group_index]
mdf._prepare_record(group)
print(group['types'], sep='\n')

group: Before _prepare_record

group: After _prepare_record

danielhrisca · 2019-02-08T06:01:47Z

This should finally be the correct snippet

mdf = MDF(files[0])
group_index, _ = mdf.whereis('Accel_Chasis_X')[0]
group = mdf.groups[group_index]
print(mdf._prepare_record(group))

drhannah94 · 2019-02-08T14:19:45Z

Output:

({0: ('time', 0), 1: ('Accel_Vertical_Wheel\ES650 / AD/Thermo:1', 0), 2: ('Accel_Chasis_Y\ES650 / AD/Thermo:1', 0), 3: ('Accel_Chasis_X\ES650 / AD/Thermo:1', 0)}, dtype([('time', '<f8'), ('Accel_Vertical_Wheel\ES650 / AD/Thermo:1', '<i4'), ('Accel_Chasis_Y\ES650 / AD/Thermo:1', '<i4'), ('Accel_Chasis_X\ES650 / AD/Thermo:1', '<i4')]))

danielhrisca · 2019-02-08T14:23:33Z

There is nothing really interesting in the print.

Let's try this:

mdf = MDF(files[0])
print(mdf.info())

drhannah94 · 2019-02-08T14:26:34Z

Output:

{'author': 'rdo9fe', 'department': '', 'project': '', 'subject': '', 'version': '3.00', 'groups': 26, 'group 0': {'cycles': 6, 'comment': 'AC', 'channels count': 2, 'channel 0': 'name="time" type=master', 'channel 1': 'name="AC\CAN-Monitoring:1" type=value'}, 'group 1': {'cycles': 2671, 'comment': 'Accelerometer', 'channels count': 2, 'channel 0': 'name="time" type=master', 'channel 1': 'name="Accelerometer\CAN-Monitoring:1" type=value'}, 'group 2': {'cycles': 0, 'comment': 'Brake', 'channels count': 2, 'channel 0': 'name="time" type=master', 'channel 1': 'name="Brake\CAN-Monitoring:1" type=value'}, 'group 3': {'cycles': 2668, 'comment': 'Brake_Position', 'channels count': 2, 'channel 0': 'name="time" type=master', 'channel 1': 'name="Brake_Position\CAN-Monitoring:1" type=value'}, 'group 4': {'cycles': 0, 'comment': 'Message_HS_CAN_2701', 'channels count': 2, 'channel 0': 'name="time" type=master', 'channel 1': 'name="TCC\CAN-Monitoring:1" type=value'}, 'group 5': {'cycles': 0, 'comment': 'Message_HS_CAN_33', 'channels count': 2, 'channel 0': 'name="time" type=master', 'channel 1': 'name="gear_ratio1\CAN-Monitoring:1" type=value'}, 'group 6': {'cycles': 2668, 'comment': 'Vehicle_Speed', 'channels count': 2, 'channel 0': 'name="time" type=master', 'channel 1': 'name="VehicleSpeed_Filtered\CAN-Monitoring:1" type=value'}, 'group 7': {'cycles': 2671, 'comment': 'Wheel_Speed', 'channels count': 5, 'channel 0': 'name="time" type=master', 'channel 1': 'name="Wheel_Speed_RR\CAN-Monitoring:1" type=value', 'channel 2': 'name="Wheel_Speed_FR\CAN-Monitoring:1" type=value', 'channel 3': 'name="Wheel_Speed_FL\CAN-Monitoring:1" type=value', 'channel 4': 'name="Wheel_Speed_RL\CAN-Monitoring:1" type=value'}, 'group 8': {'cycles': 536, 'comment': 'Brake_bit', 'channels count': 2, 'channel 0': 'name="time" type=master', 'channel 1': 'name="Brake_bit\CAN-Monitoring:2" type=value'}, 'group 9': {'cycles': 55, 'comment': 'Drivemode', 'channels count': 2, 'channel 0': 'name="time" type=master', 'channel 1': 'name="DriveMode\CAN-Monitoring:2" type=value'}, 'group 10': {'cycles': 79, 'comment': 'Gears1', 'channels count': 2, 'channel 0': 'name="time" type=master', 'channel 1': 'name="CommandedGear\CAN-Monitoring:2" type=value'}, 'group 11': {'cycles': 5336, 'comment': 'Message_HS_CAN_27', 'channels count': 2, 'channel 0': 'name="time" type=master', 'channel 1': 'name="Unmanaged_Torque\CAN-Monitoring:2" type=value'}, 'group 12': {'cycles': 2668, 'comment': 'Message_HS_CAN_29', 'channels count': 3, 'channel 0': 'name="time" type=master', 'channel 1': 'name="EngineSpeed_leading_smooth\CAN-Monitoring:2" type=value', 'channel 2': 'name="DFCO\CAN-Monitoring:2" type=value'}, 'group 13': {'cycles': 5336, 'comment': 'Pedal', 'channels count': 2, 'channel 0': 'name="time" type=master', 'channel 1': 'name="Pedal_lagging_sharp\CAN-Monitoring:2" type=value'}, 'group 14': {'cycles': 5336, 'comment': 'Pedal2', 'channels count': 2, 'channel 0': 'name="time" type=master', 'channel 1': 'name="Demanded_or_available_acceleration\CAN-Monitoring:2" type=value'}, 'group 15': {'cycles': 536, 'comment': 'Selector_Lever', 'channels count': 2, 'channel 0': 'name="time" type=master', 'channel 1': 'name="Selector_Lever\CAN-Monitoring:2" type=value'}, 'group 16': {'cycles': 5336, 'comment': 'Torque_1', 'channels count': 2, 'channel 0': 'name="time" type=master', 'channel 1': 'name="Managed_Torque\CAN-Monitoring:2" type=value'}, 'group 17': {'cycles': 373, 'comment': 'Torque_4', 'channels count': 2, 'channel 0': 'name="time" type=master', 'channel 1': 'name="Torque_higher\CAN-Monitoring:2" type=value'}, 'group 18': {'cycles': 2668, 'comment': 'Vehicle_Speed', 'channels count': 2, 'channel 0': 'name="time" type=master', 'channel 1': 'name="VehicleSpeed_Filtered\CAN-Monitoring:2" type=value'}, 'group 19': {'cycles': 90, 'comment': 'gears', 'channels count': 2, 'channel 0': 'name="time" type=master', 'channel 1': 'name="Gears_slow\CAN-Monitoring:2" type=value'}, 'group 20': {'cycles': 5336, 'comment': 'm0a0', 'channels count': 2, 'channel 0': 'name="time" type=master', 'channel 1': 'name="TurbineSpeed\CAN-Monitoring:2" type=value'}, 'group 21': {'cycles': 536, 'comment': 'vGrp_100ms', 'channels count': 2, 'channel 0': 'name="time" type=master', 'channel 1': 'name="INCA_timestamp" type=value'}, 'group 22': {'cycles': 53330, 'comment': '1ms', 'channels count': 4, 'channel 0': 'name="time" type=master', 'channel 1': 'name="Accel_Vertical_Wheel\ES650 / AD/Thermo:1" type=value', 'channel 2': 'name="Accel_Chasis_Y\ES650 / AD/Thermo:1" type=value', 'channel 3': 'name="Accel_Chasis_X\ES650 / AD/Thermo:1" type=value'}, 'group 23': {'cycles': 1, 'comment': 'VGEvent', 'channels count': 2, 'channel 0': 'name="time" type=master', 'channel 1': 'name="$EVENT_COMMENTS" type=value'}, 'group 24': {'cycles': 0, 'comment': 'VGPause', 'channels count': 2, 'channel 0': 'name="time" type=master', 'channel
1': 'name="$PAUSE_COMMENTS" type=value'}, 'group 25': {'cycles': 0, 'comment': 'VGSnapshot', 'channels count': 2, 'channel 0': 'name="time" type=master', 'channel 1': 'name="$SNAPSHOT" type=value'}}

danielhrisca · 2019-02-08T14:35:21Z

please try to install the Py3 branch and check if you still have errors

pip install -I --no-deps https://github.com/danielhrisca/asammdf/archive/Py3.zip

drhannah94 · 2019-02-08T14:56:01Z

Py3 Branch Successfully Installed

After the install, I am still seeing the same errors.

print(MDF(files[0]).version) # --> 3.00
MDF(files[0]).get('Accel_Chasis_X').plot() # --> Figure 1
merge_file = MDF.concatenate(files)
print(merge_file.version) # --> 4.10
merge_file.get('Accel_Chasis_X').plot() # --> Figure 2
merge_file.resample(0.1)
merge_file.get('Accel_Chasis_X').plot() # --> Figure 3

Figure 1

Figure2

Figure 3

danielhrisca · 2019-02-08T15:34:48Z

I understand. Please share the output of print(mdf.info()) for all the original files

drhannah94 · 2019-02-08T16:44:56Z

So when I was running my code...

for file in files:
    print(MDF(file).info())

I was getting an error here in mdf_v3.py in the 'info' function.

value = self.header[key].decode("latin-1").strip(" \n\t\0") # Line 3058

Traceback:
AttributeError: 'str' object has no attribute 'decode'

I changed it to this to get the output...

try:
    value = self.header[key].decode("latin-1").strip(" \n\t\0")
except:
    value = self.header[key].strip(" \n\t\0")

Output:

Output from all individual files.txt

danielhrisca · 2019-02-08T17:09:18Z

Thank you @drhannah94 I appreciate your effort

danielhrisca · 2019-02-09T09:40:13Z

I've found the problem and I'm working on a fix

danielhrisca · 2019-02-09T10:40:33Z

@drhannah94
please check the Py3 branch. If this works I will include the fix in development as well

drhannah94 · 2019-02-11T19:13:59Z

I updated the Py3 branch and then re-ran this code

print(MDF(files[0]).version)
MDF(files[0]).get('Accel_Chasis_X').plot()
merge_file = MDF.concatenate(files)
print(merge_file.version)
merge_file.get('Accel_Chasis_X').plot()
merge_file.resample(0.1)
merge_file.get('Accel_Chasis_X').plot()

At the line...

merge_file = MDF.concatenate(files)

I got an error saying: internal structure of file {22} is different

When I look at the structure of the lists channel_names and chans, the value at indexes 1 and 3 are switched.

danielhrisca · 2019-02-11T19:23:35Z

Hello @drhannah94 ,

that was indeed the issue; the first file had the channels switched. How does the channel "Accel_Chasis_X" look like if you concatenate the other files and ignore the first one?

I could work on fallback that checks the channel names and would hand such strange cases like your first file, but that would certainly not be as memory and speed efficient as the case when the files have identical structure.

danielhrisca · 2019-02-12T20:01:14Z

@drhannah94
I've had a go at trying to handle situations were the channels are not in the same order in the channel group. Please have a try with the Py3 branch code.

drhannah94 · 2019-02-13T18:04:38Z

@danielhrisca

That definitely did something. I re-ran the same code...

print(MDF(files[0]).version)
MDF(files[0]).get('Accel_Chasis_X').plot()
merge_file = MDF.concatenate(files)
print(merge_file.version)
merge_file.get('Accel_Chasis_X').plot()
merge_file.resample(0.1)
merge_file.get('Accel_Chasis_X').plot()

The output from the following code is...

The graphs look what I would expect them to look like. There is just weird spaces in between some of the files.

drhannah94 · 2019-02-13T18:08:32Z

Here is what the plot looks like if you ignore the first file.

merge_file = MDF.concatenate(files[1:])
merge_file.get('Accel_Chasis_X').plot()

danielhrisca · 2019-02-13T18:11:42Z

There is probably a problem with the way the way the recording software has set the start time in the measurement file header.

What does this print?

for f in files:
    print(MDF(f).header.start_time)

A possible fix would be to disable the sync for concatenation:

MDF.concatenate(files, sync=False)

drhannah94 · 2019-02-13T18:21:37Z

@danielhrisca
yeah it looks like there are jumps in the start times.

2018-10-25 10:20:16
2018-10-25 11:48:40
2018-10-25 11:50:07
2018-10-25 11:52:04
2018-10-25 11:53:59
2018-10-25 11:55:58
2018-10-25 11:58:06
2018-10-25 12:02:05
2018-10-25 12:06:00
2018-10-25 12:09:25
2018-10-25 12:38:26
2018-10-25 12:42:39
2018-10-25 12:47:57
2018-10-25 12:52:04
2018-10-25 12:55:20
2018-10-25 12:59:04
2018-10-25 13:02:31

Concatenating with sync set to false seems to have done the trick!!

danielhrisca · 2019-02-13T18:23:39Z

I'm glad this worked

danielhrisca · 2019-02-13T18:25:38Z

Just a quick suggestion: install pyqtgraph (preferably from the github develop branch pip install https://github.com/pyqtgraph/pyqtgraph/archive/develop.zip) to get nicer interactive plotting

drhannah94 · 2019-02-13T18:29:15Z

@danielhrisca Thank you so much for the help! I really appreciate the work you've done for this package!

And thanks for the suggestion. I'll definitely take a look at it. 👍

danielhrisca added the bug label Feb 11, 2019

danielhrisca added a commit that referenced this issue Feb 12, 2019

refine fix for issue #124

c9ffba8

danielhrisca closed this as completed Feb 13, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bug in reading .dat file #124

Bug in reading .dat file #124

drhannah94 commented Feb 7, 2019

danielhrisca commented Feb 7, 2019

drhannah94 commented Feb 7, 2019

danielhrisca commented Feb 7, 2019 •

edited

danielhrisca commented Feb 7, 2019

drhannah94 commented Feb 7, 2019

danielhrisca commented Feb 7, 2019

drhannah94 commented Feb 7, 2019

danielhrisca commented Feb 7, 2019

drhannah94 commented Feb 7, 2019

danielhrisca commented Feb 8, 2019

drhannah94 commented Feb 8, 2019

danielhrisca commented Feb 8, 2019

drhannah94 commented Feb 8, 2019

danielhrisca commented Feb 8, 2019

drhannah94 commented Feb 8, 2019

danielhrisca commented Feb 8, 2019

drhannah94 commented Feb 8, 2019

danielhrisca commented Feb 8, 2019

danielhrisca commented Feb 9, 2019

danielhrisca commented Feb 9, 2019

drhannah94 commented Feb 11, 2019

danielhrisca commented Feb 11, 2019

danielhrisca commented Feb 12, 2019

drhannah94 commented Feb 13, 2019

drhannah94 commented Feb 13, 2019

danielhrisca commented Feb 13, 2019

drhannah94 commented Feb 13, 2019

danielhrisca commented Feb 13, 2019

danielhrisca commented Feb 13, 2019

drhannah94 commented Feb 13, 2019

Bug in reading .dat file #124

Bug in reading .dat file #124

Comments

drhannah94 commented Feb 7, 2019

Python version: 3.6.1

Code

MDF version

Code snippet

Traceback

Description

danielhrisca commented Feb 7, 2019

drhannah94 commented Feb 7, 2019

danielhrisca commented Feb 7, 2019 • edited

danielhrisca commented Feb 7, 2019

drhannah94 commented Feb 7, 2019

danielhrisca commented Feb 7, 2019

drhannah94 commented Feb 7, 2019

danielhrisca commented Feb 7, 2019

drhannah94 commented Feb 7, 2019

danielhrisca commented Feb 8, 2019

drhannah94 commented Feb 8, 2019

danielhrisca commented Feb 8, 2019

drhannah94 commented Feb 8, 2019

danielhrisca commented Feb 8, 2019

drhannah94 commented Feb 8, 2019

danielhrisca commented Feb 8, 2019

drhannah94 commented Feb 8, 2019

danielhrisca commented Feb 8, 2019

danielhrisca commented Feb 9, 2019

danielhrisca commented Feb 9, 2019

drhannah94 commented Feb 11, 2019

danielhrisca commented Feb 11, 2019

danielhrisca commented Feb 12, 2019

drhannah94 commented Feb 13, 2019

drhannah94 commented Feb 13, 2019

danielhrisca commented Feb 13, 2019

drhannah94 commented Feb 13, 2019

danielhrisca commented Feb 13, 2019

danielhrisca commented Feb 13, 2019

drhannah94 commented Feb 13, 2019

danielhrisca commented Feb 7, 2019 •

edited