mdf4reader.py fails to get max len on unicode channels #139

laurentvm · 2018-03-21T07:46:21Z

Line
1873 of mdf4reader.py
maxlen = max([len(str(ref)) for ref in cc_ref]) fails if ref is unicode. Had to change to this to export to Matlab.

maxlen = max([len(repr(ref).encode('utf-8')) for ref in cc_ref])

The text was updated successfully, but these errors were encountered:

ratal · 2018-03-22T20:32:54Z

Hi Laurent,
Can you inform what is python version you use ? unicode is big change between 2.7 and 3.x
But I guess you use 2.7 according to my tests, len(str(u'ö$oi')) will fail only for python 2.7
However, len(u'ö$oi') will work properly for both python version --> can you try without the str() ?
Actually, I do nto think len(repr(ref).encode('utf-8')) gives the right length.

laurentvm · 2018-03-22T20:50:03Z

Hi,

Yes it was with python 2.7. I can try with 3.x if you need too.
I will try with len only.

By the way, the file I got (mf4) is full of unicode. I'm not sure but it seems that it fails to export to matlab/xlsx with this unicode values.

For instance, I have a numpy array with \u'0' values or \u'Init' which was a pain to output from the numpy array. I will continue my investigation. Unfortunately, I cannot share the file with you but I way look if I can replicate the structure to share it.

laurentvm · 2018-03-22T20:50:55Z

... Just seen you're in Belgium. I'm too.

ratal · 2018-03-22T21:19:31Z

mf4 is by specification only unicode and using xml for metadata, this was major change from 3.x to 4.x
If there is an error with the matlab export, maybe we should then rather focus on this method to improve its robustness ?

laurentvm · 2018-03-23T10:07:41Z

Just an update. Using python 2.7. XLSX fails as mentionned.

Excel file output:

Python output with:
for s in signal: val=yop[s]['data'] print s,val

There is value in the signal but in excel/matlab, there is nothing.

laurentvm · 2018-03-23T11:50:59Z

Same with 3.5

ratal · 2018-03-24T15:34:01Z

Try with latest dev branch, it should be fixed

laurentvm · 2018-03-26T09:53:26Z

Hi, just tested with python 3, the excel gives much more information now. Great!

I have many signals named t, before your code were completing the header name by t, t_1, t_2, ...
Now it append the suffix twice.

I still have to check to matlab output

ratal · 2018-03-26T18:34:15Z

This could be normal behaviour. If you have unsorted data, like several channel groups per datagroup, 't' channel could be also present several time, so datagroup number and channelgroup numbers are appended to duplicated channel names

ratal · 2018-06-02T18:34:57Z

No more feedback since while.
If still an issue, you can reopen.

laurentvm · 2018-06-12T19:36:01Z

Hi, it’s working ok. Thanks for the fix

ratal added the Waiting fix confirmation label Mar 22, 2018

ratal closed this as completed Jun 2, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

mdf4reader.py fails to get max len on unicode channels #139

mdf4reader.py fails to get max len on unicode channels #139

laurentvm commented Mar 21, 2018

ratal commented Mar 22, 2018

laurentvm commented Mar 22, 2018

laurentvm commented Mar 22, 2018

ratal commented Mar 22, 2018

laurentvm commented Mar 23, 2018

laurentvm commented Mar 23, 2018

ratal commented Mar 24, 2018

laurentvm commented Mar 26, 2018

ratal commented Mar 26, 2018

ratal commented Jun 2, 2018

laurentvm commented Jun 12, 2018

mdf4reader.py fails to get max len on unicode channels #139

mdf4reader.py fails to get max len on unicode channels #139

Comments

laurentvm commented Mar 21, 2018

ratal commented Mar 22, 2018

laurentvm commented Mar 22, 2018

laurentvm commented Mar 22, 2018

ratal commented Mar 22, 2018

laurentvm commented Mar 23, 2018

laurentvm commented Mar 23, 2018

ratal commented Mar 24, 2018

laurentvm commented Mar 26, 2018

ratal commented Mar 26, 2018

ratal commented Jun 2, 2018

laurentvm commented Jun 12, 2018