Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

case-sensitive matching in MDF.search(mode='wildcard', case_insensitive=True) #776

Closed
timohencken opened this issue Oct 11, 2022 · 0 comments

Comments

@timohencken
Copy link

timohencken commented Oct 11, 2022

Python version

  • Windows machine:

    • ('python=3.8.10 (tags/v3.8.10:3d8993a, May 3 2021, 11:48:03) [MSC v.1928 64 bit (AMD64)]')
    • 'os=Windows-10-10.0.19042-SP0'
    • 'numpy=1.23.3'
    • 'asammdf=7.1.1'
  • Linux machine:

    • 'python=3.8.13 (default, Mar 16 2022, 17:28:59) \n[GCC 7.5.0]'
    • 'os=Linux-5.4.0-125-generic-x86_64-with-glibc2.27'
    • 'numpy=1.23.3'
    • 'asammdf=7.1.1'

Code

MDF version

  • affected: all
  • tested with: 4.1.0
  • sample file attached (zipped): sample.zip

Code snippet

  • Windows
>>> from asammdf import MDF
>>> import fnmatch
>>> import os
>>> mdf_obj = MDF('stripped_sample.mf4')
>>> mdf.channels_db.keys()
dict_keys(['time', 'Channel_no_conversion', 'Channel_linear_conversion', 'Channel_algebraic', 'Channel_rational_conversion', 'Channel_string', 'Channel_bytearay', 'Channel_tabular', 'Channel_value_to_text', 'Channel_value_range_to_value', 'Channel_value_range_to_text', 'Channel_lookup_with_axis', 'channel_axis_1', 'channel_axis_2', 'Channel_lookup_with_default_axis', 'Channel_structure_composition', 'struct_channel_0', 'struct_channel_1', 'struct_channel_2', 'struct_channel_3', 'struct_channel_4', 'struct_channel_5', 'struct_channel_6', 'struct_channel_7', 'Nested_structures', 'level11', 'level21', 'level31', 'level41', 'level42', 'level43', 'level44', 'level32', 'level33', 'level22'])

# OK this returns the expected result
>>> mdf.search(pattern='channel_.*', mode='regex', case_insensitive=True)
['Channel_no_conversion', 'Channel_linear_conversion', 'Channel_algebraic', 'Channel_rational_conversion', 'Channel_string', 'Channel_bytearay', 'Channel_tabular', 'Channel_value_to_text', 'Channel_value_range_to_value', 'Channel_value_range_to_text', 'Channel_lookup_with_axis', 'channel_axis_1', 'channel_axis_2', 'Channel_lookup_with_default_axis', 'Channel_structure_composition', 'struct_channel_0', 'struct_channel_1', 'struct_channel_2', 'struct_channel_3', 'struct_channel_4', 'struct_channel_5', 'struct_channel_6', 'struct_channel_7']

# OK this returns case-sensitive matches
>>> mdf.search(pattern='channel_*', mode='wildcard', case_insensitive=False)
['channel_axis_1', 'channel_axis_2']

# OK this returns case-sensitive matches
>>> mdf.search(pattern='Channel_*', mode='wildcard', case_insensitive=False)
['Channel_no_conversion', 'Channel_linear_conversion', 'Channel_algebraic', 'Channel_rational_conversion', 'Channel_string', 'Channel_bytearay', 'Channel_tabular', 'Channel_value_to_text', 'Channel_value_range_to_value', 'Channel_value_range_to_text', 'Channel_lookup_with_axis', 'Channel_lookup_with_default_axis', 'Channel_structure_composition']

# OK this returns case-sensitive matches
>>> mdf.search(pattern='channel_*', mode='wildcard', case_insensitive=True)
['Channel_no_conversion', 'Channel_linear_conversion', 'Channel_algebraic', 'Channel_rational_conversion', 'Channel_string', 'Channel_bytearay', 'Channel_tabular', 'Channel_value_to_text', 'Channel_value_range_to_value', 'Channel_value_range_to_text', 'Channel_lookup_with_axis', 'channel_axis_1', 'channel_axis_2', 'Channel_lookup_with_default_axis', 'Channel_structure_composition']

# OK this returns case-sensitive matches
>>> mdf.search(pattern='Channel_*', mode='wildcard', case_insensitive=True)
['Channel_no_conversion', 'Channel_linear_conversion', 'Channel_algebraic', 'Channel_rational_conversion', 'Channel_string', 'Channel_bytearay', 'Channel_tabular', 'Channel_value_to_text', 'Channel_value_range_to_value', 'Channel_value_range_to_text', 'Channel_lookup_with_axis', 'channel_axis_1', 'channel_axis_2', 'Channel_lookup_with_default_axis', 'Channel_structure_composition']

# fnmatch is case-insensitive in Windows
>>> fnmatch.filter(['aa', 'AA'], 'a*')
['aa', 'AA']

# fnmatch is case-insensitive in Windows as normcase() is case-insensitive in Windows
>>> os.path.normcase('aa')
'aa'
>>> os.path.normcase('AA')
'aa'
  • Linux:
>>> from asammdf import MDF
>>> import fnmatch
>>> import os
>>> mdf_obj = MDF('stripped_sample.mf4')
>>> mdf.channels_db.keys()
dict_keys(['time', 'Channel_no_conversion', 'Channel_linear_conversion', 'Channel_algebraic', 'Channel_rational_conversion', 'Channel_string', 'Channel_bytearay', 'Channel_tabular', 'Channel_value_to_text', 'Channel_value_range_to_value', 'Channel_value_range_to_text', 'Channel_lookup_with_axis', 'channel_axis_1', 'channel_axis_2', 'Channel_lookup_with_default_axis', 'Channel_structure_composition', 'struct_channel_0', 'struct_channel_1', 'struct_channel_2', 'struct_channel_3', 'struct_channel_4', 'struct_channel_5', 'struct_channel_6', 'struct_channel_7', 'Nested_structures', 'level11', 'level21', 'level31', 'level41', 'level42', 'level43', 'level44', 'level32', 'level33', 'level22'])

# OK this returns the expected result
>>> mdf.search(pattern='channel_.*', mode='regex', case_insensitive=True)
['Channel_no_conversion', 'Channel_linear_conversion', 'Channel_algebraic', 'Channel_rational_conversion', 'Channel_string', 'Channel_bytearay', 'Channel_tabular', 'Channel_value_to_text', 'Channel_value_range_to_value', 'Channel_value_range_to_text', 'Channel_lookup_with_axis', 'channel_axis_1', 'channel_axis_2', 'Channel_lookup_with_default_axis', 'Channel_structure_composition', 'struct_channel_0', 'struct_channel_1', 'struct_channel_2', 'struct_channel_3', 'struct_channel_4', 'struct_channel_5', 'struct_channel_6', 'struct_channel_7']

# OK this returns case-sensitive matches
>>> mdf.search(pattern='channel_*', mode='wildcard', case_insensitive=False)
['channel_axis_1', 'channel_axis_2']

# OK this returns case-sensitive matches
>>> mdf.search(pattern='Channel_*', mode='wildcard', case_insensitive=False)
['Channel_no_conversion', 'Channel_linear_conversion', 'Channel_algebraic', 'Channel_rational_conversion', 'Channel_string', 'Channel_bytearay', 'Channel_tabular', 'Channel_value_to_text', 'Channel_value_range_to_value', 'Channel_value_range_to_text', 'Channel_lookup_with_axis', 'Channel_lookup_with_default_axis', 'Channel_structure_composition']

# NOK this returns only case-sensitive matches although case_insensitive=True
>>> mdf.search(pattern='channel_*', mode='wildcard', case_insensitive=True)
['channel_axis_1', 'channel_axis_2']

# NOK this returns only case-sensitive matches although case_insensitive=True
>>> mdf.search(pattern='Channel_*', mode='wildcard', case_insensitive=True)
['Channel_no_conversion', 'Channel_linear_conversion', 'Channel_algebraic', 'Channel_rational_conversion', 'Channel_string', 'Channel_bytearay', 'Channel_tabular', 'Channel_value_to_text', 'Channel_value_range_to_value', 'Channel_value_range_to_text', 'Channel_lookup_with_axis', 'Channel_lookup_with_default_axis', 'Channel_structure_composition']

# fnmatch is case-sensitive in Linux
>>> fnmatch.filter(['aa', 'AA'], 'a*')
['AA']
>>> fnmatch.filter(['aa', 'AA'], 'a*')
['AA']

# fnmatch is case-sensitive in Linux as normcase() is case-sensitive in Linux
>>> os.path.normcase('aa')
'aa'
>>> os.path.normcase('AA')
'AA'

Traceback

no traceback, as no Exception occurs
tracked down to following code:

asammdf/asammdf/mdf.py

Lines 5448 to 5450 in af0d4a6

if case_insensitive:
channels = fnmatch.filter(self.channels_db, pattern)
else:

Description

using wildcard search in MDF.search() with "case_insensitive=True" is not case-insensitive on Linux (while working fine in Windows).

This is caused by the fact, that the librariy "fnmatch" is used, which is designed to do file path comparisons. According to its documentation, it performs a "case normalization" (https://docs.python.org/3/library/fnmatch.html#fnmatch.fnmatch) but this does not mean "case-insensitive" as default Linux file systems are case-sensitive

            if case_insensitive:
-                 channels = fnmatch.filter(self.channels_db, pattern)
+                 pattern = pattern.casefold()
+                 channels = [
+                     name
+                     for name in self.channels_db
+                     if fnmatch.fnmatch(name.casefold(), pattern)
+                 ]
            else:
  • Possible workarounds:
    • use mode='regex'
    • use windows ;)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants