ENH: implemented arbitrary data separator, for now supported Tab and Comma. #15

jankoslavic · 2024-05-13T16:07:40Z

This is a PR handling arbitrary data separator.

Professor-0 · 2024-05-13T23:21:54Z

lvm_read.py

+        if line.startswith('Separator'):
+            separator = line.strip()[10:]
+            break
+    file.seek(0) 


Using file.seek will fail when calling from read_str

Professor-0 · 2024-05-13T23:25:08Z

lvm_read.py

+            break
+    file.seek(0) 
+    if separator in separators.keys():
+        return separators[separator]


From the specification: "To find out what the separator character(s) is, read the entire header block and search for the keyword Separator. The character(s) that follows the keyword is the separator." Any separator can be supported by just using the character following 'Separator'.

Professor-0 · 2024-05-13T23:28:54Z

lvm_read.py

+    }
+    for line in file:
+        if line.startswith('Separator'):
+            separator = line.strip()[10:]


This could cause an Indexation error if a writer decides to write a header like:
Separator;

jankoslavic · 2024-05-14T04:24:00Z

Thank you for the suggestions. I made a few changes to address the issues opened. Any further ideas?

Professor-0 · 2024-05-15T01:53:16Z

Changes look good. _get_separator is not called from read_str. Also a check for the end of header symbol could be added to _get_separator to avoid reading the entire file if the separator header is missing.

jankoslavic · 2024-05-15T06:13:32Z

_get_separator should be called only from _read_lvm_base and on the file handler.
if the separator is not found in the first 20 lines, it continues with the default tab separator.

implemented arbitrary data separator, for now supported Tab and Comma.

20cab1c

jankoslavic mentioned this pull request May 13, 2024

Modified parser to recognise a comma delimited lvm format #14

Closed

Professor-0 reviewed May 13, 2024

View reviewed changes

some simplifications and improving robustnes

c35c7fb

check only first 20 lines for Separator

9c734d8

jankoslavic merged commit 5793c74 into master May 16, 2024
4 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ENH: implemented arbitrary data separator, for now supported Tab and Comma. #15

ENH: implemented arbitrary data separator, for now supported Tab and Comma. #15

jankoslavic commented May 13, 2024

Professor-0 May 13, 2024

Professor-0 May 13, 2024

Professor-0 May 13, 2024

jankoslavic commented May 14, 2024

Professor-0 commented May 15, 2024

jankoslavic commented May 15, 2024 •

edited

ENH: implemented arbitrary data separator, for now supported Tab and Comma. #15

ENH: implemented arbitrary data separator, for now supported Tab and Comma. #15

Conversation

jankoslavic commented May 13, 2024

Professor-0 May 13, 2024

Choose a reason for hiding this comment

Professor-0 May 13, 2024

Choose a reason for hiding this comment

Professor-0 May 13, 2024

Choose a reason for hiding this comment

jankoslavic commented May 14, 2024

Professor-0 commented May 15, 2024

jankoslavic commented May 15, 2024 • edited

jankoslavic commented May 15, 2024 •

edited