Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better handling for DOS line endings in ASCII readers #5862

Open
markcmiller86 opened this issue Jul 6, 2021 · 3 comments
Open

Better handling for DOS line endings in ASCII readers #5862

markcmiller86 opened this issue Jul 6, 2021 · 3 comments
Labels
enhancement New feature or request impact medium Productivity partially degraded (not easily mitigated bug) or improved (enhancement) likelihood medium Neither low nor high likelihood low-hanging fruit A cognizant developer has a shot at resolving in <1/2 day of work reviewed Issue has been reviewed and labeled by a developer

Comments

@markcmiller86
Copy link
Member

Is your feature request related to a problem?

We have many readers that read ascii files. We typically read them using STL istream objects. However, our parsing logic is not necessarily able to deal with DOS line endings (\r\n) in all cases. We should audit and improve this situation. Instead of using ifstream::getline() directly, we should use a VisIt defined wrapper method for this that detects such cases and removes the offending characters from whatever gets handed back to the plugin parsing logic.

I encountered cases where NASTRAN reader was not behaving as expected with DOS line endings present. It was doing a ton of extra string to number conversion attempts on empty strings because it was failing to see the end of a line.

Is your feature request specific to a data set?

This is specific to ASCII file formats. Any such formats whose files are produced in Windows systems can have DOS line endings. We should be able to support them transparently.

Describe the solution you'd like.

Create a line reading method that ensures all plugin logic only ever sees "normal" line endings in strings read from files. This method can then be used to read files in all ASCII plugins.

Describe alternatives you've considered.

We could instead support a tool that removes DOS line endings from files. But, that seems less useful.

@markcmiller86 markcmiller86 added the enhancement New feature or request label Jul 6, 2021
@brugger1 brugger1 added impact medium Productivity partially degraded (not easily mitigated bug) or improved (enhancement) likelihood medium Neither low nor high likelihood low-hanging fruit A cognizant developer has a shot at resolving in <1/2 day of work reviewed Issue has been reviewed and labeled by a developer labels Jul 6, 2021
@brugger1
Copy link
Collaborator

brugger1 commented Jul 6, 2021

We should check with @biagas in case the behavior is different on Windows.

@biagas
Copy link
Contributor

biagas commented Jul 12, 2021

I think @markcmiller86 changes to look for '\r' will work if the file was created on Windows and read on linux.
But what about the reverse? Or created on Mac and read elsewhere (and the reverse)? On MacOS are line endings '\n' like on linux?

@markcmiller86
Copy link
Member Author

markcmiller86 commented Jul 12, 2021

I think @markcmiller86 changes to look for '\r' will work if the file was created on Windows and read on linux.
But what about the reverse? Or created on Mac and read elsewhere (and the reverse)? On MacOS are line endings '\n' like on linux?

If you are talking about, for example, this code

avtNASTRANFileFormat::ParseLine(vtkIdType *verts, char *line, int start,
int count)
{
char *valstart = line + INDEX_FIELD_WIDTH * start;
char *valend = valstart + INDEX_FIELD_WIDTH;
char val_t;
int i;
for (i = 0; (i < count) && (*valstart != '\0') && (*valstart != '\r'); i++) {
if(*valstart == '+' || *valstart == '*')
{ // At blank continuation element
valstart += INDEX_FIELD_WIDTH;
valend += INDEX_FIELD_WIDTH;
}
val_t = *valend;
*valend = '\0';
verts[i] = Geti(valstart)-1;

That logic will continue marching through the line until it hits either the null char or the \r char. So, I think that means in this paritcular case anyways, it works for files with either kind of line ending on any machine.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request impact medium Productivity partially degraded (not easily mitigated bug) or improved (enhancement) likelihood medium Neither low nor high likelihood low-hanging fruit A cognizant developer has a shot at resolving in <1/2 day of work reviewed Issue has been reviewed and labeled by a developer
Projects
None yet
Development

No branches or pull requests

3 participants