Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Windows line endings not working in CSVInputFormat #791

Open
markus-h opened this issue May 12, 2014 · 6 comments
Open

Windows line endings not working in CSVInputFormat #791

markus-h opened this issue May 12, 2014 · 6 comments

Comments

@markus-h
Copy link
Contributor

A user reported this issue, who is running stratosphere under windows. After converting the input file to UNIX line endings it worked.

@rmetzger
Copy link
Member

I'm currently preparing a pull request to ease debugging these issues.

@markus-h
Copy link
Contributor Author

I had a look at this issue and for me CsvInputFormat seems to work as intended. The problem probably is, that the DEFAULT_LINE_DELIMITER is set to "\n", but files created under windows are using "\r\n".
I am not sure how to deal with this. We could set the default delimiter to System.getProperty("line.separator"); but then for example files checkout out from git would not be running by default because git is keeping them in linux format.
Another possibility would be to accept both by default and dynamically remove a tailing "\r" if it is present.

@rmetzger
Copy link
Member

I think its okay to accept both IF the user is Windows.

@StephanEwen
Copy link
Contributor

We had a fix for this. Did it get lost during some merge?

@rmetzger
Copy link
Member

Thats the pull request: #582

@StephanEwen
Copy link
Contributor

Have a look at #582

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants