Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle 1.#INF, 1.#IND, 1.#QNAN and 1.#SNAN #1800

Closed
j0r1 opened this issue Aug 4, 2016 · 1 comment
Closed

Handle 1.#INF, 1.#IND, 1.#QNAN and 1.#SNAN #1800

j0r1 opened this issue Aug 4, 2016 · 1 comment
Labels
Milestone

Comments

@j0r1
Copy link

@j0r1 j0r1 commented Aug 4, 2016

In a pull request I was asked to open an issue on this subject, so here it is.

The reason for the proposed patch for fread was that C/C++ code compiled with a visual studio compiler will output e.g. 1.#INF instead of inf or #1.IND instead of nan when writing such floating point values to a text file using a printf like function. Since R compiles its packages using gcc, it will not recognize these strings as doubles, causing entire columns to be interpreted as text instread of numbers.

The current patch always checks for this, and therefore causes a slowdown of 2.2%. I've also tried making it optional by specifying an extra boolean argument to fread. This then sets a certain function pointer to e.g. strtod directly, or to the modified code in strtod_wrapper that performs the extra checks. However, even such a straightforward change still causes a slowdown of 1.5% when the extra checks are not used.

@j0r1
Copy link
Author

@j0r1 j0r1 commented Sep 10, 2016

The approach was modified so that the extra checks are now completely optional.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
3 participants