Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

colnamesInt() has unfriendly behavior w.r.t. missing (NA) columns #6070

Open
MichaelChirico opened this issue Apr 10, 2024 · 4 comments
Open

Comments

@MichaelChirico
Copy link
Member

          It's a bit strange that `NA` returns `0L` for character input, but `NA_integer_` for numeric input.

In fact I find the current behavior odd as well:

data.table:::colnamesInt(iris, NA_integer_)
# Error in data.table:::colnamesInt(iris, NA_integer_) : 
#   argument specifying columns received non-existing column(s): cols[1]=-2147483648

I think it's outside the scope of this PR though, I'll file a follow-up issue.

Originally posted by @MichaelChirico in #6068 (comment)

First, we should check that it's actually possible to provide NA to this column since it's a private function (maybe we fully control all inputs & know for sure it can't receive missing inputs).

If user might provide NA, we should handle this better.

If NA input is impossible, we should drop tests with missing input.

@Nj221102
Copy link
Contributor

Nj221102 commented Apr 11, 2024

I believe making NA_integer return 0L as output would align it with the behavior of NA as character input. What do you think, @MichaelChirico ?

@MichaelChirico
Copy link
Member Author

First, we should check that it's actually possible to provide NA to this column since it's a private function (maybe we fully control all inputs & know for sure it can't receive missing inputs).

Please start from here -- can you write an example using only the public API that generates the above "bad" error message?

@Nj221102
Copy link
Contributor

First, we should check that it's actually possible to provide NA to this column since it's a private function (maybe we fully control all inputs & know for sure it can't receive missing inputs).

Please start from here -- can you write an example using only the public API that generates the above "bad" error message?

Certainly, I'll begin by locating an example that triggers this message. However, what if the error cannot be replicated using the public API? Will we simply let it remain unresolved?

@MichaelChirico
Copy link
Member Author

Yes. We shouldn't waste time over-complicating an internal function that has edge case behavior no user can ever see (unless they "violate the contract" and access the private API).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants