-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Table with text columns containing and starting with missing values produce "NA" values where it should be NA #111
Comments
Correction: the issue cannot be solved by pre-sorting. It looks like the last row is also parsed as "NA" instead of NA
|
pasted code from command line was not properliy displayed in previous comment:
|
Sorry, the bug is even more severe. It looks like the last row with a value that is followed by a missing value is parsed as a missing value. In this example there are 10 sequences with their names. seq_010 was replaced by a missing value, NA, in kIn. In addition what is very strange is that the column sampleNumber.text, which is also a text column looks fine!??? At this point is is also not clear to me what NA in R means: Here is the example:
|
Also saw this today again in newer software versions. To me this is a severe bug, because it changes the data where it should not, and the user might not even notice! (=> changed Priority) Win11 Pro Current example of string column parsed to R from KNIME and back to KNIME (all "NA" were previously ?): |
Ok, just realized the full problem of the bug again: |
When a table contains text columns starting in the first rows with missing values then in R theres are correcty parsed to NA.
However, when a row contains a text value, then the previous row is parsed as "NA". In consequese the table retured to KNIME contains "NA" entries where before R snippet there was a missing value. This is a bug.
Then checked on R side this happens already in the parsed kIn table (from the data provided below):
kIn$
Name (Sense)
[1:20][1] NA NA NA NA NA NA NA NA NA
[10] NA NA NA "NA" "seq_001" "seq_002" "seq_003" "seq_004" "seq_005"
[19] "seq_006" "seq_007"
Sorting the table in KNIME in such a way that missing values are at the end of the table solves the issue for that column.
Attached KNIME workflow with data showing the problem...
Win7
KNIME 4.5.2
R version 3.6.1 (2019-07-05)
Rserve 1.8-6
R snippted problem with NAs.zip
The text was updated successfully, but these errors were encountered: