Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dataset_summarize(): open-text variable with categories "Values present in dataset" output #5

Closed
zchenmr opened this issue Jun 2, 2023 · 6 comments
Labels
bug Something isn't working question Further information is requested to test

Comments

@zchenmr
Copy link

zchenmr commented Jun 2, 2023

R CanPath: harmonizR v1.1.0.1000, madshapR 1.1.0.1000, fabR 1.1.1

In the "Values present in dataset" column of the "Categorical variable summary" tab, there is the following output for open text variables with a category in the DataSchema:

[-7] - Not applicable : 68.28%
 : 0.01%
 : 0.01%
 : 0.01%
 : 0.01%
 : 0.01%
 [......]

Other values (non categorical) : 9.81%

NA values : 20.32%

The value : 0.01% often repeats many times (sometimes a few hundred lines - possibly related to the number of responses for the variable). Is this output expected? Are there supposed to be so many lines and is the space preceding the percentage supposed to be blank? Most of the open-text variables showed some variation of the result above, but one of the variables had this instead:

[-7] - Not applicable : 85.64%
 : 0.06%

Other values (non categorical) : 1.86%

NA values : 12.44%

A different report also showed output with a majority of : 0.02% values (with some other values like : 0.13%, : 0.1%, etc.), so there is some variation based on the dataset.

@GuiFabre GuiFabre added the bug Something isn't working label Jun 14, 2023
@GuiFabre
Copy link
Contributor

hello @zchenmr , can you tell me if this bug still occures ? thanks !

@GuiFabre GuiFabre added the question Further information is requested label Aug 31, 2023
@zchenmr
Copy link
Author

zchenmr commented Sep 5, 2023

Yes, this bug still occurs with harmonizR v.1.1.0.1003, madshapR v1.0.0, fabR v1.1.1 in R CanPath.

@GuiFabre GuiFabre transferred this issue from another repository Sep 22, 2023
@zchenmr
Copy link
Author

zchenmr commented Oct 18, 2023

This is still occurring in R CanPath with Rmonize v1.0.0, madshapR v1.0.2, fabR 2.0.0.

@GuiFabre GuiFabre pinned this issue Oct 19, 2023
@rwissa
Copy link

rwissa commented Oct 20, 2023

20/10/2023 - estimate time needed to fix. 27/10 to decide if to fix before CRAN

GuiFabre added a commit to maelstrom-research/madshapR that referenced this issue Oct 20, 2023
@GuiFabre
Copy link
Contributor

This has been corrected, can you test it @zchenmr ?
thanks 😄

@GuiFabre GuiFabre unpinned this issue Oct 20, 2023
@zchenmr
Copy link
Author

zchenmr commented Oct 20, 2023

It looks like this issue has been resolved with Rmonize v1.0.0.9003 - all of the percentages are now included under "Other values (non categorical)", which gives a single value.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working question Further information is requested to test
Projects
None yet
Development

No branches or pull requests

3 participants