You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In its current form, PathoStat accepts "batch" and "condition" as possible discrete variables, and gives the user the option to color/group data (in various plots) by either of those. However, we're adding functionality: PathoStat will accept any number of covariates, such as patient age, weight, race, disease status, whatever. We still want to let users color/group data based on these things, but that doesn't make much sense for continous variables. Without binning, how do you group people by weight? You can, however, order data by continuous variables. We want to at least distinguish between the two types, and we may want to add functionality for continuous variables.
The text was updated successfully, but these errors were encountered:
However, I think we need to be explicit in assigning types to sample variables. A function should be implemented that accepts user input to assign types, or attempts to infer from the data. Inferring may not be 100% accurate. For example, R (read.table or similar) interprets "Subject ID" as an integer, but it should be a factor, since there is no meaningful ordering to the subjects. Still, inferring from the data would be a good first step.
I propose we have more than two types. I think our types should be according to the standard R data types:
factors: categorical/nominal variables
ordered factors: ordinal variables, useful for representing longitudinal variables and discretizing continuous variables
integer: continuous type
numeric/double: continuous type
character: text that does not need to be treated as a variable, mostly for display purposes.
These types will naturally suggest how to display them. For example, factors can be displayed using "select" inputs and qualitative color palettes, while ordered factors may also use "select" inputs but be displayed with sequential color palettes.
In addition, users should be able to indicate which covariates are "of interest". Perhaps there should be several categories, such as secondary/confounders, batch covariates, and random effects.
In its current form, PathoStat accepts "batch" and "condition" as possible discrete variables, and gives the user the option to color/group data (in various plots) by either of those. However, we're adding functionality: PathoStat will accept any number of covariates, such as patient age, weight, race, disease status, whatever. We still want to let users color/group data based on these things, but that doesn't make much sense for continous variables. Without binning, how do you group people by weight? You can, however, order data by continuous variables. We want to at least distinguish between the two types, and we may want to add functionality for continuous variables.
The text was updated successfully, but these errors were encountered: