Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarification on --minexpr and --nan values #174

Open
JohnHadish opened this issue Jul 10, 2020 · 1 comment
Open

Clarification on --minexpr and --nan values #174

JohnHadish opened this issue Jul 10, 2020 · 1 comment

Comments

@JohnHadish
Copy link
Collaborator

JohnHadish commented Jul 10, 2020

It is not clear if the user should consider -Inf values as a -nan or as -min value. In the documentation they are present in both locations. Documentation should state how these parameters will impact analysis.

From the Step 1: Import the GEM

... In the example above, the --nan argument indicates that the file uses "NA" to represent missing values. This value should be set to whatever indicates missing values. This could be "0.0", "-Inf", etc. and the GEM file has a header describing each column so the number ...

From the Step 2: Perform Correlation Analysis

... The --minexp argument isset to negative infinity (-inf) to indicate there is no limit on the minimum expression value. If we wanted to exclude samples whose log2 expression values dipped below 0.2, for instance, we could do so with this argument. ...

From the comand line documentation for kinc help run similarity, value is considered a "floating point", but defaults to a string.

--minexpr <value>
Value Type: Floating Point
Minimum Value: -inf
Maximum Value: inf
Default Value: -inf
Minimum threshold for a sample to be included in a gene pair.

--maxexpr <value>
Value Type: Floating Point
Minimum Value: -inf
Maximum Value: inf
Default Value: inf
Maximum threshold for a sample to be included in a gene pair.
@bentsherman
Copy link
Member

The notes on --nan are just saying that you can set any value to be parsed as a nan value, for example "NA" or "-inf" or "0.0", depending on your situation.

As for --minexpr and --maxexpr, the IEEE floating point standard has a special value reseved for infinity and nan, so "-inf" and "inf" can be parsed as valid floating point values. I hope that clears up your questions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants