-
Notifications
You must be signed in to change notification settings - Fork 587
-
Notifications
You must be signed in to change notification settings - Fork 587
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Using the native PairHMM in HaplotypeCaller can change the QUAL field vs. using the java LOGLESS_CACHING PairHMM #1572
Comments
@gspowley any thoughts on this? |
To reproduce, check out the
Then: |
@droazen Yes, this is because the native PairHMM is using single precision floating point and Flush To Zero (FTZ), while the Java PairHMM is using double precision and not using FTZ. I planned to address this when we integrate native PairHMM into HaplotypeCaller. It looks like the time is here. For now, you can configure native PairHMM to use double precision and not use FTZ. With the diff below, the VCFs from native and Java PairHMM are exactly the same. In the future, we need to enable FTZ in the Java PairHMM and provide the option to use single precision or double precision in native PairHMM. --- i/src/main/cpp/VectorLoglessPairHMM/LoadTimeInitializer.cc
+++ w/src/main/cpp/VectorLoglessPairHMM/LoadTimeInitializer.cc
@@ -23,7 +23,7 @@ LoadTimeInitializer::LoadTimeInitializer() //will be called when library is loa
//Very important to get good performance on Intel processors
//Function: enabling FTZ converts denormals to 0 in hardware
//Denormals cause microcode to insert uops into the core causing big slowdown
- _MM_SET_FLUSH_ZERO_MODE(_MM_FLUSH_ZERO_ON);
+ // _MM_SET_FLUSH_ZERO_MODE(_MM_FLUSH_ZERO_ON);
//Profiling: times for compute and transfer (either bytes copied or pointers copied)
m_compute_time = 0;
diff --git i/src/main/cpp/VectorLoglessPairHMM/org_broadinstitute_hellbender_utils_pairhmm_VectorLoglessPairHMM.cc w/src/main/cpp/VectorLoglessPairH
index f45153e..70cf54f 100644
--- i/src/main/cpp/VectorLoglessPairHMM/org_broadinstitute_hellbender_utils_pairhmm_VectorLoglessPairHMM.cc
+++ w/src/main/cpp/VectorLoglessPairHMM/org_broadinstitute_hellbender_utils_pairhmm_VectorLoglessPairHMM.cc
@@ -6,7 +6,7 @@
using namespace std;
-bool use_double = false;
+bool use_double = true;
//Should be called only once for the whole Java process - initializes field ids for the classes JNIReadDataHolderClass
//and JNIHaplotypeDataHolderClass
diff --git i/src/main/java/org/broadinstitute/hellbender/utils/pairhmm/VectorLoglessPairHMM.java w/src/main/java/org/broadinstitute/hellbender/utils
index 18370b0..74a1dad 100644
--- i/src/main/java/org/broadinstitute/hellbender/utils/pairhmm/VectorLoglessPairHMM.java
+++ w/src/main/java/org/broadinstitute/hellbender/utils/pairhmm/VectorLoglessPairHMM.java
@@ -1,6 +1,7 @@
package org.broadinstitute.hellbender.utils.pairhmm;
-import org.apache.log4j.Logger;
+import org.apache.logging.log4j.LogManager;
+import org.apache.logging.log4j.Logger;
import org.broadinstitute.hellbender.exceptions.UserException;
import org.broadinstitute.hellbender.utils.genotyper.ReadLikelihoods;
import org.broadinstitute.hellbender.utils.genotyper.LikelihoodMatrix;
@@ -20,7 +21,7 @@ import java.util.Map;
*/
public final class VectorLoglessPairHMM extends LoglessPairHMM {
- final static Logger logger = Logger.getLogger(VectorLoglessPairHMM.class);
+ private static final Logger logger = LogManager.getLogger(VectorLoglessPairHMM.class);
final static Boolean runningOnMac = System.getProperty("os.name", "unknown").toLowerCase().startsWith("mac");
long threadLocalSetupTimeDiff = 0;
long pairHMMSetupTime = 0; |
@gspowley Excellent, thanks for the detailed reply! I didn't realize that the I'll leave the native code as-is in my HC branch for now and let you address the issue as you see fit in a separate branch. |
Solution is to set FTZ in main(), as discussed in #1771 |
@droazen Would it make sense to make the setting of FTZ a CLI argument? For example something like |
@erniebrau Yes, that would certainly make sense. |
OK. I will submit a PR with this as soon as we release a version of GKL with the necessary FTZ functionality. |
@droazen I cannot reproduce this issue. I've tried doing what you say in your comment above but the results of running Java and (FTZ-enabled) AVX versions of HC are identical. I also tried running analogous experiments using several different BAM files in the repo but, again, got identical results when comparing Java vs AVX. I need a pair of files whose results are different between the implementations, so that I can confirm that enabling FTZ in the Java PairHMM fixes this issue. Any suggestions as to where I can find such files? |
Was the |
@sooheelee No, I don't believe that this option was ever implemented. |
Right, the code to change the flush to zero flag is available in GKL, but the option to set the flush to zero flag from GATK was not implemented. |
I have a researcher interested in understanding the WARN:
at https://gatkforums.broadinstitute.org/gatk/discussion/11433/haplotypecaller-warnings-depthpersamplehc#latest. I'd like to be able to enumerate more on the matter. I see these parameters for HaplotypeCaller that seem to be related (please correct me if I am mistaken):
and
where the former is |
@sooheelee "Flush to zero" or FTZ is a CPU flag that causes extremely small floating-point values to be treated as 0. Enabling this flag improves performance and consistency across PairHMM implementations, without causing precision loss of any significance. |
@droazen do we still care about this? Given that we've been moving to non-exact match integration tests and fuzzy comparisons in production, maybe this isn't worth the effort. |
Using the native PairHMM in the current version of the HaplotypeCaller can sometimes change the QUAL field in called variants. Eg.,
The text was updated successfully, but these errors were encountered: