-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Error when using a high number of threads for shapeit #48
Comments
Please have a look at SHAPEIT's log file for chromosome 1, since this looks like a SHAPEIT issue. |
In the files
The other log files Btw.: I got the error also on a virtual machine with only 4 cores. Therefore it seems to be independent of the number of available cores. |
This is a SHAPEIT issue. I manually tried SHAPEIT with 40 threads. ./bin/shapeit \
--thread 40 \
-B genipe/chr1/chr1.final \
-M /home/lemieuxl/genipe_tutorial/1000GP_Phase3/genetic_map_chr1_combined_b37.txt \
-O genipe/chr1/chr1.final.phased \
-L genipe/chr1/chr1.final.phased.log I got the following error in the console.
There is no usable information in the log file ( According to SHAPEIT's documentation:
Note that the dataset used in the tutorial only has 90 samples. |
I now used genipe with a real dataset (2562 instances, about 840 thousand SNPs) and got the error lines (with other values) also for
Please notice the INFO messages among the ERROR messages. Can it be that there is an unfavorable parametrization which genipe uses for IMPUTE2? Can it be that there are two independent problems (one which terminated genipe and one that produced the error messages in the log file)? Does someone know of a workaround how I can use genipe for our real dataset? |
genipe executes tasks in parallel (according to the To investigate why IMPUTE2 failed, you need to have a look at the corresponding log file. Could it be a memory issue? |
You said that in a rerun of genipe it will redo failed tasks. That is a very good general workaround (maybe you can add that to the documentation): "If there were errors output, just try to rerun genipe." I thought that every rerun would give the same errors but that seems to be not the case (see below). You asked if there could be a memory issue (very good idea), and the answer is: Yes. Directly after starting the impute2 processes most of these uses less than one GB memory. But after some time they use about 12 GB memory each. And the system has not 240 GB (20 threads times 12 GB) memory. Therefore I will try to run it with a smaller number of threads (just stared it). You pointed me to the log files. But there are no special messages in the log files apart from those I cited above. But for every finished impute2 task there are 5 files ( Now I understand that there were three problems:
Suggestion: Add something like the following to the description of the option |
I was successful in running genipe for our real dataset (the test was done only for chromosome 1). The problem was the to large number of impute2 processes for the available memory (problem 1 in the preceding comment). The problem 2 in the preceding comment is still unsolved (if it is really a problem). Should I open a new issue for that? |
I have a same issue with genipe-launcher: error: the following task did not work: ['SHAPEIT phase chr1'] My error message is like this. ERROR: Reference and Main panels are not well aligned:
Have you solved this issue? Thank you. |
Does this error message come from If it comes from the former, this was referenced in the issue #50 (comment). To get the log information for the phasing step of chromosome 1, make sure to look at the content of the file |
The error message of ERROR: Reference and Main panels are not well aligned:
from I checked Parameters :
Reading site list in [genipe/chr1/chr1.final.bim]
Reading sample list in [genipe/chr1/chr1.final.fam]
Reading genotypes in [genipe/chr1/chr1.final.bed]
Reading genetic map in [/data1/home/jungj7/genipe_tutorial/1000GP_Phase3/genetic_map_chr1_combined_b37.txt]
Checking missingness and MAF...
Building graphs ... Thank you. |
So the first error you see in For the phasing issue, I'm guessing a memory issue, as the graphs are not building... What you can do is manually execute shapeit and see what happens. To do so, just use the shapeit binary and add the options used for the analysis (second line of the file It should look something like this. Make sure to be in the same directory as the one you ran genipe. "$PATH_TO_BINARY"/shapeit \
--thread 1 \
-B genipe/chr1/chr1.final \
-M "$PATH_TO_GENETIC_MAP"/genetic_map_chr1_combined_b37.txt \
-O genipe/chr1/chr1.final.phased \
-L genipe/chr1/chr1.final.phased.log |
I used the manual way of shapeit Now, it is building graphs with [859/4321]. I was wondering, do I have to phasing for each chr by manual way? Thank you. |
I attached the current situation of my shapeit. Why it takes too long for only phasing chr1? Is there any other way we can do phasing for each chr? |
The phasing should be done by genipe. I asked you to run it manually for debugging purposes. Now we know that shapeit can phase chromosome 1. Using genipe, how many chromosomes were you phasing at the same time? This would be the Also, recalling genipe will redo the failed tasks, and continue where it left off. |
I used I'm not sure why my compute cannot run with only thread 1. Do I need to use |
Use 1 in both cases, and rerun genipe, see what it does. |
It looks like it's running now. I don't know how many threads you were using at first, but I'm pretty sure it was a memory issue... How much memory does your computer has. If you already have imputed data (from IMPUTE2), you can use SKAT directly on those. Otherwise, you need to let genipe finish the imputation process. |
When I run shapeit using a VCF file consisting of 30 peanut samples, some chromosomes run successfully while others did not. The error message is "Assertion `conditional_index[segment].size() >= 2' failed.".
|
I installed genipe on a CentOS system with 20 physical cores and 40 logical cores. I tested genipe as described in the documentation on http://pgxcentre.github.io/genipe/installation.html and executed
genipe_tutorial
. I added the option--shapeit-thread
to the generated script and was able to execute it. But after I increased the value of that parameter, I got an error without an informative mesage about the reason.In detail I did the following:
In
genipe_tutorial/execute.sh
I did the following changes:--chrom autosomes
by--chrom 1
(to impute only the SNPs on the first chromosome for a test)--thread
I added a line with--shapeit-thread 20 \
Then I started the imputation:
The imputation was successful.
Then I changed the value of
--shapeit-thread
ingenipe_tutorial/execute.sh
from20
to40
, removed the geneated directory and started the imputation again:I got the following messages:
The text was updated successfully, but these errors were encountered: