Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error when using a high number of threads for shapeit #48

Closed
dlippold opened this issue Nov 15, 2017 · 19 comments
Closed

Error when using a high number of threads for shapeit #48

dlippold opened this issue Nov 15, 2017 · 19 comments
Labels

Comments

@dlippold
Copy link

I installed genipe on a CentOS system with 20 physical cores and 40 logical cores. I tested genipe as described in the documentation on http://pgxcentre.github.io/genipe/installation.html and executed genipe_tutorial. I added the option --shapeit-thread to the generated script and was able to execute it. But after I increased the value of that parameter, I got an error without an informative mesage about the reason.

In detail I did the following:

  • Installed genipe and tested the installation
  • Executed the following commands

cd
wget http://statgen.org/wp-content/uploads/Softwares/genipe/supp_files/hg19.tar.bz2
wget https://mathgen.stats.ox.ac.uk/impute/1000GP_Phase3.tgz

mkdir $HOME/genipe_tutorial

mkdir $HOME/genipe_tutorial/hg19
cd $HOME/genipe_tutorial/hg19
tar -jxf $HOME/hg19.tar.bz2

cd $HOME/genipe_tutorial
tar -zxf $HOME/1000GP_Phase3.tgz
touch 1000GP_Phase3/genipe_tut_done

cd
source genipe_pyvenv/bin/activate
genipe-tutorial
deactivate

In genipe_tutorial/execute.sh I did the following changes:

  • Replaced --chrom autosomes by --chrom 1 (to impute only the SNPs on the first chromosome for a test)
  • After the line with --thread I added a line with --shapeit-thread 20 \

Then I started the imputation:

source genipe_pyvenv/bin/activate
genipe_tutorial/execute.sh
deactivate

The imputation was successful.

Then I changed the value of --shapeit-thread in genipe_tutorial/execute.sh from 20 to 40, removed the geneated directory and started the imputation again:

rm -r genipe_tutorial/genipe/
source genipe_pyvenv/bin/activate
genipe_tutorial/execute.sh

I got the following messages:

[... INFO] Phasing markers
[... ERROR] Task 'SHAPEIT phase chr1': did not finish...
[... ERROR] the following task did not work: ['SHAPEIT phase chr1']
usage: genipe-launcher [-h] [-v] [--debug] [--thread THREAD] --bfile PREFIX
                       [--reference FILE] [--chrom CHROM [CHROM ...]]
                       [--output-dir DIR] [--bgzip] [--use-drmaa]
                       [--drmaa-config FILE] [--preamble FILE]
                       [--shapeit-bin BINARY] [--shapeit-thread INT]
                       [--shapeit-extra OPTIONS] [--plink-bin BINARY]
                       [--hap-template TEMPLATE] [--legend-template TEMPLATE]
                       [--map-template TEMPLATE] --sample-file FILE
                       [--hap-nonPAR FILE] [--hap-PAR1 FILE] [--hap-PAR2 FILE]
                       [--legend-nonPAR FILE] [--legend-PAR1 FILE]
                       [--legend-PAR2 FILE] [--map-nonPAR FILE]
                       [--map-PAR1 FILE] [--map-PAR2 FILE]
                       [--impute2-bin BINARY] [--segment-length BP]
                       [--filtering-rules RULE [RULE ...]]
                       [--impute2-extra OPTIONS] [--probability FLOAT]
                       [--completion FLOAT] [--info FLOAT]
                       [--report-number NB] [--report-title TITLE]
                       [--report-author AUTHOR]
                       [--report-background BACKGROUND]
genipe-launcher: error: the following task did not work: ['SHAPEIT phase chr1']
@lemieuxl
Copy link
Member

Please have a look at SHAPEIT's log file for chromosome 1, since this looks like a SHAPEIT issue.

@dlippold
Copy link
Author

dlippold commented Nov 15, 2017

In the files genipe_tutorial/genipe/chr1/chr1.alignments.log and genipe_tutorial/genipe/chr1/chr1.to_exclude.alignments.log I found the following lines at the end:

Reading SNPs in [.../genipe_tutorial/1000GP_Phase3/1000GP_Phase3_chr1.legend.gz]
  * 149343 reference panel sites included
  * 6351015 reference panel sites excluded

ERROR: Reference and Main panels are not well aligned:
  * #Missing sites in reference panel = 443
  * #Misaligned sites between panels = 56
  * #Multiple alignments between panels = 0

The other log files genipe_tutorial/genipe/chr1/*.log seems to contain no remarkably entries.

Btw.: I got the error also on a virtual machine with only 4 cores. Therefore it seems to be independent of the number of available cores.

@lemieuxl
Copy link
Member

This is a SHAPEIT issue. I manually tried SHAPEIT with 40 threads.

./bin/shapeit \
    --thread 40 \
    -B genipe/chr1/chr1.final \
    -M /home/lemieuxl/genipe_tutorial/1000GP_Phase3/genetic_map_chr1_combined_b37.txt \
    -O genipe/chr1/chr1.final.phased \
    -L genipe/chr1/chr1.final.phased.log

I got the following error in the console.

shapeit: src/modes/phaser/phaser_algorithm.cpp:150: void phaser::phaseSegment(int): Assertion `conditional_index[segment].size() >= 2' failed.
Aborted (core dumped)

There is no usable information in the log file (chr1.final.phased.log) because of the failure.

According to SHAPEIT's documentation:

This option is recommended only if you have a large number of individuals in your dataset.

Note that the dataset used in the tutorial only has 90 samples.

@dlippold
Copy link
Author

dlippold commented Nov 16, 2017

I now used genipe with a real dataset (2562 instances, about 840 thousand SNPs) and got the error lines (with other values) also for 20 threads and even without the options --shapeit-thread and --thread at all. But in that case (with the value 20, I didn't get a result without that option because of large running time) genipe didn't stopped when the error was written into the log file but seems to produce some results. But after it ended (after some hours only for chromosome 1) I got error messages like the following:

[... WARNING] impute2_chr1_135000001_140000000: there are no SNPs in the imputation interval
[... ERROR] Task 'IMPUTE2 chr1 from 1 to 5000000': did not finish...
[... ERROR] Task 'IMPUTE2 chr1 from 5000001 to 10000000': did not finish...
[... INFO] Task 'IMPUTE2 chr1 from 10000001 to 15000000': performed in 16,726 seconds
[... ERROR] Task 'IMPUTE2 chr1 from 15000001 to 20000000': did not finish...
...
[... INFO] Task 'IMPUTE2 chr1 from 245000001 to 250000000': performed in 7,475 seconds
[... ERROR] the following task did not work: ['IMPUTE2 chr1 from 1 to 5000000', 'IMPUTE2 chr1 from 5000001 to 10000000', ...]
usage: genipe-launcher [-h] [-v] [--debug] [--thread THREAD] --bfile PREFIX
...

Please notice the INFO messages among the ERROR messages.

Can it be that there is an unfavorable parametrization which genipe uses for IMPUTE2? Can it be that there are two independent problems (one which terminated genipe and one that produced the error messages in the log file)?

Does someone know of a workaround how I can use genipe for our real dataset?

@lemieuxl
Copy link
Member

lemieuxl commented Nov 16, 2017

genipe executes tasks in parallel (according to the --thread option). It will wait to run all the tasks of a specific step (in your case, imputation) before it stops because of an error. This explains the error messages among the information messages. If you rerun genipe, it will only redo the failed or incomplete tasks.

To investigate why IMPUTE2 failed, you need to have a look at the corresponding log file. Could it be a memory issue?

@dlippold
Copy link
Author

dlippold commented Nov 16, 2017

You said that in a rerun of genipe it will redo failed tasks. That is a very good general workaround (maybe you can add that to the documentation): "If there were errors output, just try to rerun genipe." I thought that every rerun would give the same errors but that seems to be not the case (see below).

You asked if there could be a memory issue (very good idea), and the answer is: Yes. Directly after starting the impute2 processes most of these uses less than one GB memory. But after some time they use about 12 GB memory each. And the system has not 240 GB (20 threads times 12 GB) memory. Therefore I will try to run it with a smaller number of threads (just stared it).

You pointed me to the log files. But there are no special messages in the log files apart from those I cited above. But for every finished impute2 task there are 5 files (*.impute2, *.impute2_info, *.impute2_info_by_sample, *.impute2_warnings, *.impute2_summary). And I noticed that there is none of these files for the tasks for which there was an error message (the tasks which were not finished).

Now I understand that there were three problems:

  1. The main problem for the processing of our real dataset seems to be too less memory for the specified number of impute2 threads/processes. With a smaller number of impute2 threads it should work. Maybe you can add a warning about a too large number of threads with respect to memory to the documentation.

  2. The other problem which results in the error messages in the log files cited above comes from shapeit. I get that for the test dataset even with --shapeit-thread 1. But I don't know if it is critical. At least it does not stop genipe.

  3. And there is another serious problem with shapeit. When the value for --shapeit-thread is too large (probably with respect to the number of samples in the dataset) shapeit and genipe stop working (in a rerun genipe stops after a few seconds).

Suggestion: Add something like the following to the description of the option --thread: "That is the number of impute2 processes which are started in parallel."

@dlippold
Copy link
Author

dlippold commented Nov 17, 2017

I was successful in running genipe for our real dataset (the test was done only for chromosome 1). The problem was the to large number of impute2 processes for the available memory (problem 1 in the preceding comment).

The problem 2 in the preceding comment is still unsolved (if it is really a problem). Should I open a new issue for that?

@jystatistics
Copy link

Hi, @dlippold @lemieuxl

I have a same issue with

genipe-launcher: error: the following task did not work: ['SHAPEIT phase chr1']

My error message is like this.

ERROR: Reference and Main panels are not well aligned:

  • #Missing sites in reference panel = 2994
  • #Misaligned sites between panels = 525
  • #Multiple alignments between panels = 0

Have you solved this issue?

Thank you.

@lemieuxl
Copy link
Member

Does this error message come from chr1.alignments*.log or chr1.final.phased.log?

If it comes from the former, this was referenced in the issue #50 (comment).

To get the log information for the phasing step of chromosome 1, make sure to look at the content of the file chr1.final.phased.log.

@jystatistics
Copy link

@lemieuxl

The error message of

ERROR: Reference and Main panels are not well aligned:

  • #Missing sites in reference panel = 2994
  • #Misaligned sites between panels = 525
  • #Multiple alignments between panels = 0

from chr1.alignments*.log.

I checked chr1.final.phased.log, it is showing like this:

Parameters :

  • Seed : 1574885865
  • Parallelisation: 1 threads
  • Ref allele is NOT aligned on the reference genome
  • MCMC: 35 iterations [7 B + 1 runs of 8 P + 20 M]
  • Model: 100 states per window [100 H + 0 PM + 0 R + 0 COV ] / Windows of ~2.0 Mb / Ne = 15000

Reading site list in [genipe/chr1/chr1.final.bim]

  • 72809 sites included

Reading sample list in [genipe/chr1/chr1.final.fam]

  • 4321 samples included
  • 4321 unrelateds / 0 duos / 0 trios in 4321 different families

Reading genotypes in [genipe/chr1/chr1.final.bed]

  • Plink binary file SNP-major mode

Reading genetic map in [/data1/home/jungj7/genipe_tutorial/1000GP_Phase3/genetic_map_chr1_combined_b37.txt]

  • 248796 genetic positions found
  • #set=59644 / #interpolated=13165
  • Physical map [0.09 Mb -> 249.21 Mb] / Genetic map [0.00 cM -> 293.39 cM]

Checking missingness and MAF...

  • 0 individuals with high rates of missing data (>5%)
  • 0 SNPs with high rates of missing data (>5%)
  • 2637 monomorphic SNPs
  • 10874 missing genotypes automatically imputed at monomorphic SNPs
  • 494 singletons SNPs

Building graphs ...

Thank you.

@lemieuxl
Copy link
Member

So the first error you see in chr1.alignments.log is normal, as genipe is trying to fix the strand misalignment between your dataset and the reference panel (as per issue #50 (comment)).

For the phasing issue, I'm guessing a memory issue, as the graphs are not building... What you can do is manually execute shapeit and see what happens. To do so, just use the shapeit binary and add the options used for the analysis (second line of the file chr1.final.phased.log).

It should look something like this. Make sure to be in the same directory as the one you ran genipe.

"$PATH_TO_BINARY"/shapeit \
    --thread 1 \
    -B genipe/chr1/chr1.final \
    -M "$PATH_TO_GENETIC_MAP"/genetic_map_chr1_combined_b37.txt \
    -O genipe/chr1/chr1.final.phased \
    -L genipe/chr1/chr1.final.phased.log

@jystatistics
Copy link

I used the manual way of shapeit
It seems working now.

Now, it is building graphs with [859/4321].

I was wondering, do I have to phasing for each chr by manual way?

Thank you.

@jystatistics
Copy link

@lemieuxl

I attached the current situation of my shapeit.

Why it takes too long for only phasing chr1?

Is there any other way we can do phasing for each chr?

1

@lemieuxl
Copy link
Member

The phasing should be done by genipe. I asked you to run it manually for debugging purposes. Now we know that shapeit can phase chromosome 1.

Using genipe, how many chromosomes were you phasing at the same time? This would be the --thread option. If this value is too high and your computer doesn't have enough memory, it could explain why the task failed.

Also, recalling genipe will redo the failed tasks, and continue where it left off.

@jystatistics
Copy link

I used --thread 1 at the same time.

I'm not sure why my compute cannot run with only thread 1. Do I need to use --shapeit-thread?

@lemieuxl
Copy link
Member

Use 1 in both cases, and rerun genipe, see what it does.

@jystatistics
Copy link

jystatistics commented Nov 28, 2019

I used 1 both cases, but it is still running in genipe.

Do I have to use shapeit? My goal is to use SKAT from the final imputed data set in genipe.

Thank you.
1

@lemieuxl
Copy link
Member

It looks like it's running now. I don't know how many threads you were using at first, but I'm pretty sure it was a memory issue... How much memory does your computer has.

If you already have imputed data (from IMPUTE2), you can use SKAT directly on those. Otherwise, you need to let genipe finish the imputation process.

@JiaoBingke
Copy link

JiaoBingke commented Jun 16, 2021

I installed genipe on a CentOS system with 20 physical cores and 40 logical cores. I tested genipe as described in the documentation on http://pgxcentre.github.io/genipe/installation.html and executed genipe_tutorial. I added the option --shapeit-thread to the generated script and was able to execute it. But after I increased the value of that parameter, I got an error without an informative mesage about the reason.

In detail I did the following:

  • Installed genipe and tested the installation
  • Executed the following commands

cd
wget http://statgen.org/wp-content/uploads/Softwares/genipe/supp_files/hg19.tar.bz2
wget https://mathgen.stats.ox.ac.uk/impute/1000GP_Phase3.tgz
mkdir $HOME/genipe_tutorial
mkdir $HOME/genipe_tutorial/hg19
cd $HOME/genipe_tutorial/hg19
tar -jxf $HOME/hg19.tar.bz2
cd $HOME/genipe_tutorial
tar -zxf $HOME/1000GP_Phase3.tgz
touch 1000GP_Phase3/genipe_tut_done
cd
source genipe_pyvenv/bin/activate
genipe-tutorial
deactivate

In genipe_tutorial/execute.sh I did the following changes:

  • Replaced --chrom autosomes by --chrom 1 (to impute only the SNPs on the first chromosome for a test)
  • After the line with --thread I added a line with --shapeit-thread 20 \

Then I started the imputation:

source genipe_pyvenv/bin/activate
genipe_tutorial/execute.sh
deactivate

The imputation was successful.

Then I changed the value of --shapeit-thread in genipe_tutorial/execute.sh from 20 to 40, removed the geneated directory and started the imputation again:

rm -r genipe_tutorial/genipe/
source genipe_pyvenv/bin/activate
genipe_tutorial/execute.sh

I got the following messages:

[... INFO] Phasing markers
[... ERROR] Task 'SHAPEIT phase chr1': did not finish...
[... ERROR] the following task did not work: ['SHAPEIT phase chr1']
usage: genipe-launcher [-h] [-v] [--debug] [--thread THREAD] --bfile PREFIX
                       [--reference FILE] [--chrom CHROM [CHROM ...]]
                       [--output-dir DIR] [--bgzip] [--use-drmaa]
                       [--drmaa-config FILE] [--preamble FILE]
                       [--shapeit-bin BINARY] [--shapeit-thread INT]
                       [--shapeit-extra OPTIONS] [--plink-bin BINARY]
                       [--hap-template TEMPLATE] [--legend-template TEMPLATE]
                       [--map-template TEMPLATE] --sample-file FILE
                       [--hap-nonPAR FILE] [--hap-PAR1 FILE] [--hap-PAR2 FILE]
                       [--legend-nonPAR FILE] [--legend-PAR1 FILE]
                       [--legend-PAR2 FILE] [--map-nonPAR FILE]
                       [--map-PAR1 FILE] [--map-PAR2 FILE]
                       [--impute2-bin BINARY] [--segment-length BP]
                       [--filtering-rules RULE [RULE ...]]
                       [--impute2-extra OPTIONS] [--probability FLOAT]
                       [--completion FLOAT] [--info FLOAT]
                       [--report-number NB] [--report-title TITLE]
                       [--report-author AUTHOR]
                       [--report-background BACKGROUND]
genipe-launcher: error: the following task did not work: ['SHAPEIT phase chr1']

When I run shapeit using a VCF file consisting of 30 peanut samples, some chromosomes run successfully while others did not. The error message is "Assertion `conditional_index[segment].size() >= 2' failed.".
I solved this problem by three steps after scanned mails in the mail list https://www.jiscmail.ac.uk/cgi-bin/wa-jisc.exe?A2=ind1503&L=OXSTATGEN&P=R10095. I found this website in the instruction page of shapeit "To ask a question about SHAPEIT please subscribe to the OXSTATGEN mailing list" https://mathgen.stats.ox.ac.uk/genetics_software/shapeit/shapeit.html#Contact.

  1. Inluded more SNP in VCF file. Firstly, I filtered my raw VCF file using --max-missing 1. Due a sample with high missing rate larger than 38%, the SNP number shrank to the half of raw VCF. Therefore, I removed this high missing sample and using --max-missing 0.95. The SNP number increased.
  2. Set the -T 1 in shapeit as I only had 30 samples in the VCF files.
  3. Set the --window 10 in shapeit. I guessed that the SNP density of my peanut samples was lower than that of human. So I increased the window from the defualt value 2 to 5, and to 10. Finally the value 10 succeeded.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants