-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Minimac4 -- 'segv' / 'Segmentation Fault (core dumped)' Error #8
Comments
Hi. We have recently fixed a bug regarding some segmentation fault, and Minimac4 has been updated to version 1.0.1. Could you please check whether the software you are using is up-to-date? Thanks! |
Hello! Thanks for the quick response. We have updated to version 1.0.1 but we are still observing the following error: ".../1532025929.2118.shell: line 34: 16770 Aborted (core dumped) $MINIMAC4_EXEC --refHaps $REF_HAPLOTYPES --haps $SAMPLE_HAPLOTYPES --format GT,DS,GP --passOnly -- allTypedSites --chr $CHR --start $CHUNK_START --end $CHUNK_END --window 100000 --prefix chr11.02.03 --log" We are attempting to run this on a cohort of more than 300,000 individuals on a single chunk 1 Mb +/- 100 kb. Do you have any advice for circumventing the error? |
As a quick test would it be possible to run the test examples that came
with the Minimac3 package (link provided below) ?
I Apologize that we removed the test cases in minimac4. We will fix that
soon. Untill then please try with the M3 test cases and let us know if it
still seg faults.
https://github.com/Santy-8128/Minimac3/tree/master/test
*Regards,*
Sayantan Das,
…On Fri, Jul 20, 2018 at 8:31 AM CScottGo ***@***.***> wrote:
Hello! Thanks for the quick response. We have updated to version 1.0.1 but
we are still observing the following error:
".../1532025929.2118.shell: line 34: 16770 Aborted (core dumped)
$MINIMAC4_EXEC --refHaps $REF_HAPLOTYPES --haps $SAMPLE_HAPLOTYPES --format
GT,DS,GP --passOnly -- allTypedSites --chr $CHR --start $CHUNK_START --end
$CHUNK_END --window 100000 --prefix chr11.02.03 --log"
We are attempting to run this on a cohort of more than 300,000 individuals
on a single chunk 1 Mb +/- 100 kb. Do you have any advice for circumventing
the error?
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#8 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AHuICKB6aLRXAFXO1unRBcITv36yEQ3wks5uIfe2gaJpZM4VVTAn>
.
|
Hi Sayantan, Thank you so much for your responses. We are now able to successfully get MINIMAC4 to run on a 5 megabase (+/- 1 MB window) on up to 300,000 individuals. When we exceed that and try to run the imputation on ~450,000, we observe a core dump error. In addition to alerting you to this error, I wanted to ask whether there were any scientific consequences to breaking the imputation input into two separate groups? Do the samples (not the reference panel) influence one another's imputation? Best, |
Hi Scott,
My guess would be that maybe you are running out of memory ?
No, samples don't influence each others imputation. The only advantage to
doing them together would be that you would be the whole R-square estimate
for each variant (across all samples). If you split it in two groups, your
imputed results would be the same, but you would end up with two R-square
values for each variant (for each group). In such a case, you would need to
calculate the omnibus R-square (across all samples) from the individual
batch specific R-squares (which shouldn't be difficult to derive if you
know the formula for the minimac R-square)
*Regards,*
Sayantan Das,
*23andMe*
…On Mon, Jul 30, 2018 at 12:46 PM CScottGo ***@***.***> wrote:
Hi Sayantan,
Thank you so much for your responses. We are now able to successfully get
MINIMAC4 to run on a 5 megabase (+/- 1 MB window) on up to 300,000
individuals. When we exceed that and try to run the imputation on ~450,000,
we observe a core dump error. In addition to alerting you to this error, I
wanted to ask whether there were any scientific consequences to breaking
the imputation input into two separate groups? Do the samples (not the
reference panel) influence one another's imputation?
Best,
Scott
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#8 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AHuICPoobIpVlHjsrukd9CVXPjIXluqlks5uL2KGgaJpZM4VVTAn>
.
|
Excellent. Sayantan, we are also trying to impute the pseudoautosomal regions, but when defining regions to impute, Minimac4 doesn't seem to know that there is a big gap between PAR1 and PAR2. This leads to defining massive regions to impute and automatic chunking [20Mb]. The only way to get around this is to exclude the block that has this gap in the m3vcf file and define the pseudoautosomal regions so that they don't include this block. **Do you have any advice for circumventing the issue? In addition, how does the automatic chunking and merging of Minimac4 work? Would there be any scientific consequences compared to manually setting regions to impute?** |
Hi,
Yes, you can always manually use the --chr --start --end options to impute
only the PAR1 and PAR2 separately. One should NOT impute PAR1 and PAR2
together anyways since they are on opposite ends of the chromosome and I
don't think there is any LD across them (I am not sure of this statement
though, just an intuition since they are really far). Please get back if
that does not answer your question.
*Regards,*
Sayantan Das,
*23andMe*
…On Wed, Aug 8, 2018 at 8:58 AM CScottGo ***@***.***> wrote:
Excellent. Sayantan, we are also trying to impute the pseudoautosomal
regions, but when defining regions to impute, Minimac4 doesn't seem to know
that there is a big gap between PAR1 and PAR2. This leads to defining
massive regions to impute and automatic chunking [20Mb]. The only way to
get around this is to exclude the block that has this gap in the m3vcf file
and define the pseudoautosomal regions so that they don't include this
block. **Do you have any advice for circumventing the issue?
In addition, how does the automatic chunking and merging of Minimac4 work?
Would there be any scientific consequences compared to manually setting
regions to impute?**
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#8 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AHuICFVZb9ianBeA7HSzmU0fLZGoZIPaks5uOwqXgaJpZM4VVTAn>
.
|
Hi Sayantan, Could you answer this part of the question as well: In addition, how does the automatic chunking and merging of Minimac4 work? Would there be any scientific consequences compared to manually setting regions to impute? Best, Scott |
Hi Scott,
Yes, of course.
The automatic chunking and merging doesn't do anything special apart from:
(a) it reads the variant list and uses the value of --chunkLengthMb and
--ChunkOverlapMb to get an idea how to chunk the data (the constraint being
that the resulting chunks should be at least 20Mb long with at least 3 Mb
overlap on either side, based on default values).
(b) next, tt imputes each chunk sequentially (including the overlap parts)
and saves the result by appending the resulting data (without the overlap)
in a final output file. This way there is no need to run a separate
concatenation step at the end.
As minimac4 runs the automated chunking, it will print out a summary of the
start and end positions of each chunk it ran. If one runs those chunks
manually using the --start --end --window option, they would get the exact
same results. So, in terms of accuracy, there is no difference in the
results between automated and manual chunking, assuming the exact same
chunk configurations were run. However, the automated chunking can only
impute the chunks sequentially, whereas when running the chunks manually
one could impute all chunks in parallel. On the other side, if one runs the
chunks manually, they would have to concat the results back to get whole
chromosome files, whereas the automated chunking of minimac4 would give you
whole chromosome files directly. Does that help?
And lastly, the automated chunking of minimac4 would still be invoked when
manually chunking using --start --end. However, if the region specified by
--start and --end is smaller than the value of --chunkLengthMb (which is 20
by default), then the automated chunking would treat it as a single chunk.
In other words, at any point of time, in order to override the automated
chunking one needs to specify a high value of --chunkLengthMb (higher than
the region one wants to impute as a single chunk). Please let me know if
this helps and/or if there are any other questions ?
*Regards,*
Sayantan Das,
*23andMe*
…On Wed, Aug 8, 2018 at 10:28 AM CScottGo ***@***.***> wrote:
Hi Sayantan,
Could you answer this part of the question as well:
*In addition, how does the automatic chunking and merging of Minimac4
work? Would there be any scientific consequences compared to manually
setting regions to impute?*
Best, Scott
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#8 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AHuICHhnQf_Lo9ghCl00GQqLYcfy5Ou5ks5uOx_HgaJpZM4VVTAn>
.
|
Hello, I've been trying to run Minimac4 to impute a cohort of roughly 300,000 individuals. When attempting to impute a single chunk of 5 Mb +/- 1 Mb on chromosome 11, I am routinely encountering a 'Segmentation Fault (core dumped)' error. The segv error occurs even when I reduce the input size to 5,000 individuals. Are you aware of any bugs in the software that would produce this error type?
The text was updated successfully, but these errors were encountered: