Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug in benchmarking #213

Closed
rl4940 opened this issue May 19, 2024 · 4 comments
Closed

Bug in benchmarking #213

rl4940 opened this issue May 19, 2024 · 4 comments

Comments

@rl4940
Copy link

rl4940 commented May 19, 2024

Hi Team, thank you so much for creating such an excellent tool!

I just downloaded truvari and suffered such an issue when doing benchmark.
I am doing a test so I input both "true" and "test" as the same VCF (GIAB HG002), and My cmd is below:

truvari bench -b HG002_SVs_Tier1_v0.6.vcf.gz\
              -c HG002_SVs_Tier1_v0.6.vcf.gz\
              -o output_dir/

The log is downbelow:

2024-05-19 22:02:45,486 [INFO] Params:
{
    "base": "/data/renqgli/project/TGS/pipeline/benchmark/truvari_self_bench/HG002_SVs_Tier1_v0.6.vcf.gz",
    "comp": "/data/renqgli/project/TGS/pipeline/benchmark/truvari_self_bench/HG002_SVs_Tier1_v0.6.vcf.gz",
    "output": "output_dir/",
    "includebed": null,
    "extend": 0,
    "debug": false,
    "reference": null,
    "refdist": 500,
    "pctseq": 0.7,
    "minhaplen": 50,
    "pctsize": 0.7,
    "pctovl": 0.0,
    "typeignore": false,
    "chunksize": 1000,
    "bSample": "HG002",
    "cSample": "HG002",
    "dup_to_ins": false,
    "sizemin": 50,
    "sizefilt": 30,
    "sizemax": 50000,
    "passonly": false,
    "no_ref": false,
    "pick": "single",
    "check_monref": true,
    "check_multi": true
}
2024-05-19 22:03:10,109 [INFO] Zipped 148024 variants Counter({'base': 74012, 'comp': 74012})
2024-05-19 22:03:10,110 [INFO] 32790 chunks of 148024 variants Counter({'__filtered': 72331, 'comp': 45753, 'base': 29940})
[E::bgzf_flush] File write failed (wrong size)
Traceback (most recent call last):
  File "/data/renqgli/miniconda3/envs/truvari/bin/truvari", line 10, in <module>
    sys.exit(main())
  File "/data/renqgli/miniconda3/envs/truvari/lib/python3.9/site-packages/truvari/__main__.py", line 105, in main
    TOOLS[args.cmd](args.options)
  File "/data/renqgli/miniconda3/envs/truvari/lib/python3.9/site-packages/truvari/bench.py", line 759, in bench_main
    output = m_bench.run()
  File "/data/renqgli/miniconda3/envs/truvari/lib/python3.9/site-packages/truvari/bench.py", line 508, in run
    output.close_outputs()
  File "/data/renqgli/miniconda3/envs/truvari/lib/python3.9/site-packages/truvari/bench.py", line 395, in close_outputs
    truvari.compress_index_vcf(i)
  File "/data/renqgli/miniconda3/envs/truvari/lib/python3.9/site-packages/truvari/utils.py", line 434, in compress_index_vcf
    out_hdlr.write(bcftools.sort(fn))
  File "/data/renqgli/miniconda3/envs/truvari/lib/python3.9/site-packages/pysam/utils.py", line 83, in __call__
    raise SamtoolsError(
pysam.utils.SamtoolsError: 'bcftools returned with error -1: stdout=, stderr=Writing to /tmp/bcftools.re4flI\n[buf_flush] Error: cannot write to /tmp/bcftools.re4flI/00001.bcf\nCleaning\n'
@rl4940
Copy link
Author

rl4940 commented May 19, 2024

I also tried plug in different VCF files for real condition, but the same error appeared

@ACEnglish
Copy link
Owner

It seems like Truvari is able to finish doing the comparisons, but when it's calling bcftools to sort/index the output the error is raised.

Does your machine have a /tmp directory?

Could you try setting the $TMPDIR environment variable? I don't expect this to work since bcftools doesn't check that variable (It should) and older versions of bcftools sort used a hardcoded /tmp details

The long term solution is that it looks like pysam's bcftools dispatch might expose the --temp-dir option details, so perhaps I can fix it by having Truvari pass tempfile.gettempdir to it.

@ACEnglish
Copy link
Owner

If setting $TMPDIR doesn't work, the long term solution is now in develop and you can install from there.

@rl4940
Copy link
Author

rl4940 commented May 20, 2024

@ACEnglish
Hi English, thank you for your immediate help!
I have figured out, it's due to the full capacity of my RAM, cuz I have multiple containers running behind.
Now it's up and running,
Thank yo so much!!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants