Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

using custom glf instead of angsd output #15

Closed
LZeitler opened this issue Feb 25, 2021 · 4 comments
Closed

using custom glf instead of angsd output #15

LZeitler opened this issue Feb 25, 2021 · 4 comments

Comments

@LZeitler
Copy link

Hi Filipe

When extracting GL tags from a vcf (called with freebayes) and converting to glf I am encountering some problems. NgsF does not want to process, I am wondering why. I know, it's not described in the readme and it's sort of file hacking but I don't see why it would be wrong. Maybe you can help.

Here my workflow

bcftools query -f '[%GL,]\n' my.vcf > my.gl
sed -i 's/\.,/0,0,0,/g' my.gl        # replace missing GL with 0,0,0,
sed -i 's/,/\t/g' my.gl                 # make tab sep
cat my.gl | perl -an -e 'for($i=1; $i<=$#F; $i++){print(pack("d",$F[$i]))}' > my.glf
N_IND=$(awk '{print NF/3; exit}' my.gl)
N_SITES=$(cat my.gl | wc -l)
cat my.glf | $NGSF --n_ind $N_IND --n_sites $N_SITES  --glf - --min_epsilon 0.001 \
       --out my.approx_indF --n_threads 1 --verbose 2 --approx_EM --seed 12345 --init_values r

output

 ==> Input Arguments:
    glf file: -
    init_values: r
    calc_LRT: false
    freq_fixed: false
    out file: fb_run5_all_nn_bi_mj_sn_mm.gl.0s.t.glf.approx_indF
    n_ind: 339
    n_sites: 489999
    chunk_size: 100000
    approx_EM: true
    call_geno: false
    max_iters: 1500
    min_iters: 10
    min_epsilon: 0.0010000000
    n_threads: 1
    seed: 12345
    quick: false
    verbose: 2
    version: 1.2.0-STD (Feb 23 2021 @ 12:18:32)

 ==> Analysis will be run in 5 chunk(s)
 ==> Using native I/O library

 [main] ERROR: cannot read GLF file!

 [main] ERROR: cannot read GLF file!
    : No such file or directory
@fgvieira
Copy link
Owner

What is the size, in bytes, of the my.glf file?

@LZeitler
Copy link
Author

3982711872 bytes

@fgvieira
Copy link
Owner

Without looking at the files it is difficult, but it seems your GLF file might not be complete. The GLF file should have a size in bytes of = n_ind * n_sites * 3 * 8 (3 GL, and 8 for the size of a double).

maybe,
cat my.gl | perl -an -e 'for($i=1; $i<=$#F; $i++){print(pack("d",$F[$i]))}' > my.glf
should be:
cat my.gl | perl -an -e 'for($i=0; $i<=$#F; $i++){print(pack("d",$F[$i]))}' > my.glf

@LZeitler
Copy link
Author

LZeitler commented Mar 3, 2021

That was it, thanks!

@LZeitler LZeitler closed this as completed Mar 3, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants