Skip to content

Generating required files for variant effect prediction #17

Closed Answered by gonzalobenegas
katiana22 asked this question in Q&A
Discussion options

You must be logged in to vote

The above file seems to include all variants observed in 1001 Genomes Project, while the file below contains all possible SNPs in a 1Mb region.

I used this code to generate the file used as input for Ensembl VEP:

rule make_simulated_variants:
    input:
        "output/genome.fa.gz",
    output:
        # then take this file to ensembl vep online and run with option
        # upstream/downstream = 500
        "output/simulated_variants/variants.vcf.gz",
        "output/simulated_variants/variants.parquet",
    run:
        genome = Genome(input[0])
        chrom = "5"
        start = 3500000
        end = start + 1_000_000
        rows = []
        nucleotides = list("ACGT")
        for pos 

Replies: 1 comment 1 reply

Comment options

You must be logged in to vote
1 reply
@katiana22
Comment options

Answer selected by katiana22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants