# Landscape Genomics Pipeline

This Notebook can be used to conduct Landscape Genomic Analysis on allele frequency data and environmental data.
Input requirements are:
1) VCF file (In this Notebook a demo data from DEST.bio is downloaded)
2) Environmental data in the form of a CSV file [Demo Notebook to get Environmental Data](/home(sonjastbdl/s3/CDSdataforLGA.ipynb)
- In this Notebook we provide WorldClim data and Copernicus Near Surface Air Temperature for all samples included in the DEST dataset to perform Analysis on. 


## Workflow Step By Step 

### Creating required directories to process and store outputs


In [27]:
# Set your wroking directory first
#!cd /home/sonjastndl/s3/LGA/uc3-drosophola-genetics/projects/LandscapeGenomicsPipeline/EOXHUB_TEST/
wd="/home/sonjastndl/s3/LGA/uc3-drosophola-genetics/projects/LandscapeGenomicsPipeline/EOXHUB_TEST/"
%cd $wd
#!mkdir results
#!mkdir data


/home/sonjastndl/s3/LGA/uc3-drosophola-genetics/projects/LandscapeGenomicsPipeline/EOXHUB_TEST


### Download VCF from DEST.bio and extract samplenames 

DEST.bio is the data source for Drosophila data from Europe from the DrosEU consortium.
See here: https://dest.bio/ https://droseu.net/.
Please down- or upload your genomic data in the following section.
The corresponding *samps.csv* file is carrying information on latitude, longitude, number of flies in the PoolSeq sample and other metadata. Intersection of genomic data with "metadata" is based on sampleIDs in the CSV and the VCF file.

In [None]:
# Please make sure to get the latest version of your genomcic data
# Here we are using the DroseEU DEST data set from 2023
!cd data
!wget --tries=inf "http://berglandlab.uvadcos.io/gds/dest.all.PoolSNP.001.50.25Feb2023.norep.ann.gdsdest.all.PoolSNP.001.50.25Feb2023.norep.vcf.gz"
!wget "https://github.com/DEST-bio/DESTv2/blob/main/populationInfo/dest_v2.samps_25Feb2023.csv"

## Extract the information on the available samples
!awk '{FS=","}{if (NR!=1) {print $1}}' dest_v2.samps_3May2024.csv > samplenames.csv
!mv data/dest.all.PoolSNP.001.50.25Feb2023.norep.vcf.gz data/PoolSeq2023.vcf.gz

### Naming the analysis 

In this case we are naming the analysis according to the region we are analyzing, full-genome. Please keep in mind, that setting the parameter "arm" results in output folder naming and does NOT specify the region that is analyzed for the pipeline.

In [44]:
!pwd

/home/sonjastndl/s3/LGA/uc3-drosophola-genetics/projects/LandscapeGenomicsPipeline/EOXHUB_TEST


In [29]:
# 1. Give your analyisis a name and create the corresponding directory
arm = "fullgenome_test"
!mkdir results/$arm
#!output="results/${arm}/Subsampled_${arm}.recode.vcf" #name must match with the awk of the chromosomes
#!outaf="results/${arm}/Subsampled_${arm}.af"

In [47]:
!FinalOut="/results/${arm}/Summary"
!mkdir $FinalOut


/home/sonjastndl/s3/LGA/uc3-drosophola-genetics/projects/LandscapeGenomicsPipeline/EOXHUB_TEST
results/fullgenome_test/Summary


In [None]:
!awk -F, 'NR > 1 && $6 == "Europe" && {print $1}' dest_v2.samps_3May2024.csv > data/EuropeSamples.csv

In [70]:
# Take vcf as input
inputt = "LandscapeGenomicsTest1/uc3-drosophola-genetics/projects/LandscapeGenomicsPipeline/data/PoolSeq2023.vcf.gz"
# If only a subset of samples is desired to be analysed, change in the samplenames.csv accordingly
sample = "LandscapeGenomicsTest1/uc3-drosophola-genetics/projects/LandscapeGenomicsPipeline/data/samplenames.csv"

In [71]:
# Remove polyploidies, focus on region (Chromosome), subsample population samples, and exclude all sites with missing data
!bash vcftools.sh $inputt $sample $arm

LandscapeGenomicsTest1/uc3-drosophola-genetics/projects/LandscapeGenomicsPipeline/data/dest.all.PoolSNP.001.50.25Feb2023.norep.vcf.gz
LandscapeGenomicsTest1/uc3-drosophola-genetics/projects/LandscapeGenomicsPipeline/data/samplenames.csv
fullgenome_test
DONE


### Randomly pick n (10k) lines from VCF

Even if we are  not limiting the analysis to genomic regions, we can perform the analysis on a subset of SNPs.
In this case we want to investigate random 10000 SNPs distributed across the full genome. 

In [4]:
import csv
import gzip
import os
import random
import re
import subprocess
import sys
from collections import defaultdict as d

In [76]:
inputt = "results/" + arm + "/Subsampled_" + arm + ".recode2.vcf.gz"
#!output="results/${arm}/Subsampled_${arm}.recode3.vcf.gz"

In [1]:
inputt="/home/sonjastndl/s3/LGA/uc3-drosophola-genetics/projects/LandscapeGenomicsPipeline/EOXHUB_TEST3/results/fullgenome/Subsampled_fullgenome.recode2.vcf.gz"

In [2]:
def load_data(x):
    """import data either from a gzipped or or uncrompessed file or from STDIN"""
    import gzip

    if x == "-":
        y = sys.stdin
    elif x.endswith(".gz"):
        y = gzip.open(x, "rt", encoding="latin-1")
    else:
        y = open(x, "r", encoding="latin-1")
    return y

In [6]:
SNPs = d(str)
with open("/home/sonjastndl/s3/LGA/uc3-drosophola-genetics/projects/LandscapeGenomicsPipeline/EOXHUB_TEST3/results/fullgenome/Subsample_small_test.vcf", "w") as f:
    for l in load_data(inputt):
        if l.startswith("##"):
            f.write(l)
        elif l.startswith("#"):
            f.write(l.rstrip())
        else:
            a = l.rstrip().split()
            SNPs[a[0] + "_" + a[1]] = l.rstrip()
            all=len(SNPs)
    f.write("\n")
    KEYS = random.sample(list(SNPs.keys()), int(all))
    KEYS2 = d(list)
    for k in KEYS:
        C, P = k.split("_")
        KEYS2[C].append(int(P))
    for k, v in sorted(KEYS2.items()):
        for p in sorted(v):
            f.write(SNPs[k + "_" + str(p)] + "\n")

### Running BCF Tools

In [105]:
# Run BCF Toools
!bash bcftools.sh results/fullgenome_test/Subsample_small.vcf results/fullgenome_test/Subsample_Europe300.vcf.gz

results/fullgenome_test/Subsample_small.vcf
results/fullgenome_test/Subsample_Europe300.vcf.gz
[W::vcf_parse] Contig '2L' is not defined in the header. (Quick workaround: index the file with tabix.)
[W::vcf_parse] Contig '2R' is not defined in the header. (Quick workaround: index the file with tabix.)
[W::vcf_parse] Contig '3L' is not defined in the header. (Quick workaround: index the file with tabix.)
[W::vcf_parse] Contig '3R' is not defined in the header. (Quick workaround: index the file with tabix.)
[W::vcf_parse] Contig 'X' is not defined in the header. (Quick workaround: index the file with tabix.)


In [111]:
# Redefine Input and Output and Convert Count Data to Alelle Frequency Data

inputt = "/home/sonjastndl/s3/results/fullgenome_test/Subsample_Europe300.vcf.gz"
output_file = "/home/sonjastndl/s3/results/fullgenome_test/Subsample_Europe_300_AF_new.af"  # Specify the name of the output file

with open(output_file, "w") as f:
    for l in load_data(inputt):
        a = l.rstrip().split()
        if l.startswith("##"):
            continue
        if l.startswith("#"):
            header = a[9:]
            f.write("Chr\tPos\t" + "\t".join(header) + "\n")
            continue
        pops = a[9:]
        format = a[8].split(":")
        if len(a[4].split(",")) > 1:
            continue
        AFs = []
        for i in pops:
            if "./." in i:
                AFs.append("NA")
                continue
            P = dict(zip(format, i.split(":")))
            AFs.append(str(round(float(P["AD"]) / float(P["DP"]), 9)))
        if sum([float(x) for x in AFs if x != "NA"]) == 0:
            continue
        f.write(a[0] + "\t" + a[1] + "\t" + "\t".join(AFs) + "\n")

In [107]:
wd = os.getcwd()

/home/sonjastndl/s3


In [86]:
# Define the variables for both Notebook and Terminal
!AF="/home/sonjastndl/s3/results/fullgenome_test/Subsample_Europe_5k_AF.af"
AF = "/home/sonjastndl/s3/results/fullgenome_test/Subsample_Europe_5k_AF.af"
!metadata="/home/sonjastndl/s3/LGA/uc3-drosophola-genetics/projects/LandscapeGenomicsPipeline/EOXHUB_TEST/data/metadata.csv"
metadata = "/home/sonjastndl/s3/LGA/uc3-drosophola-genetics/projects/LandscapeGenomicsPipeline/EOXHUB_TEST/data/metadata.csv"
!samplelist="/home/sonjastndl/s3/LGA/uc3-drosophola-genetics/projects/LandscapeGenomicsPipeline/EOXHUB_TEST/data/samplenames.csv"

### Perform Linear Regression 

In [87]:
# Perform linear regression on the data with R
!bash RunR.sh "/home/sonjastndl/s3/LGA/uc3-drosophola-genetics/projects/LandscapeGenomicsPipeline/scripts/Plot_pvalues.R" /home/sonjastndl/s3 /home/sonjastndl/s3/results/fullgenome_test/Subsample_5k_AF.af /home/sonjastndl/s3/LGA/uc3-drosophola-genetics/projects/LandscapeGenomicsPipeline/EOXHUB_TEST/data/metadata.csv "$arm"  "$FinalOut"

THESE ARE THE VARAIBLES: WD; AF-File, METADATA, ARM
/home/sonjastndl/s3
/home/sonjastndl/s3/results/fullgenome_test/Subsample_5k_AF.af
/home/sonjastndl/s3/LGA/uc3-drosophola-genetics/projects/LandscapeGenomicsPipeline/EOXHUB_TEST/data/metadata.csv
fullgenome_test
── [1mAttaching core tidyverse packages[22m ──────────────────────── tidyverse 2.0.0 ──
[32m✔[39m [34mdplyr    [39m 1.1.4     [32m✔[39m [34mreadr    [39m 2.1.5
[32m✔[39m [34mforcats  [39m 1.0.0     [32m✔[39m [34mstringr  [39m 1.5.1
[32m✔[39m [34mggplot2  [39m 3.4.4     [32m✔[39m [34mtibble   [39m 3.2.1
[32m✔[39m [34mlubridate[39m 1.9.3     [32m✔[39m [34mtidyr    [39m 1.3.0
[32m✔[39m [34mpurrr    [39m 1.0.2     
── [1mConflicts[22m ────────────────────────────────────────── tidyverse_conflicts() ──
[31m✖[39m [34mdplyr[39m::[32mfilter()[39m masks [34mstats[39m::filter()
[31m✖[39m [34mdplyr[39m::[32mlag()[39m    masks [34mstats[39m::lag()
[36mℹ[39m Use the conflicted packa

In [37]:
##LFMM ANALYSIS
# Create variable sfor BAypass Analysis: Used in LFMM
# BAYPASS analysis

# Script "main" (Including geno_creation.py, some shell commands to create necessary files, run Baypass)
bayin = "/home/sonjastndl/s3/s3/results/fullgenome_test/Subsample_5k.vcf"
baydir = "/home/sonjastndl/s3/results/" + arm + "/BAYPASS"
bayout = baydir + "/baypass.geno"
baycov = baydir + "/covariates.csv"
metadata = "/home/sonjastndl/s3/LGA/uc3-drosophola-genetics/projects/LandscapeGenomicsPipeline/EOXHUB_TEST/data/metadata.csv"
samples = "/home/sonjastndl/s3/LandscapeGenomicsTest1/uc3-drosophola-genetics/projects/LandscapeGenomicsPipeline/data/samplenames.csv"

In [38]:
!mkdir $baydir

mkdir: cannot create directory ‘/home/sonjastndl/s3/results/fullgenome_test/BAYPASS’: File exists


In [42]:
bayin = "/home/sonjastndl/s3/results/fullgenome_test/Subsample3.vcf.gz"

vcf = gzip.open(bayin, "rt", encoding="utf-8").readlines()[1:]
geno_file = []

for line in vcf:
    if line.startswith("#CHROM"):
        popcol = line.split()
        # print(popcol)

['#CHROM', 'POS', 'ID', 'REF', 'ALT', 'QUAL', 'FILTER', 'INFO', 'FORMAT', 'AT_Kar_See_1_2014-08-17', 'AT_Kar_See_1_2016-08-01', 'AT_Nie_Mau_1_2014-07-20', 'AT_Nie_Mau_1_2014-10-19', 'AT_Nie_Mau_1_2015-07-20', 'AT_Nie_Mau_1_2015-10-19', 'AT_Wie_Gro_1_2012-08-03', 'AT_Wie_Gro_1_2012-10-20']


In [2]:
with open(samples, "r") as f:
    matching_pops = [line.strip() for line in f]
    columns = [i for i, x in enumerate(popcol) if x in matching_pops]
    print("The following populations will be analyzed:")
    print(matching_pops)
    # print(columns)

NameError: name 'samples' is not defined

In [44]:
meta = open(metadata, "r").readlines()
popsize = []

for line in meta:
    for pop in matching_pops:
        if line.startswith(pop):
            # print(line)
            popsize.append(line.split(",")[3])

AT_Kar_See_1_2016-08-01,46.8136889,13.50794792,40,284.1451,78.0

AT_Nie_Mau_1_2014-07-20,48.375,15.56,40,298.4561,88.0

AT_Nie_Mau_1_2014-10-19,48.375,15.56,40,285.8957,88.0

AT_Nie_Mau_1_2015-07-20,48.375,15.56,40,296.2886,88.0

AT_Nie_Mau_1_2015-10-19,48.375,15.56,40,280.2592,88.0

AT_Wie_Gro_1_2012-08-03,48.2,16.37,62,295.844,98.0

AT_Wie_Gro_1_2012-10-20,48.2,16.37,44,284.129,98.0



In [45]:
output_file_path = "/home/sonjastndl/s3/results/fullgenome_test/BAYPASS/size.poolsize"

# Write the data to the output file with error handling
try:
    with open(output_file_path, "w") as file:
        file.write(" ".join(map(str, popsize)))
    print(f"File '{output_file_path}' created successfully.")
except Exception as e:
    print(f"Error writing to '{output_file_path}': {e}")

File '/home/sonjastndl/s3/results/fullgenome_test/BAYPASS/size.poolsize' created successfully.


In [46]:
# Create Covariable File
import csv
import os
import sys
from collections import defaultdict as d
from csv import reader, writer
from operator import itemgetter
import numpy as np

In [49]:
samples = []
with open(metadata, "r") as f:
    reader = csv.reader(f)
    for line in reader:
        # print(line)
        samples.extend(line)


def get_numeric_columns(x):
    with open(x, "r") as f:
        reader = csv.reader(f)
        header = next(reader)
        # print(header)
        numeric_cols = []
        for i, row in enumerate(reader):
            # print(i)
            for j, val in enumerate(row):
                # print(j)
                # print(val)
                if i == 0:
                    # print(row)
                    # print(val)
                    try:
                        float(val)
                        numeric_cols.append(j)
                    except ValueError:
                        pass
                # elif j in numeric_cols:
                # try:
                #    float(val)
                # except ValueError:
                #    numeric_cols.remove(j)
        numcols = []
        for k in numeric_cols:
            numcols.append(header[k])
        return numeric_cols


indices_to_select = get_numeric_columns(metadata)
# indices_to_select= get_numeric_columns("/media/inter/ssteindl/FC/usecaserepo/uc3-drosophola-genetics/projects/LandscapeGenomicsPipeline/testNSAT/data/metadata.csv")

In [51]:
# data_transposed = np.transpose(data)
def filter_samples(meta, samples):
    filtered_rows = []
    with open(meta, "r") as file:
        reader = csv.reader(file)
        for row in reader:
            if row and row[0] in samples:
                filtered_rows.append(row)
    return filtered_rows


print(samples)
samps = filter_samples(metadata, samples)
# samps=filter_samples("/media/inter/ssteindl/FC/usecaserepo/uc3-drosophola-genetics/projects/LandscapeGenomicsPipeline/testNSAT/data/metadata.csv", samples)
print(samps)

['sampleId', 'lat', 'long', 'nFlies', 'near_surface_air_temperature', 'bio1', 'AT_Kar_See_1_2016-08-01', '46.8136889', '13.50794792', '40', '284.1451', '78.0', 'AT_Nie_Mau_1_2014-07-20', '48.375', '15.56', '40', '298.4561', '88.0', 'AT_Nie_Mau_1_2014-10-19', '48.375', '15.56', '40', '285.8957', '88.0', 'AT_Nie_Mau_1_2015-07-20', '48.375', '15.56', '40', '296.2886', '88.0', 'AT_Nie_Mau_1_2015-10-19', '48.375', '15.56', '40', '280.2592', '88.0', 'AT_Wie_Gro_1_2012-08-03', '48.2', '16.37', '62', '295.844', '98.0', 'AT_Wie_Gro_1_2012-10-20', '48.2', '16.37', '44', '284.129', '98.0', 'BY_Bre_Bre_1_2015-09-11', '52.142193', '23.662434', '40', '287.8902', '75.0', 'CH_Vau_Cha_1_2014-07-24', '46.5670416', '6.701867', '40', '290.8203', '74.0', 'CH_Vau_Cha_1_2014-10-05', '46.5670416', '6.701867', '40', '283.9144', '74.0', 'CH_Vau_Cha_1_2015-08-01', '46.5670416', '6.701867', '40', '287.1291', '74.0', 'CH_Vau_Cha_1_2016-08-15', '46.5670416', '6.701867', '40', '292.4576', '74.0', 'CH_Vau_Vul_1_2018-

In [52]:
from operator import itemgetter

import numpy as np

data = [
    list(itemgetter(*indices_to_select)(row)) for i, row in enumerate(samps) if i > 0
]
data_transposed = np.transpose(data)

In [88]:
with open(baycov, "w", newline="") as f:
    writer = csv.writer(f, delimiter=" ")
    writer.writerows(data_transposed)


def get_colnames(x):
    with open(x, "r") as f:
        reader = csv.reader(f)
        header = next(reader)
        numeric_cols = []
        for i, row in enumerate(reader):
            for j, val in enumerate(row):
                if i == 0:
                    try:
                        float(val)
                        numeric_cols.append(j)
                    except ValueError:
                        pass
        numcols = []
        for k in numeric_cols:
            numcols.append(header[k])
        return numcols


colnames = get_colnames(metadata)


def get_colnames_write(x, bn):
    with open(x, "r") as f:
        reader = csv.reader(f)
        header = next(reader)
        numeric_cols = []
        for i, row in enumerate(reader):
            for j, val in enumerate(row):
                if i == 0:
                    try:
                        # float(val)
                        # print(header[j])
                        # print(data_transposed[j-1])
                        path = bn + "_" + header[j] + ".csv"
                        # print(path)
                        with open(path, "w", newline="") as f:
                            array_as_string = " ".join(map(str, data_transposed[j - 1]))
                            f.write(array_as_string)
                    except ValueError:
                        pass
        return 0


with open(baycov, "w", newline="") as f:
    writer = csv.writer(f, delimiter=" ")
    writer.writerows(data_transposed)


base_name, ext = os.path.splitext(baycov)
info_base_name = base_name + ".covariate.info"
ooouutt = info_base_name + ext
get_colnames_write(
    metadata,
    info_base_name,
)

with open(ooouutt, "w") as f:
    print(*colnames, file=f)

In [2]:
import csv

import pandas as pd

# variables= open("/home/sonjastndl/s3/results/fullgenome_test/BAYPASS/covariates.covariate.info.csv", "r")
# with open("/home/sonjastndl/s3/results/fullgenome_test/BAYPASS/covariates.covariate.info.csv", 'r', newline='') as file:

elements_list = []

# Open the TSV file for reading
with open(
    "/home/sonjastndl/s3/results/fullgenome_test/BAYPASS/covariates.covariate.info.csv",
    "r",
) as file:
    # Iterate over each line in the file
    for line in file:
        # Split the line into elements using the tab delimiter
        elements = line.strip().split("\t")
        # Add the elements to the list
        elements_list.extend(elements)

### Latent Factor Mixed Models (R-Package LEA)

In [54]:
# Make Output Directories
LeaOut = "results/${arm}/LEA"
!mkdir $LeaOut

In [3]:
# Print the list of elements
for line in elements_list:
    # Split the line into elements using space delimiter
    elements = line.strip().split(" ")
    # Iterate over each element in the list
    for element in elements:
        # Print the element followed by ", dd"
        print(element)
        var = element

lat
long
nFlies
near_surface_air_temperature
bio1


In [141]:
!bash /home/sonjastndl/s3/Run_LeaInstallation.sh

trying URL 'https://cloud.r-project.org/src/contrib/BiocManager_1.30.23.tar.gz'
Content type 'application/x-gzip' length 589753 bytes (575 KB)
downloaded 575 KB

* installing *source* package ‘BiocManager’ ...
** package ‘BiocManager’ successfully unpacked and MD5 sums checked
** using staged installation
** R
** inst
** byte-compile and prepare package for lazy loading
** help
*** installing help indices
** building package indices
** installing vignettes
** testing if installed package can be loaded from temporary location
** testing if installed package can be loaded from final location
** testing if installed package keeps a record of temporary installation path
* DONE (BiocManager)

The downloaded source packages are in
	‘/tmp/Rtmpiapi5Y/downloaded_packages’
Updating HTML index of packages in '.Library'
Making 'packages.html' ... done
Bioconductor version 3.18 (BiocManager 1.30.23), R 4.3.2 (2023-10-31)
Old packages: 'boot', 'bslib', 'callr', 'codetools', 'commonmark', 'crul',
  '

In [15]:
###test lfmm2 isf working

/home/sonjastndl/s3/results/fullgenome_test/Subsample_AF.af


In [144]:
!bash RunLeaR.sh /home/sonjastndl/s3/results/fullgenome_test/LEA/lat1 /home/sonjastndl/s3/results/fullgenome_test/Subsample_Europe_300_AF_new.af $metadata lat 1

THESE ARE THE VARAIBLES TO TEST:
[?25h[?25hError in library(factoextra) : there is no package called ‘factoextra’
Execution halted
[?25h

In [137]:
# Choose number of estimated latent factors (nK) and number of i
# Number of calculation repetitions for each factor.
nR = 3
nK = 7

for line in elements_list:
    # Split the line into elements using space delimiter
    elements = line.strip().split(" ")
    # Iterate over each element in the list
    for element in elements:
        for rep in range(1, nR + 1):
            print(element, rep)
            outdir = "/home/sonjastndl/s3/results/" + arm + "/LEA/" + element + str(rep)
            #!bash RunLeaR.sh $outdir /home/sonjastndl/s3/results/fullgenome_test/Subsample_Europe_300_AF_new.af $metadata $element $rep
        ##change this script to average pvalues if needed
        #!bash RunZPCalc.sh "/home/sonjastndl/s3/results/fullgenome_test/LEA/" $nK $nR /home/sonjastndl/s3/results/fullgenome_test/Subsample_Europe_300_AF_new.af $element

lat 1
lat 2
lat 3
[?25h[?25h[?25h[?25h[1] "/home/sonjastndl/s3/results/fullgenome_test/LEA/"
[?25h[?25h
[36m──[39m [1mColumn specification[22m [36m────────────────────────────────────────────────────────[39m
cols(
  .default = col_double(),
  Chr = [31mcol_character()[39m
)
[36mℹ[39m Use `spec()` for the full column specifications.

[?25h[?25h[?25h[?25h[?25hError in file(file, "rt") : cannot open the connection
Calls: cbind -> read.table -> file
In file(file, "rt") :
  cannot open file 'lat_run1/genotypes_gradients.lfmm/K7/run1/genotypes_r1_s1.7.zscore': No such file or directory
Execution halted
[?25hlong 1
long 2
long 3
[?25h[?25h[?25h[?25h[1] "/home/sonjastndl/s3/results/fullgenome_test/LEA/"
[?25h[?25h
[36m──[39m [1mColumn specification[22m [36m────────────────────────────────────────────────────────[39m
cols(
  .default = col_double(),
  Chr = [31mcol_character()[39m
)
[36mℹ[39m Use `spec()` for the full column specifications.

[?25h[?25h[?

### RDA (Redundancy Analysis With R)

In [150]:
!bash /home/sonjastndl/s3/analyseResults.sh /home/sonjastndl/s3/results/fullgenome_test/Subsample_Europe_300_AF.af /home/sonjastndl/s3/results/fullgenome_test/GM "BIO1" "NSAT" /home/sonjastndl/s3/results/fullgenome_test/LEA

[?25h[?25h[1] "/home/sonjastndl/s3/results/fullgenome_test/Subsample_Europe_300_AF.af"
[?25h[1] "/home/sonjastndl/s3/results/fullgenome_test/GM"
[?25h[?25h[?25h[?25h[?25h[?25h[?25h[?25h[?25h[?25h[?25h[?25h[?25h[?25h[?25h[?25h[?25hnull device 
          1 
[1m[22mRemoved 112 rows containing missing values (`geom_point()`). 
[?25hnull device 
          1 
[?25h[?25h

## Output Interpretation

In [None]:
Rscript /home/sonjastndl/s3/LGA/uc3-drosophola-genetics/projects/LandscapeGenomicsPipeline/scripts/PLotLEAPValues.r $wd $AF $metadata_new $arm $FinalOut
Rscript /home/sonjastndl/s3/LGA/uc3-drosophola-genetics/projects/LandscapeGenomicsPipeline/scripts/ComparePValues.R $AF ${wd}/results/${arm}/GM $LeaOut $FinalOut
