# Extension of GSCAN results to nicotine dependence (issue #59)
We have preliminary results from a very large-scale genome-wide study for cigarette smoking phenotypes that relate to our nicotine dependence GWAS results. See the supplemental Tables S7-S9 here: \rcdcollaboration01.rti.ns\GxG\Analysis\GSCAN\shared MS version 1\ . The phenotypes of interest to us include: cigarettes per day (CPD), smoking cessation (SC), and smoking initiation (SI).

We're interested in seeing whether these associations extend over to nicotine dependence. I use the SNP look-up script to extend Tables S6-S9 with our GWAS results (analysis sets 044, 045, and 046 here: \\rcdcollaboration01.rti.ns\GxG\Analysis\META\1df)

## Create directory structure and copy SNP look-up
The data that we will be parsing SNPs for are located on the `gxg share drive`. We will move the data to our local machine do to bandwidth issues re the share drive.

In [None]:
# Create directory structure locally
cd /cygdrive/c/Users/jmarks/Desktop/Projects
mkdir -p Nicotine/GSCAN_extended_results_nicotine/develop/{044,045,046}/\
{044_results,045_results,046_results}/{Table_S6,Table_S7,Table_S8,Table_S9}

mkdir -p Nicotine/GSCAN_extended_results_nicotine/data/\
{044.eur13cohorts.afr.9cohorts.eagle_lung.jhs_aric.1000G_p3_markerName_FinalDatasetDNMT3Bpaper,
045.eur13cohorts.eagle_lung.jhs_aric.1000G_p3_markerName_FinalDatasetDNMT3Bpaper,
046.afr.9cohorts.eagle_lung.jhs_aric.1000G_p3_markerName_FinalDatasetDNMT3Bpaper}

# copy data over to local machine and change permission of file
cd /cygdrive/c/Users/jmarks/Desktop/Projects/Nicotine/GSCAN_extended_results_nicotine/data/

# copy 044 data
cp //rcdcollaboration01.rti.ns/gxg/Analysis/META/1df/044.eur13cohorts.afr.9cohorts.eagle_lung.jhs_aric.1000G_p3_markerName_FinalDatasetDNMT3Bpaper/*.1df
/cygdrive/c/Users/jmarks/Desktop/Projects/Nicotine/GSCAN_extended_results_nicotine/data/044.eur13cohorts.afr.9cohorts.eagle_lung.jhs_aric.1000G_p3_markerName_FinalDatasetDNMT3Bpaper/

# copy 045 data
for i in //rcdcollaboration01.rti.ns/gxg/Analysis/META/1df/045.eur13cohorts.eagle_lung.jhs_aric.1000G_p3_markerName_FinalDatasetDNMT3Bpaper/cogend+copdgene+decode+eagle+sage+uw-tturc+gain+nongain+yale-penn+ntr+finnish+dental_caries+cogend2.eur.chr{1..22}.exclude_singletons.1df
do
cp $i 045.eur13cohorts.eagle_lung.jhs_aric.1000G_p3_markerName_FinalDatasetDNMT3Bpaper/
done

# copy 046 data
for i in //rcdcollaboration01.rti.ns/gxg/Analysis/META/1df/046.afr.9cohorts.eagle_lung.jhs_aric.1000G_p3_markerName_FinalDatasetDNMT3Bpaper/cogend+copdgene+sage+uw-tturc+gain+yale-penn+aand+jhs+cogend2.afr.chr{1..22}.exclude_singletons.1df
do
cp $i 046.afr.9cohorts.eagle_lung.jhs_aric.1000G_p3_markerName_FinalDatasetDNMT3Bpaper/
done


# change permission on all files 
chmod 755 044.eur13cohorts.afr.9cohorts.eagle_lung.jhs_aric.1000G_p3_markerName_FinalDatasetDNMT3Bpaper/*
chmod 755 045.eur13cohorts.eagle_lung.jhs_aric.1000G_p3_markerName_FinalDatasetDNMT3Bpaper/*
chmod 755 046.afr.9cohorts.eagle_lung.jhs_aric.1000G_p3_markerName_FinalDatasetDNMT3Bpaper/*


cd Nicotine/GSCAN_extended_results_nicotine/develop/GSCAN_extended_results_nicotine/

# I need to create a directory for each table because each table has different set of
# SNPs to lookup. Each set of SNPs need to be searched for in our GWAS results (044,045,046)
mkdir -p SNP_finder/{Table_S6,Table_S7,Table_S8,Table_S9}

# copy SNP look-up script to directory structure
for i in {6..9}; do  cp -r /cygdrive/c/Users/jmarks/Desktop/Code/SNP_finder/* SNP_finder/Table_S$i; done

# make a copy of the perl script in this directory for ease of use 
# (this is where it will actually be run from)

cp /cygdrive/c/Users/jmarks/Desktop/Code/SNP_finder/Table_S6/extract_rows.pl .

## Customize accompanying files to the SNP look-up script

The SNP look-up script takes as an argument a file which contains the SNPs that are to be searched for. The SNPs which will be of interest here are the SNPs listed in the supplementary tables at:

`//rcdcollaboration01.rti.ns/gxg/Analysis/GSCAN/shared MS version 1/Supplementary_Tables_S6-S12_Loci.xlsx`

* We are focusing on the SNPs from supplementary tables 6-9. The fourth column titled `rsID` is the column which contains the SNPs. We will copy/paste this column into the accompanying file titled `SNP_ids.txt`, which is located in the same directory as the SNP look-up script, titled `extract_rows.pl`

* The other accompanying file is titled `perlRun.txt`. This file is customized to detail the location of all necessary files need for the SNP script to run and is also located in the same directory as `extract_rows.pl` and `SNP_ids.txt`.

## Supplementary Table 6
Run the SNP look-up script for the SNPs in supplementary table 6

In [None]:
cd /cygdrive/c/Users/jmarks/Desktop/Projects/Nicotine/GSCAN_extended_results_nicotine/develop

# Lookup the SNPs from supplementary table 6 in the 044 data (cross-ancestrial)
j=0
for i in /cygdrive/c/Users/jmarks/Desktop/Projects/Nicotine/GSCAN_extended_results_nicotine/data/044.eur13cohorts.afr.9cohorts.eagle_lung.jhs_aric.1000G_p3_markerName_FinalDatasetDNMT3Bpaper/cogend+copdgene+decode+eagle+sage+uw-tturc+gain+nongain+yale-penn+ntr+finnish+aand+jhs+dental_caries+cogend2.afr+eur.chr{1..22}.1df; do
let j++
# location of the Perl SNP look-up script 
perl extract_rows.pl \
--source $i \
--id_list SNP_finder/Table_S6/SNP_ids.txt \
--out results/044_results/Table_S6/chr$j.overlap.txt \
--header 1 \
--id_column 0
done



# Lookup the SNPs from supplementary table 6 in the 045 data (EA-ancestry)
j=0
for i in /cygdrive/c/Users/jmarks/Desktop/Projects/Nicotine/GSCAN_extended_results_nicotine/data/045.eur13cohorts.eagle_lung.jhs_aric.1000G_p3_markerName_FinalDatasetDNMT3Bpaper/cogend+copdgene+decode+eagle+sage+uw-tturc+gain+nongain+yale-penn+ntr+finnish+dental_caries+cogend2.eur.chr{1..22}.exclude_singletons.1df; do

let j++ 
perl extract_rows.pl \
--source $i \
--id_list SNP_finder/Table_S6/SNP_ids.txt \
--out results/045_results/Table_S6/chr$j.overlap.txt \
--header 1 \
--id_column 0
done


# Lookup the SNPs from supplementary table 6 in the 046 data (AA-ancestry)
j=0
for i in /cygdrive/c/Users/jmarks/Desktop/Projects/Nicotine/GSCAN_extended_results_nicotine/data/046.afr.9cohorts.eagle_lung.jhs_aric.1000G_p3_markerName_FinalDatasetDNMT3Bpaper/cogend+copdgene+sage+uw-tturc+gain+yale-penn+aand+jhs+cogend2.afr.chr{1..22}.exclude_singletons.1df; do
let j++
# location of the Perl SNP look-up script 
perl extract_rows.pl \
--source $i \
--id_list SNP_finder/Table_S6/SNP_ids.txt \
--out results/046_results/Table_S6/chr$j.overlap.txt \
--header 1 \
--id_column 0
done

## Supplementary Table 7
Run the SNP look-up script for the SNPs in supplementary table 7

In [None]:
cd /cygdrive/c/Users/jmarks/Desktop/Projects/Nicotine/GSCAN_extended_results_nicotine/develop

# Lookup the SNPs from supplementary table 7 in the 044 data (cross-ancestrial)
j=0
for i in /cygdrive/c/Users/jmarks/Desktop/Projects/Nicotine/GSCAN_extended_results_nicotine/data/044.eur13cohorts.afr.9cohorts.eagle_lung.jhs_aric.1000G_p3_markerName_FinalDatasetDNMT3Bpaper/cogend+copdgene+decode+eagle+sage+uw-tturc+gain+nongain+yale-penn+ntr+finnish+aand+jhs+dental_caries+cogend2.afr+eur.chr{1..22}.1df; do
let j++
# location of the Perl SNP look-up script 
perl extract_rows.pl \
--source $i \
--id_list SNP_finder/Table_S7/SNP_ids.txt \
--out results/044_results/Table_S7/chr$j.overlap.txt \
--header 1 \
--id_column 0
done


# Lookup the SNPs from supplementary table 7 in the 045 data (EA-ancestry)
j=0
for i in /cygdrive/c/Users/jmarks/Desktop/Projects/Nicotine/GSCAN_extended_results_nicotine/data/045.eur13cohorts.eagle_lung.jhs_aric.1000G_p3_markerName_FinalDatasetDNMT3Bpaper/cogend+copdgene+decode+eagle+sage+uw-tturc+gain+nongain+yale-penn+ntr+finnish+dental_caries+cogend2.eur.chr{1..22}.exclude_singletons.1df; do
let j++ 
perl extract_rows.pl \
--source $i \
--id_list SNP_finder/Table_S7/SNP_ids.txt \
--out results/045_results/Table_S7/chr$j.overlap.txt \
--header 1 \
--id_column 0
done


# Lookup the SNPs from supplementary table 7 in the 046 data (AA-ancestry)
j=0
for i in /cygdrive/c/Users/jmarks/Desktop/Projects/Nicotine/GSCAN_extended_results_nicotine/data/046.afr.9cohorts.eagle_lung.jhs_aric.1000G_p3_markerName_FinalDatasetDNMT3Bpaper/cogend+copdgene+sage+uw-tturc+gain+yale-penn+aand+jhs+cogend2.afr.chr{1..22}.exclude_singletons.1df; do
let j++
# location of the Perl SNP look-up script 
perl extract_rows.pl \
--source $i \
--id_list SNP_finder/Table_S7/SNP_ids.txt \
--out results/046_results/Table_S7/chr$j.overlap.txt \
--header 1 \
--id_column 0
done

## Supplementary Table 8
Run the SNP look-up script for the SNPs in supplementary table 8

In [None]:
cd /cygdrive/c/Users/jmarks/Desktop/Projects/Nicotine/GSCAN_extended_results_nicotine/develop

# Lookup the SNPs from supplementary table 8 in the 044 data (cross-ancestrial)
j=0
for i in /cygdrive/c/Users/jmarks/Desktop/Projects/Nicotine/GSCAN_extended_results_nicotine/data/044.eur13cohorts.afr.9cohorts.eagle_lung.jhs_aric.1000G_p3_markerName_FinalDatasetDNMT3Bpaper/cogend+copdgene+decode+eagle+sage+uw-tturc+gain+nongain+yale-penn+ntr+finnish+aand+jhs+dental_caries+cogend2.afr+eur.chr{1..22}.1df; do
let j++
# location of the Perl SNP look-up script 
perl extract_rows.pl \
--source $i \
--id_list SNP_finder/Table_S8/SNP_ids.txt \
--out results/044_results/Table_S8/chr$j.overlap.txt \
--header 1 \
--id_column 0
done


# Lookup the SNPs from supplementary table 8 in the 045 data (EA-ancestry)
j=0
for i in /cygdrive/c/Users/jmarks/Desktop/Projects/Nicotine/GSCAN_extended_results_nicotine/data/045.eur13cohorts.eagle_lung.jhs_aric.1000G_p3_markerName_FinalDatasetDNMT3Bpaper/cogend+copdgene+decode+eagle+sage+uw-tturc+gain+nongain+yale-penn+ntr+finnish+dental_caries+cogend2.eur.chr{1..22}.exclude_singletons.1df; do
let j++ 
perl extract_rows.pl \
--source $i \
--id_list SNP_finder/Table_S8/SNP_ids.txt \
--out results/045_results/Table_S8/chr$j.overlap.txt \
--header 1 \
--id_column 0
done


# Lookup the SNPs from supplementary table 8 in the 046 data (AA-ancestry)
j=0
for i in /cygdrive/c/Users/jmarks/Desktop/Projects/Nicotine/GSCAN_extended_results_nicotine/data/046.afr.9cohorts.eagle_lung.jhs_aric.1000G_p3_markerName_FinalDatasetDNMT3Bpaper/cogend+copdgene+sage+uw-tturc+gain+yale-penn+aand+jhs+cogend2.afr.chr{1..22}.exclude_singletons.1df; do
let j++
# location of the Perl SNP look-up script 
perl extract_rows.pl \
--source $i \
--id_list SNP_finder/Table_S8/SNP_ids.txt \
--out results/046_results/Table_S8/chr$j.overlap.txt \
--header 1 \
--id_column 0
done

## Supplementary Table 9
Run the SNP look-up script for the SNPs in supplementary table 9

In [None]:
cd /cygdrive/c/Users/jmarks/Desktop/Projects/Nicotine/GSCAN_extended_results_nicotine/develop

# Lookup the SNPs from supplementary table 9 in the 044 data (cross-ancestrial)
j=0
for i in /cygdrive/c/Users/jmarks/Desktop/Projects/Nicotine/GSCAN_extended_results_nicotine/data/044.eur13cohorts.afr.9cohorts.eagle_lung.jhs_aric.1000G_p3_markerName_FinalDatasetDNMT3Bpaper/cogend+copdgene+decode+eagle+sage+uw-tturc+gain+nongain+yale-penn+ntr+finnish+aand+jhs+dental_caries+cogend2.afr+eur.chr{1..22}.1df; do
let j++
# location of the Perl SNP look-up script 
perl extract_rows.pl \
--source $i \
--id_list SNP_finder/Table_S9/SNP_ids.txt \
--out results/044_results/Table_S9/chr$j.overlap.txt \
--header 1 \
--id_column 0
done


# Lookup the SNPs from supplementary table 9 in the 045 data (EA-ancestry)
j=0
for i in /cygdrive/c/Users/jmarks/Desktop/Projects/Nicotine/GSCAN_extended_results_nicotine/data/045.eur13cohorts.eagle_lung.jhs_aric.1000G_p3_markerName_FinalDatasetDNMT3Bpaper/cogend+copdgene+decode+eagle+sage+uw-tturc+gain+nongain+yale-penn+ntr+finnish+dental_caries+cogend2.eur.chr{1..22}.exclude_singletons.1df; do
let j++ 
perl extract_rows.pl \
--source $i \
--id_list SNP_finder/Table_S9/SNP_ids.txt \
--out results/045_results/Table_S9/chr$j.overlap.txt \
--header 1 \
--id_column 0
done


# Lookup the SNPs from supplementary table 9 in the 046 data (AA-ancestry)
j=0
for i in /cygdrive/c/Users/jmarks/Desktop/Projects/Nicotine/GSCAN_extended_results_nicotine/data/046.afr.9cohorts.eagle_lung.jhs_aric.1000G_p3_markerName_FinalDatasetDNMT3Bpaper/cogend+copdgene+sage+uw-tturc+gain+yale-penn+aand+jhs+cogend2.afr.chr{1..22}.exclude_singletons.1df; do
let j++
# location of the Perl SNP look-up script 
perl extract_rows.pl \
--source $i \
--id_list SNP_finder/Table_S9/SNP_ids.txt \
--out results/046_results/Table_S9/chr$j.overlap.txt \
--header 1 \
--id_column 0
done

## Copy results to GxG drive
Do to issues with copying the files over from windows to the gxg drive, the best approach I have found is to first tarball the data then copy it over and finally untar it in the new location.

In [None]:
# local machine
cd /cygdrive/c/Users/jmarks/Desktop/Projects/Nicotine/GSCAN_extended_results_nicotine/develop/

tar -czvf results.tar.gz results/
cp results.tar.gz //rcdcollaboration01.rti.ns/gxg/Analysis/META/GSCAN_extended_results_nicotine/

cd //rcdcollaboration01.rti.ns/gxg/Analysis/META/GSCAN_extended_results_nicotine/

# untar
tar -xzvf results.tar.gz
rm results.tar.gz