## Mac - Convert your spreadsheet to a popART traits block: 

Convert your spreadhseet to a popARt traits block. No coding of traits with 1's and 0's necessary--this does it all for you!

### User Input Required: 
- **Lines 2 and 4** (.csv input file name and traits name)
- Optional: lines 5 and 12 (special "use_name" option and "auto-concat" file concatention with your .nex alignment)
    - If using optional lines, remember to toggle cell to "Code" option in top menu


**Note: 2 files will be created**
- **pART_traits.txt** is the popART traits file containing everything you need for the popART traits block and matrix (this is the one you want to save, and the one that will be appended to the nexus file you specify).
- **pART_matrix.txt** is just the traits matrix. It is a useful file to check if anything does not look as expected in the pART_traits file.

*Prerequisites - In order to run this file you must have the following programs and packages installed:*
- *Python*
- *Jupyter Notebooks*
- *Pandas* (use `conda install pandas`)
- *Numpy* (use `conda install numpy`)

**Run this cell:**

In [1]:
import pandas as pd
import numpy as np

### Input your file name:

`df = pd.read_csv('YOUR_FILE_NAME_HERE.csv')`

- File requirements: .csv, all column headers lowercase, has a "name" column

- If your file does not have all lowercase column headers, see below:

In [2]:
df = pd.read_csv('thermi_testC.csv')
df

Unnamed: 0,Sample_name,BIC_Accession,Genus,Species,Locality,Latitude,Longitude
0,PB4088,A7969,Thermiphione,rapanui,Southern,-37.8,-110.92
1,PBS4088,A7970,Thermiphione,rapanui,Southern,-37.8,-110.92
2,PB4096A,A7971,Thermiphione,rapanui,Easter_Island,-23.55,-115.57
3,PB4096C,A7972,Thermiphione,rapanui,Easter_Island,-23.55,-115.57


In [3]:
# (Optional) if your file does not have all lowercase headers:

df.columns = [x.lower() for x in df.columns]

### Input your traits column name:
This is the column that you want to code as your traits in you haplotype network (eg. locality, depth) 

`def get_traits():
    df['traits'] = df['YOUR_TRAIT_NAME_HERE']
get_traits()`

In [4]:
def get_traits():
    df['traits'] = df['locality']
get_traits()
df

Unnamed: 0,sample_name,bic_accession,genus,species,locality,latitude,longitude,traits
0,PB4088,A7969,Thermiphione,rapanui,Southern,-37.8,-110.92,Southern
1,PBS4088,A7970,Thermiphione,rapanui,Southern,-37.8,-110.92,Southern
2,PB4096A,A7971,Thermiphione,rapanui,Easter_Island,-23.55,-115.57,Easter_Island
3,PB4096C,A7972,Thermiphione,rapanui,Easter_Island,-23.55,-115.57,Easter_Island


### Use_name
This is the name that will be used to create the popART matrix. You have two options, explained below:

**Option 1:**
"
`use_name()` function creates a new "use_name" column that concatenates the "genus", "species", and the "sample_name" columns, separated by underscores. This is the name that will be used for the popART matrix.
- **Note**: Your columns can be in any order in the dataframe, but must be all lowercase like this (no spaces): genus, species, sample_name

`def use_name():
    df['Use_name'] = df[['Genus', 'Species', 
                             'Sample_Name']].apply(lambda x: '_'.join(x), axis=1)`

**Option 2:**
`use_orig()` function copies your "Sample_name" column into the "Use_name" column so that only the sample names will be used to create the matrix (use this option if Genus/species are not available)

`def use_orig():
    df['Use_name'] = df['Sample_name']`

In [5]:
# Option 2:
def use_orig():
    df['use_name'] = df['sample_name']
use_orig()
#df

### Setup Finished! 
**If you want to set your output file now**, scroll down to the bottom and enter the file name.

Otherwise, click "restart and run all" to run the rest of the code as is and get your finished popART traits file.

In [6]:
df2 = df[['use_name', 'traits']]
# Creates dummy values (1 and 0) and concatenates them with the traits dataframe
df2_matrix = pd.get_dummies(df2['traits']) # must keep this dataframe for use later
df3 = pd.concat([df2, df2_matrix], axis=1)
del df3['traits'] # This deletes the Traits column
df3
# Save as csv without headers
df3.to_csv('p_matrix.csv', header=None,index=False)
# Convert to txt KEEPING commas

In [None]:
# MACS: bash; save as a text file
%mv p_matrix.csv p_matrix.txt

In [8]:
#Textfile Editing

# Replace the first comma with 2 spaces
with open("p_matrix.txt") as in_file, open("pART_matrix.txt", "w") as out_file:
    for line in in_file:
        out_file.write(line.replace(',', '  ', 1))
# Add ending format to file
outfile = 'pART_matrix.txt'
with open(outfile, 'a') as target:
    target.write(";")
    target.write("\n")
    target.write("\n")
    target.write("END;")
    target.write("\n")

In [None]:
# MACS: remove 'p_matrix.txt' because it is no longer needed
%rm 'p_matrix.txt'

In [10]:
# Make traitlabels from imported dataframe
labels = list(df2_matrix.columns.values)
labels # print output to confirm
# Make NTRAITS from imported dataframe and convert to string
ntraits = len(df2_matrix.columns)
ntraits = str(ntraits) # convert to string format
# Make traits file
outfile = 'pART_traits.txt'
with open(outfile, 'w') as target:
    target.write("BEGIN TRAITS;")
    target.write("\n")
    target.write("[This is the traits block specific to PopART. Check that NTRAITS matches the number of traits and they are in the same order as the matrix.]")
    target.write("\n")
    target.write("\tDimensions NTRAITS=")
    for item in ntraits:
        target.write("%s;" % item)
    target.write("\n")
    target.write("\tFormat labels=yes missing=? separator=Comma;")
    target.write("\n")
    target.write("\tTraitlabels")
    for item in labels:
      target.write(" %s" % item)
    target.write(";")
    target.write("\n")
    target.write("\tMatrix")
    target.write("\n")

In [None]:
# MACS only: append files to create final traits file
%cat pART_matrix.txt >> pART_traits.txt

### Optional Auto-Concat: ###
We can now append this to our nexus file for use in popART. Type the name and path (if needed) of your nexus file here:

`cat pART_traits.txt >> YOUR_FILENAME.nex`

**OR** just copy-paste our pART_traits file to the desired nexus file if you don't want to type it in here.