<a href="https://colab.research.google.com/github/cskipper07/Data-Science/blob/main/3_Aim3_cnm_copy.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **Aim 3: Trait variation: CNM**
* This file contains frequency counts and the code to prepare the dfs for trait variation analysis (performed in R).
* Some outputs have been removed to protect PII and the raw data.

The cnm data frequencies will be calculated here to assess potential trait variation among populations.

# Input data files:

**My data**
*   *CES_cnm_recoded.xlsx*: This file includes the cranial nonmetric and macromorphoscopic data and demographics after left and right sides have been collapsed and all traits have been recoded.
*   *CES_cnm_prepped.xlsx*: Contains my data that is prepared to merge with the other dfs

**Comp1 data**
*   *Comp1_cnm_collapsed.xlsx*: This file contains the Comp1 cnm data with the left and right sides collapsed (prior to being recoded).
*   *Comp1_cnm_prepped.xlsx*: Contains the Comp1 that is prepared to merge with the other dfs

**Comp2 data**
*   *Comp2_sex.xlsx*: This file contains the male and female counts for the Comp2 dataset.
*   *Comp2_cnm_prepped.xlsx*: Contains the Comp2 that is prepared to merge with the other dfs

**Comp3 data**
*   *Comp3_cnm_prepped.xlsx*: Contains the Comp3 that is prepared to merge with the other dfs
*   *cnm_for_TMD.xlsx*: Contains all four dfs merged. This file was subsequently altered in Excel to remove all demographic columns except Population Code

**All cnm data merged**
*   *cnm_all_merged.xlsx*: This file contains all four cnm dfs merged. This file can be used in subsequent analyses (except the TMD analysis, which required my df to be match individually to each comparative df).

**TMD analysis files**
*   *CES_Comp1_for_TMD.xlsx*: This file contains my dataset and Comp1 matched and prepared for the TMD analysis.
*   *CES_Comp2_for_TMD.xlsx*: This file contains my dataset and Comp2 matched and prepared for the TMD analysis.
*   *CES_Comp3_for_TMD.xlsx*: This file contains my dataset and Comp3 matched and prepared for the TMD analysis.

### Import libraries

In [None]:
# Import libraries
import pandas as pd
import scipy
import numpy as np
import seaborn as sns
import os

In [None]:
!pip install --upgrade openpyxl



### Set print options

In [None]:
import sys
import pandas as pd
pd.set_option('display.max_rows', 500)
pd.set_option('display.max_columns', 500)
pd.set_option('display.width', 1000)
np.set_printoptions(threshold=sys.maxsize)

### Set export

In [None]:
from google.colab import  drive
drive.mount('/drive')

Mounted at /drive


---

# CES vs. Comp1

In [None]:
CES_Comp1 = pd.read_csv('CES_Comp1_for_TMD3.csv')

The output for the following code cell was removed to protect PII and/or the raw data. The output contained the following columns:
* Population, APF, AST, BREG....TYM



In [None]:
CES_Comp1.head()

## Frequency counts

In [None]:
CES_Comp1['APF'].value_counts(normalize=True)

0.0    0.565789
1.0    0.434211
Name: APF, dtype: float64

The output for the following code cell was removed to protect PII and/or the raw data. The output contained the following columns:
* Population, APF, AST, BREG....TYM

In [None]:
CES_Comp1.head()

In [None]:
CES_Comp1_groups = CES_Comp1.groupby(['Population'])

In [None]:
CES_Comp1_groups

<pandas.core.groupby.generic.DataFrameGroupBy object at 0x7f2356fe3590>

In [None]:
CES_Comp1_groups['APF'].value_counts()

Population      APF
Asian_American  1.0    56
                0.0     9
Black           0.0    40
                1.0    11
Hispanic        0.0    21
                1.0     5
Japanese        1.0    18
                0.0    14
White           0.0    88
                1.0    42
Name: APF, dtype: int64

In [None]:
CES_Comp1_groups[0]

In [None]:
CES_Comp1['APF'].value_counts()

0.0    172
1.0    132
Name: APF, dtype: int64

In [None]:
# calculate number of 0 scores in APF
CES_Comp1_groups['APF'] == 0

False

In [None]:
CES_Comp1[CES_Comp1['APF'] == 0].shape[0]

In [None]:
# cnm[cnm['SONR'].values == 0].count()

In [None]:
# calculate number of non-null (non-NA) scores in APF
CES_Comp1['APF'].notnull().sum()


304

In [None]:
# only including 0 and 1 because the data are dichotomized to match the Comp1 and for the TMD analysis (binary data only)
print('Frequency of score=0:', '\t', (CES_Comp1[CES_Comp1['APF'] == 0].shape[0]) / (CES_Comp1['APF'].notnull().sum()))
print('Frequency of score=1:', '\t', (CES_Comp1[CES_Comp1['APF'] == 1].shape[0]) / (CES_Comp1['APF'].notnull().sum()))
'''
print('Frequency of score=2:', '\t', (CES_Comp1[CES_Comp1['APF'] == 2].shape[0]) / (CES_Comp1['APF'].notnull().sum()))
print('Frequency of score=3:', '\t', (CES_Comp1[CES_Comp1['APF'] == 3].shape[0]) / (CES_Comp1['APF'].notnull().sum()))
print('Frequency of score=4:', '\t', (CES_Comp1[CES_Comp1['APF'] == 4].shape[0]) / (CES_Comp1['APF'].notnull().sum()))
print('Frequency of score=5:', '\t', (CES_Comp1[CES_Comp1['APF'] == 5].shape[0]) / (CES_Comp1['APF'].notnull().sum()))
'''

In [None]:
# only including 0 and 1 because the data are dichotomized to match the Comp1 and for the TMD analysis (binary data only)
print('Frequency of score=0:', '\t', (CES_Comp1[CES_Comp1['APF'] == 0].shape[0]) / (CES_Comp1['APF'].notnull().sum()))
print('Frequency of score=1:', '\t', (CES_Comp1[CES_Comp1['APF'] == 1].shape[0]) / (CES_Comp1['APF'].notnull().sum()))
'''
print('Frequency of score=2:', '\t', (CES_Comp1[CES_Comp1['APF'] == 2].shape[0]) / (CES_Comp1['APF'].notnull().sum()))
print('Frequency of score=3:', '\t', (CES_Comp1[CES_Comp1['APF'] == 3].shape[0]) / (CES_Comp1['APF'].notnull().sum()))
print('Frequency of score=4:', '\t', (CES_Comp1[CES_Comp1['APF'] == 4].shape[0]) / (CES_Comp1['APF'].notnull().sum()))
print('Frequency of score=5:', '\t', (CES_Comp1[CES_Comp1['APF'] == 5].shape[0]) / (CES_Comp1['APF'].notnull().sum()))
'''

In [None]:
# slice from column 1 because that is the first cnm trait column

CES_Comp1_VarsOnly = pd.DataFrame(CES_Comp1.iloc[:, 1:])

In [None]:
CES_Comp1_VarsOnly.head()

In [None]:
#column_names = ('Trait Code', 0, 1, 2, 3, 4, 5)
column_names = ('Trait Code', 0, 1)
CES_Comp1_freq_counts = pd.DataFrame(columns = column_names)
for i in CES_Comp1_VarsOnly.columns:
    #print(cnm[i].value_counts(normalize=True))
    freq_0 = (CES_Comp1_VarsOnly[CES_Comp1_VarsOnly[i] == 0].shape[0]) / (CES_Comp1_VarsOnly[i].notnull().sum())
    freq_1 = (CES_Comp1_VarsOnly[CES_Comp1_VarsOnly[i] == 1].shape[0]) / (CES_Comp1_VarsOnly[i].notnull().sum())
    #freq_2 = (CES_Comp1_VarsOnly[CES_Comp1_VarsOnly[i] == 2].shape[0]) / (CES_Comp1_VarsOnly[i].notnull().sum())
    #freq_3 = (CES_Comp1_VarsOnly[CES_Comp1_VarsOnly[i] == 3].shape[0]) / (CES_Comp1_VarsOnly[i].notnull().sum())
    #freq_4 = (CES_Comp1_VarsOnly[CES_Comp1_VarsOnly[i] == 4].shape[0]) / (CES_Comp1_VarsOnly[i].notnull().sum())
    #freq_5 = (CES_Comp1_VarsOnly[CES_Comp1_VarsOnly[i] == 5].shape[0]) / (CES_Comp1_VarsOnly[i].notnull().sum())
    #freq_6 = (CES_Comp1_VarsOnly[CES_Comp1_VarsOnly[i] == 6].shape[0]) / (CES_Comp1_VarsOnly[i].notnull().sum())
    #print(i,'\t', freq_0,'\t', freq_1,'\t', freq_2,'\t', freq_3,'\t', freq_4,'\t', freq_5)
    #CES_Comp1_freq_counts.loc[i] = [i, round(freq_0, 2), round(freq_1, 2), round(freq_2, 2), round(freq_3, 2), round(freq_4, 2), round(freq_5, 2)]
    print(i,'\t', freq_0, '\t', freq_1)
    CES_Comp1_freq_counts.loc[i] = [i, round(freq_0, 2), round(freq_1, 2)]

APF 	 0.5657894736842105 	 0.4342105263157895
AST 	 0.889967637540453 	 0.11003236245954692
BREG 	 0.9761092150170648 	 0.023890784982935155
CRB 	 0.6151202749140894 	 0.3848797250859107
EPB 	 0.8901098901098901 	 0.10989010989010989
FTA 	 0.9633333333333334 	 0.03666666666666667
HYP 	 0.8734567901234568 	 0.12654320987654322
IFS 	 0.8163934426229508 	 0.18360655737704917
INCA 	 0.9934426229508196 	 0.006557377049180328
LBLa 	 0.6395759717314488 	 0.36042402826855124
LBM 	 0.716 	 0.284
MANT 	 0.7024221453287197 	 0.2975778546712803
MEN 	 0.9437086092715232 	 0.056291390728476824
METO 	 0.953125 	 0.046875
MF 	 0.2246153846153846 	 0.7753846153846153
MFLo 	 0.41139240506329117 	 0.5886075949367089
MHB 	 0.8300653594771242 	 0.16993464052287582
MIF 	 0.8892508143322475 	 0.11074918566775244
OMB 	 0.8848920863309353 	 0.11510791366906475
PALT 	 0.5374592833876222 	 0.46254071661237783
PF 	 0.37577639751552794 	 0.6242236024844721
PHAR 	 0.7543859649122807 	 0.24561403508771928
PNB 	 0.79

In [None]:
CES_Comp1_freq_counts.tail()

Unnamed: 0,Trait Code,0,1
PF,PF,0.38,0.62
PHAR,PHAR,0.75,0.25
PNB,PNB,0.79,0.21
SOF,SOF,0.7,0.3
TYM,TYM,0.94,0.06


In [None]:
CES_Comp1_freq_counts = pd.read_excel('CES_Comp1_for_TMD3_freq_table for diss.xlsx')

In [None]:
CES_Comp1_freq_counts.head()

Unnamed: 0,Trait Code,Asian American,Japanese,Black,White,Hispanic
0,APF,0.861538,0.5625,0.215686,0.323077,0.192308
1,AST,0.125,0.0625,0.122807,0.10219,0.148148
2,BREG,0.039216,0.0,0.0,0.037037,0.0
3,CRB,0.571429,0.03125,0.4,0.43609,0.185185
4,EPB,0.157895,0.125,0.019231,0.138211,0.071429


The output for the following code cell was removed to protect PII and/or the raw data. The output contained the following columns:
* Trait Code, Trait Name, Skipper, Comp1, Comp2, Comp3, Notes, R script names, R dataframe names, write.csv in "Recoded data" file
* Skipper, Comp1, Comp2, Comp3 columns contain the original and recoded values for each column/df for each trait

In [None]:
codes_names = pd.read_excel('Trait list and scoring.xlsx')
codes_names.head()

In [None]:
codes_names = pd.DataFrame(codes_names.iloc[:, :2])
codes_names.head()

Unnamed: 0,Trait Code,Trait Name
0,METO,Metopic suture
1,,
2,INCA,Oss inca
3,OMB,Occipito-mastoid suture ossicle
4,AST,Asterionic ossicle


In [None]:
CES_Comp1_freq_counts_labeled = pd.merge(CES_Comp1_freq_counts, codes_names, on='Trait Code')

In [None]:
CES_Comp1_freq_counts_labeled

Unnamed: 0,Trait Code,Asian American,Japanese,Black,White,Hispanic,Trait Name
0,APF,0.861538,0.5625,0.215686,0.323077,0.192308,Accessory lesser palatine foramen
1,AST,0.125,0.0625,0.122807,0.10219,0.148148,Asterionic ossicle
2,BREG,0.039216,0.0,0.0,0.037037,0.0,Bregma ossicle
3,CRB,0.571429,0.03125,0.4,0.43609,0.185185,Coronal ossicle
4,EPB,0.157895,0.125,0.019231,0.138211,0.071429,Epipteric bone
5,FTA,0.0,0.03125,0.086207,0.030075,0.035714,Fronto-temporal articulation
6,HYP,0.0,0.0,0.084746,0.214286,0.206897,Hypoglossal canal bridged or double
7,IFS,0.0,0.0,0.145455,0.317829,0.269231,Infraorbital suture
8,LBLa,0.520833,0.15625,0.288462,0.408,0.230769,Lambdoid ossicle lateral
9,LBM,0.3125,0.1875,0.265306,0.324561,0.217391,Lambdoid ossicle medial


In [None]:
CES_Comp1_freq_counts_labeled.to_excel('/drive/My Drive/Colab Notebooks/Statistical analysis/Aim 3 - Trait variation/CES_Comp1_freq_counts_labeled.xlsx', index=True)

---

# CES vs. Comp2

In [None]:
CES_Comp2 = pd.read_excel('CES_Comp2_for_TMD2_freq_table for diss.xlsx')

In [None]:
CES_Comp2.head()

Unnamed: 0,Trait Code,AA,J,USB,KEN,TAN,SUD,S,GAB,GHA,NIG,W,HK,AL,SAL,SLS,NAL,CAR,EAR,ARM,AT,CAN,BV,CZ,GE,RU,HU,IC,IND,SIE,C,N,NN,CHN,MON,SIB,NMV,ILL,NFL,ONT,NPC,PEC,PLN,PLT,CH,PT,TF,AU,CHAT,MQ,NZ
0,AST,0.125,0.0625,0.150943,0.190476,0.217391,0.190476,0.222222,0.333333,0.354839,0.25,0.173729,0.209581,0.132867,0.214286,0.173913,0.193634,0.083333,0.094891,0.113208,0.086667,0.066265,0.0,0.166667,0.166667,0.153846,0.166667,0.026316,0.174312,0.168831,0.172414,0.121951,0.176471,0.169231,0.178571,0.179012,0.115207,0.366667,0.1,0.181818,0.208333,0.308219,0.13,0.186047,0.214286,0.0,0.076923,0.22449,0.368421,0.266667,0.166667
1,CIV,0.0,0.0,0.016949,0.0,0.0,0.0,0.018182,0.0,0.0,0.0,0.039301,0.043478,0.040248,0.039216,0.052239,0.072539,0.148789,0.043624,0.036697,0.06962,0.058824,0.0,0.076923,0.142857,0.0,0.017857,0.078947,0.034483,0.034884,0.019355,0.017442,0.069767,0.088235,0.033898,0.017857,0.071111,0.05,0.0,0.0,0.094851,0.048276,0.095455,0.13089,0.064516,0.0,0.0,0.0,0.0,0.042553,0.068182
2,HYP,0.0,0.0,0.166667,0.095238,0.116279,0.136364,0.156863,0.0,0.096774,0.086957,0.139241,0.216374,0.242718,0.192233,0.238971,0.171875,0.229452,0.206081,0.214953,0.173333,0.216495,0.428571,0.153846,0.333333,0.166667,0.368421,0.194444,0.172414,0.264368,0.100649,0.156977,0.102564,0.102941,0.068966,0.122699,0.231481,0.236111,0.12,0.136364,0.226667,0.159091,0.20197,0.19186,0.241379,0.4,0.384615,0.02,0.157895,0.111111,0.159091
3,MEN,0.046875,0.09375,0.166667,0.0,0.222222,0.080645,0.26,0.0,0.2,0.0,0.115556,0.209302,0.121849,0.090476,0.080402,0.096916,0.077869,0.068966,0.075472,0.075758,0.071429,0.0,0.0,0.0,0.0,0.096154,0.0,0.043478,0.05814,0.151316,0.123457,0.025,0.113208,0.152174,0.112245,0.097345,0.0375,0.107143,0.03125,0.120521,0.09375,0.106509,0.069231,0.111111,0.25,0.125,0.116279,0.25,0.2,0.130435
4,METO,0.0,0.0,0.035714,0.047619,0.042553,0.015625,0.0,0.0,0.0,0.035714,0.074803,0.054348,0.042553,0.015625,0.032028,0.011331,0.011719,0.003876,0.112069,0.025157,0.087719,0.0,0.076923,0.142857,0.0,0.051724,0.078947,0.025862,0.130952,0.073864,0.117647,0.02,0.161765,0.152542,0.00578,0.010274,0.022472,0.0,0.016949,0.02611,0.013245,0.013453,0.004926,0.03125,0.0,0.0,0.02,0.05,0.02,0.0


In [None]:
codes_names.head()

Unnamed: 0,Trait Code,Trait Name
0,METO,Metopic suture
1,,
2,INCA,Oss inca
3,OMB,Occipito-mastoid suture ossicle
4,AST,Asterionic ossicle


In [None]:
CES_Comp2_freq_counts_labeled_IP = pd.merge(CES_Comp2, codes_names, on='Trait Code')
CES_Comp2_freq_counts_labeled_IP.head()

Unnamed: 0,Trait Code,AA,J,USB,KEN,TAN,SUD,S,GAB,GHA,NIG,W,HK,AL,SAL,SLS,NAL,CAR,EAR,ARM,AT,CAN,BV,CZ,GE,RU,HU,IC,IND,SIE,C,N,NN,CHN,MON,SIB,NMV,ILL,NFL,ONT,NPC,PEC,PLN,PLT,CH,PT,TF,AU,CHAT,MQ,NZ,Trait Name
0,AST,0.125,0.0625,0.150943,0.190476,0.217391,0.190476,0.222222,0.333333,0.354839,0.25,0.173729,0.209581,0.132867,0.214286,0.173913,0.193634,0.083333,0.094891,0.113208,0.086667,0.066265,0.0,0.166667,0.166667,0.153846,0.166667,0.026316,0.174312,0.168831,0.172414,0.121951,0.176471,0.169231,0.178571,0.179012,0.115207,0.366667,0.1,0.181818,0.208333,0.308219,0.13,0.186047,0.214286,0.0,0.076923,0.22449,0.368421,0.266667,0.166667,Asterionic ossicle
1,CIV,0.0,0.0,0.016949,0.0,0.0,0.0,0.018182,0.0,0.0,0.0,0.039301,0.043478,0.040248,0.039216,0.052239,0.072539,0.148789,0.043624,0.036697,0.06962,0.058824,0.0,0.076923,0.142857,0.0,0.017857,0.078947,0.034483,0.034884,0.019355,0.017442,0.069767,0.088235,0.033898,0.017857,0.071111,0.05,0.0,0.0,0.094851,0.048276,0.095455,0.13089,0.064516,0.0,0.0,0.0,0.0,0.042553,0.068182,Pterygospinous bridge complete (foramen of civ...
2,HYP,0.0,0.0,0.166667,0.095238,0.116279,0.136364,0.156863,0.0,0.096774,0.086957,0.139241,0.216374,0.242718,0.192233,0.238971,0.171875,0.229452,0.206081,0.214953,0.173333,0.216495,0.428571,0.153846,0.333333,0.166667,0.368421,0.194444,0.172414,0.264368,0.100649,0.156977,0.102564,0.102941,0.068966,0.122699,0.231481,0.236111,0.12,0.136364,0.226667,0.159091,0.20197,0.19186,0.241379,0.4,0.384615,0.02,0.157895,0.111111,0.159091,Hypoglossal canal bridged or double
3,MEN,0.046875,0.09375,0.166667,0.0,0.222222,0.080645,0.26,0.0,0.2,0.0,0.115556,0.209302,0.121849,0.090476,0.080402,0.096916,0.077869,0.068966,0.075472,0.075758,0.071429,0.0,0.0,0.0,0.0,0.096154,0.0,0.043478,0.05814,0.151316,0.123457,0.025,0.113208,0.152174,0.112245,0.097345,0.0375,0.107143,0.03125,0.120521,0.09375,0.106509,0.069231,0.111111,0.25,0.125,0.116279,0.25,0.2,0.130435,Accessory mental foramen
4,MHB,0.415385,0.125,0.074074,0.0,0.1,0.063492,0.24,0.0,0.04,0.0,0.068182,0.179688,0.266667,0.090465,0.084158,0.123894,0.144681,0.114679,0.078431,0.209302,0.088235,0.0,0.0,0.166667,0.0,0.056604,0.166667,0.06087,0.093023,0.043333,0.075472,0.073171,0.0,0.043478,0.081633,0.295238,0.220779,0.3,0.272727,0.196721,0.077519,0.358824,0.174603,0.222222,0.5,0.25,0.04878,0.25,0.038462,0.086957,Mylohyoid bridge


In [None]:
CES_Comp2_freq_counts_labeled_IP.to_excel('/drive/My Drive/Colab Notebooks/Statistical analysis/Aim 3 - Trait variation/CES_Comp2_freq_counts_labeled_IP.xlsx', index=True)

In [None]:
CES_Comp2_freq_counts_labeled_IP2 = pd.read_excel('CES_Comp2_freq_counts_labeled_IP2.xlsx')
CES_Comp2_freq_counts_labeled_IP2.head()

Unnamed: 0,Population Code,AST,CIV,HYP,MEN,MHB,OMB,PNB,PTB,SOF,TYM
0,AA,0.125,0.0,0.0,0.046875,0.415385,0.3125,0.370968,0.153846,0.384615,0.0
1,J,0.0625,0.0,0.0,0.09375,0.125,0.03125,0.1875,0.25,0.6875,0.0
2,USB,0.150943,0.016949,0.166667,0.166667,0.074074,0.0,0.111111,0.35,0.116667,0.169492
3,KEN,0.190476,0.0,0.095238,0.0,0.0,0.095238,0.285714,0.428571,0.333333,0.238095
4,TAN,0.217391,0.0,0.116279,0.222222,0.1,0.181818,0.191489,0.404255,0.191489,0.382979


The output for the following code cell was removed to protect PII and/or the raw data. The output contained the following columns:
* Cranial nonmetrics, Cranial macromorphoscopics, Provenience, PopID, Population Code, Population Name, Location, Females, Males, Total, Source

In [None]:
cnm_pops = pd.read_excel('2.3. Cranial nonmetric and macromorphoscopic datasets.xlsx')
cnm_pops.head()

In [None]:
cnm_pops2 = pd.DataFrame()
cnm_pops2['Population Code'] = cnm_pops['Population Code']
cnm_pops2['Population Name'] = cnm_pops['Population Name']
cnm_pops2.head()

Unnamed: 0,Population Code,Population Name
0,AI,Amerindian
1,AMBL,American Black
2,AMWH,American White
3,HH,Hispanic MaMD
4,B,Black


In [None]:
CES_Comp2_freq_counts_labeled = pd.merge(CES_Comp2_freq_counts_labeled_IP2, cnm_pops2, on='Population Code')
CES_Comp2_freq_counts_labeled.head()

Unnamed: 0,Population Code,AST,CIV,HYP,MEN,MHB,OMB,PNB,PTB,SOF,TYM,Population Name
0,AA,0.125,0.0,0.0,0.046875,0.415385,0.3125,0.370968,0.153846,0.384615,0.0,Asian American
1,J,0.0625,0.0,0.0,0.09375,0.125,0.03125,0.1875,0.25,0.6875,0.0,Japanese
2,USB,0.150943,0.016949,0.166667,0.166667,0.074074,0.0,0.111111,0.35,0.116667,0.169492,African American
3,KEN,0.190476,0.0,0.095238,0.0,0.0,0.095238,0.285714,0.428571,0.333333,0.238095,Kenya
4,TAN,0.217391,0.0,0.116279,0.222222,0.1,0.181818,0.191489,0.404255,0.191489,0.382979,Tanzania


In [None]:
CES_Comp2_freq_counts_labeled.tail()

Unnamed: 0,Population Code,AST,CIV,HYP,MEN,MHB,OMB,PNB,PTB,SOF,TYM,Population Name
45,TF,0.076923,0.0,0.384615,0.125,0.25,0.153846,0.0,0.230769,0.5,0.538462,Terra del Fuego
46,AU,0.22449,0.0,0.02,0.116279,0.04878,0.229167,0.104167,0.32,0.2,0.145833,Australia
47,CHAT,0.368421,0.0,0.157895,0.25,0.25,0.25,0.15,0.1,0.3,0.2,Chatham Island
48,MQ,0.266667,0.042553,0.111111,0.2,0.038462,0.195652,0.152174,0.06383,0.367347,0.041667,Marquesas
49,NZ,0.166667,0.068182,0.159091,0.130435,0.086957,0.119048,0.095238,0.068182,0.363636,0.209302,New Zealand


In [None]:
CES_Comp2_freq_counts_labeled.to_excel('/drive/My Drive/Colab Notebooks/Statistical analysis/Aim 3 - Trait variation/CES_Comp2_freq_counts_labeled.xlsx', index=True)