# Calculating Relative and Absolute Abundances
Author: Lindsay Hopson and James Ziegler<br>Date: April 27th 2021<br>Written in: Jupyter Notebook<br>Availability: https://github.com/GW-HIVE/microbiome


### Objective 

The goal of this code is to calculate the relative abundances of bacteria using the number of 'hits' produced for Hexagon computations. When  the Hexagon output CSV files are downloaded from Hive to a local computer, change the name of each CSV file to match the sample the Hexagon computation was performed on. On the local computer, save all the Hexagon output CSV files in a folder also containing this code. This code does the following tasks: 
1. Parses through each Hexagon computation output csv file saved to a local directory 
2. Calculates the relative and absolute abundances of reach organism present in the file 
4. Writes the new abundance information into columns within the original CSV file saved in the current directory 

#### <p style="color:red;">Important Note:  

Before running this code, save a copy of all the __original Hexagon output CSV files__ and __this code__ into a new directory on your computer. If you do not do this, other CSV files within the directory where this code is saved will be changed irreversibly.

### Implementation

In [None]:
import pandas as pd
import numpy as np
import glob

files = glob.glob('*.csv')
for file in files:
    df = pd.read_csv(file)
    aligned_hits = sum(df['Hits'][1:])
    total_hits=sum(df['Hits'])
    rel_ab = ['NaN']
    abs_ab = []
    for hit in df['Hits'][1:]:
        rel_ab.append(hit/aligned_hits)
    for hit in df['Hits']:
        abs_ab.append(hit/(total_hits))
    df = df.reset_index(drop = True)
    df.drop("Relative Abundance", inplace=True, axis=1, errors='ignore')
    df.drop("Absolute Abundance", inplace=True, axis=1, errors='ignore')
    df.insert(3, "Relative Abundance", pd.Series(rel_ab))
    df.insert(4, "Absolute Abundance", pd.Series(abs_ab))
    df.to_csv(file, index=False)
    
print("Calculations complete! Return to the folder where your Hegagon outputs have been saved to view the updated files.")