# Analyzing the Effects of Space Flight on Telomere Length Dynamics in NASA Astronauts

In 2024 NASA will execute Project Artemis, sending humans back to the moon to establish a permanent lunar base. However, space is inherently dangerous to human health. As part of my Ph.D, I'm researching how spaceflight impacts human health, and whether these impacts could potentially comprise current (Project Artemis) or future missions, ala Mars and beyond. Specifically, I'm examining how time aboard the International Space Station affects telomeres, the ends of human DNA, and the stability of DNA, for NASA's astronauts. My research takes the *first look at the changes to telomeres in unrelated astronauts as a result of spaceflight*, informing NASA policy and approach for current and future missions.

Let's get started with the analysis!

And, please feel free to contact me.

**Contact:**  
Jared Luxton  
jLuxton@colostate.edu

<table><tr>
<td><img src=https://upload.wikimedia.org/wikipedia/commons/thumb/c/c3/Python-logo-notext.svg/200px-Python-logo-notext.svg.png width="150"> 
<td><img src=https://cdn1.medicalnewstoday.com/content/images/articles/319/319971/space-explorer.jpg width="300">
<td><img src=https://abm-website-assets.s3.amazonaws.com/rdmag.com/s3fs-public/embedded_image/2017/04/telomere-chromosome-stock.jpg width="250">
</tr></table>

&nbsp;
&nbsp;   

## Table of Contents:

* [Background: Health Risks and Obstacles to Space Flight](#background) 
* [Approach: Identifying Risks by Measuring Telomere Lengths](#approach)
* [Methods: Blood Collections, Cell Culture, Telomere Measurements](#methods)
* [Data Cleaning: Handling Telomere Length Data](#data-cleaning)
* [Data Analysis: Visualization and Statistics](#data-analysis)
* [Conclusions](#conclusions)

&nbsp;    

<a id='background'></a>
&nbsp;
## Background 
**Health Risks and Obstacles to Space Flight**   
Did you know that NASA is sending humans to the Moon in 2024? Yes! And not only that: this mission is the first of many that will develop permanent lunar colonies and provide a bridge to exploring Mars and beyond. Dubbed Artemis, this NASA project entails sending the *first* woman (and another man) to the lunar surface and the development of a *permanent lunar outpost called the Gateway* orbiting the Moon. The objectives undertaken by Artemis are part of NASA's overarching goal (and humanity's common dream) for humans to explore our solar system; Mars and beyond. 

The immediate challenges facing Artemis are substantial in terms of technology and health considerations for the astronauts. Even as we approach Project Artemis in 2024, the short- and long-term health effects of spaceflight, especially those from chronic exposure to galactic cosmic rays, a type of radiation unique to space and not found on Earth, remain relatively unknown. 

Galactic cosmic rays (GCRs) are highly energetic particles hurtling through space at nearly the speed of light. Though a rare event, when GCRs strike human cells they shred all cellular contents in their path, including DNA. This damage accumulates over time, and could lead to degeneration of tissues and cancer. Currently, we simply don't understand how much cellular damage humans accumulate in space, and how much it increases cancer risk. Not understanding these issues makes addressing them impossible. My research directly addresses these issues by examing how spaceflight effects telomeres (the ends of DNA) and DNA stability for NASA astronauts aboard the International Space Station.

&nbsp; 




<a id='approach'></a>
&nbsp;
## Approach 
**Identifying Risks by Measuring Telomere Lengths**  
Telomeres are repetitive sequences of DNA covered by protein found at the very ends of DNA. Telomeres shorten with each cell division and thus shorten as we age. When the telomeres in a cell reach a critically short length, the cell will die or persist in a state which damages neighboring cells (termed senescence). Cell death resulting from telomeres shortening too quickly will lead to age-related diseases, i.e cancer.  Environmental exposures, including space radiation, air pollution, stress, inflammation, and others can all contribute to telomere shortening and thus age-related diseases - cancer. Telomeres therefore link environmental exposures with age-related diseases. By measuring telomere length over a period of time which involves environmental stressors and exposures, the telomere length changes can be used to quantify the short- and long-term effects of that experience in terms of cancer risk and disease. This is what we've done with the astronauts.


&nbsp; 

<a id='methods'></a>
&nbsp;
## Methods
**Blood Collection, Cell Culture, Telomere Measurements**  
We monitored the telomere lengths in 11 unrelated astronauts at pre-, mid-, and post-spaceflight timepoints aboard the International Space Station (where available). In all, we have telomere data for about seven timepoint samples for each astronaut. For our analyses, we directly monitored the lengths of *all individual telomeres* in each cell for each timepoint for each patient; we also have the telomere length means for those timepoints.

To measure telomere length in astronauts, we used a noninvasive approach for sample collection and analysis. Blood was taken from astronauts at pre-, mid- (yes, blood was drawn aboard the International Space Station, sent down on the Soyuz capsule to Texas, and mailed to us), and post-spaceflight timepoints. From these blood samples we specifically cultured white blood cells (ala 'T-cells'). By culturing and using only white blood cells for quantifying telomere length, we reduced variation in our measurements from different cell types.

But, how exactly can we measure telomere length? I used a technique called *Telomere Fluoresence In Situ Hybridization*, aka Telo-FISH. This technique measures telomere length by taking advantage of the fact that telomeres are repetitive sequences of DNA. Longer telomeres have more DNA repeats; shorter telomeres have less. If we have a colorful substance (a fluorescent 'probe') which binds to these repetitive sequences, we can visualize the telomeres using a microscope. When telomeres are longer, more probe will bind, and under the microscope they'll appear bright. Conversely, shorter telomeres bind less probe and are dimmer under the microscope. Thus, to quantify telomere length we attach fluorescent probes to telomeres, image the colored telomeres, and quantify the fluorescent intensity of the colored telomeres. I used a software called Telometer (part of ImageJ) to convert fluorescent intensities of telomeres to pixel values, yielding measurements of telomeres. This techinque, Telo-FISH,  is a powerful approach which enables us to determine the relative telomere lengths of all individual telomeres. 

I mentioned earlier that I have 11 unrelated astronauts with about seven timepoint samples; for each timepoint I imaged the telomeres of 30 cells. As well, I have a cohort of 11 unrelated age-matched controls for the astronauts; they also have seven timepoint samples with 30 images per timepoint. I think that's sufficient background to understand what I've done. Let's move onto the data analysis!


&nbsp; 

<a id='data-cleaning'></a>
&nbsp; 
## Data Cleaning 
**Handling Telomere Length Data**  




&nbsp; 


In [1]:
import os

import numpy as np
from numpy import array
import pandas as pd
from pandas import ExcelWriter
from pandas import ExcelFile

import matplotlib.pyplot as plt
from matplotlib.ticker import StrMethodFormatter
from matplotlib import colors
from matplotlib.ticker import PercentFormatter

from scipy import stats
from statsmodels.graphics.gofplots import qqplot

In [2]:
def generate_histograms_and_dataframes_forTeloLengthData(patharg):

    """
    USAGE:
    From the command line, call the name of this entire script file, and pass an argument
    for the location (directory) containing the Excel files to analyze, i.e
    $ python Name_Of_This_Script directory/containing/excelfiles/to/analyze
    
    This is the main function for this script, which opens custom Excel (.xlsx) files we use for 
    quantifying raw telomere lengths, derived from ImageJ analyses.
    The individual telomere lengths column is extracted, cleaned of NA values & DAPI-intensity 
    values, outliers (3 std devs from mean of column) are removed, and the length values are 
    standardized to each other according to discrepancies in the fluoresent intensity of 
    the microscope, determined by fluorescent bead measurements. The astronaut ID & sample's 
    timepoint (from filename) is associated with the individual telo length column 
    (for that excel file) as a KEY:VALUE pair in a dictionary. The dictionary is looped over, 
    initializing variables corresponding to astronaut sample timepoints, and those variables 
    are used to derive descriptive stats, and create histogram graphs.
    """

    dict_astro_individ_telos_dfs = {}

    for file in os.scandir(patharg):
        if file.name.endswith('.xlsx') and file.name.startswith('~$') == False:
            print(f'{file.name} telomere data acquisition in progress..')
        
            try:
                df = pd.read_excel(file)

            except:
                print(f'{file.name} File not found..')
                return -1

            df.rename(columns={'Unnamed: 3':'Mean Individ Telos'}, inplace=True)
            
            DAPI_values_to_drop=[5, 192, 379, 566, 753, 940, 1127, 1314, 1501, 1688, 1875, 2062,
                    2249, 2436, 2623, 2810, 2997, 3184, 3371, 3558, 3745, 3932, 4119, 4306, 4493, 
                    4680, 4867, 5054, 5241, 5428]

            individual_telos_lengths = (df['Mean Individ Telos'])
            individual_telos_lengths = individual_telos_lengths.drop(labels=DAPI_values_to_drop)
            individual_telos_lengths = individual_telos_lengths.iloc[7:5611]
            telos_str_toNaN = pd.to_numeric(individual_telos_lengths, errors='coerce')
            individual_telos_cleaned = telos_str_toNaN.dropna(axis=0, how='any')
            telos_df = individual_telos_cleaned.to_frame(name=None)
            telos_individ_df = telos_df[(np.abs(stats.zscore(telos_df)) < 3).all(axis=1)]
            

            if ('5163' in file.name) or ('1536' in file.name):
                telos_individ_df_cy3Cal = telos_individ_df.div(59.86)

            elif '2171' in file.name:
                telos_individ_df_cy3Cal = telos_individ_df.div(80.5)

            elif '7673' in file.name:
                telos_individ_df_cy3Cal = telos_individ_df.div(2.11)

            elif '2479' in file.name:
                telos_individ_df_cy3Cal = telos_individ_df.div(2.18)

            elif '1261' in file.name:
                telos_individ_df_cy3Cal = telos_individ_df.div(2.16)

            else:
                telos_individ_df_cy3Cal = telos_individ_df

            file_name_trimmed = file.name.replace('.xlsx', '')
            dict_astro_individ_telos_dfs[file_name_trimmed] = telos_individ_df_cy3Cal
    
    print('Done collecting all astronaut telomere length excel files')
    return dict_astro_individ_telos_dfs

In [21]:
dict_astro_individ_telos_dfs = generate_histograms_and_dataframes_forTeloLengthData('../all astros for pre mid post and pre post')

dso7673 mphase TeloFISH R+270.xlsx telomere data acquisition in progress..
dso2494 mphase TeloFISH R+270.xlsx telomere data acquisition in progress..
dso2479 mphase TeloFISH R+270.xlsx telomere data acquisition in progress..
dso1062 mphase TeloFISH R+270.xlsx telomere data acquisition in progress..
DSO1536 L-270.xlsx telomere data acquisition in progress..
dso2381 mphase TeloFISH R+270.xlsx telomere data acquisition in progress..
DSO1536 FD140.xlsx telomere data acquisition in progress..
DSO2171 L-180.xlsx telomere data acquisition in progress..
DSO1536 FD90.xlsx telomere data acquisition in progress..
dso1261 mphase TeloFISH R+270.xlsx telomere data acquisition in progress..
dso3228 mphase TeloFISH R+270.xlsx telomere data acquisition in progress..
dso4819 mphase TeloFISH R+270.xlsx telomere data acquisition in progress..
DSO5163 R+180.xlsx telomere data acquisition in progress..
dso2494 mphase TeloFISH L-270.xlsx telomere data acquisition in progress..
DSO2171 FD260.xlsx telomere dat

<a id='data-analysis'></a>
&nbsp; 
## Data Analysis
**Visualization and Statistics**  

...

&nbsp; 

<a id='conclusions'></a>
&nbsp; 
## Conclusions
**Highlights and Final Thoughts**  

...

&nbsp; 