# Association Analysis using a generalized linear mixed model (GLMM)

The present notebook serves as a guide of how to use the library `IDEAL-GENOM` to perform a genome wide association analysis (GWAS). The cornerstone of this proposed analysis is a glmm.

In [1]:
import sys
import os

# add parent directory to path
library_path = os.path.abspath('..')
if library_path not in sys.path:
    sys.path.append(library_path)

from ideal_genom.gwas.gwas_glmm import GWASrandom

In the next widgets the user must input the paths and filenames needed to perform the GWAS.

1. `input_path`: folder with the input data. The pipeline assumes that the files are `.bed`, `.bim`, `.fam` files;
2. `input_name`: prefix of the `PLINK` binary files:
3. `output_path`: folder to output the results;
4. `output_name`: the prefix of the output files.

In [None]:
import ipywidgets as widgets
from IPython.display import display

# Create interactive widgets for input
input_path = widgets.Text(
    value='',
    description='Path to input PLINK binary files:',
    style={'description_width': 'initial'},
    layout=widgets.Layout(width='50%')
)

input_name = widgets.Text(
    value='',
    description='Prefix of PLINK binary files:',
    style={'description_width': 'initial'},
    layout=widgets.Layout(width='50%')
)

output_path = widgets.Text(
    value='',
    description='Path to output files:',
    style={'description_width': 'initial'},
    layout=widgets.Layout(width='50%')
)
output_name = widgets.Text(
    value='',
    description='Name of the resulting files:',
    style={'description_width': 'initial'},
    layout=widgets.Layout(width='50%')
)
# Display the widgets
display(input_path, input_name, output_path, output_name)

# Function to get the text parameter values
def get_params():
    return input_path.value, input_name.value, output_path.value, output_name.value

In [None]:
path_params = get_params()
print('input_path: ', path_params[0])
print('input_name: ', path_params[1])
print('output_path: ', path_params[2])
print('output_name: ', path_params[3])

With this info we can initialize the clas `GWASrandom`.

In [None]:
gwas_random = GWASrandom(
    input_path=path_params[0],
    input_name=path_params[1],
    output_path=path_params[2],
    output_name=path_params[3]
)

In the next widgets, please provide the parameters needed to execute the pipeline.

1. `maf`: minor allele frequency.

In [None]:
maf = widgets.FloatText(
    value=0.01,
    description='Minor Allele Frequency:',
    style={'description_width': 'initial'},
    layout=widgets.Layout(width='50%')
)

display(maf)

def get_gwas_params():

    gwas_params = dict()

    gwas_params['maf']  = maf.value

    return gwas_params

In [None]:
gwas_params = get_gwas_params()
gwas_params

Execute the pipeline steps.

In [None]:
gwas_steps = {
    'aux_files': (gwas_random.prepare_aux_files, {}),
    'compute_grm': (gwas_random.compute_grm, {}),
    'run_gwas': (gwas_random.run_gwas_random, {'maf' :gwas_params['maf']}),
    'top_hits': (gwas_random.get_top_hits, {'maf' :gwas_params['maf']})
}

step_description = {
    'aux_files': 'Prepare auxiliary files',
    'compute_grm': 'Compute genetic relationship matrix',
    'run_gwas': 'Run GWAS',
    'top_hits': 'Get top hits'
}

for name, (func, params) in gwas_steps.items():
    print(f"\033[1m{step_description[name]}.\033[0m")
    func(**params)