# Statistical Analysis
This notebook provides a comprehensive statistical analysis of a selected dataset by generating histograms and box-and-whisker plots for each variable. Users can explore the distribution and central tendency measures of the dataset's variables. The notebook guides users through the process of loading the dataset, selecting the variables of interest, and generating insightful visualizations. By examining the histograms, users can gain insights into the data's frequency distribution, while the box-and-whisker plots provide information about the variable's quartiles, outliers, and overall spread. This analysis facilitates a thorough understanding of the dataset's statistical properties and aids in identifying patterns, outliers, and potential relationships between variables.

## Libraries
Libraries to be used during code development are imported.

In [None]:
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import seaborn as sns
import tkinter as tk
from tkinter.filedialog import askopenfilename
import warnings

warnings.filterwarnings("ignore")

## Import Database 
In this section of the script the data to be plotted is imported. It is required that the file to work with is in a comma separated text format ".csv". Then a summary table with database statistics is generated.

In [None]:
# The root window is created.
root = tk.Tk()
# show askopenfilename dialog without the Tkinter window
root.withdraw()

# select csv file
datos = askopenfilename(filetypes=[("csv files", "*.csv")])

## read the data and do rank transformation
#rgp= pd.read_csv(datos, index_col=0)
rgp = pd.read_csv(datos)
rgp.describe().apply(lambda s: s.apply('{0:.5f}'.format))

## Name of your variables 

In [None]:
rgp.columns.tolist()

### Standard Deviation

In [None]:
#cantidad_col = rgp.count().to_list()
rgp.drop('Time', inplace = True, axis = 1)
cantidad_col = rgp.count().astype(float)
cantidad_col.to_numpy()

In [None]:
len(cantidad_col)

In [None]:
bin_count = int(np.ceil(np.log2(len(cantidad_col))) + 1)
bin_count   


In [None]:
def his_box_p(a):
    sns.set(style="ticks")

    x = rgp[a]
    

    f, (ax_box, ax_hist) = plt.subplots(2, sharex=True, 
                                        gridspec_kw={"height_ratios": (.15, .85)}, figsize=(10, 7))

    sns.boxplot(x, ax=ax_box)
    sns.histplot(x, ax=ax_hist, bins = bin_count, stat = 'count')

    ax_box.set(yticks=[])
    ax_hist.axvline(np.mean(x),color='black',linestyle='-', label = 'Mean')
    ax_hist.axvline(np.median(x),color='red',linestyle='--', label = 'Median')
    sns.despine(ax=ax_hist)
    sns.despine(ax=ax_box, left=True)
    plt.legend()
    plt.savefig(dpi = 200, bbox_inches = 'tight')
    plt.show()

## Plots

In [None]:
for a in subtit:
    his_box_p(a)