# Statistical analysis of a fracture network

This notebook will show how to perform statistical analysis of the analyized network. It will show how to:
+ Fit different distributions to the dataset
+ Plot summary plots for each fitted distribution
+ Visually compare different fits using PIT
+ Show and export statistical summary tables

In [None]:
from fracability.examples import data  # import the path of the sample data
from fracability import Entities, Statistics  # import the Entities class

import scipy.stats as ss
import matplotlib.pyplot as plt
# The following is only for jupyter to avoid matplotlib inline plots
%matplotlib qt5

## Import the Pontrelli quarry Set a and calculate the topology

In [None]:
pontrelli_data = data.Pontrelli()
data_dict = pontrelli_data.data_dict  # Get dict of paths for the data

# Create the fractures and boundary objects. 
set_a = Entities.Fractures(shp=data_dict['Set_a.shp'], set_n=1)  # to add your data put the absolute path of the shp file

boundary = Entities.Boundary(shp=data_dict['Interpretation_boundary.shp'], group_n=1)

fracture_net = Entities.FractureNetwork()

fracture_net.add_fractures(set_a)
fracture_net.add_boundaries(boundary)

fracture_net.calculate_topology()

## NetworkFitter 

The network fitter class is responsible of running the statistical analysis on the fracture network. There are different options:
1. use_survival: Boolean flag to use survival (True) or treat the data as if there were no censoring (False). Default is True. 
2. complete_only: Boolean flag to use only complete measurements (True) or all the dataset (False). This flag is used only when use_survival is False. Default is False.
3. use_AIC: Boolean flag to use AIC (true) or AICc (false) for model selection. Default is True


These options are useful to compare different ways of fitting the data with survival analysis however we strongly suggest to always use survival analysis since in case of no censoring the final results will be the same as the other methods.

In [None]:
fitter = Statistics.NetworkFitter(fracture_net)

### Fit different distributions

All the rv_continous distribution present in scipy are valid (https://docs.scipy.org/doc/scipy/reference/stats.html#continuous-distributions).

Each time a fit is run the Akaike, Kolmogorov-Smirnov, Koziol and Green and Anderson Darling distances are calculated and saved.

In [None]:
fitter.fit('lognorm')
fitter.fit('expon')
fitter.fit('norm')
fitter.fit('gengamma')
fitter.fit('powerlaw')
fitter.fit('weibull_min')

### Show the model rank table

In [None]:
fitter.fit_records(sort_by='Akaike').iloc[:,:-1] # the iloc is to remove the last column that is not useful in this case

### Plot the different models using PIT and summary plots

In [None]:
# Plot specific model
fitter.plot_PIT(fitter,position=[3],sort_by='Akaike')

In [None]:
# Plot specific models together
fitter.plot_PIT(fitter,position=[1,2,3],sort_by='Akaike',show_plot=True)

In [None]:
# Plot all the models
fitter.plot_PIT(sort_by='Akaike') 

In [None]:
# Plot specific model
fitter.plot_summary(position=[1], sort_by='Mean_rank')

In [None]:
# Plot specific models (separate plots)
fitter.plot_summary(position=[1,2,3], sort_by='Mean_rank')

In [None]:
# Plot all the models (separate plots)
fitter.plot_summary(sort_by='Mean_rank')

### Export the fit_records table

The fit_records table can also be saved as csv, excel or directly to clipboard in a excel friendly format 

In [None]:
fitter.fit_result_to_csv('test_export.csv')
fitter.fit_result_to_excel('test_export.xlsx')
fitter.fit_result_to_clipboard()