# Event Analysis
This jupyter notebook will show you how to use the Event Analysis Library You can install the library by running
```
pip install event-analysis
```
If you have an Nvidia GPU and you want to accelerate your workload using the GPU, you will need to install PyCuda which you can by running
```
pip install pycuda
```
The library provides functions for Event Synchronisation and Event Coincidence Analysis. Read the documentation if you want to learn more about them.



In [None]:
!pip install geopandas pycuda event-analysis

To use the library you first need to create a pandas dataframe. The dataframe must satisfy 2 properties

1) The index of the dataframe must be a pandas DatetimeIndex.
2) The internal data of the dataframe must be boolean

By default the library quantiese the time to hours, ignoring minutes and seconds. However this behavious can be overridden by passing the time_normalization_factor argument to the constructor. You can set it to 1 and quantise to seconds , however this limits you to just 69 years of data (nice??). Setting it to 60 will quantize it to minutes allowing for 4000 years of data, finally setting it to 3600 (the default value) will allow for 250 centuries of data! 

For this example I have written helper function to convert my data into this format. You will have to write your own!



In [2]:
import datetime
import numpy as np
import pandas as pd
import geopandas as gpd
from helperfunction import numpy_from_csv_data, get_Df_From_numpy

In [5]:
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


In [6]:
india_mp_gdf = gpd.read_file("/content/drive/MyDrive/shapefile/Ind.shp")
map_shape_object = india_mp_gdf.geometry
file_name = "/content/drive/MyDrive/trmm_3hrs_precp_2003_2019.csv"

In [7]:
numpy_mat, missing_vals_index_set, viable_columns = numpy_from_csv_data(file_name, map_shape_object)
starting_date = datetime.datetime(2003,1,1,hour=0,minute=0,second=0,microsecond=0)
time_delta = datetime.timedelta(hours=3)
rainDataDf = get_Df_From_numpy(numpy_mat,  viable_columns, starting_date = starting_date, time_delta = time_delta)
monsoonDataDf = rainDataDf[rainDataDf.index.month >= 6 | (rainDataDf.index.month <=9) ]
monsoonEventDf = monsoonDataDf > np.percentile(monsoonDataDf.to_numpy(copy = True), 98, interpolation = "lower", overwrite_input = True)


Dropping Not Point elements / Points outside shape
Number of elements dropped :: 45002
Starting main Loop :: Remove Missing Values


14384it [00:48, 297.64it/s]


Number of values dropped :: 115526


I am going to be analysing the rainfall events over the Indian sub-continent at 3 hour different from the year 2003 to 2019

In [8]:
monsoonEventDf.head()

Unnamed: 0_level_0,X93.875Y6.875,X93.625Y7.375,X77.375Y8.125,X77.125Y8.375,X77.375Y8.375,X77.625Y8.375,X77.875Y8.375,X76.875Y8.625,X77.125Y8.625,X77.375Y8.625,...,X74.875Y36.625,X75.125Y36.625,X75.375Y36.625,X75.625Y36.625,X73.875Y36.875,X74.375Y36.875,X74.625Y36.875,X74.875Y36.875,X75.125Y36.875,X75.375Y36.875
DateTime,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
2003-07-01 00:00:00,False,False,False,False,False,False,False,False,False,False,...,False,False,False,False,False,False,False,False,False,False
2003-07-01 03:00:00,False,False,False,False,False,False,False,False,False,False,...,False,False,False,False,False,False,False,False,False,False
2003-07-01 06:00:00,False,False,False,False,False,False,False,False,False,False,...,False,False,False,False,False,False,False,False,False,False
2003-07-01 09:00:00,False,False,False,False,False,False,False,False,False,False,...,False,False,False,False,False,False,False,False,False,True
2003-07-01 12:00:00,False,False,False,False,False,False,False,False,False,False,...,False,False,False,False,False,False,False,False,False,False


In [9]:
from EventAnalysis import EventAnalysis

In [10]:
EA = EventAnalysis(monsoonEventDf)

In [11]:
ES_Q = EA.ES_Cuda()

Device ID was not specified, reverting to 0
Device in use 
	Tesla T4
Time elapsed to run the computation :: 2 minutes 24 seconds 


In [12]:
ES_Q

Coordinates,X93.875Y6.875,X93.625Y7.375,X77.375Y8.125,X77.125Y8.375,X77.375Y8.375,X77.625Y8.375,X77.875Y8.375,X76.875Y8.625,X77.125Y8.625,X77.375Y8.625,...,X74.875Y36.625,X75.125Y36.625,X75.375Y36.625,X75.625Y36.625,X73.875Y36.875,X74.375Y36.875,X74.625Y36.875,X74.875Y36.875,X75.125Y36.875,X75.375Y36.875
Coordinates,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
X93.875Y6.875,1.000000,0.155602,0.087019,0.008548,0.007796,0.010177,0.033337,0.105532,0.146826,0.005051,...,0.172280,0.085991,0.126926,0.050039,0.029070,0.026991,0.035718,0.008702,0.058730,0.017190
X93.625Y7.375,0.155602,1.000000,0.151794,0.000000,0.010199,0.026628,0.016615,0.128061,0.162210,0.154198,...,0.087652,0.121429,0.149024,0.098198,0.084515,0.070624,0.006231,0.053128,0.059761,0.004998
X77.375Y8.125,0.087019,0.151794,1.000000,0.015844,0.007225,0.031437,0.039722,0.148830,0.099786,0.070219,...,0.062091,0.151794,0.117629,0.139122,0.071842,0.058366,0.006620,0.008065,0.063500,0.021241
X77.125Y8.375,0.008548,0.000000,0.015844,1.000000,0.166783,0.077205,0.086712,0.000000,0.004456,0.009197,...,0.013070,0.007456,0.048887,0.005694,0.026465,0.032763,0.084544,0.039610,0.044556,0.114759
X77.375Y8.375,0.007796,0.010199,0.007225,0.166783,1.000000,0.132370,0.090942,0.011429,0.008127,0.004194,...,0.027813,0.023798,0.040531,0.031159,0.048271,0.082167,0.071172,0.093922,0.016254,0.066601
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
X74.375Y36.875,0.026991,0.070624,0.058366,0.032763,0.082167,0.091009,0.045632,0.052758,0.028137,0.048400,...,0.055025,0.109859,0.037421,0.095893,0.092848,1.000000,0.109517,0.066704,0.056274,0.120784
X74.625Y36.875,0.035718,0.006231,0.006620,0.084544,0.071172,0.123876,0.065217,0.020945,0.059576,0.000000,...,0.036408,0.006231,0.044568,0.066621,0.000000,0.109517,1.000000,0.026481,0.029788,0.095903
X74.875Y36.875,0.008702,0.053128,0.008065,0.039610,0.093922,0.044012,0.044136,0.025514,0.045357,0.093626,...,0.062091,0.030359,0.045242,0.057967,0.071842,0.066704,0.026481,1.000000,0.018143,0.042481
X75.125Y36.875,0.058730,0.059761,0.063500,0.044556,0.016254,0.053044,0.019859,0.105231,0.010204,0.000000,...,0.019955,0.059761,0.071247,0.000000,0.060609,0.056274,0.029788,0.018143,1.000000,0.059732


In [13]:
EC_p_max,EC_p_mean, EC_t_max,EC_t_mean, p_prec, p_trig = EA.ECA_Cuda(datetime.timedelta(hours = 100), return_p_values = True)

Time elapsed to run the computation :: 0 minutes 5 seconds 


In [14]:
EC_t_max

Coordinates,X93.875Y6.875,X93.625Y7.375,X77.375Y8.125,X77.125Y8.375,X77.375Y8.375,X77.625Y8.375,X77.875Y8.375,X76.875Y8.625,X77.125Y8.625,X77.375Y8.625,...,X74.875Y36.625,X75.125Y36.625,X75.375Y36.625,X75.625Y36.625,X73.875Y36.875,X74.375Y36.875,X74.625Y36.875,X74.875Y36.875,X75.125Y36.875,X75.375Y36.875
Coordinates,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
X93.875Y6.875,1.000000,0.734884,0.530233,0.011583,0.023256,0.241860,0.176744,0.683721,0.631313,0.148837,...,0.688372,0.772093,0.762791,0.446512,0.218605,0.283333,0.176744,0.187500,0.431373,0.117241
X93.625Y7.375,0.734884,1.000000,0.756000,0.039007,0.080386,0.127660,0.181818,0.680000,0.833333,0.887097,...,0.390071,0.695035,0.719858,0.680328,0.519231,0.283333,0.117021,0.218750,0.372549,0.117241
X77.375Y8.125,0.530233,0.756000,1.000000,0.069498,0.096463,0.220000,0.325359,0.604444,0.532000,0.478495,...,0.524000,0.800000,0.668000,0.532787,0.480769,0.266667,0.074468,0.312500,0.196078,0.124138
X77.125Y8.375,0.011583,0.039007,0.069498,1.000000,0.903475,0.899614,0.548263,0.050193,0.030303,0.102151,...,0.077220,0.049645,0.166023,0.090164,0.019231,0.300000,0.633205,0.234375,0.470588,0.714286
X77.375Y8.375,0.023256,0.080386,0.096463,0.903475,1.000000,0.938907,0.722488,0.053333,0.032154,0.010753,...,0.260870,0.122186,0.105528,0.086817,0.288462,0.400000,0.574468,0.359375,0.470588,0.710345
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
X74.375Y36.875,0.283333,0.283333,0.266667,0.300000,0.400000,0.700000,0.366667,0.283333,0.050000,0.311828,...,0.283333,0.466667,0.233333,0.250000,0.269231,1.000000,0.150000,0.359375,0.411765,0.416667
X74.625Y36.875,0.176744,0.117021,0.074468,0.633205,0.574468,0.808511,0.311005,0.053333,0.297980,0.095745,...,0.404255,0.212766,0.404255,0.095745,0.000000,0.150000,1.000000,0.328125,0.255319,0.510638
X74.875Y36.875,0.187500,0.218750,0.312500,0.234375,0.359375,0.656250,0.203125,0.250000,0.141414,0.295699,...,0.453125,0.500000,0.093750,0.375000,0.250000,0.359375,0.328125,1.000000,0.078431,0.609375
X75.125Y36.875,0.431373,0.372549,0.196078,0.470588,0.470588,0.568627,0.215686,0.431373,0.126263,0.048387,...,0.411765,0.470588,0.392157,0.039216,0.076923,0.411765,0.255319,0.078431,1.000000,0.490196


In [15]:
for ECA_res in EA.ECA_vec_Cuda([datetime.timedelta(hours = h) for h in range(3, 30, 10)]):
  EC_p_max,EC_p_mean, EC_t_max,EC_t_mean = ECA_res

Time elapsed to run the computation :: 0 minutes 5 seconds 
