# Categorising CME events by the DST reading using a neural network

## The Data
The data has been collected from [OMNIWeb](https://omniweb.gsfc.nasa.gov/form/dx1.html) 

- Date: The UTC date for when the CME was detected in Space
- Time: UTC time of detection 
- DST: Disturbance Storm Time measured in nT 
- Disturbance_date: The date when detected on earth
    - This gets encoded into 1 (>100nT) and 0 (<100nT) into the Dst_label column
    - This is the 'Label', 1 or 0
- Disturbance_time: The time when detected on earth

There are then 23 measured parameters for the:
- Magnetic field
    - B(IMF): IMF magnitude average , nT
    - Vector_B: Magnitude, Avg IMF, nT
    - Lat_angle_B: Latitude of average IMF, degrees
    - Long_angle_B: Longitude of average IMF, degrees
    - Bz(GCE): 
    - Bz(GSM): z component of Geocentric Solar Magnetospheric
    - RMS_magnitude: error of magnetic field
    - RMS_field_vector: error of field vector
    - RMS_BZ_GSE: error of z component of GSE field
- Plasma
    - Plasma_Temp: temperature of plasma
    - Proton_Density: the density of protons in the cme
    - Plasma_speed: speed of plasma
    - Alpha/proton: ratio of alpha particles to protons
    - sigma-T 
    - sigma-n
    - sigma-V
    - sigma-ratio
    - flow_pressure: The pressure
    - E_field: electric field
    - Plasma_beta
- Indices
    - Kp_index
    - Ap_index 

## The labels
The labels are based off the DST value of the event. A different DST value will cause different types of geomagnetic storms when the event reaches earth.
- 50 -100 is moderate -> label is 0 (not geoeffective)
- 100- 250 is storm -> label is 1 (geoeffective)
- 250+ is extreme -> label is 2 (extremely geoeffective)

## The Data

In [8]:
import pandas as pd 

data = pd.read_csv('dst_events.csv', index_col = 'Date')
data = data.dropna()
print(data.describe())

              DST      B(IMF)    Vector_B  Lat_angle_B  Long_angle_B  \
count  310.000000  310.000000  310.000000   310.000000    310.000000   
mean   -72.568383   11.077097    9.773226   -15.341290    192.394194   
std     64.242992    6.969821    6.696144    29.962048    102.219610   
min   -420.851064    3.100000    0.600000   -82.000000      0.600000   
25%    -87.500000    6.400000    5.325000   -37.575000    113.875000   
50%    -52.500000    8.900000    7.700000   -14.400000    159.850000   
75%    -32.500000   13.575000   12.175000     3.100000    296.425000   
max    -12.500000   54.800000   53.300000    77.900000    358.900000   

          Bz(GSE)     Bz(GSM)  RMS_magnitude  RMS_field_vector  RMS_BZ_GSE  \
count  310.000000  310.000000     310.000000        310.000000  310.000000   
mean    -2.340645   -2.904194       0.751290          3.895161    2.514516   
std      5.527222    5.834543       1.104294          3.767869    2.619410   
min    -20.800000  -26.600000       0.0

In [None]:
# encode the DST values