# Predicting Solar Flares with Machine Learning

### Yasser Abduallah, Jason T. L. Wang, Haimin Wang

## 1. Introduction

Solar flare prediction plays an important role in understanding and forecasting space weather. The main goal of the Helioseismic and Magnetic Imager (HMI), one of the instruments on NASA's Solar Dynamics Observatory, is to study the origin of solar variability and characterize the Sun's magnetic activity. HMI provides continuous full-disk observations of the solar vector magnetic field with high cadence data that lead to reliable predictive capability; yet, solar flare prediction effort utilizing these data is still limited.

In this notebook we provide an overview of the FlareML system to demonstrate how to predict solar flares using machine learning (ML).

## 2. FlareML Workflow

### 2.1 Data Prepration & Loading

The data folder includes two sub-directories: train_data and test_data.
* The train_data includes a CSV training data file that is used to train the model. 
* The test_data includes a CSV test data file that is used to predict the included flares.

The files are loaded and used by during the testing and training process.


### 2.2 Model Training and Testing
You may train the model with your own model or retrain the model with default data (see Section 2.2.1 and 2.2.2) or run the prediction using the pre-trained model (see Section 2.2.3) 

#### 2.2.1 Model Training and Prediction
To train the model with your own data:
1. you should first upload or create your file to the data directory (in the left hand side file list).
2. Edit the args variable in the following code and update the path to the training file:<br> 'train_data_file':'data/train_data/flaringar_training_sample.csv' <br>and replace the value 'data/train_data/flaringar_training_sample.csv' with your new file name.
3. You may also train your model with one of the following algorithms: ENS, RF, MLP, and ELM by changing the args variable in the following code:<br>'algorithm': 'ENS'


In [None]:
print('Loading the train_model function...')
from flareml_train import train_model
args = {'train_data_file':'data/train_data/flaringar_training_sample.csv', 
       'algorithm': 'ENS', 
       'modelid': 'custom_model_id'
      }
train_model(args)


#### 2.2.2 Predicting with Your Model
To predict the testing data using the model you trained above, make sure the modelid value in the args variable in the following code is set exactly as the one used in the training, for example: 'custom_model_id'.

In [None]:
from flareml_test import test_model
args =  {'test_data_file': 'data/test_data/flaringar_simple_random_40.csv', 
         'algorithm': 'ENS', 
         'modelid': 'custom_model_id'}
custom_result = test_model(args)

#### 2.2.3 Prediction with Pretrained Model
There are default and pretrained models that can be used to predict without running your own trained model. The modelid  is set to default_model which uses all pretrained algorithms.

In [None]:
from flareml_test import test_model
args =  {'test_data_file': 'data/test_data/flaringar_simple_random_40.csv', 
         'algorithm': 'ENS', 
         'modelid': 'default_model'}
result = test_model(args)

#### 2.2.4 Plotting the Results
The prediction result can be plotted by passing the result variable to the function plot_result as the following example.

In [None]:
from flareml_utils import plot_result
plot_result(result)

#### 2.3 Training and Predicting Specific Model
You can train and predict specific modle by updating the algorithm variable in the following code to one of the following: RF, MLP, ELM.
RF is for random forest algorithm.
MLP is for multilayer perceptron algorithm.
ELM is for extreme learning machine algorithm.

In [None]:
from flareml_train import train_model
from flareml_test import test_model
from flareml_utils import plot_result
args = {'train_data_file':'data/train_data/flaringar_training_sample.csv', 
        'test_data_file': 'data/test_data/flaringar_simple_random_40.csv',
       'algorithm': 'ELM', 
       'modelid': 'custom_model_id'
      }
print('Start the training')
train_model(args)

print('Start the prediction task.')
model_result = test_model(args)
print('Plotting the result')
plot_result(model_result)

## 2. Acknowledgment
We thank the team of *SDO*/HMI for producing vector magnetic field data products.
The flare catalogs were prepared by and made available through NOAA NCEI.
The CME and SEP event records were provided by DONKI.
This work was supported by U.S. NSF grants AGS-1927578 and AGS-1954737.
J.W. thanks Manolis K. Georgoulis for helpful conversations in the SHINE 2019 Conference.
Q.L. and H.W. acknowledge the support of NASA under grants 80NSSC20K1282,
80NSSC18K0673, and 80NSSC18K1705.

## 3. References
DeepSun: machine-learning-as-a-service for solar flare prediction

Yasser Abduallah, Jason T. L. Wang and Haimin Wang 

https://iopscience.iop.org/article/10.1088/1674-4527/21/7/160