This repository is a more concise and simpler pytorch implementation of the model in paper Hongyuan Mei, Jason Eisner The Neural Hawkes Process: A Neurally Self-Modulating Multivariate Point Process.
A sequence of events with different types is often generated in our lives. For instance, a patient may be diagnosed with different diseases in his or her record history; a kind of stock may be sold or bought several times in a given day. We can define that the ith event in such a sequence above is a tuple (ki, ti), where ki denote the type of the event and ti denote when does this event happen. Therefore, a sequence of events can be represented in a sequence of such tuples above. Such sequences are usually called Marked Point Process or Multivariate Point Process. The problem we care about is to predict when will the next event happens and what will be the event type given a stream of events.That is given a stream of event of form:
(k1, t1), (k2, t2), (k3, t3) ... (kn, tn)
we want to predict the next event time and type (kn+1, tn+1)
Please refer to J. G. Rasmussen. Temporal point processes: the
conditional intensity function. 2009. for proof and detailed math formular deductions in the place with mark [1] above.
To learn more about how Neural Network, RNN an LSTM works, Dive into Deep Learning is a good source.
- To run the program on your computer, please make sure that you have the following files and packages being downloaded.
- Python3: you can download through the link here: https://www.python.org/
- Numpy: you can dowload it through command
pip install numpy
- Scikit-Learn: you can download it through command
pip install sklearn
- matplotlib: you can download it through command
pip install matplotlib
- pytorch installation is more complicated than the package described above. You can go to https://pytorch.org/get-started/locally/ for more information. If you still cannot install it on windows computer through pip, you can download Anaconda first and then download the pytorch through method described here: https://dziganto.github.io/data%20science/python/anaconda/Creating-Conda-Environments/
2. In order to train the model, please type the command below for more information:
!python train.py --helpExamples include:
!python train.py --dataset conttime
!python train.py --dataset hawkes --seq_len 75 --batch_size 64
3. In order to test the model, please type the command below for more information:
!python test.py --helpExamples include:
!python test.py --dataset conttime --test_type 2
!python test.py --dataset self-correcting --test_type 1
- Google Colab:
Because of the complexity of the model, and long training time, it is better to train and test the model on cloud such as Google Colab rather than train the model on your laptop or desktop. The purpose of using Google Colab is to accelerate the training process and protect your personal laptop from overheated caused by a long time intense computing. If you are using a desktop built for neural network training or scientific computing, you may simply ingore this section.
Google Colab is a plotform which allow you to write and run python Jupyter Notebook with CPU, GPU or TPU that are designed for Neural Network training. Google Colab also allows you to type linux command line to execute python files such as files in this repository.
To use the Google Colab, you must use the chrome browser, log in to your google account, and follow the picture below:It is recommanded to use GPU to train the model. To change to GPU mode, select Runtime, Change run time type, and in Hardware accelerator select GPU. Type the commands blow cell by cell:
!git clone https://github.com/Hongrui24/NeuralHawkesPytorch
!cd NeuralHawkesPytorch
Then you can type the command in this section 2. and 3. to train and test the model.
We use the data provided by the Hongyuan Mei and Du Nan to do tests.
Name | Type of Dataset | Number of types | Number of training sequence | Number of testing sequence | Sequence Length Mean | Sequence Length Min | Sequence Length Max |
---|---|---|---|---|---|---|---|
data_hawkes, data_hawkeshib, data_conttime | Simulated | 5 | 8000 | 1000 | 60 | 20 | 100 |
MIMIC-II(1)(2)(3)(4)(5) | Real World Dataset | 75 | 527 | 65 | 3 | 1 | 31 |
SO(Stack Overflow) (1)(2)(3)(4)(5) | Real World Dataset | 22 | 4777 | 1326 | 72 | 41 | 736 |
hawkes, self-correcting | Simulated | 1 | 64 | 64 | train: 1406, testing: 156 | train: 1406, testing: 156 | train: 1406, testing: 156 |
The Electron Medical Record (MIMIC II) is a collection of de-identified clinical visit of Intensice Care Unite patient for 7 years. Each event in the dataset is a record of its time stamp and disease diagnosis.
Description of SO (Stack Overflow) Datasets
The Stack Overflow dataset represents two years of user awards on a question-answering website: each user received a sequence of badges
Notice:
The dataset 'data_hawkes', 'data_hawkeshib', 'conttime', 'hawkes', 'data_so', and 'self-correcting' in this repository is truncated from the original data due to uploading difficulty and long training time. You may train the data in this repository with more epochs to get a similar result below. The original dataset can be found in this page and here
The first test we do is to calculate average log-likelihood of events in test file of "data_conttime", and compare the results in Hongyuan Mei's paper. The model is trained with lr = 0.01, epochs = 30, mini batch size = 10. Test results:
Model Result | Result on Paper | |
---|---|---|
log-likelihood over seqs | -0.99 | -1.00 to -0.98 |
log-likelihood over time | 0.447 | 0.440 to 0.455 |
lo-likelihood over type | -1.44 | -1.44 to -1.43 |
We use this test to verify that our pytorch implementation of Neural Hawkes is the Neural Hawkes model described in Hongyuan Mei's paper.
- We also test out model with data provided in Du, Nan, et al. “Recurrent Marked Temporal Point Processes.” paper about self-correcting and hawkes. We make predictions on inter-event durations, intensities, and calculate RMSE between real inter-event durations and our predictions for events in a test sequence. We also compare the results with Du Nan's RMTPP's prediction and optimal prediction. We train the model for 10 epochs with learning rate = 0.01 and truncated sequence length = 75.
-
Result of "hawkes" (The first picture is results by Neural Hawkes; the second picture is results by RMTPP on Du et. al's paper):
-
Result of "self-correcting" (The first picture is results by Neural Hawkes; the second picture is results by RMTPP on Du et. al's paper)
This test show that Neural Hawkes model has the ability to achieve the prediction by optimal equation (prediction made by actual equation behind the dataset) for hawkes and self-correcting.
The third test we do is to test on type prediction accuracy. We choose two dataset to do the test: 'MIMICii' and 'SO' (Stack Overflow). For testing, we input a sequence in testing file except the last event to the trained model trained by 'train.pkl' and compare the model prediction with the actual one for the last event. For testing purpose, we also look at how loss and prediction accuracy on types changes with number of epochs, and we compare the type prediction accuracy with the prediction accuracy by pytorch implementation of RMTPP.
Model During Training:
Testing Results on MIMIC-II:
Dataset | (# epochs, lr)Error by Neural Hawkes | (# epochs, lr)Error by RMTPP |
---|---|---|
data_mimic1 | (200, 0.001) 10.8% | (700, 0.0005) 20% |
data_mimic2 | (300, 0.001) 16.9% | (900, 0.0005) 38.5% |
data_mimic3 | (200, 0.001) 16.9% | (900, 0.0005) 32.3% |
data_mimic4 | (200, 0.001) 20% | (2000, 0.0002)36.9% |
data_mimic5 | (200, 0.001) 9.2% | (2000, 0.0002) 35.4% |
MIMIC-II Average | (---, ----)14.76% | (---, ----)32.62% |
Testing Results on SO dataset:
Dataset | (epochs, lr)Error by Neural Hawkes |
---|---|
data_so1 | (20, 0.01) 62% |
data_so1 | (20, 0.01) 61.5 |
data_so1 | (20, 0.01) 59.5% |
data_so1 | (20, 0.01) 63% |
data_so1 | (20, 0.01) 62.3% |
average | (--, ---) 61.66% |
The prediction on types achieve a lower error rate on MIMIC-II dataset than Stack Overflow dataset. This may caused by a simplier type sequence on MIMIC-II dataset. That is event types in single sequence of MIMIC-II seldom changes. The following is a sample print out of a sequence in MIMIC-II and SO:
Sample MIMIC-II Sequence:
Sample SO Sequence:
Thus, the better prediction on types in MIMIC-II dataset may caused by the the recurrence of same event type in each sequence.
This model is built by Hongrui Lyu, supervised by Hyunouk Ko and Dr. Huo. The file cont-time-cell is just a copy from Hongyuan Mei's code, but all other files are written by us. As notice by the original github page of pytorch implementation, this license need to be included.