Part of 1st place solution (LSTM CNN Transformer Encoder) of Google-Brain-Ventilator competition

Competition website: https://www.kaggle.com/c/ventilator-pressure-prediction/overview
Our solution write-up: https://www.kaggle.com/c/ventilator-pressure-prediction/discussion/285256

Features

Features I use include a few lag and diff features, which are basically previous values of u_in and differences between current u_in and previous u_in. R and C are one-hot encoded with combinations of R and C one-hot encoded as well. Additionally, cumulative u_in integrated over time is also calculated (area_true). For more details, see add_features in Functions.py.

Architecture

My deep learning architecture is a combination of LSTM, 1D convolution, and transformers. LSTM is necessary to model this data because of target pressure's heavy dependence on previous time points. Convolution in conjunction with transformers is a good combination to model global dependencies while making up for transformers' inability to capture local interactions.

Since I'm using a series of many different modules, the network becomes quite deep. Eventually, I ran into some issues with gradient propagation since nn.LSTM does not have residual connection. Therefore I created a new module called ResidualLSTM, which adds a Feedforward Network (FFN) and connects the input to the LSTM with the output after FFN with a residual connection. Below is a simplified visualization of the architecture (Nl is the number of ResidualLSTM blocks and Nt is the number of convolution+transformer blocks).

Packages you need

Pytorch
Ranger optimizer: https://github.com/lessw2020/Ranger-Deep-Learning-Optimizer
Sklearn
tqdm

To run

run.sh is used to run training. The only argument you need to change in run.sh is the --path argument. Change it to where you have train.csv and test.csv
calculate_cv.py calculates cv and outputs in cv.txt
predict.sh to make predictions, generate prediction file, and save 10-fold predictions. Similar to 1., change --path to where you have train.csv, test.csv, and sample_submission.csv

files

Network.py has the architecture
Dataset.py has the dataset object
Functions.py has some functions i use (mainly add_features)
Logger.py is the custom csv logger i use to log train/val loss and metrics

Name		Name	Last commit message	Last commit date
Latest commit History 31 Commits
Dataset.py		Dataset.py
Functions.py		Functions.py
LICENSE		LICENSE
Logger.py		Logger.py
Network.py		Network.py
README.md		README.md
arch.png		arch.png
arch.svg		arch.svg
calculate_cv.py		calculate_cv.py
predict.py		predict.py
predict.sh		predict.sh
run.py		run.py
run.sh		run.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dataset.py

Dataset.py

Functions.py

Functions.py

LICENSE

LICENSE

Logger.py

Logger.py

Network.py

Network.py

README.md

README.md

arch.png

arch.png

arch.svg

arch.svg

calculate_cv.py

calculate_cv.py

predict.py

predict.py

predict.sh

predict.sh

run.py

run.py

run.sh

run.sh

Repository files navigation

Part of 1st place solution (LSTM CNN Transformer Encoder) of Google-Brain-Ventilator competition

Features

Architecture

Packages you need

To run

files

About

Releases

Packages

Languages

License

wmmxk/Google-Brain-Ventilator

Folders and files

Latest commit

History

Repository files navigation

Part of 1st place solution (LSTM CNN Transformer Encoder) of Google-Brain-Ventilator competition

Features

Architecture

Packages you need

To run

files

About

Resources

License

Stars

Watchers

Forks

Languages