In this section, we will provides some information about the motivation and background of the project.
The tropical stratospheric Quasi-Biennial Oscillation is driven largely by parametrized gravity wave breaking, which makes its parametrization an interesting and necessary task to do.
Currently, the parametrization is largely based on the physics. We hope by adopting machine learning, we can have a parametrization with improved accuracy and computation feasibility.
Our project is located in a very interesting fact in Atomespheric Fluid Dynamics called Quasi-Biennual Oscillation, QBO for short. Equatorial zonal wind oscillates between easterlies and westerlies in the tropical stratosphere with a mean period of 28 to 29 months (quasi-biennial) due to gravity waves forcing.
Our job, therefore, is to parametrize the gravity waves inside the oscillation. In particular, our work is largely based on the 1-dim QBO model.
The QBO-1d model is largely a hybrid of the models used by Holton and Linden (1972) and Plumb (1977). The (advection-diffussion) equation reads:
where
Currently, the forcing term is parametrized in the following way.
where the wave flux 𝐹(𝑢,𝑧) is parameterized as follows:
and
Note that, when
The stochasticity enters the story by by making
The pipeline of the project can be visualized as the follow image.
We have goals from two perspectives.
-
Offline: Machine Learning model’s prediction on test dataset should be satisfactory. For offline performance we mainly use (level-wise mean) / RMSE as the metric.
-
Online: Machine Learning model should function similarly as the Physical Model when inserted into the PDE, with better accuracy and efficiency (hopefully).
Despite human beings' great breakthrough on classification, regression is still a hard task. Here is the list of all the machine learning algorithms for regression that I can think of:
- Linear regression
- Neural Nets
- Regression trees/Forest
- Supported Vector Machine
Linear models will by no means be used in practice, we will treat it as a baseline. Neural nets and trees and forest have been done by other members in the group. Therefore, our main focus is the application of supported vector machine for regression(SVR for short) on our problem.
For an introduction of supporetd vector machine and regression, please refer to my presentation slides and weekly report.
With some highly nontrivial data preprocessing and hyperparameters tuning, SVR model are able to achieve a fairly satisfactory performance on the dataset (
With some highly nontrivial hyper-parameters tuning, SVR models can successfully emulate the physical-based gravity waves, thus producing the correct oscillation.
Unfortunately, the efficiency of SVR models are not satisfactory compared with pre-existing NN models (which is very conter-intuitive). Its low efficiency is mainly due to the computation of exponential function inside the kernel.
Also, it has some ability to 'generalize'. For the definition of generalization and further details, please refer to my presentation slides.
- Supported Vector Regression model can emulate physics-based gravity wave parametrization (in QBO-1d model)
- The relation between offline performance and online performance is unknown.
- Supported Vector Regression model has some ability of generalization.
- A small but dense model outperforms a big but sparse model in efficiency with comparable online emulation results. However, SVR is essentially not a method with efficiency.