Description

The code is a user-friendly version of the code used in Chinn et al. (2023), allowing to forecast using the three-step approach. Please note this is a simplified version.

For more details over the three-step approach and the possible methods at each step, please refer to Chinn, M. D., Meunier, B., Stumpner, S. (2023). "Nowcasting World Trade with Machine Learning: a Three-Step Approach", NBER Working Paper, No 31419, National Bureau of Economic Research

How to run the code?

The code works by running Main.R . The user only needs to provide an input dataset and to select the settings (methods, periods, horizons) of the three-step approach.

The input dataset contains the target variable and a set of potential regressors. It should be located in the folder ‘1-Inputs’. An example input dataset with randomly generated series is provided to give a broad idea of the required data structure. The input dataset should:

include a variable named 'target' (= world CPB trade in Chinn et al., 2023). NB: Please do NOT include lags of 'target' (two lags are created automatically in the code).
include a variable named 'date' which must:
- be at monthly frequency (other frequencies can be supported with a slight adaptation of the code)
- be encoded in the format YYYY-MM-15 (NB: please put 15 as day of the month)
have variables already transformed (no transformation is performed in the code)

Different sorts of input variables are supported including:

series with leading and trailing missing values: see for example, first 25 variables (columns C to AA) in the example dataset
series with random missing observations: see for example, last 20 variables (columns CE to CX) in the example dataset
non-stationary series: see for example, 3 variables named 'nonstat' (columns CY to DA) in the example dataset (respectively a RW, a RW with drift, and a RW with linear trend)

Settings of the three-step approach should be entered directly in Main.R which provides a detailed description of each required user setting. In a nutshell, the user can produce forecasts and evaluate models by:

selecting one (or several) method(s) for pre-selection
selecting one (or several) method(s) for regression
selecting one (or several) horizon(s) for forecast
setting the period for out-of-sample predictions (on which models are evaluated)
setting the starting date for in-sample estimation

NB: for pre-selection, the user must also specify the number of variables that are kept (=60 in Chinn et al., 2023).

What are the outputs of the code?

Outputs are created in the folder '2-Output' with sub-folders by forecast horizons (e.g. folder 'h0' for a nowcast). Three types of Excel outputs are provided for each run of Main.R . The type of output is indicated by the name:

'pred' = out-of-sample predictions at each point in time. Columns are:
- 'date' = date of the actual data (NOT the date of the forecast)
- 'true_value' = actual data
- one column for each regression technique
'rmse' = out-of-sample RMSE, separated between crisis (2008-2009 and 2020-2021) and non-crisis (other years)
'summaryALL' = summary of out-of-sample RMSE. The structure is:
- second row corresponds to the pre-selection technique used
- Third row corresponds to the number of variables kept after pre-selection
- each row (>3) corresponds to a regression technique. Values indicate the out-of-sample RMSE

NB: Per horizon, each run of Main.R creates one ‘pred’ Excel and one ‘rmse’ Excel for each pre-selection technique tested. It also creates different Excels for each number of variables kept. For example, if the user tests LARS and SIS with 50 and 60 variables kept, four different ‘pred’ and four different ‘rmse’ Excels will be created (LARS-50, LARS-60, SIS-50, SIS-60). By contrast, only a single 'summaryALL' Excel per horizon is created at each run of Main.R .

Besides the type of output, the name of the Excel indicates the settings of the three-step approach:

'sel_xxx' = xxx relates to the pre-selection technique(s) used
'n_yyy' = yyy relates to the number of variables kept after pre-selection
'reg_zzz' = zzz relates to the regression technique(s) used
'h_aaa' = aaa relates to the horizon of forecast
last items in the name of the Excel refer to start and end dates of the out-of-sample forecasts

Examples of output Excel are provided in the folder ‘2-Output’.

Contact and references

Please report any bug and address any question to baptistemeunier@hotmail.fr.

Please cite as Chinn, M. D., Meunier, B., Stumpner, S. (2023). "Nowcasting World Trade with Machine Learning: a Three-Step Approach", NBER Working Paper, No 31419, National Bureau of Economic Research

Where appropriate, please also cite

Macroeconomic random forest as: Goulet-Coulombe, P. (2020). “The Macroeconomy as a Random Forest”, arXiv pre-print
Linear gradient boosting as: Chen, T., and Guestrin, C. (2016). “XGBoost: A Scalable Tree Boosting System”, in Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 785–794
Least Angle Regression (LARS) as: Efron, B., Hastie, T., Johnstone, I., and Tibshirani, R. (2004). “Least angle regression”, Annals of Statistics, 32(2), pp. 407–499
Sure Independence Screening (SIS) as: Fan, J., and Lv, J. (2008). “Sure independence screening for ultrahigh dimensional feature space”, Journal of the Royal Statistical Society Series B, 70(5), pp. 849–911
T-stat-based pre-selection as: Bair, E., Hastie, T., Paul, D., and Tibshirani, R. (2006). “Prediction by supervised principal components”, Journal of the American Statistical Association, 101(473), pp. 119–137
Iterated Bayesian Model Averaging as: Yeung, K., Bumgarner, R., and Raftery, A. (2005). “Bayesian Model Averaging: Development of an improved multi-class, gene selection and classification tool for microarray data”, Bioinformatics, 21(10), pp. 2394–2402

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
11-Github		11-Github
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

11-Github

11-Github

README.md

README.md

Repository files navigation

Description

How to run the code?

What are the outputs of the code?

Contact and references

About

Releases

Packages

Languages

baptiste-meunier/NowcastingML_3step

Folders and files

Latest commit

History

11-Github

11-Github

README.md

README.md

Repository files navigation

Description

How to run the code?

What are the outputs of the code?

Contact and references

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages