NARX uses missing values to split time series, which does not work for auto-regression (AR) models in multi-step-ahead (MSA) fitting. Because in MSA fitting, the optimiser will recognise the separation of two time series by the missing values in inputs X, while AR models do not have input X.
In addition, introducing session_sizes to split the time series will use less memory than the previous time-series separation method, which injects max_delay missing values in X and y.