docs-ref-autogen/azure-ai-ml/azure.ai.ml.automl.ForecastingSettings.yml

### YamlMime:PythonClass
uid: azure.ai.ml.automl.ForecastingSettings
name: ForecastingSettings
fullName: azure.ai.ml.automl.ForecastingSettings
module: azure.ai.ml.automl
inheritances:
- azure.ai.ml.entities._mixins.RestTranslatableMixin
summary: Forecasting settings for an AutoML Job.
constructor:
  syntax: 'ForecastingSettings(*, country_or_region_for_holidays: str | None = None,
    cv_step_size: int | None = None, forecast_horizon: str | int | None = None, target_lags:
    str | int | List[int] | None = None, target_rolling_window_size: str | int | None
    = None, frequency: str | None = None, feature_lags: str | None = None, seasonality:
    str | int | None = None, use_stl: str | None = None, short_series_handling_config:
    str | None = None, target_aggregate_function: str | None = None, time_column_name:
    str | None = None, time_series_id_column_names: str | List[str] | None = None,
    features_unknown_at_forecast_time: str | List[str] | None = None)'
  parameters:
  - name: country_or_region_for_holidays
    description: 'The country/region used to generate holiday features. These should
      be ISO

      3166 two-letter country/region code, for example ''US'' or ''GB''.'
    isRequired: true
    types:
    - <xref:typing.Optional>[<xref:str>]
  - name: cv_step_size
    description: 'Number of periods between the origin_time of one CV fold and the
      next fold. For

      example, if *n_step* = 3 for daily data, the origin time for each fold will
      be

      three days apart.'
    isRequired: true
    types:
    - <xref:typing.Optional>[<xref:int>]
  - name: forecast_horizon
    description: 'The desired maximum forecast horizon in units of time-series frequency.
      The default value is 1.


      Units are based on the time interval of your training data, e.g., monthly, weekly
      that the forecaster

      should predict out. When task type is forecasting, this parameter is required.
      For more information on

      setting forecasting parameters, see [Auto-train a time-series forecast model](https://docs.microsoft.com/azure/machine-learning/how-to-auto-train-forecast).'
    isRequired: true
    types:
    - <xref:typing.Optional>[<xref:typing.Union>[<xref:int>, <xref:str>]]
  - name: target_lags
    description: "The number of past periods to lag from the target column. By default\
      \ the lags are turned off.\n\nWhen forecasting, this parameter represents the\
      \ number of rows to lag the target values based\non the frequency of the data.\
      \ This is represented as a list or single integer. Lag should be used\nwhen\
      \ the relationship between the independent variables and dependent variable\
      \ do not match up or\ncorrelate by default. For example, when trying to forecast\
      \ demand for a product, the demand in any\nmonth may depend on the price of\
      \ specific commodities 3 months prior. In this example, you may want\nto lag\
      \ the target (demand) negatively by 3 months so that the model is training on\
      \ the correct\nrelationship. For more information, see [Auto-train a time-series\
      \ forecast model](https://docs.microsoft.com/azure/machine-learning/how-to-auto-train-forecast).\n\
      \n**Note on auto detection of target lags and rolling window size.\nPlease see\
      \ the corresponding comments in the rolling window section.**\nWe use the next\
      \ algorithm to detect the optimal target lag and rolling window size.\n\n1.\
      \ Estimate the maximum lag order for the look back feature selection. In our\
      \ case it is the number of periods till the next date frequency granularity\
      \ i.e. if frequency is daily, it will be a week (7), if it is a week, it will\
      \ be month (4). That values multiplied by two is the largest possible values\
      \ of lags/rolling windows. In our examples, we will consider the maximum lag\
      \ order of 14 and 8 respectively). \n\n2. Create a de-seasonalized series by\
      \ adding trend and residual components. This will be used in the next step.\
      \ \n\n3. Estimate the PACF - Partial Auto Correlation Function on the on the\
      \ data from (2) and search for points, where the auto correlation is significant\
      \ i.e. its absolute value is more then 1.96/square_root(maximal lag value),\
      \ which correspond to significance of 95%. \n\n4. If all points are significant,\
      \ we consider it being strong seasonality and do not create look back features.\
      \ \n\n5. We scan the PACF values from the beginning and the value before the\
      \ first insignificant auto correlation will designate the lag. If first significant\
      \ element (value correlate with itself) is followed by insignificant, the lag\
      \ will be 0 and we will not use look back features."
    isRequired: true
    types:
    - <xref:typing.Union>[<xref:str>, <xref:int>, <xref:typing.List>[<xref:int>]]
  - name: target_rolling_window_size
    description: 'The number of past periods used to create a rolling window average
      of the target column.


      When forecasting, this parameter represents *n* historical periods to use to
      generate forecasted values,

      <= training set size. If omitted, *n* is the full training set size. Specify
      this parameter

      when you only want to consider a certain amount of history when training the
      model.

      If set to ''auto'', rolling window will be estimated as the last

      value where the PACF is more then the significance threshold. Please see target_lags
      section for details.'
    isRequired: true
    types:
    - <xref:typing.Optional>[<xref:typing.Union>[<xref:str>, <xref:int>]]
  - name: frequency
    description: 'Forecast frequency.


      When forecasting, this parameter represents the period with which the forecast
      is desired,

      for example daily, weekly, yearly, etc. The forecast frequency is dataset frequency
      by default.

      You can optionally set it to greater (but not lesser) than dataset frequency.

      We''ll aggregate the data and generate the results at forecast frequency. For
      example,

      for daily data, you can set the frequency to be daily, weekly or monthly, but
      not hourly.

      The frequency needs to be a pandas offset alias.

      Please refer to pandas documentation for more information:

      [https://pandas.pydata.org/pandas-docs/stable/user_guide/timeseries.html#dateoffset-objects](https://pandas.pydata.org/pandas-docs/stable/user_guide/timeseries.html#dateoffset-objects)'
    isRequired: true
    types:
    - <xref:typing.Optional>[<xref:str>]
  - name: feature_lags
    description: Flag for generating lags for the numeric features with 'auto' or
      None.
    isRequired: true
    types:
    - <xref:typing.Optional>[<xref:str>]
  - name: seasonality
    description: 'Set time series seasonality as an integer multiple of the series
      frequency.

      If seasonality is set to ''auto'', it will be inferred.

      If set to None, the time series is assumed non-seasonal which is equivalent
      to seasonality=1.'
    isRequired: true
    types:
    - <xref:typing.Optional>[<xref:typing.Union>[<xref:int>, <xref:str>]]
  - name: use_stl
    description: 'Configure STL Decomposition of the time-series target column.

      use_stl can take three values: None (default) - no stl decomposition, ''season''
      - only generate

      season component and season_trend - generate both season and trend components.'
    isRequired: true
    types:
    - <xref:typing.Optional>[<xref:str>]
  - name: short_series_handling_config
    description: 'The parameter defining how if AutoML should handle short time series.


      Possible values: ''auto'' (default), ''pad'', ''drop'' and None.

      * **auto** short series will be padded if there are no long series,

      otherwise short series will be dropped.

      * **pad** all the short series will be padded.

      * **drop**  all the short series will be dropped".

      * **None** the short series will not be modified.

      If set to ''pad'', the table will be padded with the zeroes and

      empty values for the regressors and random values for target with the mean

      equal to target value median for given time series id. If median is more or
      equal

      to zero, the minimal padded value will be clipped by zero.

      Input:


      :::row:::

      :::column:::

      **Date**

      :::column-end:::

      :::column:::

      **numeric_value**

      :::column-end:::

      :::column:::

      **string**

      :::column-end:::

      :::column:::

      **target**

      :::column-end:::

      :::row-end:::

      :::row:::

      :::column:::

      2020-01-01

      :::column-end:::

      :::column:::

      23

      :::column-end:::

      :::column:::

      green

      :::column-end:::

      :::column:::

      55

      :::column-end:::

      :::row-end:::


      Output assuming minimal number of values is four:


      :::row:::

      :::column:::

      **Date**

      :::column-end:::

      :::column:::

      **numeric_value**

      :::column-end:::

      :::column:::

      **string**

      :::column-end:::

      :::column:::

      **target**

      :::column-end:::

      :::row-end:::

      :::row:::

      :::column:::

      2019-12-29

      :::column-end:::

      :::column:::

      0

      :::column-end:::

      :::column:::

      NA

      :::column-end:::

      :::column:::

      55.1

      :::column-end:::

      :::row-end:::

      :::row:::

      :::column:::

      2019-12-30

      :::column-end:::

      :::column:::

      0

      :::column-end:::

      :::column:::

      NA

      :::column-end:::

      :::column:::

      55.6

      :::column-end:::

      :::row-end:::

      :::row:::

      :::column:::

      2019-12-31

      :::column-end:::

      :::column:::

      0

      :::column-end:::

      :::column:::

      NA

      :::column-end:::

      :::column:::

      54.5

      :::column-end:::

      :::row-end:::

      :::row:::

      :::column:::

      2020-01-01

      :::column-end:::

      :::column:::

      23

      :::column-end:::

      :::column:::

      green

      :::column-end:::

      :::column:::

      55

      :::column-end:::

      :::row-end:::


      **Note:** We have two parameters short_series_handling_configuration and

      legacy short_series_handling. When both parameters are set we are

      synchronize them as shown in the table below (short_series_handling_configuration
      and

      short_series_handling for brevity are marked as handling_configuration and handling

      respectively).


      :::row:::

      :::column:::

      **handling**

      :::column-end:::

      :::column:::

      **handling configuration**

      :::column-end:::

      :::column:::

      **resulting handling**

      :::column-end:::

      :::column:::

      **resulting handlingconfiguration**

      :::column-end:::

      :::row-end:::

      :::row:::

      :::column:::

      True

      :::column-end:::

      :::column:::

      auto

      :::column-end:::

      :::column:::

      True

      :::column-end:::

      :::column:::

      auto

      :::column-end:::

      :::row-end:::

      :::row:::

      :::column:::

      True

      :::column-end:::

      :::column:::

      pad

      :::column-end:::

      :::column:::

      True

      :::column-end:::

      :::column:::

      auto

      :::column-end:::

      :::row-end:::

      :::row:::

      :::column:::

      True

      :::column-end:::

      :::column:::

      drop

      :::column-end:::

      :::column:::

      True

      :::column-end:::

      :::column:::

      auto

      :::column-end:::

      :::row-end:::

      :::row:::

      :::column:::

      True

      :::column-end:::

      :::column:::

      None

      :::column-end:::

      :::column:::

      False

      :::column-end:::

      :::column:::

      None

      :::column-end:::

      :::row-end:::

      :::row:::

      :::column:::

      False

      :::column-end:::

      :::column:::

      auto

      :::column-end:::

      :::column:::

      False

      :::column-end:::

      :::column:::

      None

      :::column-end:::

      :::row-end:::

      :::row:::

      :::column:::

      False

      :::column-end:::

      :::column:::

      pad

      :::column-end:::

      :::column:::

      False

      :::column-end:::

      :::column:::

      None

      :::column-end:::

      :::row-end:::

      :::row:::

      :::column:::

      False

      :::column-end:::

      :::column:::

      drop

      :::column-end:::

      :::column:::

      False

      :::column-end:::

      :::column:::

      None

      :::column-end:::

      :::row-end:::

      :::row:::

      :::column:::

      False

      :::column-end:::

      :::column:::

      None

      :::column-end:::

      :::column:::

      False

      :::column-end:::

      :::column:::

      None

      :::column-end:::

      :::row-end:::'
    isRequired: true
    types:
    - <xref:typing.Optional>[<xref:str>]
  - name: target_aggregate_function
    description: "The function to be used to aggregate the time series target\n  \
      \ column to conform to a user specified frequency. If the\n   target_aggregation_function\
      \ is set, but the freq parameter\n   is not set, the error is raised. The possible\
      \ target\n   aggregation functions are: \"sum\", \"max\", \"min\" and \"mean\"\
      .\n\n* The target column values are aggregated based on the specified operation.\
      \ Typically, sum is appropriate for most scenarios. \n\n* Numerical predictor\
      \ columns in your data are aggregated by sum, mean, minimum value, and maximum\
      \ value. As a result, automated ML generates new columns suffixed with the aggregation\
      \ function name and applies the selected aggregate operation. \n\n* For categorical\
      \ predictor columns, the data is aggregated by mode, the most prominent category\
      \ in the window. \n\n* Date predictor columns are aggregated by minimum value,\
      \ maximum value and mode. \n\n:::row:::\n:::column:::\n**freq**\n:::column-end:::\n\
      :::column:::\n**target_aggregation_function**\n:::column-end:::\n:::column:::\n\
      **Data regularityfixing mechanism**\n:::column-end:::\n:::row-end:::\n:::row:::\n\
      :::column:::\nNone (Default)\n:::column-end:::\n:::column:::\nNone (Default)\n\
      :::column-end:::\n:::column:::\nThe aggregation is notapplied. If the validfrequency\
      \ can not bedetermined the error willbe raised.\n:::column-end:::\n:::row-end:::\n\
      :::row:::\n:::column:::\nSome Value\n:::column-end:::\n:::column:::\nNone (Default)\n\
      :::column-end:::\n:::column:::\nThe aggregation is notapplied. If the numberof\
      \ data points compliantto given frequency gridis less then 90% these pointswill\
      \ be removed, otherwisethe error will be raised.\n:::column-end:::\n:::row-end:::\n\
      :::row:::\n:::column:::\nNone (Default)\n:::column-end:::\n:::column:::\nAggregation\
      \ function\n:::column-end:::\n:::column:::\nThe error about missingfrequency\
      \ parameteris raised.\n:::column-end:::\n:::row-end:::\n:::row:::\n:::column:::\n\
      Some Value\n:::column-end:::\n:::column:::\nAggregation function\n:::column-end:::\n\
      :::column:::\nAggregate to frequency usingprovided aggregation function.\n:::column-end:::\n\
      :::row-end:::"
    isRequired: true
    types:
    - <xref:str>
  - name: time_column_name
    description: 'The name of the time column. This parameter is required when forecasting
      to specify the datetime

      column in the input data used for building the time series and inferring its
      frequency.'
    isRequired: true
    types:
    - <xref:typing.Optional>[<xref:str>]
  - name: time_series_id_column_names
    description: 'The names of columns used to group a timeseries.

      It can be used to create multiple series. If time series id column names is
      not defined or

      the identifier columns specified do not identify all the series in the dataset,
      the time series identifiers

      will be automatically created for your dataset.'
    isRequired: true
    types:
    - <xref:typing.Union>[<xref:str>, <xref:typing.List>[<xref:str>]]
  - name: features_unknown_at_forecast_time
    description: 'The feature columns that are available for training but unknown
      at the time of forecast/inference.

      If features_unknown_at_forecast_time is set to an empty list, it is assumed
      that

      all the feature columns in the dataset are known at inference time. If this
      parameter is not set

      the support for future features is not enabled.'
    isRequired: true
    types:
    - <xref:typing.Optional>[<xref:typing.Union>[<xref:str>, <xref:typing.List>[<xref:str>]]]
  keywordOnlyParameters:
  - name: country_or_region_for_holidays
    isRequired: true
  - name: cv_step_size
    isRequired: true
  - name: forecast_horizon
    isRequired: true
  - name: target_lags
    isRequired: true
  - name: target_rolling_window_size
    isRequired: true
  - name: frequency
    isRequired: true
  - name: feature_lags
    isRequired: true
  - name: seasonality
    isRequired: true
  - name: use_stl
    isRequired: true
  - name: short_series_handling_config
    isRequired: true
  - name: target_aggregate_function
    isRequired: true
  - name: time_column_name
    isRequired: true
  - name: time_series_id_column_names
    isRequired: true
  - name: features_unknown_at_forecast_time
    isRequired: true