In [None]:
#| hide
import logging
import warnings

In [None]:
#| hide
warnings.simplefilter('ignore')
logging.getLogger('statsforecast').setLevel(logging.ERROR)

# Ray

> Run StatsForecast distributedly on top of Ray.

StatsForecast works on top of Spark, Dask, and Ray through [Fugue](https://github.com/fugue-project/fugue/). StatsForecast will read the input DataFrame and use the corresponding engine. For example, if the input is a Ray Dataset, StatsForecast will use the existing Ray instance to run the forecast.

A benchmark (with older syntax) can be found [here](https://www.anyscale.com/blog/how-nixtla-uses-ray-to-accurately-predict-more-than-a-million-time-series) where we forecasted one million timeseries in under half an hour.


## Installation

As long as Ray is installed and configured, StatsForecast will be able to use it. If executing on a distributed Ray cluster, make use the `statsforecast` library is installed across all the workers.

## StatsForecast on Pandas

Before running on Ray, it's recommended to test on a smaller Pandas dataset to make sure everything is working. This example also helps show the small differences when using Ray.

In [None]:
from statsforecast.core import StatsForecast
from statsforecast.models import AutoARIMA, AutoETS
from statsforecast.utils import generate_series

In [None]:
n_series = 4
horizon = 7

series = generate_series(n_series)

sf = StatsForecast(
    models=[AutoETS(season_length=7)],
    freq='D',
)
sf.forecast(df=series, h=horizon).head()

Unnamed: 0,unique_id,ds,AutoETS
0,0,2000-08-10,5.261609
1,0,2000-08-11,6.196357
2,0,2000-08-12,0.282309
3,0,2000-08-13,1.264195
4,0,2000-08-14,2.262453


## Executing on Ray

To run the forecasts distributed on Ray, just pass in a Ray Dataset instead.

In [None]:
import ray
import logging

In [None]:
ray.init(logging_level=logging.ERROR)

series['unique_id'] = series['unique_id'].astype(str)
ctx = ray.data.context.DatasetContext.get_current()
ctx.use_streaming_executor = False
ray_series = ray.data.from_pandas(series).repartition(4)

In [None]:
sf.forecast(df=ray_series, h=horizon).take(5)