README

Introduction

Beta release.

This package can be used to create simulations of wind power forecasts from actuals or the other way around from actuals to forecasts. It has been implemented so that a generalization to any type of dataframe providing actuals and forecasts, is possible.

Documentation is available at readthedocs and there is a paper published by Engergy Systems <http://link.springer.com/article/10.1007/s12667-020-00408-6>.

The main inputs of the package are :

an input dataset giving forecasts and actuals for specified datetimes.
a simulation input dataset : the start and end-date of the subset of the input dataset
r_tilde : a desired MAPE (i.e mean absolute percent error see more at Percent Errors and MAPEs ) for the simulations in output
user-specified technical parameters

The mape_maker class estimates the conditioned distribution of the errors considering the input values. It adjusts these distribution to satisfy the specified target MAPE. Having fitted a base process, it simulates highly auto-correlated errors and finally if the user specifies it, it operates a curvature optimization. With these three steps, MapeMaker aims at satisfying the plausability criteria. See more at Plausability criteria.

In this regard, the two technical specification to make for each simulation are the following :

Base Process : IID (generate IDD and so uncorrelated base process), ARMA (default)
Curvature : boolean (This is usually not needed, so the default is FALSE)

Setup

You can install the package with the setup.py file:

python setup.py develop

Then you can use the package in command-line, for a quick-first run :

python -m mape_maker -xf "mape_maker/samples/wind_total_forecast_actual_070113_063015.csv"

This is real data from CAISO with negative values that mape_maker treats as zero. The presence of many low power values makes it difficult to exactly hit a target MAPE.

Options

The options of the package are :

python mape_maker --help
Options:
  -sf, --input_sid_file TEXT        path to a simulation input dataset with one or two timeseries (e.g. actuals),
                                    from which scenarios for the other timeseries are generated (e.g. forecasts)
  -o, --output_dir TEXT             path to destination dir where the scenario csv file(s) are saved
  -vo, --verbosity_output TEXT      the name of the verbosity output file
  -is, --input_start_dt TEXT        start date for the estimation of the distributions, format = 'Y-m-d H:M:S'
  -ie, --input_end_dt TEXT          end date for the estimation of the distributions, format = 'Y-m-d H:M:S'

  -ss, --simulation_start_dt TEXT   start date for the simulation of scenarios, format='Y-m-d H:M:S'
  -se, --simulation_end_dt TEXT     end date for the simulation of scenarios, format='Y-m-d H:M:S'
  -t, --target_mape FLOAT           desired mape, otherwise will take the mape based on the input dataset
  -a, --a FLOAT                     percent of data on the left and/or on the right for the estimation
                                    of conditional beta distribution parameter
  -ct, --curvature_target FLOAT     the target of the second difference for curvature optimization
  -m, --mip_gap FLOAT               mip gap for curvature optimization
  -bp, --base_process TEXT          base process - 'iid' or 'ARMA'
  -n, --number_simulations INTEGER  number_simulations
  -tl, --time_limit INTEGER         time limit for curvature optimization
  -ps, --plot_start_date INTEGER    start date for plot(if 0, the start date is the first date of the simulations)
  -s, --seed INTEGER                seed for the pseudo-random seed
  -v, --verbosity INTEGER           verbosity level
  -f, --sid_feature TEXT            feature you want to simulate - 'actuals' or 'forecasts'
  -lp, --load_pickle BOOLEAN        load the pickle file for the dataset instead of estimation
  -c, --curvature BOOLEAN           optimize the curvature for the simulated scenarios
  -p, --plot BOOLEAN                plot scenarios
  -sv, --solver TEXT                name of the solver (e.g. "gurobi")
  -tt, --title TEXT                 title for the plot
  -xl, --x_legend TEXT              legend for x in plot
  --help                            show this message and exit.

Options availables

sid_feature:
- "actuals" : simulating actuals from forecasts
- "forecasts" : simulating forecasts from actuals
base_process :
- "iid"
- "ARMA"
start_date and end_date :

Format "Y-m-d"

By Default-options

input_sid_file : None, will take the input dataset as sid
output_dir : None, no output_file will be created while a plot will be outputted
verbosity_output : None, no verbosity_output will be created while a plot will be outputted
input_start_dt : None, will use the whole dataset for the computation of the distributions
input_end_dt : None, will use the whole dataset for the computation of the distributions
simulation_start_dt : None, will simulate over the whole dataset
simulation_end_dt : None, will simulate over the whole dataset
target_mape : the mape of the current dataset
a : 4
curvature_target : mean of the second difference of the dataset
mip_gap : 0.3
number_simulations : 1
time_limit : 3600 seconds
plot_start_date : 0
seed : 1234
verbosity : 2
sid_feature : "actuals"
base_process : "ARMA"
load_pickle : False
curvature : False
show_curv_model : False
plot : True
solver : gurobi
title : None, no additional title will be added to the plot
x_legend : None, will use the feature of curves (actuals or forecasts)

Examples

The following command will take the data of the CAISO.csv file, will launch n=4 simulations from actuals to forecasts for a target_mape of 30% using an IID Base Process. It will simulate from "2014-7-2" to "2014-7-30", and it will write a plot file with title "first test" to the file mmFinalFig.png.

python -m mape_maker -xf "mape_maker/samples/wind_total_forecast_actual_070113_063015.csv" -n 4 -f "forecasts"  -bp "iid" -t 30 -ss "2014-7-2 00:00:00" -se "2014-7-31 00:00:00" -tt "first test"

The next example issues some errors about the bounds when finding simulation parameters and creates a directory called output. If that directory already exists, it will issue an error message. The directory will have a csv file with output.

python -m mape_maker -xf "mape_maker/samples/wind_total_forecast_actual_070113_063015.csv" -f "actuals" -n 4 -bp "ARMA" -is "2014-6-1 0:0:0" -ie "2014-6-30 23:0:0" --target_mape 30 --output_dir "output"

The final example is for optimization under uncertainty methodologists who just want some scenarios. This example will put two hundred scenarios in the directory wind_actuals_ARMA. If that directory already exists, it will issue an error message. The directory will have a csv file with output of scenarios based on CAISO data.

python -m mape_maker -xf "mape_maker/samples/wind_total_forecast_actual_070113_063015.csv" -f "actuals" -n 200 -bp "ARMA" -o "wind_actuals_ARMA" -s 1234 -ss "2014-7-12 00:00:00" -se "2014-7-13 00:00:00"

Percent Errors and MAPEs

We denote f and a as respectively the timeseries of forecasts and actuals. From there we can define two MAPEs depending on the simulation you wish to accomplish.

Then if you are simulating forecasts from actuals,

$$mape = \frac{100}{n} \sum_{i=1}^n \frac{|f_i - a_i|}{a_i}$$

If you are simulating actuals from forecasts,

$$mape = \frac{100}{n} \sum_{i=1}^n \frac{|f_i - a_i|}{f_i}$$

Plausability criteria

A scenario set is said to be "plausible" if :

its distribution of errors is close to the empirical distribution of errors i.e its plausibility score is close to 1.
its auto-correlation coefficients are close the empirical values.
When the output is forecast scenarios, the second differences are close to the empirical values.

References

[1] "Mape_Maker: A Scenario Creator" Guillaume Goujard, Jean-Paul Watson, and David L. Woodruff, Engergy Systems <http://link.springer.com/article/10.1007/s12667-020-00408-6>

[2] "Fitting Time-Series Input Processes for Simulation", Bahar Biller, Barry L. Nelson, OPERATIONS RESEARCH Vol. 53, No. 3, May–June 2005, pp. 549–559

Github action badge

Github test status is

Name		Name	Last commit message	Last commit date
Latest commit History 174 Commits
.github/workflows		.github/workflows
attic		attic
documentation		documentation
guide		guide
mape_maker		mape_maker
slides		slides
test		test
.gitignore		.gitignore
.travis.yml		.travis.yml
README.rst		README.rst
requirements.txt		requirements.txt
setup.py		setup.py

NaijingGuo/mape-maker

Folders and files

Latest commit

History

Repository files navigation

README

Introduction

Setup

Options

Options availables

By Default-options

Examples

Percent Errors and MAPEs

Plausability criteria

References

Github action badge

About

Resources

Stars

Watchers

Forks

Languages