OPPerTune is a framework that enables configuration tuning of applications, including those in live deployment. It reduces application interruptions while maximizing the performance of the application as and when the workload or the underlying infrastructure changes. It automates three essential processes that facilitate post-deployment configuration tuning:
- Determining which configurations to tune
- Automatically managing the scope at which to tune the configurations (using AutoScope)
- Using a novel reinforcement learning algorithm to simultaneously and quickly tune numerical and categorical configurations, thereby keeping the overhead of configuration tuning low.
- Project page: https://aka.ms/OPPerTune
- Python 3 (>= 3.7)
-
Install the latest version of
pip
andsetuptools
python3 -m pip install --upgrade pip setuptools
-
To setup the package locally, run
python3 -m pip install .
Or
python3 -m pip install git+https://github.com/microsoft/OPPerTune.git
We define the parameters of the system to be tuned. To specify a parameter, you need to provide the following:
-
name
: The name of the parameter. -
initial_value
: The initial value of the parameter. -
lb
: The lower bound value that the parameter can take. -
ub
: The upper bound value that the parameter can take. -
step_size
(optional): The minimum amount by which a parameter's value can be perturbed. In the example below, the values forp2
are restricted to(100, 200, 300, ... 900)
because we have specified astep_size
of100
. ForContinuousValue
, the defaultstep_size
isNone
, which indicates any arbitrary amount of perturbation, whereas forDiscreteValue
, the default (and also the minimum) value forstep_size
is1
.
-
name
: The name of the parameter -
initial_value
: The initial value of the parameter -
categories
: The list of allowed values (atleast 2)
from oppertune import CategoricalValue, ContinuousValue, DiscreteValue
parameters = [
ContinuousValue(
name="p1",
initial_value=0.45,
lb=0.0,
ub=1.0,
),
DiscreteValue(
name="p2",
initial_value=100
lb=100,
ub=900,
step_size=100,
),
CategoricalValue(
name="p3",
initial_value="medium",
categories=["low", "medium", "high"]
)
]
algorithm = "hybrid_solver" # Supports continuous, discrete and categorical parameters
algorithm_args = dict(
numerical_solver="bluefin", # For the numerical (continuous and discrete) parameters
numerical_solver_args=dict(
feedback=2,
eta=0.01,
delta=0.1,
random_seed=123, # Just for reproducibility
),
categorical_solver="exponential_weights_slates", # For the categorical parameters
categorical_solver_args=dict(
random_seed=123, # Just for reproducibility
),
)
from oppertune import OPPerTune
tuner = OPPerTune(parameters, algorithm, algorithm_args)
app.set_config(...)
prediction, metadata = tuner.predict()
app.set_config(prediction)
# prediction will be a dictionary, with the keys as the names of the parameters
# and the values as the ones predicted by OPPerTune.
# E.g., {"p1": 0.236, "p2": 300, "p3": "medium"}
# metadata is any additional (possibly None) data required by the algorithm.
# This should always be passed back in the set_reward call.
OPPerTune uses a reward to compute an update to the parameter values. This reward needs to be a function of the metrics (e.g., throughput, latency) of the current state of the system.
def calculate_reward(metrics) -> float:
"""
We assume that the metrics of concern for us are latency and throughput.
There can be more (or less) metrics that you may want to optimize for as well.
"""
latency = metrics["latency"]
throughput = metrics["throughput"]
# Higher the throughput, higher the reward
# Lower the latency, higher the reward
reward = throughput / latency
# (Optional) You can scale the reward to [0, 1]
# reward = sigmoid(reward)
return reward
Start observing the metrics and deploy the app.
metrics_monitor.start() # To start monitoring the metrics of concern
app.deploy() # Using the new parameter values. Wait till the app finishes the job.
metrics = metrics_monitor.stop()
reward = calculate_reward(metrics)
tuner.set_reward(reward, metatadata)
from oppertune import CategoricalValue, ContinuousValue, DiscreteValue, OPPerTune
def calculate_reward(metrics):
latency = metrics["latency"]
throughput = metrics["throughput"]
# Higher the throughput, higher the reward
# Lower the latency, higher the reward
reward = throughput / latency
# (Optional) You can scale the reward to [0, 1]
# reward = sigmoid(reward)
return reward
def main():
parameters = [
ContinuousValue(
name="p1",
initial_value=0.45,
lb=0.0,
ub=1.0,
),
DiscreteValue(
name="p2",
initial_value=100
lb=100,
ub=900,
step_size=100,
),
CategoricalValue(
name="p3",
initial_value="medium",
categories=["low", "medium", "high"]
)
]
algorithm="hybrid_solver" # Supports continuous, discrete and categorical parameters
algorithm_args=dict(
numerical_solver="bluefin", # For the numerical (continuous and discrete) parameters
numerical_solver_args=dict(
feedback=2,
eta=0.01,
delta=0.1,
random_seed=123, # Just for reproducibility
),
categorical_solver="exponential_weights_slates", # For the categorical parameters
categorical_solver_args=dict(
random_seed=123, # Just for reproducibility
),
)
while True:
prediction, metadata = tuner.predict()
app.set_config(prediction)
metrics_monitor.start() # To start monitoring the metrics of concern
app.deploy() # Using the new parameter values. Wait till the app finishes the job.
metrics = metrics_monitor.stop()
reward = calculate_reward(metrics)
tuner.set_reward(reward, metatadata)
# Optionally, you can stop once the metrics are good enough
# which is indicated by a high reward
if reward >= 0.95: # Assuming the reward lies in [0, 1]
break
if __name__ == "__main__":
main()
For a working example, refer to examples/hybrid_solver/main.py.
This project welcomes contributions and suggestions. Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us the rights to use your contribution. For details, visit https://cla.opensource.microsoft.com.
When you submit a pull request, a CLA bot will automatically determine whether you need to provide a CLA and decorate the PR appropriately (e.g., status check, comment). Simply follow the instructions provided by the bot. You will only need to do this once across all repos using our CLA.
This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact opencode@microsoft.com with any additional questions or comments.
pip install -e "./[dev]"
To ensure your code follows the style guidelines, install black>=23.1
and isort>=5.11
pip install black>=23.1
pip install isort>=5.11
then run,
isort . --sp=pyproject.toml
black . --config=pyproject.toml
This project may contain trademarks or logos for projects, products, or services. Authorized use of Microsoft trademarks or logos is subject to and must follow Microsoft's Trademark & Brand Guidelines. Use of Microsoft trademarks or logos in modified versions of this project must not cause confusion or imply Microsoft sponsorship. Any use of third-party trademarks or logos are subject to those third-party's policies.