perfsize is a tool that uses automated performance testing to determine the right size of
infrastructure that can meet a given set of requirements. This repo is for common interfaces and
shared components that can be imported into technology specific implementations.
One example implementation is
perfsize-sagemaker
which is the open source implementation that is specific to AWS SageMaker infrastructure.
The environment update uses the boto3 SDK, and the traffic generator uses
sagemaker-gatling
to call SageMaker endpoints directly.
The goal is to make perfsize general enough so that it can be included in some other
implementation for "x" infrastructure, which could have its own SDK for updating infrastructure, and
its own load generator for sending traffic.
-
Conditionfunction: any function that takes a Decimal value and returns True for success, False for failure. Some examples provided in the tests use comparison operatorslt,lte,eq,ne,gt,gteto create functions to check a given value against specified thresholds.description: string summary for what the function is checking, useful for printing in reports.- Example:
Condition(lt(Decimal("200")), "value < 200")
-
Resultmetric: string id of metric being measured. Examples could beerror_percent,p99_total,p99_pass,p99_fail.value: the Decimal value measured.conditions: list ofConditionto check.successes(): subset list ofConditionthat succeeded based on given value.failures(): subset list ofConditionthat failed based on given value.- Example:
Result("p99 response time", Decimal("201"), conditions)
-
Runstart: starting timestamp of test run.end: ending timestamp of test run.results: list ofResult.status(): False if any result has failures, else True if any result has successes, else None.- Example:
result1 = Result("p99 response time", Decimal("199.01"), p99conditions) result2 = Result("error rate", Decimal("0"), errorconditions) run = Run( datetime.fromisoformat("2021-04-01T00:00:00"), datetime.fromisoformat("2021-04-01T01:00:00"), [result1, result2], )
-
Configparameters: dictionary mapping parameter name to parameter value, to describe all settings for a particular configuration to be tested. Example:parameters = { "instance type": "ml.m5.large", "instance count": "1", "tps": "100", }requirements: dictionary mapping a metric name to a list ofCondition, to describe how each measuredResultwill be checked for success.p99conditions = [ Condition(lt(Decimal("200")), "value < 200"), Condition(gte(Decimal("0")), "value >= 0"), ] errorconditions = [ Condition(lt(Decimal("0.01")), "value < 0.01"), Condition(gte(Decimal("0")), "value >= 0"), ] requirements = { "p99 response time": p99conditions, "error rate": errorconditions, }runs: list of testRunfor the given config.- Example:
config = Config(parameters, requirements)
-
Planparameter_lists: dictionary mapping parameter name to list of possible values to test. Example:parameter_lists = { "instance type": [ "ml.m5.large", "ml.m5.xlarge", "ml.m5.2xlarge", "ml.m5.4xlarge", ], "instance count": ["1"], "tps": [ "1", "2", "3", "4", "5", "6", "7", "8", "9", "10", "20", "30", "40", "50", "60", "70", "80", "90", "100", "200", "300", "400", ], }requirements: dictionary mapping a metric name to a list ofCondition, to describe how each measuredResultwill be checked for success.combinations: list of tuples covering the cross product of possible combinations using the values from given parameter lists. Example:[ ("ml.m5.large", "1", "1"), ("ml.m5.large", "1", "2"), ("ml.m5.large", "1", "3"), ... ("ml.m5.xlarge", "1", "1"), ... ("ml.m5.2xlarge", "1", "1"), ... ("ml.m5.4xlarge", "1", "1"), ... ("ml.m5.4xlarge", "1", "300"), ("ml.m5.4xlarge", "1", "400"), ]configs: dictionary mapping each tuple above to aConfigobject. Example:{ ("ml.m5.large", "1", "1"): Config( { "instance type": "ml.m5.large", "instance count": "1", "tps": "1", }, requirements ), ("ml.m5.large", "1", "2"): Config(...), ("ml.m5.large", "1", "3"): Config(...), ... ("ml.m5.xlarge", "1", "1"): Config(...), ... ("ml.m5.2xlarge", "1", "1"): Config(...), ... ("ml.m5.4xlarge", "1", "1"): Config(...), ... ("ml.m5.4xlarge", "1", "300"): Config(...), ("ml.m5.4xlarge", "1", "400"): Config(...), }history: list ofConfigtested so far.recommendation: mapping of parameter name to parameter value, to describe all settings for a recommended configuration found to be the best so far. Example:recommendation = { "instance type": "ml.m5.large", "instance count": "2", }
-
StepManagernext(): based on current state of configs tested and their results, return next Config to test, or None if process is completed.- Example implementations:
AllStepManager: test every config in order.FirstFailureStepManager: current ml-pathfinder algorithm.BinarySearchStepManager: try binary search over TPS range given.
-
EnvironmentManagersetup(): update target environment based on settings from givenConfig.teardown(): clean up target environment based on settings from givenConfig.- Example implementations:
SageMakerEnvironmentManager: update AWS SageMaker environment using boto3XEnvironmentManager: update "X" environment using "X" SDK
-
LoadManagersend(): createRunto track timing and results, send target test traffic of givenConfig.- Example implementations:
SageMakerLoadManager: send traffic usingsagemaker-gatlinglibraryXLoadManager: send traffic using "X" SDK
-
ResultManagerquery(): for the time period of a givenRun, gather metrics and append asResultitems on theRun.- Example implementations:
GatlingResultManager: get results from parsing Gatling output filesSplunkResultManager: get results from Splunk server with given queryWavefrontResultManager: get results from WavefrontCloudWatchResultManager: get results from CloudWatch
-
Reporterrender(): generate a report given the tested configs and their results.- Example implementations:
DefaultReporter: string representation formatted by white spaceHTMLReporter: HTML string
-
Workflowrun(): set up the test plan and call various managers to do work and determine next steps. When all test runs are completed, update results and return the recommended settings.DefaultWorkflow- call each manager below and print placeholder text to simulate test activitySageMakerWorkflow- call SageMaker specific managersXWorkflow- call "X" specific managers
perfsize is available at The Python Package Index (PyPI).
Install using pip (or your preferred dependency manager and virtual environment):
pip install perfsizeSee tests folder for examples.
Clone repository:
git clone https://github.com/intuit/perfsize.git
cd perfsize
Install poetry for dependency management and packaging:
https://python-poetry.org/docs/
Set up your virtual environment (with package versions from poetry.lock file):
poetry install
Start a shell for your virtual environment for running additional commands that need access to the installed packages:
poetry shell
python anything.py
Other commands from the Makefile:
make format: format code with blackmake test: run all testsmake build: create artifactsmake publish: push artifacts to PyPI
See packages installed:
poetry show --tree
See info about environments:
poetry env info
poetry env list
Optional. For integration with your IDE, you can run poetry env info to get the Virtualenv path,
like /Users/your-name-here/Library/Caches/pypoetry/virtualenvs/perfsize-VvRdEPIE-py3.9, and point your IDE
to the bin/python there.
In IntelliJ:
- Create a new Python SDK option under
File > Project Structure > Platform Settings > SDKs > Add new Python SDK > Virtualenv Environment > Existing environment > Interpreterand specify the path above includingbin/python. - Update
Project Settings > Project SDKandProject Settings > Modules > Module SDKto point to the SDK you just created.
Make sure you are doing this from a clean working directory.
Possible release types are:
patchminormajorprepatchpreminorpremajorprerelease
VERSION_BUMP_MSG=$(poetry version --no-ansi <release>)
NEW_VERSION=$(poetry version --no-ansi | cut -d ' ' -f2)
git commit -am "${VERSION_BUMP_MSG}"
git tag "v${NEW_VERSION}"
git push && git push --tagsOnce tag is published as a release, GitHub Action python-publish.yml will publish the artifacts to perfsize on PyPI.
Feel free to open an issue or pull request!
For major changes, please open an issue first to discuss what you would like to change.
Make sure to read our code of conduct.
This project is licensed under the terms of the Apache License 2.0.
