Skip to content

Commit

Permalink
Add new benchmarks
Browse files Browse the repository at this point in the history
  • Loading branch information
greshilov committed Nov 21, 2021
1 parent f483451 commit d280cf4
Show file tree
Hide file tree
Showing 19 changed files with 500 additions and 57 deletions.
8 changes: 4 additions & 4 deletions .github/workflows/ci.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -25,14 +25,14 @@ jobs:
- name: Cache PyPI
uses: actions/cache@v2.1.6
with:
key: pip-lint-${{ hashFiles('requirements.dev.txt') }}
key: pip-lint-${{ hashFiles('requirements/*.txt') }}
path: ~/.cache/pip
restore-keys: |
pip-lint-
- name: Install dependencies
uses: py-actions/py-dependency-install@v2
with:
path: requirements.dev.txt
path: requirements/requirements.dev.txt
- name: Pre-Commit hooks
uses: pre-commit/action@v2.0.3
- name: Build docs
Expand Down Expand Up @@ -63,14 +63,14 @@ jobs:
- name: Cache PyPI
uses: actions/cache@v2
with:
key: pip-ci-${{ runner.os }}-${{ matrix.pyver }}-${{ hashFiles('requirements.dev.txt') }}
key: pip-ci-${{ runner.os }}-${{ matrix.pyver }}-${{ hashFiles('requirements/*.txt') }}
path: ${{ steps.pip-cache.outputs.dir }}
restore-keys: |
pip-ci-${{ runner.os }}-${{ matrix.pyver }}
- name: Install dependencies
uses: py-actions/py-dependency-install@v2
with:
path: requirements.dev.txt
path: requirements/requirements.dev.txt
- name: Install itself
run: |
pip install -e .
Expand Down
1 change: 1 addition & 0 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@ repos:
rev: '5.10.0'
hooks:
- id: isort
args: ["--profile", "black"]
- repo: https://github.com/psf/black
rev: '21.10b0'
hooks:
Expand Down
21 changes: 21 additions & 0 deletions DEVELOPMENT.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
# Development

## Insall dependencies
```
pip install -r requirements.dev.txt
pip install -e .
```
## Lock dependencies
```
make compile-deps
```

## Test
```
make test
```

## Benchmark
```
python benchmarks/benchmark.py
```
6 changes: 4 additions & 2 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@
@touch .update-pip

.develop: .update-pip
@pip install -r requirements.dev.txt
@pip install -r requirements/requirements.dev.txt
@pip install -e .
@touch .develop

Expand All @@ -28,7 +28,9 @@ clean:
.PHONY: compile-deps
compile-deps: .update-pip
pip-compile --allow-unsafe -q --strip-extras \
requirements.dev.in
requirements/requirements.dev.in
pip-compile --allow-unsafe -q --strip-extras \
requirements/requirements.bench.in

.PHONY: doc
doc: .develop
Expand Down
53 changes: 35 additions & 18 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,7 @@ $ pip install pysegmenttree
>> from pysegmenttree import stree
# Build the tree
# 'sum' function is used by default
>> tree = stree([5, 1, 9, 4, 5, 11])
# Find sum on the interval [1, 4)
Expand All @@ -42,24 +43,40 @@ $ pip install pysegmenttree
# Docs
Docs are available [here](https://pysegmenttree.readthedocs.io/en/latest/).

# Development
# Perfomance

## Insall dependencies
```
pip install -r requirements.dev.txt
pip install -e .
```
## Lock dependencies
```
pip-compile requirements.dev.in
```
Three basic segment tree operations were benchmarked for three different types `int`, `float` and `Vec2D`.
I included results for 3 other python segment trees libraries for comparison.
All code related to benchmarking can be found in `benchmarks` subdirectory.

## Test
```
pytest -v
```
* [segment-tree](https://github.com/evgeth/segment_tree)
* [segmenttree](https://github.com/1e0ng/segmenttree)
* [c-segment-tree](https://github.com/gilaniasher/segtree-c-python)

## Benchmark
```
python benchmarks/benchmark.py
```
## init
| Param | Value |
| --------- | ------- |
| Tree size | 100 000 |


[<img src="benchmarks/with_other_libs/data/init.png"/>](benchmarks/with_other_libs/data/init.png "init")

## query
| Param | Value |
| --------- | ------- |
| Tree size | 100 000 |
| Queries performed | 10 000 |

[<img src="benchmarks/with_other_libs/data/query.png"/>](benchmarks/with_other_libs/data/query.png "query")

## update
| Param | Value |
| --------- | ------- |
| Tree size | 100 000 |
| Updates performed | 10 000 |

[<img src="benchmarks/with_other_libs/data/update.png"/>](benchmarks/with_other_libs/data/update.png "update")


# Development
Read more [here](DEVELOPMENT.md).
Empty file added bench_results.json
Empty file.
1 change: 1 addition & 0 deletions benchmarks/with_other_libs/data/bench_results.json
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
{"init": [{"lib": "segment_tree", "type": "int", "size": 100000, "result": [0.4950565179969999, 0.49545792899880325, 0.49299253100616625, 0.501899640999909, 0.49569579899980454]}, {"lib": "segment_tree", "type": "float", "size": 100000, "result": [0.516590555998846, 0.529954080004245, 0.5295553499963717, 0.5284192909966805, 0.5341700610006228]}, {"lib": "segment_tree", "type": "Vec2D", "size": 100000, "result": null}, {"lib": "segmenttree", "type": "int", "size": 100000, "result": [3.436378117003187, 3.359400483997888, 3.3743358140054625, 3.3423399679959402, 3.395128267999098]}, {"lib": "segmenttree", "type": "float", "size": 100000, "result": [3.5194874169974355, 3.54250311999931, 3.497459190002701, 3.459336081999936, 3.4730496590054827]}, {"lib": "segmenttree", "type": "Vec2D", "size": 100000, "result": null}, {"lib": "c-segment-tree", "type": "int", "size": 100000, "result": [0.0035811330017168075, 0.0033734430035110563, 0.0034247379953740165, 0.003644535994681064, 0.0037995039965608157]}, {"lib": "c-segment-tree", "type": "float", "size": 100000, "result": null}, {"lib": "c-segment-tree", "type": "Vec2D", "size": 100000, "result": null}, {"lib": "pysegmenttree", "type": "int", "size": 100000, "result": [0.0035756060024141334, 0.003142731999105308, 0.0032137849993887357, 0.0032744030031608418, 0.003281003999290988]}, {"lib": "pysegmenttree", "type": "float", "size": 100000, "result": [0.00233873200340895, 0.002194407003116794, 0.0020830599969485775, 0.002199442002165597, 0.00207147300534416]}, {"lib": "pysegmenttree", "type": "Vec2D", "size": 100000, "result": [0.09601272299914854, 0.09568678100185934, 0.09474269799829926, 0.09465558900410542, 0.09999653500563]}], "query": [{"lib": "segment_tree", "type": "int", "size": 100000, "result": [3.6246276259989827, 3.985174123001343, 3.9856986659942777, 3.925152572002844, 3.931508036002924]}, {"lib": "segment_tree", "type": "float", "size": 100000, "result": [3.974614147002285, 3.8920451360027073, 3.915745499005425, 4.05408359500143, 4.118509927000559]}, {"lib": "segment_tree", "type": "Vec2D", "size": 100000, "result": null}, {"lib": "segmenttree", "type": "int", "size": 100000, "result": [2.0365582829981577, 1.9988330940032029, 2.106063384002482, 2.128903099997842, 2.0784016090037767]}, {"lib": "segmenttree", "type": "float", "size": 100000, "result": [2.0103782629958005, 2.24844694800413, 2.5023136640011217, 2.120354243001202, 2.1585201090056216]}, {"lib": "segmenttree", "type": "Vec2D", "size": 100000, "result": null}, {"lib": "c-segment-tree", "type": "int", "size": 100000, "result": [0.07157418800488813, 0.06825177700375207, 0.06837935699877562, 0.06721083999582333, 0.06763002799561946]}, {"lib": "c-segment-tree", "type": "float", "size": 100000, "result": null}, {"lib": "c-segment-tree", "type": "Vec2D", "size": 100000, "result": null}, {"lib": "pysegmenttree", "type": "int", "size": 100000, "result": [0.04932399899553275, 0.04654256200592499, 0.045853061994421296, 0.046224268997320905, 0.04585722499905387]}, {"lib": "pysegmenttree", "type": "float", "size": 100000, "result": [0.045541424995462876, 0.04557060400111368, 0.046319664994371124, 0.04528419399866834, 0.0463766510001733]}, {"lib": "pysegmenttree", "type": "Vec2D", "size": 100000, "result": [1.4812883819977287, 1.6258879259985406, 1.68618632399739, 1.5976089920004597, 1.5695207160024438]}], "update": [{"lib": "segment_tree", "type": "int", "size": 100000, "result": [5.520094412997423, 5.841197776004265, 5.150616339000408, 5.427858158996969, 5.542583589005517]}, {"lib": "segment_tree", "type": "float", "size": 100000, "result": [5.425468474997615, 5.465630778002378, 5.662542193997069, 5.739678728001309, 5.280571990995668]}, {"lib": "segment_tree", "type": "Vec2D", "size": 100000, "result": null}, {"lib": "segmenttree", "type": "int", "size": 100000, "result": null}, {"lib": "segmenttree", "type": "float", "size": 100000, "result": null}, {"lib": "segmenttree", "type": "Vec2D", "size": 100000, "result": null}, {"lib": "c-segment-tree", "type": "int", "size": 100000, "result": [0.051225787996372674, 0.056256093994306866, 0.05795971499901498, 0.05704092999803834, 0.053529493998212274]}, {"lib": "c-segment-tree", "type": "float", "size": 100000, "result": null}, {"lib": "c-segment-tree", "type": "Vec2D", "size": 100000, "result": null}, {"lib": "pysegmenttree", "type": "int", "size": 100000, "result": [0.03714599100203486, 0.04030320300080348, 0.040829017001669854, 0.04023172899906058, 0.03740006700536469]}, {"lib": "pysegmenttree", "type": "float", "size": 100000, "result": [0.04381488499348052, 0.0434026270013419, 0.04631680300371954, 0.04376739299914334, 0.04007001599529758]}, {"lib": "pysegmenttree", "type": "Vec2D", "size": 100000, "result": [2.3138798520012642, 2.262647192001168, 2.26163187000202, 2.2630646460020216, 2.5101676029953524]}]}
Binary file added benchmarks/with_other_libs/data/init.png
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added benchmarks/with_other_libs/data/query.png
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added benchmarks/with_other_libs/data/update.png
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
156 changes: 156 additions & 0 deletions benchmarks/with_other_libs/main.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,156 @@
import collections
import functools
import json
import pathlib
import random
import timeit

from benchmarks.with_other_libs.wrappers import *
from pysegmenttree.test_utils import Vec2D

DATA_DIR = pathlib.Path(__file__).parent / "data"
RESULT_FILE_NAME = DATA_DIR / "bench_results.json"

LIBS = [
Segment_TreeWrapper,
SegmentTreeWrapper,
CSegmentTreeWrapper,
PySegmentTreeWrapper,
]
REFRENCE_SIZE = 100_000
TYPES = [int, float, Vec2D]
QUERY_COUNT = 10_000
DEF_RANGE = 10_000


def typ_random(typ: type):
if typ is int:
return random.randint(-DEF_RANGE, DEF_RANGE)
elif typ is float:
return random.randint(-DEF_RANGE, DEF_RANGE) * random.random()
elif typ is Vec2D:
return Vec2D(
random.randint(-DEF_RANGE, DEF_RANGE), random.randint(-DEF_RANGE, DEF_RANGE)
)


@functools.lru_cache
def generate_data(typ: type, size: int):
random.seed(42)
return [typ_random(typ) for _ in range(size)]


@functools.lru_cache
def generate_sum_queries(size: int):
random.seed(42)
return [
sorted([random.randint(0, size - 1), random.randint(0, size - 1)])
for _ in range(size)
]


@functools.lru_cache
def generate_update_queries(typ: type, size: int):
random.seed(42)
return [[random.randint(0, size - 1), typ_random(typ)] for _ in range(size)]


def bench_init(kls: type, source: list):
def func():
return kls(source, "sum")

context = {**globals(), **locals()}
try:
return timeit.repeat(
"func()",
globals=context,
number=1,
repeat=5,
)
except NotImplementedError:
return


def bench_query_sum(kls: type, source: list, queries: list):
random.seed(42)

def func(tree):
for left, right in queries:
tree.query_sum(left, right)

context = {**globals(), **locals()}
try:
return timeit.repeat(
"func(tree)",
setup="tree = kls(source, 'sum')",
globals=context,
number=1,
repeat=5,
)
except NotImplementedError:
return


def bench_update_sum(kls: type, source: list, queries: list):
random.seed(42)

def func(tree):
for i, el in queries:
tree.update(i, el)

context = {**globals(), **locals()}
try:
return timeit.repeat(
"func(tree)",
setup="tree = kls(source, 'sum')",
globals=context,
number=1,
repeat=5,
)
except NotImplementedError:
return


def main():
results = collections.defaultdict(list)
for kls in LIBS:
lib = kls.LIB
print(f"Testing {lib}")

for typ in TYPES:
source = generate_data(typ, REFRENCE_SIZE)
results["init"].append(
{
"lib": lib,
"type": typ.__name__,
"size": REFRENCE_SIZE,
"result": bench_init(kls, source),
}
)

sum_queries = generate_sum_queries(REFRENCE_SIZE)
results["query"].append(
{
"lib": lib,
"type": typ.__name__,
"size": REFRENCE_SIZE,
"result": bench_query_sum(kls, source, sum_queries),
}
)

update_queries = generate_update_queries(typ, REFRENCE_SIZE)
results["update"].append(
{
"lib": lib,
"type": typ.__name__,
"size": REFRENCE_SIZE,
"result": bench_update_sum(kls, source, update_queries),
}
)

with open(RESULT_FILE_NAME, "w") as f:
json.dump(results, f)


if __name__ == "__main__":
main()
76 changes: 76 additions & 0 deletions benchmarks/with_other_libs/plotter.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,76 @@
import collections
import itertools
import json
import statistics

import matplotlib.pyplot as plt
import numpy as np

from benchmarks.with_other_libs.main import (
DATA_DIR,
QUERY_COUNT,
REFRENCE_SIZE,
RESULT_FILE_NAME,
)


def plot():
bench_data = json.load(open(RESULT_FILE_NAME))

graph_titles = {
"init": f"init",
"query": f"query",
"update": f"update",
}

for bench_name, results in bench_data.items():
lib_labels = collections.defaultdict(list)
bars_count = collections.defaultdict(int)

for result in results:
if result["result"]:
res = statistics.mean(result["result"]) * 1000
bars_count[result["type"]] += 1
else:
res = 0
lib_labels[result["lib"]].append(res)

labels = list(bars_count)
bars = list(bars_count.values())
x = list(range(len(labels)))

width = 0.25
fig, ax = plt.subplots()

# Complex logic to support different amount of bars
current_coords = [
xi - width * (bc - 1) / 2 if bc > 1 else xi for xi, bc in zip(x, bars)
]
for i, (typ, results) in enumerate(lib_labels.items()):
x_coords = []
for i, res in enumerate(results):
x_coords.append(current_coords[i])
if res > 0 and bars[i] > 1:
current_coords[i] += width

rects = ax.bar(x_coords, results, width, label=typ)
ax.bar_label(rects, fmt="%.2f", padding=3)

ax.set_ylabel("Time, ms")
ax.set_yscale("log")
ax.set_title(graph_titles.get(bench_name))
ax.set_xticks(x)
ax.set_xticklabels(labels)

# ax.legend(bbox_to_anchor=(1,0), loc="lower left")
ax.legend()
ax.margins(0.05, 0.6)

fig.tight_layout()

fig.savefig(DATA_DIR / "{}.png".format(bench_name))
plt.close()


if __name__ == "__main__":
plot()

0 comments on commit d280cf4

Please sign in to comment.