New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

[OPEN] Test dumping performances #3126

Closed
wants to merge 17 commits into
base: master
from

Conversation

Projects
None yet
3 participants
@madt1m
Copy link
Member

madt1m commented May 14, 2018

First GSoC PR here! Good day :)

So, this should set the foundations to meaningfully test python protobuf implementation performances. The results are encouraging, as GET 4 MB pdf yields the following:
image

As documented in README, something has to be implemented yet, such as a way to loop dumps and average result, or to write blobs (and time the process) to SQLite DB.

Mind that this (in particular dummyhttp_pb2) requires protobuf package to work. To test the addon, I have just put the google package produced by python setup.py build from https://github.com/google/protobuf/tree/master/python
into my serialization directory.

Request for comments here! 馃槃

@kajojify

This comment has been minimized.

Copy link
Contributor

kajojify commented May 15, 2018

Congratulations! It is a great PR! I wish my PRs were like this one :D
The thing you may change is the style of imports disposition.
Your imports:

from mitmproxy import ctx
from mitmproxy.io import tnetstring
from mitmproxy.addons.serialization import protobuf
from mitmproxy import http
from functools import wraps
import sys
import time

I looked through all addons and they follow second paragraph of PEP8 imports recommendations:

Imports should be grouped in the following order:

1.standard library imports
2.related third party imports
3.local application/library specific imports
You should put a blank line between each group of imports.

So, it would be nice to have:

import sys
import time
from functools import wraps

from mitmproxy import ctx
from mitmproxy.io import tnetstring
from mitmproxy.addons.serialization import protobuf
from mitmproxy import http

Personally, I like when imports are placed by increasing their length:

import sys
import time
from functools import wraps

from mitmproxy import ctx
from mitmproxy import http
from mitmproxy.io import tnetstring
from mitmproxy.addons.serialization import protobuf

But it is absolutely up to you I think.
Thanks again and good luck!

@madt1m

This comment has been minimized.

Copy link
Member Author

madt1m commented May 15, 2018

Oh yeah, those import are not that pretty. Thanks for the advice, fixing that :D

@cortesi

This comment has been minimized.

Copy link
Member

cortesi commented May 15, 2018

This looks like an excellent start. Eventually, I would like to mature the serialisation benchmarks in /test/bench as part of our general benchmarking suite (though for now let's keep it in the addon directory). A few notes:

  • The test suite is failing because of a flaky client replay test. This isn't your fault, it's mine. I'll look at this separately.
  • I think the test here should be from flow objects all the way to disk. That means we can include the sqlite component in the test, which might be a significant bottleneck.
  • I'm going to push out 4.0 in the next few days. We'll start merging your work at that point.
  • We'll have to clean up the lint failures and exempt the auto-generated protobuf files from lint checks.
@madt1m

This comment has been minimized.

Copy link
Member Author

madt1m commented May 15, 2018

I think the test here should be from flow objects all the way to disk. That means we can include the sqlite component in the test, which might be a significant bottleneck.

Working on it. Will push in the next day(s) 馃榿

@madt1m

This comment has been minimized.

Copy link
Member Author

madt1m commented May 19, 2018

Ok so, new chapter in the saga 馃榿

Basically, I have stretched the dumping flow up until disk, so to have a more meaningful benchmarking of the process. What I have implemented in the past two days is mostly focused on the new DummySession class.

DummySession

This is quite dummy, actually:

  • It opens, and keeps opened, a connection with an SQLite DB
  • At first, it creates a DUMMY_SESSION table, with schema (MID, PBUF_BLOB)
  • Every time a protobuf blob is stored, the store method returns the mid of the inserted tuple, which can then be used as a ticket to retrieve the flow afterwards.

As @cortesi was guessing, writing to DB has quite an impact on performances:

image

image

image

ANYWAY, I am quite convinced that performances here can be a bit improved. I suppose that the major hit on perf time is due to me using with self._con: statements. This makes every INSERT query an isolated, committed, transaction. Now, quoting FAQs:

Actually, SQLite will easily do 50,000 or more INSERT statements per second on an average desktop computer. But it will only do a few dozen transactions per second. Transaction speed is limited by the rotational speed of your disk drive. A transaction normally requires two complete rotations of the disk platter, which on a 7200RPM disk drive limits you to about 60 transactions per second.

So, for me, it's time to study how to cut those numbers in the sshots 馃槃

@mhils @Kriechi join the pit! I think I need some uber-pythonic advice here!

"""
if self.f:
for s in self.serializers:
ctx.log('{} module: '.format(s))

This comment has been minimized.

@kajojify

kajojify May 19, 2018

Contributor

You used f-strings in all other places. Do you want to use f-string here as well? :)

This comment has been minimized.

@madt1m

madt1m May 19, 2018

Author Member

Good catch!

@cortesi

This comment has been minimized.

Copy link
Member

cortesi commented May 19, 2018

Great work. This is all very interesting. A few stream-of-consciousness thoughts here:

Commit unit

In practice, flows will be put on a queue for serialisation. At any one point in time, we will be able to inspect that queue, batch together all the flows that are there currently, and commit them as a single transaction. With asyncio queues, we can efficiently wait on new items in the queue without polling. We should explore this as an optimisation, and adjust the performance tests to be throughput tests - i.e. we add N flows per second to the queue in a constant stream, and see how long they take two write to disk. Ideally, we'd comfortably handle about N=150, which is our performance limit on a current reasonable system. This strategy will also let us handle spikes reasonably.

Hot and cold flows

It occurs to me that we might end up with a system where we have hot flows, that live in-memory, and haven't yet been shed to disk, and cold flows that have already been saved. When the user iterates through flows in one of the tools, the hot store is checked first, and then the cold store. We should keep an eye on whether this is necessary as we develop the state addon.

Separate flow bodies

I think we'll end up storing flow bodies separately from the flows themselves. There are many situations in which we only need to retrieve the flow metadata (i.e. the flow index in mitmproxy console, search that doesn't include content bodies, etc.). We should think about including this in the tests - flow bodies might live in a separate large blob storage table to optimize access patterns, or just in a separate table column.

@madt1m

This comment has been minimized.

Copy link
Member Author

madt1m commented May 20, 2018

@cortesi thanks for the notes. They inspired me some really useful thoughts on the task. For throughput testing purposes, do you think I should let the addon generate a constant stream of flows? Or instead, just request a great number of pages and intercept those flows?

@madt1m

This comment has been minimized.

Copy link
Member Author

madt1m commented May 31, 2018

So, some thoughts on the current scenario.

First, I realized that all this testing process is not a self-contained one. I am implicitly using these addons as a testing ground to actually build the basic blocks of the next-up serializer.

So, I've extended the test to the SQLite module, adding some asyncio spoons:

  • stream is producing a constant stream of flow blobs to an asyncio queue.
  • writer reads UP to N flows from the queue, and flushes them to the disk.
  • For every flush, the ratio N/time(_fflush()) is computed, and added to results.
  • Every 10 flushes, results are averaged.

Note that SQLite is NOT asynchronous yet, so writer periodically awakes to avoid a ping-pong between producer and consumer coroutines. I want to remove that sleep period.

Some results:
image

I feel I'm only scratching the surface. I'm reading some sweets SQLite docs on performance optimizations.

Next steps:

  • Improve unit performances, especially on SQLite side.
  • Improve integration performances, introducing non-blocking writes to DB and overall mechanisms.
  • Figure out the requirements and the design of Session storage.

@mitmproxy/devs any thoughts on the code?
over and out.

@cortesi

This comment has been minimized.

Copy link
Member

cortesi commented Jun 2, 2018

Hi there! I think we should get the streaming performance measurement ready to merge so we can move to the next step.

Measurements on my system match yours roughly. Expressed in flows saved per second (f/s), I see about 2k f/s for a 160k file, and about 50 f/s for a 5mb file. This is within acceptable range for ordinary use, and we should be able to improve that even more as we focus on performance down the track.

A few comments:

  • Could you please add the protobuf dependency to setup.py.
  • Please exempt *_pb2.py files from lint, and then we can also fix the linting issues in the hand-written code.
  • Do we still need the dumpwatcher addon, or is that now obsolete?
  • To get streamtester ready for a shift into ./test/bench, we should make it terminate and write a total report after N cycles (for some sensible choice of N).
@madt1m

This comment has been minimized.

Copy link
Member Author

madt1m commented Jun 2, 2018

Could you please add the protobuf dependency to setup.py.
Please exempt *_pb2.py files from lint, and then we can also fix the linting issues in the hand-written code.

I'll take care of this.

Do we still need the dumpwatcher addon, or is that now obsolete?

It only served the purpose of testing protobuf.dumps() performances in comparison with tnetstrings. I think it can be safely removed now, given the results.

By the way, I am already working on a more mature system, and this quite experimental testing addon will reflect the evolving changes.

@kajojify kajojify added the gsoc label Jun 8, 2018

@madt1m

This comment has been minimized.

Copy link
Member Author

madt1m commented Jul 22, 2018

Closing this - replicated, clean work in #3256.

@madt1m madt1m closed this Jul 22, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment