Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize IPCProvider.make_request #842

Closed
medvedev1088 opened this issue May 12, 2018 · 5 comments
Closed

Optimize IPCProvider.make_request #842

medvedev1088 opened this issue May 12, 2018 · 5 comments

Comments

@medvedev1088
Copy link
Contributor

medvedev1088 commented May 12, 2018

  • Version: 4.2.1

What was wrong?

In the current implementation the response will be parsed on every recv from the socket https://github.com/ethereum/web3.py/blob/master/web3/providers/ipc.py#L175

If the response is big it can become a bottleneck. E.g. in ETL tasks https://github.com/medvedev1088/ethereum-etl

How can it be fixed?

Is it possible to check the last bytes of the response first to see if they are a valid JSON RPC terminating character such as },] and only if true try to parse JSON?

@pipermerriam
Copy link
Member

Some thoughts.

  • seems viable, lets try it.
  • I'd like to take a benchmark before so we can have an objective performance metric
  • Our integration tests should catch most classes of errors for this approach.

Any chance you'd like to take a stab at this?

@medvedev1088
Copy link
Contributor Author

Sure. For the benchmark I'm going to

  • Pick some API with big response (probably eth_getBlockByNumber for a block with many transactions)
  • Run IPCProvider.make_request with this API a few thousand times against geth or parity
  • Run it with the "},]" optimization
  • Check the performance gain

Let me know if you have other ideas.
Thanks.

@medvedev1088
Copy link
Contributor Author

medvedev1088 commented May 13, 2018

I tried this script:

import argparse
import socket
import threading
import time
from json import JSONDecodeError

from web3 import IPCProvider
from web3.providers.base import JSONBaseProvider
from web3.providers.ipc import get_default_ipc_path, PersistantSocket
from web3.utils.threads import Timeout

parser = argparse.ArgumentParser(description='web3py IPCProvider benchmarks')
parser.add_argument('--ipc-path', required=True, type=str, help='The path to the IPC file.')
parser.add_argument('--block-number', default=237368, type=int, help='The block number to benchmark against.')

args = parser.parse_args()


# Copy pasted from web3/providers/ipc.py only added } optimization.
class OptimizedIPCProvider(JSONBaseProvider):
    _socket = None

    def __init__(self, ipc_path=None, testnet=False, timeout=10, *args, **kwargs):
        if ipc_path is None:
            self.ipc_path = get_default_ipc_path(testnet)
        else:
            self.ipc_path = ipc_path

        self.timeout = timeout
        self._lock = threading.Lock()
        self._socket = PersistantSocket(self.ipc_path)
        super().__init__(*args, **kwargs)

    def make_request(self, method, params):
        request = self.encode_rpc_request(method, params)

        with self._lock, self._socket as sock:
            try:
                sock.sendall(request)
            except BrokenPipeError:
                # one extra attempt, then give up
                sock = self._socket.reset()
                sock.sendall(request)

            raw_response = b""
            with Timeout(self.timeout) as timeout:
                while True:
                    try:
                        raw_response += sock.recv(4096)
                    except socket.timeout:
                        timeout.sleep(0)
                        continue
                    if raw_response == b"":
                        timeout.sleep(0)
                    else:
                        try:
                            # Check if the last character or the one before last is closing brace
                            if raw_response[-1:] == b"}" or raw_response[-2:-1] == b"}":
                                response = self.decode_rpc_response(raw_response)
                            else:
                                timeout.sleep(0)
                                continue
                        except JSONDecodeError:
                            timeout.sleep(0)
                            continue
                        else:
                            return response


def benchmark(ipc_provider, ipc_provider_name):
    start = time.time()
    for i in range(0, 100):
        ipc_provider.make_request('eth_getBlockByNumber', [hex(args.block_number), True])
    end = time.time()
    print('Running time for {} is {} seconds'.format(ipc_provider_name, (end - start)))


benchmark(IPCProvider(args.ipc_path), 'IPCProvider')
benchmark(OptimizedIPCProvider(args.ipc_path), 'OptimizedIPCProvider')

The output is

Running time for IPCProvider is 1.700517177581787 seconds
Running time for OptimizedIPCProvider is 0.4113609790802002 seconds

It's about 75% gain. I'm gonna try it on bigger blocks tomorrow.

This is the line that does the check for closing brace:
if raw_response[-1:] == b"}" or raw_response[-2:-1] == b"}":

That's a quick and dirty way just for testing the idea. The last character is usually \n so I guess I'll need to "right trim" the response before checking for the last character.

Update
For --block-number=4775653 (has 381 transactions) the output is:

Running time for IPCProvider is 9.711205005645752 seconds
Running time for OptimizedIPCProvider is 1.5380918979644775 seconds

Almost 10 fold gain.

I also tried switching the order in which benchmarks are run, i.e. optimized version first followed by the current version - still 10 fold gain.

@pipermerriam
Copy link
Member

That looks like solid gains. You might check and see if raw_response.endswith(b'}') is faster. There is a decent chance that it save a few object allocations and comparisons, or does them in a faster way. +1 to a pull request with these changes if you're up for it.

@medvedev1088 medvedev1088 changed the title Possibility to optimize IPCProvider.make_request Optimize IPCProvider.make_request May 21, 2018
@medvedev1088
Copy link
Contributor Author

Fixed here #849

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants