Events API project #1349

vrde · 2017-03-28T15:15:48Z

This PR is the hub to all the sub-PRs related to #1086 and specifically the WebSocket event stream API.

codecov-io · 2017-03-28T15:24:52Z

Codecov Report

Merging #1349 into master will increase coverage by 0.15%.
The diff coverage is 96.06%.

@@            Coverage Diff             @@
##           master    #1349      +/-   ##
==========================================
+ Coverage   98.21%   98.37%   +0.15%     
==========================================
  Files          54       56       +2     
  Lines        2469     2587     +118     
==========================================
+ Hits         2425     2545     +120     
+ Misses         44       42       -2

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update c64a35c...414d915. Read the comment docs.

Updated tests.

sbellem · 2017-04-07T09:20:32Z

bigchaindb/web/websocket_server.py

+            queue.put_nowait(value)
+            return
+        except asyncio.QueueFull:
+            queue.get_nowait()


A try except else pattern could be used here, for readability and style reasons, since obviously not much can go wrong with the return statement :).

try: queue.put_nowait(value) except asyncio.QueueFull: queue.get_nowait() else: return

some related links:

http://stackoverflow.com/a/16138864/2769475

http://stackoverflow.com/questions/855759/python-try-else

…-api-first-cut

sbellem

code cov!

sbellem · 2017-04-11T14:33:16Z

tests/pipelines/test_election.py

+    # put an valid block event in the queue
+    e.handle_block_events({'status': Bigchain.BLOCK_VALID}, block_id)
+    event = e.event_handler.get_event()
+    assert event.type == EventTypes.BLOCK_VALID


We can treat this in a separate issue, as a clean up task, but I'll take the opportunity to share some thoughts on one possible approach regarding the imports in tests. There's this idea of not importing a module-under-test (MUT) at module scope. One helpful reference for this approach is at http://pylonsproject.org/community-unit-testing-guidelines.html

The module-under-test (MUT) can be a function, class, or method for instance. In the test above one could say that the MUT is the class Election. As per the recommendation of the approach being discussed here, one would then import the Election class under the test testing it. The reason that is given for doing so is:

Import failures in the module-under-test (MUT) should cause individual test cases to fail, and they should never prevent those tests from being run. Depending on the test runner, import problems may be much harder to distinguish at a glance than normal test failures.

So that's it. We do not need to be maniacal about it though 😄 as this is most likely debatable.

I'm not too hot on this requirement. I've yet to see a circumstance where having the imports at the top of the file is masking a problem.

I personally like the MUT approach, it makes the unit–test self contained. I'd defer it to another PR tho.

test that the process is started with the events_queue kwargs

sbellem · 2017-04-11T16:44:08Z

tests/web/test_websocket_server.py

    from bigchaindb import events
    from bigchaindb.web.websocket_server import init_app, POISON_PILL, EVENTS_ENDPOINT

    event_source = asyncio.Queue(loop=loop)
    app = init_app(event_source, loop=loop)
    client = yield from test_client(app)
    ws = yield from client.ws_connect(EVENTS_ENDPOINT)
-    block = create_block(b, 10).to_dict()
+    block = _block.to_dict()


General note for this commit:

Using a fixture is proposed here instead of a function because we already have some tests that also rely on having a block, and these tests also have their own function to create the dummy block. See https://github.com/bigchaindb/bigchaindb/blob/master/tests/db/test_bigchain_api.py#L26-L31 for instance.

We now have the opportunity to take care of this, hence this proposed change. The fixture could perhaps be renamed, and moved up into the main conftest module. If the proposed change is accepted, then both the rename and move can be easily taken care of.

OK but I believe the previous solution was more understandable:

def create_block(b, total=1): # ... def test_websocket_block_event(b, test_client, loop): # ... block = create_block(b, 10) # ...

vs

@pytest.fixture def _block(b, request): total = getattr(request, 'param', 1) # ... @pytest.mark.parametrize('_block', (10,), indirect=('_block',), ids=('block',)) def test_websocket_block_event(b, _block, test_client, loop): # ... block = _block.to_dict() # ...

Yes, I understand.

For this specific case if we make the default number of transactions in a block to be 10 then the decorator part is not needed. Moreover the fixture name could simplified, such that we end up with something like:

@pytest.fixture def block(b, request): total = getattr(request, 'param', 10) # ... def test_websocket_block_event(b, block, test_client, loop): # ... block_dict = block.to_dict() # ...

The whole point with this approach is that it provides us with a fixture we can re-use, and when we need to fine tune the quantity of transactions that the block contains we can do so with pytest's indirect parametrization mechanism.

I think that the complexity is worth it in this case given what it provides us with, but that is my opinion. Lastly, tests in which this pattern is encountered is not un-common.

sbellem · 2017-04-12T08:55:17Z

tests/web/test_websocket_server.py

+    sync_queue.put('we')
+    sync_queue.put('are')
+    sync_queue.put('the')
+    sync_queue.put('robots')


sync_queue.put('dreaming') sync_queue.put('we') sync_queue.put('are') sync_queue.put('humans')

ssadler · 2017-04-12T10:24:24Z

bigchaindb/__init__.py

@@ -59,6 +59,10 @@
        'workers': None,  # if none, the value will be cpu_count * 2 + 1
        'threads': None,  # if none, the value will be cpu_count * 2 + 1
    },
+    'wsserver': {
+        'host': os.environ.get('BIGCHAINDB_WSSERVER_HOST') or 'localhost',


My personal preference would be to s/wsserver/websocket/g, little bit more readable, but that's personal.

I noticed that the ws prefix is used quite a lot to name libraries and variables. I was thinking if using websocket or wsserver. At the end I thought that websocket is the name of a protocol, while wsserver is more aligned with the configuration key server. I'd like to rename server to apiserver actually 😅.

ssadler · 2017-04-12T10:28:55Z

bigchaindb/events.py

+        self.events_queue = events_queue
+
+    def put_event(self, event, timeout=None):
+        # TODO: handle timeouts


Will these TODOs be addressed before merging? I don't think we should have "TODO" comments in master, but there should be issues addressing the requirement.

@libscott: hybrid approach, it's fine to have TODOs as long as they reference an issue in our tracker. What do you think?

@r-marques: can you write an issue for this TODO please?

I do not mind TODOs.

Personally i use TODO to indicate something i need to address in a PR. If there are TODOs in master its no problem, i will choose another random word to use.

ssadler · 2017-04-12T10:31:11Z

bigchaindb/web/views/info.py

            '_links': {
                'docs': ''.join(docs_url),
                'self': api_root,
                'statuses': api_root + 'statuses/',
                'transactions': api_root + 'transactions/',
+                # TODO: The version should probably not be hardcoded


I think it should be hardcoded, the point is to have a stable API.

ssadler · 2017-04-12T10:32:26Z

bigchaindb/web/websocket_server.py

+
+
+logger = logging.getLogger(__name__)
+POISON_PILL = 'POISON_PILL'


ssadler · 2017-04-12T10:38:04Z

bigchaindb/web/websocket_server.py

+
+                for tx in block['block']['transactions']:
+                    asset_id = tx['id'] if tx['operation'] == 'CREATE' else tx['asset']['id']
+                    data = {'blockid': block['id'],


I'd prefer block_id, asset_id, tx_id, though I've been guilty of using txid in the past.

I just followed the specification... but 💯% agree here 💃 🕺

Lemme ask quickly to the team.

Question

Given the original message:

{ "txid": "<sha3-256 hash>", "assetid": "<sha3-256 hash>", "blockid": "<sha3-256 hash>" }

Do you agree in changing the message using those new keys?

{ "tx_id": "<sha3-256 hash>", "asset_id": "<sha3-256 hash>", "block_id": "<sha3-256 hash>" }

Yes: 👍
No: 👎

I ~~am voting~~ initially voted down because our current HTTP api uses txid in multiple response payloads.

I ~~think~~ thought that we should be consistent, such that if we use tx_id we use it in all responses.

But as @ttmc pointed out and as #1134 points out the long term points in the direction of using underscores.

In broader terms, I see that as an API design specification question, and it has already been answered for the HTTP API, and in some way also for the websocket API. It is too late right now to change it I think.

If we wish to change the specifications (HTTP and websocket APIs) we can:

do so after the planned release v0.10

change everything now (both HTTP and websocket APIs)

only change the websocket API now for v0.10 and change the HTTP API for v0.11

In any case, a change to the HTTP API will break clients.

link to HTTP API docs (grep txid): https://docs.bigchaindb.com/projects/server/en/latest/drivers-clients/http-client-server-api.html

We're already inconsistent with txid and tx_id. We use tx_id in HTTP API endpoints, but txid inside transactions (e.g. inputs.fulfills.txid); see issue #1134

Yup. How about putting a stake in the ground now:

Long term we want to separate all our object IDs with an underscoreobject_id (e.g. block_id, tx_id, ...)

[http api] query parameters are "tx_id" meanwhile the payloads are "txid" #1134 is completed by renaming all ids to the above schema

For this PR, we change the implementation now and it's specification because we just agreed on this schema.

OK?

I agree with using tx_id in place of txid from hereon, at least as far as APIs are concerned (local variables are another thing).

Updated my comments and vote. Basically going for:

only change the websocket API now for v0.10 and change the HTTP API for v0.11

Yes, I agree that consistency is better (at least in the publicly-visible stuff), and tx_id is easier to read.

ssadler · 2017-04-12T10:39:34Z

bigchaindb/web/websocket_server.py

+def _put_into_capped_queue(queue, value):
+    """Put a new item in a capped queue.
+
+    If the queue reached its limit, get the first element


So, to clarify, this queue will lose messages rather than block if the consumer doesn't empty it fast enough? If that's the case, why is that the desired behaviour?

Uhm, I see what you mean. I don't have a good answer for that. My assumption is that creating blocks is computationally more intense than dispatching messages to the connected clients, but I've never tested it. Let's address this issue when we have more tools to do stress testing.

I think since we don't have the problem of high throughput right now, it might be safer to make it a blocking queue, ie, doesn't drop messages. If a client is not reading fast enough, then the server may get blocked waiting for their socket to become writeable. In that case, the slow client socket should be somehow dealt with. If it isn't dealt with, a slow client may interfere with what other clients see just by connecting and reading slowly.

In that case, the slow client socket should be somehow dealt with. If it isn't dealt with, a slow client may interfere with what other clients see just by connecting and reading slowly.

AFAIK this should be handled by aiohttp itself. If a slow client cannot keep up with the stream, the output buffer will be drained.

I'll investigate a bit more and comment again.

@libscott: I spent quite some time on it but I'm still not 100% convinced.

I've opened an issue in the aiohttp repo:

What happens when a WebSocket Server is publishing faster than a Client? aio-libs/aiohttp#1823

ssadler · 2017-04-12T10:43:32Z

tests/web/test_info.py

@@ -31,5 +31,6 @@ def test_api_v1_endpoint(client):
            'self': 'http://localhost/api/v1/',
            'statuses': 'http://localhost/api/v1/statuses/',
            'transactions': 'http://localhost/api/v1/transactions/',
+            'streams_v1': 'ws://localhost:9985/api/v1/streams/valid_tx',


Perhaps a trailing forwardslash to be consistent with the other paths?

There is a bit of personal interpretation in the following statement.

The difference is that http://localhost/api/v1/transactions/ represents a "directory". Under it you can find transactions that are the "files".

On the other side ws://localhost:9985/api/v1/streams/valid_tx is a "file" since it's the stream endpoint.

We could argue if streams_v1 should point to ws://localhost:9985/api/v1/streams/valid_tx or http://localhost:9985/api/v1/streams/

Oh, that kinda makes sense then

ssadler · 2017-04-12T10:44:23Z

Please address codecov 👍

sbellem · 2017-04-12T12:39:23Z

fyi: I am fixing the codecov problem

ssadler · 2017-04-12T13:46:26Z

bigchaindb/web/websocket_server.py

+
+            for _, websocket in self.subscribers.items():
+                for str_item in str_buffer:
+                    websocket.send_str(str_item)


Are these blocking calls?

Nope! Weird, right? I didn't have time to dig into the implementation details. I guess that when you call send_str you are writing to a buffer that will be eventually emptied.

ssadler

Pending item is to make it so that queue does not drop messages

vrde closed this Mar 29, 2017

vrde force-pushed the events-api-first-cut branch from e18386d to c6de90f Compare March 29, 2017 11:13

vrde reopened this Mar 30, 2017

vrde force-pushed the events-api-first-cut branch from f1510a9 to 02fa788 Compare April 3, 2017 11:51

vrde requested review from ssadler and removed request for ssadler April 3, 2017 11:52

vrde force-pushed the events-api-first-cut branch from 364d9ed to b379ed0 Compare April 6, 2017 15:56

r-marques and others added 10 commits April 7, 2017 08:42

Initial implementation of an event_handler

0cbf144

Add dependencies and first test

5d39b42

Add more tests and utils

83397de

Adverstise Event stream api in api info endpoint.

96daa98

Updated tests.

fix tests

83a7cff

cleanup code

730b748

fixed pep8 issue

bcc2e1f

fix pep8 issue

a92c091

Code cleanup, rename some vars

64a033b

Add WebSocket server

f23faaa

vrde force-pushed the events-api-first-cut branch from 33622dc to f23faaa Compare April 7, 2017 07:16

Add configuration for websocket server

d260e16

sbellem reviewed Apr 7, 2017

View reviewed changes

vrde and others added 5 commits April 7, 2017 14:07

Update documentation (tnx @ttmc)

be76302

Use try..except..else

aeb8827

Update endpoints and docs

be3f62d

added tests for the events

da29bbc

Merge remote-tracking branch 'origin/event-handler-tests' into events…

709eced

…-api-first-cut

sbellem suggested changes Apr 7, 2017

View reviewed changes

vrde mentioned this pull request Apr 11, 2017

[proposal] Fixture for acceptance tests #1384

Closed

Merge branch 'master' into events-api-first-cut

3758d45

vrde changed the title ~~[WIP] Events API project~~ Events API project Apr 11, 2017

sbellem reviewed Apr 11, 2017

View reviewed changes

vrde and others added 4 commits April 11, 2017 16:34

Add more code coverage

a673d9c

Refine test for the election pipeline process

7999784

test that the process is started with the events_queue kwargs

Re-order imports (pep8)

e0e9977

Make utility test function into a fixture

98e52e0

sbellem reviewed Apr 11, 2017

View reviewed changes

sbellem reviewed Apr 12, 2017

View reviewed changes

ssadler reviewed Apr 12, 2017

View reviewed changes

bigchaindb/web/websocket_server.py

logger = logging.getLogger(__name__)

POISON_PILL = 'POISON_PILL'

Copy link

Contributor

ssadler Apr 12, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🦇

ssadler reviewed Apr 12, 2017

View reviewed changes

sbellem added 3 commits April 12, 2017 13:47

Import stdlib pkgs at the top of the test module

75dd645

Import Transaction class within fixture

e614834

Add a few more checks to the test

0347fbc

sbellem approved these changes Apr 12, 2017

View reviewed changes

Fix typos

2bedc9b

ssadler reviewed Apr 12, 2017

View reviewed changes

vrde and others added 4 commits April 12, 2017 15:54

Remove TODO

4c9aded

Test command helper _run_init

a7ed28e

Test command run_init when db already exists

303e12e

Snakecaseify keys

414d915

ssadler suggested changes Apr 13, 2017

View reviewed changes

ssadler approved these changes Apr 18, 2017

View reviewed changes

vrde merged commit 414d915 into master Apr 18, 2017

vrde deleted the events-api-first-cut branch March 21, 2018 15:02



		logger = logging.getLogger(__name__)
		POISON_PILL = 'POISON_PILL'

Events API project #1349

Events API project #1349

Conversation

vrde commented Mar 28, 2017

codecov-io commented Mar 28, 2017 • edited

Codecov Report

sbellem Apr 7, 2017 • edited

Choose a reason for hiding this comment

sbellem left a comment

Choose a reason for hiding this comment

sbellem Apr 11, 2017 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sbellem Apr 11, 2017 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sbellem Apr 12, 2017 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

vrde Apr 12, 2017 • edited

Choose a reason for hiding this comment

Question

sbellem Apr 12, 2017 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sbellem Apr 12, 2017 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

TimDaub Apr 12, 2017 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sbellem Apr 12, 2017 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ssadler commented Apr 12, 2017

sbellem commented Apr 12, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ssadler left a comment

Choose a reason for hiding this comment

codecov-io commented Mar 28, 2017 •

edited

sbellem Apr 7, 2017 •

edited

sbellem Apr 11, 2017 •

edited

sbellem Apr 11, 2017 •

edited

sbellem Apr 12, 2017 •

edited

vrde Apr 12, 2017 •

edited

sbellem Apr 12, 2017 •

edited

sbellem Apr 12, 2017 •

edited

TimDaub Apr 12, 2017 •

edited

sbellem Apr 12, 2017 •

edited