Push notifications on data changes (client-server data sync) #300

Mec-iS · 2018-12-05T10:56:34Z

I'm submitting a

[x ] feature request.

Current Behaviour:

There is no way for now to make the client-side graph representation of the data aware of data changes. Clients have to query the server every time to be sure that data has not been updated since the last query. This opens to the possibility of the client using stale data.

Expected Behaviour:

hydrus should allows the clients to connect through WebSocket to push data changes for every object directly to the client.
Maybe also needed some sort of security mechanism should be present; server keeps a table for its outbox and client keeps an inbox. The server keeps a log of changes in an outbox table, the client every time before sending a request should check the server's outbox if its inbox is synced with the latest changes. This implies that every request is under-the-hood two requests.
This refers to #218 as the inbox/outbox may be a DAG.

Do you want to work on this issue?

This will probably be a task to be part of GSOC 2019

Please start collecting here ideas for implementation, tell the community how you would like to see implemented this feature.

shravandoda · 2019-03-22T13:37:07Z

@Mec-iS shouldn't this work both ways. Rather than just making client aware of data changes on the server-side, client should be able to push changes to server as well (given it has been authorized). I don't know if hydrus currently supports any way to push changes to server database. Please guide.

vddesai1871 · 2019-03-22T13:41:32Z

I do not think the client is supposed to change HydraDoc. If I am correct, this feature is mainly about pushing data changes made by other clients to any client connected with that instance of hydrus.

shravandoda · 2019-03-22T14:35:23Z

@vddesai1871 I think hydra-python-agent can add new instances of resources.. It might not be able to add new classes or collections. Please take a look
https://github.com/HydraCG/Specifications/blob/master/drafts/use-cases/5.1.creating-event-with-put.md

vddesai1871 · 2019-03-22T14:38:33Z

@vddesai1871 I think hydra-python-agent can add new instances of resources.. It might not be able to add new classes or collections.

That's what I wrote above

shravandoda · 2019-03-22T14:39:42Z

I guess I should've said server database instead of HydraDoc

vddesai1871 · 2019-03-22T14:47:45Z

client should be able to push changes to server as well (given it has been authorized)

Client does this by standard HTTP methods. (through operations available/provided in HydraDoc).
We need push mechanism at server to propagate such individual changes to other clients connected to the server. (So data at every client remains synchronized with the server data).

Mec-iS · 2019-03-23T11:42:43Z

Absolutely not, only source of truth for data shall be the server.

shravandoda · 2019-05-03T05:14:55Z

Can we use some stateful protocol for synchronization between client and server?

Mec-iS · 2019-05-03T09:50:29Z

The best approach is something like a PubSub (with Redis as we have already it available) or better an inboxes system like the one used by actors in an Actor Model; the server and clients have each of them an inbox; messages are in a queue that is consumed by the actor. Types of messages can be for example: PUT this payload in resource with id, POST this payload in resource with id, GET resource with id.

Guttz · 2019-05-11T16:48:40Z

After a bit of discussion with @vddesai1871 and inspiration from other comments here I would like to bring a solution into discussion that would use Websockets and an Inbox mechanism.

Websocket implementation

To implement the socket mechanism we would use FlaskSocketIO which is a Socket.io intregration for Flask applications and allows the client to be connected via a socket when it queries for the API Doc, also having connection inconsistency and etc.

Regarding the socket functionatily @vddesai1871 already developed an experiment and added some basic WebSocket support in hydrus and simulated a small client. You can run multiple clients and when making a request to hydrus it forwards the modification after successfully adding the resource in resource.py.

Inbox mechanism

Below is the inbox that is kept by the server and by the clients.

This solution introduces modifications in both hydrus and the Hydra Python Agent. At hydrus it's necessary to add the table log mechanism as well as a web socket available to notify modifications at the table. Client side, it should hold an own internal table, connect to the web socket and also be able to handle the four different situations described below to maintain the data updated.

Client Initialization

When starting a client, it will query the API basic structure and also copy the current server modification log table. After that, there are three situations that have to be addressed when dealing with new rows at the modification table: the client has internally an outdated resource that needs an update, the client never queried that resource and lastly the client was the one who made a transaction.

The client finds a JOB ID referring to an outdated resource

The client finds a new JOB ID, it queries it's Redis graph to check if it already has that specific resource,
if finding it, it will compare both the internal resource date and the one provided by the server. If the resource provided by the server is more recent, it will call itself internally to query that resource again from the hydrus server and update it internally accordingly.

The client finds a JOB ID to a nonexisting resource internally

The client finds a new JOB ID, it queries its Redis graph to check if it already has that specific resource, if not finding, meaning that the Client hasn't yet queried for that resource, the client can ignore the modification and simply add the row to its internal table since it's not relevant for the Client.

The Client finds a JOB ID made by itself on the table

When the Client finds a new job that isn't on its table, the first thing it has to do is to check if its internal resource has a Date signature after or similar to the one in the server table, if the internal client representation is more recent, basically the client adds that transaction to the table since the client has a more up to date resource that will be shown soon at the server table.

An important observation here is that, the Client should set the internal Date for an resource with the Date object sent as the response Header by the hydrus server(to make sure it uses an standard centralized date). Also, it should only set it's internal Date for a resource when receiving a successful response for the HTTP request sent to modify/create a resource on the server.

That's the overall concept. I've been trying to grasp more concepts from the Actor Model so the solution is more robust and can perhaps process some concurrent changes.

Guttz · 2019-07-19T12:05:44Z

[Following discussing at https://github.com/HTTP-APIs/hydra-python-agent/pull/123]
The server automatically launches a Flask server and creates a socket in the Namespace '/sync', all clients connect to that socket and listen for events.

The server has a limited sized modifications_table as the following:

JOB_ID	METHOD	RESOURCE_URL
fece6d5e...	POST	http://server.com/Collection/98e8e272-e5ae-4f1a-a0b2-117fb052ca50
aaa49974...	DELETE	http://server.com/Collection/f1404e8d-0a52-4359-88c3-29ec9f208525
360df976...	POST	http://server.com/Collection/f1404e8d-0a52-4359-88c3-29ec9f208525

All clients when connecting will copy the last_job_id(fece6d5e here).
When hydrus has POST and DELETE modifications, it simply informs all clients that "there's updates".
Clients send hydrus the last job id it had processed(fece6d5e), and will get all the new rows above fece6d5e and process them.
After that, it will update its last_job_id variable with the new last_job_id on the server.

In the end, hydrus has three core implementations:

A socket connection that broadcasts that there were new events to all Clients
A limited size table that should contain modifications made to resources with POST and DELETE
An endpoint for the clients to fetch the modifications table difference:

@app.route('/modification-table-diff')

This receives a PARAM with a Job ID and sends the table diff according to the last updated resource the Agent had.
GET Example: https://localhost:5000/modification-table-diff?agent_job_id=2
Obs.: If empty parameter, the endpoint returns the full table(For initialization purposes)

A simulated server, compatible and working with current Agent PR, simulates this behavior and is available at: https://github.com/Guttz/simulated-hydrus-sync-socket.

Mec-iS · 2019-09-20T14:00:21Z

@chrizandr please close if it is done.

Mec-iS added the GSOC-2019 label Dec 5, 2018

This was referenced Dec 22, 2018

Research possible improvements for the models and architecture #47

Closed

Implement the checksum mechanism we discussed over mail. #218

Closed

PrajwalM2212 mentioned this issue Jan 18, 2019

Ideas page for 2019. HTTP-APIs/http-apis.github.io#10

Closed

Mec-iS changed the title ~~Push notifications on data changes to client~~ Push notifications on data changes (client-server data sync) Mar 19, 2019

Mec-iS mentioned this issue Mar 21, 2019

Create a synchronisation mechanism for server-client udpates HTTP-APIs/hydra-python-agent#98

Closed

Mec-iS mentioned this issue Mar 23, 2019

Considering HTTP/2 support, porting to Quart #372

Closed

Mec-iS added this to To do in [GSOC-2019]__Enhancement Mar 25, 2019

HTTP-APIs deleted a comment from HarsheetKakar May 3, 2019

Guttz moved this from To do to In progress in [GSOC-2019]__Enhancement May 17, 2019

Mec-iS assigned vddesai1871 and Guttz Jun 7, 2019

Guttz mentioned this issue Jul 18, 2019

Synchronization Mechanism for Server-Client Updates HTTP-APIs/hydra-python-agent#123

Merged

2 tasks

vddesai1871 mentioned this issue Jul 23, 2019

Add support for client-server synchronization #414

Merged

2 tasks

Guttz moved this from In progress to Done in [GSOC-2019]__Enhancement Aug 23, 2019

chrizandr closed this as completed Sep 20, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Push notifications on data changes (client-server data sync) #300

Push notifications on data changes (client-server data sync) #300

Mec-iS commented Dec 5, 2018 •

edited

Loading

shravandoda commented Mar 22, 2019 •

edited

Loading

vddesai1871 commented Mar 22, 2019 •

edited

Loading

shravandoda commented Mar 22, 2019

vddesai1871 commented Mar 22, 2019

shravandoda commented Mar 22, 2019

vddesai1871 commented Mar 22, 2019 •

edited

Loading

Mec-iS commented Mar 23, 2019

shravandoda commented May 3, 2019

Mec-iS commented May 3, 2019

Guttz commented May 11, 2019

Guttz commented Jul 19, 2019 •

edited

Loading

Mec-iS commented Sep 20, 2019

Push notifications on data changes (client-server data sync) #300

Push notifications on data changes (client-server data sync) #300

Comments

Mec-iS commented Dec 5, 2018 • edited Loading

I'm submitting a

Current Behaviour:

Expected Behaviour:

Do you want to work on this issue?

shravandoda commented Mar 22, 2019 • edited Loading

vddesai1871 commented Mar 22, 2019 • edited Loading

shravandoda commented Mar 22, 2019

vddesai1871 commented Mar 22, 2019

shravandoda commented Mar 22, 2019

vddesai1871 commented Mar 22, 2019 • edited Loading

Mec-iS commented Mar 23, 2019

shravandoda commented May 3, 2019

Mec-iS commented May 3, 2019

Guttz commented May 11, 2019

Websocket implementation

Inbox mechanism

Client Initialization

The client finds a JOB ID referring to an outdated resource

The client finds a JOB ID to a nonexisting resource internally

The Client finds a JOB ID made by itself on the table

Guttz commented Jul 19, 2019 • edited Loading

Mec-iS commented Sep 20, 2019

Mec-iS commented Dec 5, 2018 •

edited

Loading

shravandoda commented Mar 22, 2019 •

edited

Loading

vddesai1871 commented Mar 22, 2019 •

edited

Loading

vddesai1871 commented Mar 22, 2019 •

edited

Loading

Guttz commented Jul 19, 2019 •

edited

Loading