Collaboration access improvement #472

AntoLC · 2024-12-03T15:21:30Z

Purpose

Access improvement on the collaboration server.

Proposal

✨(backend) add subrequest auth view for collaboration server
✨(y-provider) endpoint POST /collaboration/api/reset-connections
✅(y-provider) add tests for y-provider server
✨(backend) notify collaboration server
🔧(helm) add ingress collaboration api

sampaccoud · 2024-12-07T21:29:42Z

docker/files/etc/nginx/conf.d/default.conf

+    location /collaboration-auth {
+        proxy_pass http://app-dev:8000/api/v1.0/documents/collaboration-auth/;
+        proxy_set_header Host $host;
+        proxy_set_header X-Real-IP $remote_addr;
+        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
+        proxy_set_header X-Original-URL $request_uri;
+
+        # Prevent the body from being passed
+        proxy_pass_request_body off;
+        proxy_set_header Content-Length "";
+        proxy_set_header X-Original-Method $request_method;
+    }


I don't see why this is necessary and not use the normal api url

We can maybe clear a bit.
We cannot use the normal api url, when we arrive here we are in the ngnix container, so we need to call the back with its docker compose service name.

let's peer code because I don't see very clearly how it could be by just reviewing

sampaccoud · 2024-12-07T21:30:03Z

docker/files/etc/nginx/conf.d/default.conf

+    location  /collaboration/api/ {
+        # Collaboration server
+        proxy_pass http://y-provider:4444;
+        proxy_set_header Host $host;
+    }


then who is doing authentication on this route? 🤔

This route will be used only by the django backend, so we don't want to pass by collaboration_auth.

Ngnix is here only to do the "sticky" to find automatically the good pods.
The auth is inside the middlelayer httpSecurity on the endpoint:
https://github.com/numerique-gouv/impress/blob/c6c4eec18f5e1b7de32d838dc81ff4a8421f7de3/src/frontend/servers/y-provider/src/server.ts#L83-L88

The security is mainly this part;
https://github.com/numerique-gouv/impress/blob/c6c4eec18f5e1b7de32d838dc81ff4a8421f7de3/src/frontend/servers/y-provider/src/middlelayers.ts#L28-L32

sampaccoud · 2024-12-07T21:34:13Z

src/backend/core/api/viewsets.py

+    def perform_update(self, serializer):
+        """Update an access to the document and notify the collaboration server."""
+        access = serializer.save()
+
+        access_user_id = None
+        if access.user:
+            access_user_id = str(access.user.id)
+
+        # Notify collaboration server about the access change
+        CollaborationService().reset_connections(
+            str(access.document.id), access_user_id
+        )
+
+    def perform_destroy(self, instance):
+        """Delete an access to the document and notify the collaboration server."""
+        instance.delete()
+
+        # Notify collaboration server about the access removed
+        CollaborationService().reset_connections(
+            str(instance.document.id), str(instance.user.id)
+        )
+


a signal directly on the DocumentAccess model is better place to do that because the access could be changed from elsewhere.

I think I will do it in another PR if it is ok. It asks to modify lot of tests in the models.

sampaccoud · 2024-12-07T21:37:59Z

src/backend/impress/settings.py

+    COLLABORATION_API_URL = values.Value(
+        None, environ_name="COLLABORATION_API_URL", environ_prefix=None
+    )
+    COLLABORATION_SERVER_SECRET = values.Value(
+        None, environ_name="COLLABORATION_SERVER_SECRET", environ_prefix=None
+    )


I would find it easier to understand if these 2 settings shared the same root:
COLLABORATION_SERVER_URL
COLLABORATION_SERVER_SECRET

I put COLLABORATION_API_URL because we have as well COLLABORATION_WS_URL, both of them target the server somehow, we need to dissociate the base url as well because of docker compose:

COLLABORATION_API_URL=http://nginx:8083/collaboration/api/ COLLABORATION_WS_URL=ws://localhost:8083/collaboration/ws/

If you think it is better, I can change, no prob.

sampaccoud · 2024-12-07T21:40:56Z

src/frontend/servers/y-provider/src/server.ts

+
+    if (documentName !== roomParam) {
+      console.error(
+        'Invalid room name - Probable hacking attempt:',


Why do you need the room parameter if you already have the info in documentName? 🤔

roomParam is needed for 2 things:

we extract the document from the request_uri in collaboration-auth endpoint - (here)

We need it to stick to the good pod between clients
https://github.com/numerique-gouv/impress/pull/472/files#diff-259585b31c6635e6f972a090c26e576f7ae41f7cc7d3f65be7cf16c08fbd2f2fR80

documentName is used by HocusPocus to create a room, it is a mandatory parameter.

So both of them are needed, they must be the same when we are on the server.

sampaccoud · 2024-12-07T21:42:04Z

src/frontend/servers/y-provider/src/server.ts

+   * Route to reset connections in a room
+   */
+  app.post(
+    routes.RESET_CONNECTIONS,


you need authentication on this view or it can be abused?

This route is not for the clients, it is for the backend (signal), so we don't need / want the collaboration_auth.
The security is from the midlelayer httpSecurity, the "authentication" is here:
https://github.com/numerique-gouv/impress/pull/472/files#diff-4fa4cb104ce6aac1c7d731c6a4dbd6faaece1554a0052bfd6e4cf4f7a3f769acR28-R32

The keys have to match. We can increase the secu if you want.
This endpoint does not gives you access to the document, you could reset a connection, that will trigger a reconnection on the client side.

Ok understood. Then you can remove this ingress and access the service directly via its svc. Exposing an ingress is only to give access on internet.

sampaccoud · 2024-12-07T21:46:13Z

src/helm/impress/values.yaml

+## @param ingressCollaborationApi.className IngressClass to use for the Ingress
+## @param ingressCollaborationApi.host Host for the Ingress
+## @param ingressCollaborationApi.path Path to use for the Ingress
+ingressCollaborationApi:


I didn't understand the need for another full url range. To me, the collaboration server should only be available through urls that are checked by Django as this is the only way to compute access rights. Let's not create things that we don't need yet and see later when the problem arises?

I need it, I really try to not create a new ingress, but with Jacques it was what we found the cleanest.
First of all, the endpoints /collaboration/api/ is callable only by the backend (django), we don't want clients to use them, the endpoints are secure with a key known only by the 2 servers.
Second things, we don't want the backend to pass by the auth_url collaboration-auth, collaboration-auth is made to secures clients access, the backend does not have notion of accesses or abilities, it does not make sense for the backend to call again himself, to get a key that it has already.

To avoid the auth_url collaboration-auth, we need to have another ingress.

I try first to server to server (django / yjs server), but it is working only aleatory, when the backend connect to the good pod. The backend must connect to the same pod as the client.
Then I try to create a ConfiMaps, like that I could indicate the stickiness, and depend the road avoid the auth_url collaboration-auth, but it is not the good way, the ConfigMaps is commun to other service than Impress, it can messed up things

If you have others things in mind, curious to hear about.

see above comment. We should remove this ingress.

YousefED

Nice! Left some comments. Note that I haven't looked in depth at the Python / nginx / kubernetes code as this is not my expertise.

One suggestion about the architecture to make it safer would be to periodically reauthenticate (refresh) websocket connections (to make sure the auth is still valid) instead of 100% relying on the "kick" mechanism - but I think it's

YousefED · 2024-12-09T04:06:50Z

src/frontend/servers/y-provider/__tests__/server.test.ts

+      }
+    });
+
+    await new Promise<void>((resolve) => {


you can just do await hocuspocusServer .configure({ port: portWS, }) .listen()

Personally, I try to avoid .then and .catch as much as possible and prefer async / await

e7565f0#diff-28f2602f409539020f08398a838eea589448bd58ccf86f9670f9c06f1fbeeef4R28-R30

I added this helper, to reduce promise hell:
e7565f0#diff-7b4e113632ac6a5668894882898d4c59b8260390aa9c61ebe6962395e4b5c94e

YousefED · 2024-12-09T04:08:51Z

src/frontend/servers/y-provider/__tests__/server.test.ts

+      .set('Authorization', 'test-secret-api-key');
+
+    expect(response.status).toBe(200);
+    expect(response.body.message).toBe('Connections reset');


do you want to test whether the connection has actually been reset (i.e.: users "kicked"?

I added a check on hocuspocusServer.closeConnections, I don't succeed to do a full check test, I think we are in half mock env with Jest, so users don't really connect totally to the websocket, or they are kicked out because of a ping pong thing between the HocusPocus server and the HocusPocus Provider?

I have as well a e2e test on it, that assert the deconnection and the reconnection. I will try to come back on these tests when I will get more times.
096837a#diff-aabe2f42c0dee383e91e6ad77a11be21ddb2246bee1007ee1678dcc09c55557e

YousefED · 2024-12-09T04:13:09Z

src/frontend/servers/y-provider/src/server.ts

+        return;
+      }
+
+      const docConnection = await hocuspocusServer.openDirectConnection(room);


I think you don't need to call openDirectConnection, but it you can just access hocuspocusServer.documents

Actually, there seems to be a hocuspocusServer.closeConnections(docName)

Improved:
0297052#diff-931f2e9c315f6fc95eae032ee4f167e486cab63966647873520cb12d39d56fdfR106-R125

lebaudantoine · 2024-12-10T14:09:39Z

src/frontend/servers/y-provider/src/middlelayers.ts

@@ -0,0 +1,59 @@
+import { NextFunction, Request, Response } from 'express';


nit: should it be middlewares.ts?

src/frontend/servers/y-provider/src/middlelayers.ts

lebaudantoine · 2024-12-10T16:21:56Z

Discussed IRL: The current code organization is becoming confusing. Placing server code in the frontend folder is misleading, especially now that it functions as a backend microservice.

src/frontend/servers/y-provider/src/routes.ts

lebaudantoine · 2024-12-10T16:15:22Z

src/frontend/servers/y-provider/src/server.ts

-});
+  onConnect({ requestHeaders, connection, documentName, requestParameters }) {
+    const roomParam = requestParameters.get('room');
+    const canEdit = requestHeaders['x-can-edit'] === 'True' ? true : false;


nit: const canEdit = requestHeaders['x-can-edit'] === 'True'

src/frontend/servers/y-provider/src/server.ts

We want to use the same pattern for the websocket collaboration service authorization as what we use for media files. This addition comes in the next commit but doing it efficiently required factorizing some code with the media auth view.

We need to improve security on the access to The collaboration server We can use the same pattern as for media files leveraging the nginx subrequest feature.

Using "impress" as the name of minio's root user in Tilt's dev environment, was triggering obfuscation of the logs in Tilt's console each time the word "impress" was used. This made the logs hard to read.

We want to be able to reset the connections of a document. To do this, we need to be able to send a request to the collaboration server. To do so, we added the endpoint POST "/collaboration/api/reset-connections" to the collaboration server thanks to "express".

We add jest tests for the y-provider server. The CI will be able to run the tests.

When an access is updated or removed, the collaboration server is notified to reset the access connection; by being disconnected, the accesses will automatically reconnect by passing by the ngnix subrequest, and so get the good rights. We do the same system when the document link is updated, except here we reset every access connection.

Add sentry to the collaboration server. It will be used to log errors and exceptions.

We need to keep the stickyness between the collaboration api and the ws server, to do so, we will use "upstream-hash-by: $arg_room", meaning that the stickyness will be based on the room query. We need to ahve 2 ingress to handle the "collaboration_auth", only the ws routes has to use the "collaboration_auth" subrequest.

We remove the debounce on useHeadings, it decreases the user experience and it's not necessary a big performance improvement.

AntoLC added frontend backend helm collaboration labels Dec 3, 2024

AntoLC self-assigned this Dec 3, 2024

AntoLC changed the title ~~Collab improvement~~ Collaborationimprovement Dec 3, 2024

AntoLC changed the title ~~Collaborationimprovement~~ Collaboration improvement Dec 3, 2024

AntoLC changed the title ~~Collaboration improvement~~ Collaboration access improvement Dec 3, 2024

AntoLC force-pushed the pod-yjs-direct-auth branch 2 times, most recently from 53d3e96 to a7c1173 Compare December 5, 2024 14:16

lebaudantoine mentioned this pull request Dec 5, 2024

Add create document api endpoint #467

Merged

AntoLC force-pushed the pod-yjs-direct-auth branch 5 times, most recently from 9479a54 to 145f876 Compare December 5, 2024 20:50

AntoLC mentioned this pull request Dec 6, 2024

♻️Misc refacto #480

Merged

4 tasks

AntoLC force-pushed the pod-yjs-direct-auth branch 2 times, most recently from e53c70f to c6c4eec Compare December 6, 2024 14:39

AntoLC marked this pull request as ready for review December 6, 2024 15:03

AntoLC requested review from YousefED and sampaccoud December 6, 2024 15:03

sampaccoud reviewed Dec 7, 2024

View reviewed changes

YousefED reviewed Dec 9, 2024

View reviewed changes

lebaudantoine reviewed Dec 10, 2024

View reviewed changes

src/frontend/servers/y-provider/src/middlelayers.ts Show resolved Hide resolved

lebaudantoine mentioned this pull request Dec 10, 2024

Mardown converter endpoint #488

Merged

AntoLC force-pushed the pod-yjs-direct-auth branch 2 times, most recently from 0f9399b to a109366 Compare December 11, 2024 09:31

lebaudantoine reviewed Dec 11, 2024

View reviewed changes

sampaccoud and others added 9 commits December 11, 2024 11:16

✨(backend) add subrequest auth view for collaboration server

8ae5f95

We need to improve security on the access to The collaboration server We can use the same pattern as for media files leveraging the nginx subrequest feature.

🧑‍💻(helm) rename minio root user password

4ca5568

Using "impress" as the name of minio's root user in Tilt's dev environment, was triggering obfuscation of the logs in Tilt's console each time the word "impress" was used. This made the logs hard to read.

✅(y-provider) add tests for y-provider server

e7565f0

We add jest tests for the y-provider server. The CI will be able to run the tests.

📈(collaboration) add sentry

cb2bf25

Add sentry to the collaboration server. It will be used to log errors and exceptions.

⚡️(frontend) remove debounce on useHeadings

1e5976e

We remove the debounce on useHeadings, it decreases the user experience and it's not necessary a big performance improvement.

AntoLC force-pushed the pod-yjs-direct-auth branch from a109366 to 1e5976e Compare December 11, 2024 10:16

AntoLC requested a review from lebaudantoine December 11, 2024 13:06

AntoLC merged commit a8310fa into main Dec 11, 2024
15 of 16 checks passed

AntoLC deleted the pod-yjs-direct-auth branch December 11, 2024 13:54

AntoLC mentioned this pull request Dec 11, 2024

🔖(minor) release 1.9.0 #492

Merged

		@@ -0,0 +1,59 @@
		import { NextFunction, Request, Response } from 'express';

Collaboration access improvement #472

Collaboration access improvement #472

Uh oh!

Conversation

AntoLC commented Dec 3, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Proposal

Uh oh!

Choose a reason for hiding this comment

Uh oh!

AntoLC Dec 8, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

AntoLC Dec 8, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

AntoLC Dec 11, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

AntoLC Dec 8, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

YousefED left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

AntoLC Dec 11, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

AntoLC Dec 11, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

lebaudantoine commented Dec 10, 2024

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

AntoLC commented Dec 3, 2024 •

edited

Loading

AntoLC Dec 8, 2024 •

edited

Loading

AntoLC Dec 8, 2024 •

edited

Loading

AntoLC Dec 11, 2024 •

edited

Loading

AntoLC Dec 8, 2024 •

edited

Loading

AntoLC Dec 11, 2024 •

edited

Loading

AntoLC Dec 11, 2024 •

edited

Loading