Skip to content

Conversation

@AntoLC
Copy link
Collaborator

@AntoLC AntoLC commented Dec 3, 2024

Purpose

Access improvement on the collaboration server.

Proposal

  • ✨(backend) add subrequest auth view for collaboration server
  • ✨(y-provider) endpoint POST /collaboration/api/reset-connections
  • ✅(y-provider) add tests for y-provider server
  • ✨(backend) notify collaboration server
  • 🔧(helm) add ingress collaboration api

@AntoLC AntoLC self-assigned this Dec 3, 2024
@AntoLC AntoLC changed the title Collab improvement Collaborationimprovement Dec 3, 2024
@AntoLC AntoLC changed the title Collaborationimprovement Collaboration improvement Dec 3, 2024
@AntoLC AntoLC changed the title Collaboration improvement Collaboration access improvement Dec 3, 2024
@AntoLC AntoLC force-pushed the pod-yjs-direct-auth branch 2 times, most recently from 53d3e96 to a7c1173 Compare December 5, 2024 14:16
@AntoLC AntoLC force-pushed the pod-yjs-direct-auth branch 5 times, most recently from 9479a54 to 145f876 Compare December 5, 2024 20:50
@AntoLC AntoLC mentioned this pull request Dec 6, 2024
4 tasks
@AntoLC AntoLC force-pushed the pod-yjs-direct-auth branch 2 times, most recently from e53c70f to c6c4eec Compare December 6, 2024 14:39
@AntoLC AntoLC marked this pull request as ready for review December 6, 2024 15:03
@AntoLC AntoLC requested review from YousefED and sampaccoud December 6, 2024 15:03
Comment on lines +36 to +47
location /collaboration-auth {
proxy_pass http://app-dev:8000/api/v1.0/documents/collaboration-auth/;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Original-URL $request_uri;

# Prevent the body from being passed
proxy_pass_request_body off;
proxy_set_header Content-Length "";
proxy_set_header X-Original-Method $request_method;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see why this is necessary and not use the normal api url

Copy link
Collaborator Author

@AntoLC AntoLC Dec 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can maybe clear a bit.
We cannot use the normal api url, when we arrive here we are in the ngnix container, so we need to call the back with its docker compose service name.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's peer code because I don't see very clearly how it could be by just reviewing

Comment on lines +49 to +53
location /collaboration/api/ {
# Collaboration server
proxy_pass http://y-provider:4444;
proxy_set_header Host $host;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

then who is doing authentication on this route? 🤔

Copy link
Collaborator Author

@AntoLC AntoLC Dec 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This route will be used only by the django backend, so we don't want to pass by collaboration_auth.

Ngnix is here only to do the "sticky" to find automatically the good pods.
The auth is inside the middlelayer httpSecurity on the endpoint:
https://github.com/numerique-gouv/impress/blob/c6c4eec18f5e1b7de32d838dc81ff4a8421f7de3/src/frontend/servers/y-provider/src/server.ts#L83-L88

The security is mainly this part;
https://github.com/numerique-gouv/impress/blob/c6c4eec18f5e1b7de32d838dc81ff4a8421f7de3/src/frontend/servers/y-provider/src/middlelayers.ts#L28-L32

Comment on lines +823 to +844
def perform_update(self, serializer):
"""Update an access to the document and notify the collaboration server."""
access = serializer.save()

access_user_id = None
if access.user:
access_user_id = str(access.user.id)

# Notify collaboration server about the access change
CollaborationService().reset_connections(
str(access.document.id), access_user_id
)

def perform_destroy(self, instance):
"""Delete an access to the document and notify the collaboration server."""
instance.delete()

# Notify collaboration server about the access removed
CollaborationService().reset_connections(
str(instance.document.id), str(instance.user.id)
)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a signal directly on the DocumentAccess model is better place to do that because the access could be changed from elsewhere.

Copy link
Collaborator Author

@AntoLC AntoLC Dec 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I will do it in another PR if it is ok. It asks to modify lot of tests in the models.

Comment on lines +375 to +380
COLLABORATION_API_URL = values.Value(
None, environ_name="COLLABORATION_API_URL", environ_prefix=None
)
COLLABORATION_SERVER_SECRET = values.Value(
None, environ_name="COLLABORATION_SERVER_SECRET", environ_prefix=None
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would find it easier to understand if these 2 settings shared the same root:
COLLABORATION_SERVER_URL
COLLABORATION_SERVER_SECRET

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I put COLLABORATION_API_URL because we have as well COLLABORATION_WS_URL, both of them target the server somehow, we need to dissociate the base url as well because of docker compose:

COLLABORATION_API_URL=http://nginx:8083/collaboration/api/
COLLABORATION_WS_URL=ws://localhost:8083/collaboration/ws/

If you think it is better, I can change, no prob.


if (documentName !== roomParam) {
console.error(
'Invalid room name - Probable hacking attempt:',
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you need the room parameter if you already have the info in documentName? 🤔

Copy link
Collaborator Author

@AntoLC AntoLC Dec 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

roomParam is needed for 2 things:

documentName is used by HocusPocus to create a room, it is a mandatory parameter.

So both of them are needed, they must be the same when we are on the server.

* Route to reset connections in a room
*/
app.post(
routes.RESET_CONNECTIONS,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you need authentication on this view or it can be abused?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This route is not for the clients, it is for the backend (signal), so we don't need / want the collaboration_auth.
The security is from the midlelayer httpSecurity, the "authentication" is here:
https://github.com/numerique-gouv/impress/pull/472/files#diff-4fa4cb104ce6aac1c7d731c6a4dbd6faaece1554a0052bfd6e4cf4f7a3f769acR28-R32

The keys have to match. We can increase the secu if you want.
This endpoint does not gives you access to the document, you could reset a connection, that will trigger a reconnection on the client side.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok understood. Then you can remove this ingress and access the service directly via its svc. Exposing an ingress is only to give access on internet.

## @param ingressCollaborationApi.className IngressClass to use for the Ingress
## @param ingressCollaborationApi.host Host for the Ingress
## @param ingressCollaborationApi.path Path to use for the Ingress
ingressCollaborationApi:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't understand the need for another full url range. To me, the collaboration server should only be available through urls that are checked by Django as this is the only way to compute access rights. Let's not create things that we don't need yet and see later when the problem arises?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need it, I really try to not create a new ingress, but with Jacques it was what we found the cleanest.
First of all, the endpoints /collaboration/api/ is callable only by the backend (django), we don't want clients to use them, the endpoints are secure with a key known only by the 2 servers.
Second things, we don't want the backend to pass by the auth_url collaboration-auth, collaboration-auth is made to secures clients access, the backend does not have notion of accesses or abilities, it does not make sense for the backend to call again himself, to get a key that it has already.

To avoid the auth_url collaboration-auth, we need to have another ingress.

I try first to server to server (django / yjs server), but it is working only aleatory, when the backend connect to the good pod. The backend must connect to the same pod as the client.
Then I try to create a ConfiMaps, like that I could indicate the stickiness, and depend the road avoid the auth_url collaboration-auth, but it is not the good way, the ConfigMaps is commun to other service than Impress, it can messed up things

If you have others things in mind, curious to hear about.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see above comment. We should remove this ingress.

Copy link
Collaborator

@YousefED YousefED left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! Left some comments. Note that I haven't looked in depth at the Python / nginx / kubernetes code as this is not my expertise.

One suggestion about the architecture to make it safer would be to periodically reauthenticate (refresh) websocket connections (to make sure the auth is still valid) instead of 100% relying on the "kick" mechanism - but I think it's

}
});

await new Promise<void>((resolve) => {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can just do await hocuspocusServer .configure({ port: portWS, }) .listen()

Personally, I try to avoid .then and .catch as much as possible and prefer async / await

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.set('Authorization', 'test-secret-api-key');

expect(response.status).toBe(200);
expect(response.body.message).toBe('Connections reset');
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you want to test whether the connection has actually been reset (i.e.: users "kicked"?

Copy link
Collaborator Author

@AntoLC AntoLC Dec 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a check on hocuspocusServer.closeConnections, I don't succeed to do a full check test, I think we are in half mock env with Jest, so users don't really connect totally to the websocket, or they are kicked out because of a ping pong thing between the HocusPocus server and the HocusPocus Provider?

I have as well a e2e test on it, that assert the deconnection and the reconnection. I will try to come back on these tests when I will get more times.
096837a#diff-aabe2f42c0dee383e91e6ad77a11be21ddb2246bee1007ee1678dcc09c55557e

return;
}

const docConnection = await hocuspocusServer.openDirectConnection(room);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you don't need to call openDirectConnection, but it you can just access hocuspocusServer.documents

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, there seems to be a hocuspocusServer.closeConnections(docName)

Copy link
Collaborator Author

@AntoLC AntoLC Dec 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -0,0 +1,59 @@
import { NextFunction, Request, Response } from 'express';
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: should it be middlewares.ts?

@lebaudantoine
Copy link
Contributor

Discussed IRL: The current code organization is becoming confusing. Placing server code in the frontend folder is misleading, especially now that it functions as a backend microservice.

@AntoLC AntoLC force-pushed the pod-yjs-direct-auth branch 2 times, most recently from 0f9399b to a109366 Compare December 11, 2024 09:31
});
onConnect({ requestHeaders, connection, documentName, requestParameters }) {
const roomParam = requestParameters.get('room');
const canEdit = requestHeaders['x-can-edit'] === 'True' ? true : false;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: const canEdit = requestHeaders['x-can-edit'] === 'True'

sampaccoud and others added 9 commits December 11, 2024 11:16
We want to use the same pattern for the websocket collaboration service
authorization as what we use for media files.

This addition comes in the next commit but doing it efficiently
required factorizing some code with the media auth view.
We need to improve security on the access to The collaboration server
We can use the same pattern as for media files leveraging the nginx
subrequest feature.
Using "impress" as the name of minio's root user in Tilt's
dev environment, was triggering obfuscation of the logs in Tilt's
console each time the word "impress" was used.
This made the logs hard to read.
We want to be able to reset the connections of a document.
To do this, we need to be able to send a
request to the collaboration server.
To do so, we added the endpoint
POST "/collaboration/api/reset-connections"
to the collaboration server thanks to "express".
We add jest tests for the y-provider server.
The CI will be able to run the tests.
When an access is updated or removed, the
collaboration server is notified to reset the
access connection; by being disconnected, the
accesses will automatically reconnect by passing
by the ngnix subrequest, and so get the good
rights.
We do the same system when the document link is
updated, except here we reset every access
connection.
Add sentry to the collaboration server.
It will be used to log errors and exceptions.
We need to keep the stickyness between the
collaboration api and the ws server, to do so,
we will use "upstream-hash-by: $arg_room", meaning
that the stickyness will be based on the room query.
We need to ahve 2 ingress to handle the
"collaboration_auth", only the ws routes has to
use the "collaboration_auth" subrequest.
We remove the debounce on useHeadings, it
decreases the user experience and it's not
necessary a big performance improvement.
@AntoLC AntoLC force-pushed the pod-yjs-direct-auth branch from a109366 to 1e5976e Compare December 11, 2024 10:16
@AntoLC AntoLC requested a review from lebaudantoine December 11, 2024 13:06
@AntoLC AntoLC merged commit a8310fa into main Dec 11, 2024
15 of 16 checks passed
@AntoLC AntoLC deleted the pod-yjs-direct-auth branch December 11, 2024 13:54
@AntoLC AntoLC mentioned this pull request Dec 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants