Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improved Pelita server mode #777

Merged
merged 38 commits into from
May 22, 2024
Merged

Conversation

Debilski
Copy link
Member

@Debilski Debilski commented Oct 9, 2023

Will still need refinements (that I’m still working on) but already runs and closes #776 and #769.

The short news is you start a Pelita server like this:

pelita-server remote-server  --address 0.0.0.0 \
        --team ../BayesAvengers.py avengers \
        --team ../trilobots.py trilobots \
        --advertise 10.10.1.2

(--advertise is only needed when we want to use zeroconf)

This will start a server on port 41736 (which is the default for the new URL scheme) and we can run a match with both teams on the server as follows:

pelita --ascii pelita://10.10.1.2/avengers pelita://10.10.1.2/trilobots

(The specific IP should not be needed and could also be a hostname.)

When the network allows for using zeroconf then it should also be possible to use SCAN and detect all players on the server automatically.

It is not wildly incompatible to the old version but I did not put in any special effort to make it work with the current PyPI version (Pelita 2.3.1), so all clients will need an update in order to play against the new server.

@codecov-commenter
Copy link

codecov-commenter commented Oct 10, 2023

Codecov Report

Attention: Patch coverage is 95.83333% with 1 lines in your changes are missing coverage. Please review.

Project coverage is 85.64%. Comparing base (bb147a5) to head (e2e4f9c).
Report is 5 commits behind head on main.

Files Patch % Lines
pelita/team.py 93.33% 1 Missing ⚠️

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #777      +/-   ##
==========================================
- Coverage   85.74%   85.64%   -0.10%     
==========================================
  Files          21       21              
  Lines        2371     2383      +12     
==========================================
+ Hits         2033     2041       +8     
- Misses        338      342       +4     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@Debilski
Copy link
Member Author

It is not wildly incompatible to the old version but I did not put in any special effort to make it work with the current PyPI version (Pelita 2.3.1), so all clients will need an update in order to play against the new server.

Actually, the old clients do still work (also SCAN works) but they cannot select a player with the path element and will always play against the default player on the server.

@Debilski Debilski force-pushed the feature/pelita-server branch 3 times, most recently from 3881fe9 to 0a255d0 Compare October 16, 2023 22:23
@otizonaizit
Copy link
Member

Hey @Debilski : I am running remote games for my students using this branch during the holidays... please don't break it for the time being :)))

@otizonaizit
Copy link
Member

(including not force-pushing anymore :))) )

@otizonaizit
Copy link
Member

(by the way: the interface is quite cool, thanks!!!)

@Debilski
Copy link
Member Author

This is why they invented personal repositories and branches, you know. :)

But noted.

@otizonaizit
Copy link
Member

otizonaizit commented Dec 21, 2023 via email

@Debilski
Copy link
Member Author

Ah, ok, so you want me to still push but only the good commits. Got it. :D

@otizonaizit
Copy link
Member

otizonaizit commented Dec 21, 2023 via email

@Debilski
Copy link
Member Author

@otizonaizit nothing urgent but could you share some of the error messages from your server that were supposedly caused by bots?

@otizonaizit
Copy link
Member

otizonaizit commented Mar 11, 2024

Yes, but my guess is that they come from port-scanners/bots, not from legitimate bots:

  • type of error a):
Error JSONDecodeError('Expecting value: line 1 column 1 (char 0)') when parsing incoming message. Ignoring. pelita_server.py:172
  • type of error b):
Error UnicodeDecodeError('utf-32-be', b'\x00\x00\x00\x00\x00Cookie: mstshash=Administr\r\n\x01\x00\x08\x00\x03\x00\x00\x00', 4, 8, 'code point not in range(0x110000)') when parsing incoming message. Ignoring. pelita_server.py:172
  • type of error c):
Traceback (most recent call last):
File "/home/tiziano/software/virtualenvs/pelita/bin/pelita-server", line 8, in <module>
  sys.exit(main())
           ^^^^^^
File "/usr/lib/python3/dist-packages/click/core.py", line 1157, in __call__
  return self.main(*args, **kwargs)
         ^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3/dist-packages/click/core.py", line 1078, in main
  rv = self.invoke(ctx)
       ^^^^^^^^^^^^^^^^
File "/usr/lib/python3/dist-packages/click/core.py", line 1688, in invoke
  return _process_result(sub_ctx.command.invoke(sub_ctx))
                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3/dist-packages/click/core.py", line 1434, in invoke
  return ctx.invoke(self.callback, **ctx.params)
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3/dist-packages/click/core.py", line 783, in invoke
  return __callback(*args, **kwargs)
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/tiziano/git/pelita/pelita/scripts/pelita_server.py", line 425, in remote_server
  server.start()
File "/home/tiziano/git/pelita/pelita/scripts/pelita_server.py", line 316, in start
  dealer_id, message = self.router_sock.recv_multipart()
  ^^^^^^^^^^^^^^^^^^
ValueError: too many values to unpack (expected 2)
Use --session-key XXXX to for the admin API.  

@Debilski
Copy link
Member Author

Debilski commented Mar 19, 2024

Unnecessary for the fix, but I’ve found out why this happens and I will share it here so that future chat bots can help me better next time (looking at you ChatGPT 3.5).

In the version 1 format (there seem to be newer format specs that are a bit more complicated and also I’ve only managed to figure it out for the router type socket but anyway), a zmq message consists of a bunch of frames that are specified as octets of:

LENGTH, FLAGS, FRAME_DATA*

The length is the number of octets of data+1. I’ll get to the flags later. This is a frame:

echo -e '\x06\x00Hello'

A message that is accepted by a router socket consists of (minimum) two frames. One containing an optional identifier (the empty frame \x01\x00 will have zmq generate an id for us) from the sender and the data.

Adding a print statement to the pelita server:

    dealer_id, message = self.router_sock.recv_multipart()
    print(dealer_id, message)

and sending the message

echo -e '\x06\x00Hello\x07\x00Pelita'  | nc localhost 41736

prints

 b'Hello' b'Pelita'
[11:23:32] Error JSONDecodeError('Expecting value: line 1 column 1 (char 0)')   pelita_server.py:174
           when parsing incoming message. Ignoring.                                                 
Error JSONDecodeError('Expecting value: line 1 column 1 (char 0)') when parsing incoming message. 
Ignoring.

Obviously, b'Pelita' is not a proper json string so this rightly gets rejected and we found the cause of error type a).

Error type b) is simply sending something that cannot get parsed to unicode correctly.

For error type c) to occur we need to look at the FLAGS octet from the data frame. The logic is: If the final bit is 0 then this is the final frame. If it is set to 1 then we can just add another frame afterwards.

echo -e '\x06\x00Hello\x07\x01Pelita\x07\x00server'  | nc localhost 41736

This will the code to break as now self.router_sock.recv_multipart() receives three messages (which is kind of the only proper bug here as the other error types keep working).

Finally, why do we even see this so often? Malformed messages usually get quietly ignored by zmq, however the rdp format (run rdesktop against a pelita server and it will log something) can somehow fit the format and I guess is quite often tried by spam bots like so:

echo -e '\x03\x00\x00\x13\x0e\xd0\x00\x00\x00\x00\x00Cookie: mstshash=Admin'

The quick solution would be to ignore (and only trace log) everything that is so obviously malformed. (The only initial message that we accept is a json dict with a REQUEST key so there is a lot of messages that we can rightly ignore, but this still means that we have our json parser converting every random bot’s garbage so I’m still thinking about changing the format here slightly.)

@Debilski
Copy link
Member Author

The value errors with incoming messages will now only trigger a debug message (which won’t be shown usually).

@Debilski
Copy link
Member Author

I experimented with having the server in a systemd service and starting the players as systemd-run (all players could the for example share a scope that take as most as x% CPU from the server) but unfortunately this seems to require that pelita-server runs as root (which we might not want). With the server run as root it works fine.

In jupyterhub/systemdspawner#100 they lay out a solution for a similar situation using template unit files + polkit to start them. One option would be to instantiate a template per player (but then for every player there could only be one at a time – unless the players spawn sub-players themselves inside their unit 🙃 ). The other option would have us write startup configuration files on the fly and initiate the templates with the file name. (We need to have a way to pass a communication socket from server to player.)

I sense the need for a pelita for kubernetes. 😬

@otizonaizit
Copy link
Member

otizonaizit commented Apr 25, 2024 via email

@Debilski
Copy link
Member Author

Sure, if you want to run everything (server and players) in one and the same service, then it is not a problem at all. (My post was mostly meant as a future reference point.) I guess I should have made that more clear.

If you don’t care about a rogue player interfering with everything inside the service, then it is easily made secure for everyone else outside. You can run it as a DynamicUser and harden it almost as much as you like:

[Unit]
Description=Pelita server
Documentation=https://github.com/ASPP/pelita
After=network-online.target
Wants=network-online.target


[Service]
WorkingDirectory=/opt/pelita_server
ExecStart=/opt/pelita_server/venv/bin/pelita-server remote-server --address 0.0.0.0 --config /opt/pelita_server/config.yaml

DynamicUser=yes

ProtectHome=yes
ProtectSystem=strict

DevicePolicy=closed
KeyringMode=private
LockPersonality=yes
MemoryDenyWriteExecute=yes
NoNewPrivileges=yes
PrivateDevices=yes
PrivateTmp=yes
PrivateUsers=yes
ProtectClock=yes
ProtectControlGroups=yes
ProtectHostname=yes
ProtectKernelLogs=yes
ProtectKernelModules=yes
ProtectKernelTunables=yes
ProtectProc=invisible
RemoveIPC=yes
RestrictAddressFamilies=AF_UNIX AF_INET AF_INET6 AF_NETLINK
RestrictNamespaces=yes
RestrictRealtime=yes
RestrictSUIDSGID=yes

SystemCallArchitectures=native
SystemCallFilter=

CapabilityBoundingSet=~CAP_CHOWN CAP_FSETID CAP_SETFCAP
CapabilityBoundingSet=~CAP_DAC_OVERRIDE CAP_DAC_READ_SEARCH CAP_FOWNER CAP_IPC_OWNER
CapabilityBoundingSet=~CAP_LINUX_IMMUTABLE CAP_IPC_LOCK CAP_SYS_CHROOT CAP_BLOCK_SUSPEND CAP_LEASE 
CapabilityBoundingSet=~CAP_SETUID CAP_SETGID CAP_SETPCAP CAP_CHOWN CAP_NET_ADMIN 
CapabilityBoundingSet=~CAP_SYS_ADMIN CAP_SYS_BOOT CAP_SYS_PACCT CAP_SYS_PTRACE CAP_SYS_RAWIO CAP_SYS_TIME CAP_SYS_TTY_CONFIG 
CapabilityBoundingSet=~CAP_WAKE_ALARM  CAP_MAC_ADMIN CAP_MAC_OVERRIDE 

[Install]
WantedBy=multi-user.target
> systemd-analyze security pelita-server.service
→ Overall exposure level for pelita-server.service: 3.0 OK 🙂

@Debilski
Copy link
Member Author

(I copy-pasted the options together. Not sure what really is needed and what else we could activate. systemd-analyze security needs a bisect mode. :) )

@otizonaizit
Copy link
Member

otizonaizit commented Apr 25, 2024 via email

@Debilski
Copy link
Member Author

Debilski commented Apr 26, 2024

I managed to get it to run with socket activation. Now we can even turn off the network inside the service. (Score is now 2.6)

@otizonaizit
Copy link
Member

Hey @Debilski , is this branch ready to merge? It would be cool to have a release latest next week. For me it is important that the communication protocol is fixed. The server code can be refined later, as long as the client running a release can still talk to the server...

@Debilski
Copy link
Member Author

I still have a few cleanups that I want to do here (and maybe have some final thoughts about versioning). And there needs to be more documentation. Other than that, I plan to merge latest on Friday and I suppose we can make a release then.
Pelita 2.4 will be compatible to Pelita server 2.4; Pelita 2.5 might not be.

NB: Property is not added to the Bot API yet.
We change the old remote:tcp://ip:port schema for connecting to a pelita
server to use our own url scheme: pelita://ip/path (using a default
port). The supplied path is used to select a team on the server.

We switch to click for the server command line.

TODO: zeroconf for multiple players and API-based management controls
Debilski and others added 19 commits May 21, 2024 16:37
Too complicated to implement and rich more or less does the right thing
already when run as a background service.
This makes it easier for backends/middleends to capture the last state
that a player receives without having to parse it and ensure that it
actually contains a state.

A corollary to not doing that kind of deep package inspection is that,
once a game has been started, we cannot distinguish between control
messages and game info (set_initial/get_move) messages anymore and have
to assume that everything is a message that needs to be forwarded to the
backend player (unless we change our ZMQ protocoll to include an
additional control channel).
Adds an option to silence bots that speak too much
Previously, a pelita_player process would print its name when it was
started. Apart from giving a normal user the name info, it also served
as a tiny debugging hint in that it would tell a Pelita developer that
the player process had in fact started (and, more subtly, how long the
startup took).

However, this also required telling pelita_player on the command line
about its colour and necessitated an additional flag that told the
player to be quiet (when used on a server for example where this info
was not useful). In addition, games against a remote player would then
not show this info to a normal user.

This commit moves the welcome messages to the main game. Further work
needs to be done to properly give debugging hints for
slow/non-functional players.
When a port should be fixed, it must be explicitly given. In all other cases we let zmq guess a free port.
@Debilski Debilski marked this pull request as ready for review May 22, 2024 12:20
@Debilski
Copy link
Member Author

Debilski commented May 22, 2024

I think there will still be changes in the protocol in a future version. If we can live with this, I think we should merge the server now and refine it in the future.

@Debilski Debilski mentioned this pull request May 22, 2024
@otizonaizit
Copy link
Member

yes, merge!

@Debilski Debilski merged commit cf13d16 into ASPP:main May 22, 2024
25 checks passed
@Debilski Debilski deleted the feature/pelita-server branch May 22, 2024 14:01
github-actions bot pushed a commit that referenced this pull request May 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add zmq protocol to dispatch multiple players with --remote
3 participants