Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Offline negates all rules #190

Closed
vaidls opened this issue Dec 10, 2023 · 13 comments
Closed

Offline negates all rules #190

vaidls opened this issue Dec 10, 2023 · 13 comments
Assignees
Labels
bug Something isn't working high priority in progress

Comments

@vaidls
Copy link

vaidls commented Dec 10, 2023

Please check app for offline behaviour.

Linux daemon ignores network connectivity status and allows user to ignore limits.

Could daemon disconnect user in case network connectivity is lost or not present after OS startup?

Thanks

@marcus67 marcus67 self-assigned this Dec 10, 2023
@marcus67 marcus67 added bug Something isn't working high priority in progress labels Dec 10, 2023
@marcus67
Copy link
Owner

@vaidls : I just checked. Kicking out all active users is the default once the connection to the server has been lost for maximum_time_without_send_events seconds, which defaults to 50 seconds. It takes some more seconds before the KILL command is executed but after about one minute the session should be killed.

When you look at the log output of the client, do you see entries like this:

2023-12-10 16:42:26,162 - AppControl - WARNING - No successful send events for 45 seconds
2023-12-10 16:42:26,162 - App - DEBUG - Executing task app_control.check 0.000 [s] behind schedule... *** END ***
2023-12-10 16:42:26,162 - App - DEBUG - Sleeping for 4.99542 seconds (or until next signal)
2023-12-10 16:42:31,157 - App - DEBUG - Woken by signal
2023-12-10 16:42:31,158 - App - DEBUG - Executing task app_control.check 0.000 [s] behind schedule... *** START ***
2023-12-10 16:42:31,158 - AppControl - DEBUG - Sending 0 event(s) to master
2023-12-10 16:42:31,159 - MasterConnector - DEBUG - Executing POST API call 'http://localhost-wifi:5560/api/events'
2023-12-10 16:42:31,161 - MasterConnector - ERROR - Unknown error code 'None' with message 'None' while accessing artifact 'http://localhost-wifi:5560/api/events'
2023-12-10 16:42:31,161 - AppControl - ERROR - Exception 'HTTPConnectionPool(host='localhost-wifi', port=5560): Max retries exceeded with url: /api/events (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f4473388310>: Failed to establish a new connection: [Errno 111] Connection refused'))' while sending events to master. Requeueing events...
2023-12-10 16:42:31,161 - ProcessHandlerManager - INFO - Artificially killing 1 active processes on handler ClientProcessHandler
2023-12-10 16:42:31,161 - EventHandler - DEBUG - Queue locally: AdminEvent (type=KILL_PROCESS, host=ute.client, user=leon, process=None, PID=790092)
2023-12-10 16:42:31,162 - ProcessHandlerManager - INFO - Artificially killing 0 active processes on handler ClientDeviceHandler
2023-12-10 16:42:31,162 - EventHandler - DEBUG - Processing AdminEvent (type=KILL_PROCESS, host=ute.client, user=leon, process=None, PID=790092)
2023-12-10 16:42:31,162 - ClientProcessHandler - DEBUG - Kill process 790092 of user leon on host ute.client with signal SIGHUP
2023-12-10 16:42:31,162 - ClientProcessHandler - INFO - Killing process 790092 of user 'leon' on host 'ute.client' with signal SIGHUP
2023-12-10 16:42:31,162 - ClientProcessHandler - DEBUG - Executing sudo command '/usr/bin/sudo /bin/kill -SIGHUP 790092'...

@marcus67
Copy link
Owner

@vaidls I just tried the other context: The server is not available at the time that a user logs into the client. In this case the user is kicked out immediately.
Could you describe exactly the order in which you experienced the issue? Thanks a lot!

@vaidls
Copy link
Author

vaidls commented Dec 10, 2023

In my case, I disconnect network before restart and after restart it happens.
If I disconnect network after restart, sessions are disconnected as expected.


This a log after restart (from ofline to ofline: "clean ofline").


3707 2023-12-10 18:39:49,691 - root - INFO - Started logging in CWD=/ using module python_base_app.log_handling
3708 2023-12-10 18:39:49,691 - Configuration - INFO - Reading configuration file from '/etc/little-brother/little-brother.config'
3709 2023-12-10 18:39:49,692 - root - INFO - Set logging level to INFO
3710 2023-12-10 18:39:49,692 - Persistence - INFO - Database URL for normal access: 'sqlite://var/spool/little-brother/little-brother.sqlite.db'
3711 2023-12-10 18:39:49,701 - Persistence - INFO - Checking whether to create database 'little_brother'...
3712 2023-12-10 18:39:49,701 - App - INFO - Upgrading database to revision 'head' using alembic with working directory /var/lib/little-brother/virtualenv/lib/python3.11/site-packages/littl 3712 e_brother...
3713 2023-12-10 18:39:49,706 - alembic.runtime.migration - INFO - Context impl SQLiteImpl.
3714 2023-12-10 18:39:49,706 - alembic.runtime.migration - INFO - Will assume non-transactional DDL.
3715 2023-12-10 18:39:49,712 - root - INFO - Starting daemon process...
3716 2023-12-10 18:39:49,712 - App - INFO - Starting daemon...
3717 2023-12-10 18:39:49,715 - App - INFO - Started daemon.
3718 2023-12-10 18:39:49,715 - App - INFO - Starting app 'LittleBrother'
3719 2023-12-10 18:39:49,715 - Persistence - INFO - Database URL for normal access: 'sqlite://var/spool/little-brother/little-brother.sqlite.db'
3720 2023-12-10 18:39:49,718 - alembic.runtime.migration - INFO - Context impl SQLiteImpl.
3721 2023-12-10 18:39:49,718 - alembic.runtime.migration - INFO - Will assume non-transactional DDL.
3722 2023-12-10 18:39:49,720 - App - INFO - Database is on alembic version ed5e0310d209
3723 2023-12-10 18:39:49,722 - Persistence - INFO - Open database for normal access
3724 2023-12-10 18:39:49,736 - App - INFO - Module python_base_app is using Babel translations in /var/lib/little-brother/virtualenv/lib/python3.11/site-packages/python_base_app/translatio 3724 ns
3725 2023-12-10 18:39:49,737 - UnixUserHandler - INFO - Using admin user 'None'
3726 2023-12-10 18:39:49,744 - UserManager - INFO - Watching usernames: teo,feja,minecraft
3727 2023-12-10 18:39:49,745 - App - WARNING - Client instance will not start web server due to missing port number
3728 2023-12-10 18:39:49,745 - DeviceActivationManager - INFO - No handlers for device activation registered.
3729 2023-12-10 18:39:49,745 - AppControl - INFO - Starting application in CLIENT mode communicating with master at URL http://192.168.254.12:5580/api/
3730 2023-12-10 18:39:49,745 - AppControl - INFO - Using fully qualified domain name 'teodoras' for process infos
3731 2023-12-10 18:39:49,745 - App - INFO - Entering event queue...
3732 2023-12-10 18:39:49,763 - UserManager - WARNING - Cannot find user information for user 'teo', will retry later...
3733 2023-12-10 18:39:49,763 - UserManager - WARNING - Cannot find user information for user 'feja', will retry later...
3734 2023-12-10 18:39:49,763 - UserManager - WARNING - Cannot find user information for user 'minecraft', will retry later...
3735 2023-12-10 18:39:49,779 - MasterConnector - ERROR - Unknown error code 'None' with message 'None' while accessing artifact 'http://192.168.254.12:5580/api/events'
3736 2023-12-10 18:39:49,779 - AppControl - ERROR - Exception 'HTTPConnectionPool(host='192.168.254.12', port=5580): Max retries exceeded with url: /api/events (Caused by NewConnectionErro 3736 r('<urllib3.connection.HTTPConnection object at 0x7f857c7e4e10>: Failed to establish a new connection: [Errno 101] Network is unreachable'))' while sending events to master. Requeuein 3736 g events...
3737 2023-12-10 18:39:54,763 - UserManager - WARNING - Cannot find user information for user 'teo', will retry later...
3738 2023-12-10 18:39:54,763 - UserManager - WARNING - Cannot find user information for user 'feja', will retry later...
3739 2023-12-10 18:39:54,763 - UserManager - WARNING - Cannot find user information for user 'minecraft', will retry later...
3740 2023-12-10 18:39:54,765 - MasterConnector - ERROR - Unknown error code 'None' with message 'None' while accessing artifact 'http://192.168.254.12:5580/api/events'
3741 2023-12-10 18:39:54,765 - AppControl - ERROR - Exception 'HTTPConnectionPool(host='192.168.254.12', port=5580): Max retries exceeded with url: /api/events (Caused by NewConnectionErro 3741 r('<urllib3.connection.HTTPConnection object at 0x7f857c5aa3d0>: Failed to establish a new connection: [Errno 101] Network is unreachable'))' while sending events to master. Requeuein 3741 g events...
3742 2023-12-10 18:39:59,763 - UserManager - WARNING - Cannot find user information for user 'teo', will retry later...
3743 2023-12-10 18:39:59,764 - UserManager - WARNING - Cannot find user information for user 'feja', will retry later...
3744 2023-12-10 18:39:59,764 - UserManager - WARNING - Cannot find user information for user 'minecraft', will retry later...
3745 2023-12-10 18:39:59,765 - MasterConnector - ERROR - Unknown error code 'None' with message 'None' while accessing artifact 'http://192.168.254.12:5580/api/events'
:

@marcus67
Copy link
Owner

@vaidls OK. Now I understand. The server is offline and the client is restarted. It tries to retrieve information about the state of the users but does not get any because the server is down. Without the information it will not start monitoring the users and will never terminate their processes.

I think there are two options: a) The client will assume that all configured users will have to have their processes terminated. b) The client should persist the information about the users so that it can work with that as preliminary state until corrected by the server.

I will look into this a little more next weekend. Option a) seems to be pretty easy to implement.

@vaidls
Copy link
Author

vaidls commented Dec 10, 2023 via email

@marcus67
Copy link
Owner

marcus67 commented Mar 3, 2024

@vaidls: Sorry, completely forgot about this. My new branch (not public yet) for the Angular front end is too tempting...

I checked the two options again. Option (a) wouldn't have worked since the mapping of usernames to uids would have been missing. Hence, I 'm implementing option (b) now. It's almost done. The client will keep partial user configurations in an sqlite database, enough to kill process upon startup just in case the server is not reachable.

@marcus67
Copy link
Owner

marcus67 commented Mar 3, 2024

Seems to work. Only one small issue remaining when the master starts up again. Will probably look into this tomorrow.

marcus67 added a commit that referenced this issue Mar 4, 2024
* Closes #190
* Persist uid mappings on the client. This is required just in case the client
  restarts with no server active to enable process termination.
@marcus67
Copy link
Owner

marcus67 commented Mar 4, 2024

@vaidls: I fixed the problem. The Debian package for the master branch should be available. Could you test it, please? The Docker image did not build. If you need that, it will take a little more time. Thanks!

@vaidls
Copy link
Author

vaidls commented Mar 5, 2024 via email

@vaidls
Copy link
Author

vaidls commented Mar 8, 2024 via email

@marcus67
Copy link
Owner

marcus67 commented Mar 9, 2024

@vaidls : This must have been temporary. I was able to download it just now with 19 downloads this week overall.

@marcus67
Copy link
Owner

@vaidls Hi there! Have you had a chance to test? If it works, let me know, please. I would build a release then. Thanks!

@vaidls
Copy link
Author

vaidls commented Mar 16, 2024 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working high priority in progress
Projects
None yet
Development

No branches or pull requests

2 participants