Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sopel Appears to not know the Python PATH #1604

Closed
deathbybandaid opened this issue May 13, 2019 · 13 comments
Closed

Sopel Appears to not know the Python PATH #1604

deathbybandaid opened this issue May 13, 2019 · 13 comments
Labels
Needs Info Not Us Issues that are not Sopel's responsibility, e.g. a bug in the environment or a dependency Stale Mostly used for PRs that no longer work and need updating before re-review/merge.

Comments

@deathbybandaid
Copy link
Contributor

Whilst utilizing bot.restart() The bot occasionally hangs during the shutdown procedure (about 1 out of 4 uses of bot.restart). This did not occur prior to the recent updates to cli.

Traceback:

Job Scheduler stopped.
Calling shutdown for 8 modules.

Then all the modules shutdown procedures happen

Closed!
Traceback (most recent call last):
  File "/usr/local/bin/sopel", line 11, in <module>
    load_entry_point('sopel==6.6.8', 'console_scripts', 'sopel')()
  File "/usr/local/lib/python3.6/dist-packages/sopel/cli/run.py", line 654, in main
    return command(opts)
  File "/usr/local/lib/python3.6/dist-packages/sopel/cli/run.py", line 617, in command_legacy
    os.execv(sys.executable, ['python'] + sys.argv)
FileNotFoundError: [Errno 2] No such file or directory

The Bot then hangs for a few minutes before the bot actually restarts:

Welcome to Sopel. Loading modules...

This is possible related to the bot being run via systemd.

systemd file:

[Unit]
Description=Sopel IRC bot
Documentation=http://sopel.chat/
After=network.target

[Service]
Type=simple
User=sopel
PIDFile=/run/sopel/sopel-SpiceDBB.pid
ExecStart=/usr/local/bin/sopel -c /home/sopel/.sopel/SpiceDBB.cfg
Restart=on-failure
RestartPreventExitStatus=2
RestartSec=30
Environment=LC_ALL=en_US.UTF-8

[Install]
WantedBy=multi-user.target

I can provide additional information if needbe.

@dgw
Copy link
Member

dgw commented May 13, 2019

Went googling about for similar errors and found rkt/rkt#2322… Is this happening under SELinux?

I'm kinda grasping at whatever, since I don't have that much time to dig into this, but if you have a chance to run with strace (https://stackoverflow.com/a/49757349/5991) that could help narrow it down.

It's possible (even likely) that this has nothing to do with the CLI rework, and everything to do with how Sopel's restart function actually works. We might be able to work around it with tweaks to the example service unit files, or we might be able to change how restarting works when loading/startup is sufficiently refactored (@Exirel continues to make steady progress on that front). Or both. Who knows?

@deathbybandaid
Copy link
Contributor Author

Im afk, but Ubuntu 18.04

@dgw
Copy link
Member

dgw commented May 13, 2019

Ubuntu is a distro and SELinux is an access control module, but you're probably using AppArmor then instead unless something changed in recent Ubuntu versions. AppArmor and SELinux serve the same purpose, so you'll only have one or the other.

This Ubuntu bug looks possibly related: https://bugs.launchpad.net/cloud-images/+bug/1791691

@dgw dgw added Needs Info Not Us Issues that are not Sopel's responsibility, e.g. a bug in the environment or a dependency labels May 13, 2019
@Exirel
Copy link
Contributor

Exirel commented May 14, 2019

And that's why you don't want to use old-style services with systemd and why you don't want to handle a restart yourself with systemd.

@dgw
Copy link
Member

dgw commented May 15, 2019

@Exirel So, do you have a suggestion for handling this case, then? Should Sopel try to detect when it's running under systemd (or whatever other service manager) and refuse to honor the restart command? Or do we fix this by tweaking the example unit files?

I really want to avoid removing functionality from Sopel run as a standalone process just because it doesn't play nicely with Sopel run under a service manager. :/

@Exirel
Copy link
Contributor

Exirel commented May 15, 2019

@dgw well, if Sopel isn't run as a daemon (ie. it's a new-style service), both .restart command and sopel restart should be deactivated, and it should probably not handle its own PID file.

If it is run as a daemon, then the service unit for systemd must be configured for that purpose.

See also: the Type setting in systemd documentation.

If set to forking, it is expected that the process configured with ExecStart= will call fork() as part of its start-up. The parent process is expected to exit when start-up is complete and all communication channels are set up. The child continues to run as the main service process, and the service manager will consider the unit started when the parent process exits. This is the behavior of traditional UNIX services. If this setting is used, it is recommended to also use the PIDFile= option, so that systemd can reliably identify the main process of the service. systemd will proceed with starting follow-up units as soon as the parent process exits.

@dgw
Copy link
Member

dgw commented May 15, 2019

Well, if passed -d, Sopel does fork() itself… So we should start by updating the example unit files to use Type=forking instead of simple, as a start.

@Exirel
Copy link
Contributor

Exirel commented May 15, 2019

Exactly! :)

And one day I'll extract the whole daemon feature out of Sopel into a sopel-plugins-daemon package that brings sopel-restart & .restart.

@deathbybandaid
Copy link
Contributor Author

Switching from simple to forking seems to have helped alot

@deathbybandaid
Copy link
Contributor Author

I have restarted my bot several times, and had thought the issue had gone away, Just caught an instance of it.

However, I have been seeing it less frequently with forking.

@dgw
Copy link
Member

dgw commented Nov 21, 2019

@deathbybandaid have you seen this at all recently on any master-based instances?

@deathbybandaid
Copy link
Contributor Author

I'll have to check, I've been restarting with systemctl since restart() was a lottery

@dgw dgw added the Stale Mostly used for PRs that no longer work and need updating before re-review/merge. label Apr 1, 2023
@dgw
Copy link
Member

dgw commented Apr 1, 2023

Been a long time since this saw any activity, so let's close it out. It can be reopened if the symptom is observed again.

@dgw dgw closed this as not planned Won't fix, can't repro, duplicate, stale Apr 1, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Needs Info Not Us Issues that are not Sopel's responsibility, e.g. a bug in the environment or a dependency Stale Mostly used for PRs that no longer work and need updating before re-review/merge.
Projects
None yet
Development

No branches or pull requests

3 participants