Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kolibri 0.12.8 fails to start on RPi 3 B+ as part of IIAB 7.0 #1932

Closed
holta opened this issue Aug 16, 2019 · 6 comments
Closed

Kolibri 0.12.8 fails to start on RPi 3 B+ as part of IIAB 7.0 #1932

holta opened this issue Aug 16, 2019 · 6 comments
Milestone

Comments

@holta
Copy link
Member

holta commented Aug 16, 2019

Thanks @shanti-bhardwa for this report, from his 2GB RAM RPi 4 and RPi 3 B+:

IIAB 7.0 Pre-Release build on 2019-07-10-raspbian-buster-lite.img

Kolibri info is

Version: 0.12.8
OS: Linux-4.19.58-v7l+-armv7l-with-debian-10.0
Installer: whl
Server: Kolibri internal server
Database: /library/kolibri/db.sqlite3
Device name: box.lan
Free disk space: 7 GB
Server time: Fri Aug 16 2019 11:25:54 GMT+0100 (British Summer Time)
Server timezone: Europe/London

Kolibri starts on first boot:

pi@box:~ $ systemctl status kolibri
â—� kolibri.service - Kolibri
Loaded: loaded (/etc/systemd/system/kolibri.service; enabled; vendor preset:
Active: active (running) since Fri 2019-08-16 11:21:47 BST; 18min ago
Process: 372 ExecStart=/usr/local/kolibri/bin/kolibri start (code=exited, stat
Main PID: 1216 (kolibri)
Tasks: 57 (limit: 4035)
Memory: 100.7M
CGroup: /system.slice/kolibri.service
└─1216 /usr/local/kolibri/bin/python2 /usr/local/kolibri/bin/kolibri

Aug 16 07:58:45 box.lan kolibri[372]: INFO Running Kolibri with the followin
Aug 16 11:21:46 box.lan kolibri[372]: INFO Sqlite database Vacuum finished.
Aug 16 11:21:47 box.lan kolibri[372]: INFO Running 'kolibri start' as daemon
Aug 16 11:21:47 box.lan kolibri[372]: INFO Kolibri running on:
Aug 16 11:21:47 box.lan kolibri[372]:
Aug 16 11:21:47 box.lan kolibri[372]: http://127.0.0.1:8009/
Aug 16 11:21:47 box.lan kolibri[372]: http://192.168.1.39:8009/
Aug 16 11:21:47 box.lan kolibri[372]: http://172.18.96.1:8009/
Aug 16 11:21:47 box.lan kolibri[372]: INFO Going to daemon mode, logging to
Aug 16 11:21:47 box.lan systemd[1]: Started Kolibri.

After loading a few channels and the rebooting, fails on boot.

pi@box:~ $ systemctl status kolibri
â—� kolibri.service - Kolibri
Loaded: loaded (/etc/systemd/system/kolibri.service; enabled; vendor preset:
Active: failed (Result: timeout) since Fri 2019-08-16 14:07:55 BST; 3min 22s
Process: 343 ExecStart=/usr/local/kolibri/bin/kolibri start (code=killed, sign

Aug 16 14:05:56 box.lan systemd[1]: Starting Kolibri...
Aug 16 14:05:57 box.lan kolibri[343]: WARNING:root:No C Extensions available for
Aug 16 14:06:01 box.lan kolibri[343]: INFO Option HTTP_PORT in section [Depl
Aug 16 14:06:01 box.lan kolibri[343]: INFO Option URL_PATH_PREFIX in section
Aug 16 14:06:02 box.lan kolibri[343]: INFO Running Kolibri with the followin
Aug 16 14:07:55 box.lan systemd[1]: kolibri.service: Start operation timed out.
Aug 16 14:07:55 box.lan systemd[1]: kolibri.service: Control process exited, cod
Aug 16 14:07:55 box.lan systemd[1]: kolibri.service: Failed with result 'timeout
Aug 16 14:07:55 box.lan systemd[1]: Failed to start Kolibri.

Ran the Server on PI 3B+ as well, but it would not start

sudo systemctl start kolibri
Job for kolibri.service failed because a timeout was exceeded.
See "systemctl status kolibri.service" and "journalctl -xe" for details.

journalctl -xe
--
-- An ExecStart= process belonging to unit kolibri.service has exited.
--
-- The process' exit code is 'killed' and its exit status is 15.
Aug 16 15:13:45 box.lan systemd[1]: kolibri.service: Failed with result 'timeout
-- Subject: Unit failed
-- Defined-By: systemd
-- Support: https://www.debian.org/support
--
-- The unit kolibri.service has entered the 'failed' state with result 'timeout'
Aug 16 15:13:45 box.lan systemd[1]: Failed to start Kolibri.
-- Subject: A start job for unit kolibri.service has failed
-- Defined-By: systemd
-- Support: https://www.debian.org/support
--
-- A start job for unit kolibri.service has finished with a failure.
--
-- The job identifier is 837 and the job result is failed.
Aug 16 15:15:01 box.lan CRON[2719]: pam_unix(cron:session): session opened for u
Aug 16 15:15:01 box.lan CRON[2718]: pam_unix(cron:session): session opened for u
Aug 16 15:15:01 box.lan CRON[2726]: (root) CMD (if [ -x /etc/munin/plugins/apt_a
Aug 16 15:15:01 box.lan CRON[2727]: (munin) CMD (if [ -x /usr/bin/munin-cron ];
Aug 16 15:15:01 box.lan CRON[2718]: pam_unix(cron:session): session closed for u
lines 2269-2291/2291 (END)
--
-- An ExecStart= process belonging to unit kolibri.service has exited.
--
-- The process' exit code is 'killed' and its exit status is 15.
Aug 16 15:13:45 box.lan systemd[1]: kolibri.service: Failed with result 'timeout'.
-- Subject: Unit failed
-- Defined-By: systemd
-- Support: https://www.debian.org/support
--
-- The unit kolibri.service has entered the 'failed' state with result 'timeout'.
Aug 16 15:13:45 box.lan systemd[1]: Failed to start Kolibri.
-- Subject: A start job for unit kolibri.service has failed
-- Defined-By: systemd
-- Support: https://www.debian.org/support
--
-- A start job for unit kolibri.service has finished with a failure.
--
-- The job identifier is 837 and the job result is failed.
Aug 16 15:15:01 box.lan CRON[2719]: pam_unix(cron:session): session opened for user munin by (uid=0)
Aug 16 15:15:01 box.lan CRON[2718]: pam_unix(cron:session): session opened for user root by (uid=0)
Aug 16 15:15:01 box.lan CRON[2726]: (root) CMD (if [ -x /etc/munin/plugins/apt_all ]; then /etc/munin/plugins/apt_all update 7200 12 >/dev/null;
Aug 16 15:15:01 box.lan CRON[2727]: (munin) CMD (if [ -x /usr/bin/munin-cron ]; then /usr/bin/munin-cron; fi)
Aug 16 15:15:01 box.lan CRON[2718]: pam_unix(cron:session): session closed for user root
~

on Pi 4 it can be restarted

@holta holta added this to the 7.0 milestone Aug 16, 2019
@holta
Copy link
Member Author

holta commented Aug 16, 2019

Thanks @shanti-bhardwa for the report. @jvonau asks if you can investigate 2 possible solutions here:

  • Does adding TimeoutStartSec=180 under [Service] help, within unit file /etc/systemd/system/kolibri.service ? (And then rebooting to test!)
  • Is it possible Kolibri is waiting on the back-end database to become available? Can you try to figure out how long it takes for the database to start up from its log file and compare when Kolibri and MySQL are started?
    • If that's confirmed, something like After=mariadb.service in the above unit file might stage/order things better.

@jvonau
Copy link
Contributor

jvonau commented Aug 19, 2019

After loading a few channels and the rebooting, fails on boot.

I'd try to load the channels one at a time while trying 'systemctl stop kolibri' 'systemctl start kolibri' between each addition, just to get a feel for the overhead that is introduced with each new channel.

Aug 16 07:58:45 box.lan kolibri[372]: INFO Running Kolibri with the followin
Aug 16 11:21:46 box.lan kolibri[372]: INFO Sqlite database Vacuum finished.
Aug 16 11:21:47 box.lan kolibri[372]: INFO Running 'kolibri start' as daemon

Looks like the backend is Sqlite when the service starts

Aug 16 14:05:56 box.lan systemd[1]: Starting Kolibri...
Aug 16 14:05:57 box.lan kolibri[343]: WARNING:root:No C Extensions available for
Aug 16 14:06:01 box.lan kolibri[343]: INFO Option HTTP_PORT in section [Depl
Aug 16 14:06:01 box.lan kolibri[343]: INFO Option URL_PATH_PREFIX in section
Aug 16 14:06:02 box.lan kolibri[343]: INFO Running Kolibri with the followin
Aug 16 14:07:55 box.lan systemd[1]: kolibri.service: Start operation timed out.

While the failed start didn't finish the db maintenance

I'd confirm at the cmdline that 'systemctl stop kolibri' shuts down the service in systemd's default timeframe, or if you see failed to stop kolibri during shutdown, if either of these are failing there might be a dirty database delaying the next start of the service.

@jvonau
Copy link
Contributor

jvonau commented Aug 23, 2019

From chat:

database locked/db vacuum times

shanti, OK made the change for TimeoutStartSec=180. After couple of reboots, Kolibri has started. So will keep testing and see if this has fixed the problem.
shanti, testing so far: on PI 4 Kolibri start up at boot time is more reliable,.even after several reboots and in between Channel loads. Must allow at least 180 sec from boot time before accessing kolibri. on PI 3B + works as well, but not all the time. Tried it on Pi 2B Plus as well. Slow and timeouts still occur. Will continue with PI 4.I am not convinced that this is purely a timeout issue

Upstream discussion:
kolibri never shuts down cleanly
vacuum does not work properly for large DBs in 0.12.x

@jvonau
Copy link
Contributor

jvonau commented Aug 25, 2019

jvonau added a commit to jvonau/iiab that referenced this issue Aug 25, 2019
holta added a commit that referenced this issue Sep 3, 2019
Kolibri systemd unit file...solves its failure-to-start on RPi 3 B+ especially? #1932
@holta
Copy link
Member Author

holta commented Sep 3, 2019

@shanti-bhardwa can we close this ticket in the coming week perhaps, now that PR #1936 is merged?

In any case, please do monitor as Kolibri 0.12.9 is likely around the corner -- whether before or after IIAB 7.0 final release this month -- that we should coordinate with your own deadlines in coming days.

@holta
Copy link
Member Author

holta commented Sep 30, 2019

Can't reproduce this over the past 1.5 months. Please re-open as nec.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants