Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

after dist-upgrade "A start job is running for lsb: raise network interface" @boot #111

Closed
RafaelKa opened this issue Oct 2, 2015 · 32 comments

Comments

@RafaelKa
Copy link

RafaelKa commented Oct 2, 2015

I installed http://mirror.igorpecovnik.com/Armbian_4.4_Bananapi_Debian_jessie_4.2.2.zip and runned apt-get dist-upgrade and now boots my BPI not anymore and halts on "A start job is running for lsb: raise network interface"

@igorpecovnik
Copy link
Member

Jessie and dist-upgrade is not a good combination. Probably this is the problem:
http://forum.armbian.com/index.php/topic/241-first-run-stuck-with-ssh-key-generation-cubietruck-rtc-bug/

@ramki982
Copy link

ramki982 commented Oct 3, 2015

I took the latest 4.4 armbian & built from source & installed on a fresh SD card to boot my Lamobo-R1 - I'm also stuck with same issue as above. (It was not a upgrade).

I tried using a STATIC configuration for the interfaces.default instead of dhcp - with that it was able to continue further but I got a

[FAILED] Failed to start Login Service.
See 'systemctl status systemd-logind.service' for details.
[ OK ] Started LSB: Get some info about hardware for some A...e basic things.

It halted here and when I did a power reset - it goes thro without errors - but I'm NOT able to LOGIN

@igorpecovnik
Copy link
Member

Ethernet is not brought up because RTC has invalid data ... I haven't got time to investigate what is the real reason behind, but if you set RTC before bringing eth up, than things works. I made simple workaround for Wheezy and Trusty but that one doesn't work here because of systemd. I decided I'll remove systemd ASAP since it's advantages are minimal. This is not the first time, Jessie is not booting.

@ramki982
Copy link

ramki982 commented Oct 3, 2015

Thanks Igor, Do you have any clues as to why LOGIN service failed even if I used a static network configuration to get past that Ethernet issue?

@igorpecovnik
Copy link
Member

No clue at the moment, but am almost sure that systemd is the one which can't cope with this situation. Try to build image this way: http://without-systemd.org/wiki/index.php/How_to_remove_systemd_from_a_Debian_jessie/sid_installation

@ramki982
Copy link

ramki982 commented Oct 3, 2015

Let me try that & update you

@ramki982
Copy link

ramki982 commented Oct 3, 2015

On a related note - I had a built ARMBIAN image based on 4.2-rc7. It was stable on Lamobo-r1.

In the above image - the directory in /lib/module used to be called 4.2.0-rc7-lamobo-r1. Now it is seeming to be called 4.2-sunxi

Is this difference in directory name giving u any clues?

Thx again for all your awesome work

@ramki982
Copy link

ramki982 commented Oct 3, 2015

That older build based on 4.2-rc7 was built few weeks back

@igorpecovnik
Copy link
Member

I merged kernel under same name - sunxi since they are the same.

This problem emerged in recent kernel / u-boot version ... and sooner and later we'll fix it somehow ;)

@ramki982
Copy link

ramki982 commented Oct 3, 2015

I'm puzzled because I'm able to run the 4.2.0 jessie image I got from your site http://www.armbian.com/lamobo-r1/ (that was released couple of weeks back, I guess it was Armbian 4.3 - don't remember)

But when I try to built it from git sources locally (by setting the KERNELTAG to 4.2.0 in compile.sh) & use that image on my board - i run into this problem.

So if possible - I wanted to use git checkout to move Armbian scripts to a point when you had built 4.2 image & then try building it locally.

If possible - can you share the date when the previous 4.2.0 jessie image was built?

@igorpecovnik
Copy link
Member

R1 u-boot is very much different - in previous version I use one old (2015.04), now I am using recent.

To than you will need to use older patch too:
https://github.com/igorpecovnik/lib/blob/second/patching.sh#L210

This one works only on recent u-boot. Bottom up. Not a trivial task.

It's not just kernel which can make you troubles ;)

@ramki982
Copy link

ramki982 commented Oct 3, 2015

Ah ok got it :)

@zador-blood-stained
Copy link
Member

I tested two installations on two SD cards - Jessie 4.4 with mainline kernel and Stretch (testing) with 3.4.109, my device is cubietruck.
From dmesg on legacy kernel:
<6>sunxi-rtc sunxi-rtc: setting system clock to 2015-10-03 09:33:34 UTC (1443864814)
From serial console on mainline:
[ 3.672352] sunxi-rtc 1c20d00.rtc: setting system clock to 2085-10-03 09:21:01 UTC (3652939261)
and then console gets spammed with

Set up TFD_TIMER_CANCEL_ON_SET timerfd.
Time has been changed
Set up TFD_TIMER_CANCEL_ON_SET timerfd.
Time has been changed
Set up TFD_TIMER_CANCEL_ON_SET timerfd.

which is, I believe, is this bug: systemd/systemd/issues/1143

The thing is, day, month and time reads correctly in both cases, and only year is off by 70.

Comparing Mainline and Legacy RTC drivers, looks like they are using different offset for setting year on the RTC, which leads to 70 years difference.
Hope this helps.

@zador-blood-stained
Copy link
Member

I "hacked" sunxi rtc driver to get correct year on mainline (just for testing), but boot process still was stuck at "Raise network interfaces".
After some testing and searching I found this, replaced auto eth0 with allow-hotplug eth0 in /etc/network/interfaces, and now it can finally boot up (it still takes about half a minute to bring up network while systemctl status networking shows udevadm settle in process tree).

@ramki982
Copy link

ramki982 commented Oct 3, 2015

Can u share the diff to sunxi rtc driver so that I can try this. Thanks a bunch for sharing your findings

@zador-blood-stained
Copy link
Member

In file drivers/rtc/rtc-sunxi.c, line 123, I replaced
#define SUNXI_YEAR_OFF(x) ((x)->min - 1900)
with
#define SUNXI_YEAR_OFF(x) 0
This removes 70 years offset and in my case allows booting mainline kernel after legacy and vice versa on cubietruck, but:

  1. I don't think it is a good idea to mess around with kernel drivers without fully understanding possible consequences on all hardware, and
  2. I think it's better to patch legacy kernel to be compatible with mainline.
    Please use this only for testing purposes and make sure you can boot another (not systemd based) distribution and restore RTC time if something goes wrong.

@igorpecovnik
Copy link
Member

It helps, at least we start to deal with the problem ;) I agree with 2. ... I'll take a look too.

I also change the script and add new option for systemd yes or no. If we don't find a proper solution in a short time ... :)

@zador-blood-stained
Copy link
Member

Another way to boot jessie with systemd broken by wrong RTC time is even simpler and does not require recompiling kernel - you can just recompile .dtb file for target system with dtc, commenting out or removing RTC section. Again, this does not solve network issues, but it allows using systemd early debug shell.

/* <-- commented out
        rtc: rtc@01c20d00 {
            compatible = "allwinner,sun7i-a20-rtc";
            reg = <0x01c20d00 0x20>;
            interrupts = <GIC_SPI 24 IRQ_TYPE_LEVEL_HIGH>;
        };
*/

@zador-blood-stained
Copy link
Member

By the way, Igor, you can have both systemd and sysvinit installed in Jessie, and switch between them by using different u-boot scripts, if you remove only systemd-sysv (sysvinit-core should be installed instead of it).
To boot into systemd, you will need to add init=/bin/systemd to bootargs in boot.cmd, and without this parameter it will boot with sysvinit.

@igorpecovnik
Copy link
Member

Thanks for the tip. I'll prepare configuration for both ... I did some testing and comparison with 4.1.6 and the RTC part haven't change for a while. Huh. Also u-boot has no effect.

@zador-blood-stained
Copy link
Member

From systemd bug discussion mentioned above I'm assuming that the reason for RTC issues is that mainline kernel might had switched from 32 bit time_t ( or another time related type) to 64 bit, so before instead of year 2085 we would just had an overflow and wrong date and time, but now we get all kinds of bugs in userspace.
Right now I'm compiling patched 3.4.109, and if it works like I want it to and doesn't break anything, I will post patch here.
What do you mean by "u-boot has no effect"?

@igorpecovnik
Copy link
Member

It must be some general change/bug; it's possible. I'll wait if your patch brings the joy.

Regarding u-boot ... I only did few test boots with different versions to rule uboot out of this problem.

@zador-blood-stained
Copy link
Member

There are (or actually were) 2 or 3 separate bugs, unrelated to each other.

  • Kernel RTC bug, which happened when switching from old kernel to mainline, and lead to systemd completely locking up (for users it looked like mostly black screen with ttyS0 error appearing after some time);
  • Network error (LSB: Raise network interface...), actually caused by dhcp client and ifup;
  • Systemd logind error, if network was raised successfully.

I believe that your last commit, that disables i2c debug messages, actually fixed last two. I think that systemd-journald and/or rsyslog were so busy processing i2c debug spam, that they locked up another services that tried to use logging subsystem.
I just unpacked recompiled kernel on clean Jessie 4.4 image (downloaded from armbian.com, still with systemd), and it works like a charm.

I consider RTC bug for now being a low priority, because it happens only when switching kernels, and it can be fixed manually relatively easy (if you want, I can post instructions here).

In my system evbug module (kernel config CONFIG_INPUT_EVBUG) loads due to plugged keyboard and fills dmesg with debug messages related to keyboard events, please disable or blacklist it.

@igorpecovnik
Copy link
Member

Thanks for update. I also add evbug to blacklist. I rarely use console so I don't see. RTC bug fix - subtract 70 years from current date and write it, than proceed with upgrade? But post your solution, than I can close this issue. Perhaps I have to reconsider about changes regarding systemd? Disabling is a click away ;)

@zador-blood-stained
Copy link
Member

If system is not booting after switching kernel from 3.4 to mainline due to systemd lockup, it can be fixed in-place without reflashing this way:
You have to get to u-boot command prompt, using either a serial adapter or monitor and usb keyboard.
After switching power on or rebooting, when u-boot loads up, press some keys on the keyboard (or send some key presses via terminal) to abort default boot sequence and get to the command prompt:

U-Boot SPL 2015.07-dirty (Oct 01 2015 - 15:05:21)
...
Hit any key to stop autoboot:  0
sunxi#

Enter these commands, replacing root device path if necessary.
Select setenv line with ttyS0 for serial, tty1 for keyboard+monitor:

setenv bootargs init=/bin/bash root=/dev/mmcblk0p1 rootwait console=ttyS0,115200
# or
setenv bootargs init=/bin/bash root=/dev/mmcblk0p1 rootwait console=tty1

ext4load mmc 0 0x49000000 /boot/dtb/${fdtfile}
ext4load mmc 0 0x46000000 /boot/zImage
env set fdt_high ffffffff
bootz 0x46000000 - 0x49000000

System should eventually boot to bash shell:

root@(none):/#

Now you can check current date, correct it and upload it to RTC. Example:

root@(none):/# date
Mon Oct  5 20:37:28 CEST 2015
root@(none):/# date -s "2015-10-5 21:38:00"
Mon Oct  5 21:38:00 CEST 2015
root@(none):/# hwclock -w
hwclock: Could not open file with the clock adjustment parameters in it (/etc/adjtime) for writing: Read-only file system
hwclock: Drift adjustment parameters not updated.
root@(none):/# hwclock -r
Mon Oct  5 21:38:20 2015  -1.479230 seconds
root@(none):/#

hwclock will print error due to read-only file system, but it actually updates RTC.

Done. Now you can restart with reboot -f

Theoretically, using bash shell with access to rootfs, users with armbian broken after update can remount rootfs to r/w, enable systemd debug shell and on next reboot use it to bring up network manually and upgrade kernel from armbian repo (when it will be available), but it is a whole another story.

About systemd:

  1. I think systemd should be left as it is (enabled for jessie by default, I mean).
  2. I'm not sure, but without explicit configuration next time systemd-sysv changes version, it might automatically be installed on dist-upgrade.

@zador-blood-stained
Copy link
Member

Did you test fresh built image or your old one with new kernel? I tested only 4.2.2 kernel with manual reconfiguration, and without this commit especially (which is temporary measure for migrating to 4.2.3). Can you extract kernel config file from your image (it is in /boot ) and check if it actually picked up i2c debug changes?

@ramki982
Copy link

ramki982 commented Oct 6, 2015

I think it did not pick the i2c changes. I'm running a re-build after ensuring the latest config is present with the i2c fix. Hopefully this will fix it.

Thanks so much for your help

@dllud
Copy link

dllud commented Oct 8, 2015

I had this same issue on fresh install for Cubieboard 2, with the Armbian 4.4 jessie 4.2.2 image which is currently on the download page.

Building from the repo @414492474aa32db43364094d12d35933e7861afc solved it. Thus a new release is needed sooner than latter.

(BTW, @igorpecovnik congrats for your work on these build scripts. It took long but I had never built so much stuff (incl. a kernel) with so little effort. Keep it going!)

@zador-blood-stained
Copy link
Member

Regarding kernel upgrade issue / switch from old to mainline.
Possible solution for integrating into new images - firstboot script for fixing time. Requires zero user interaction. Disables itself after first boot. Should prevent things like this.
Preparations:

  • Original boot.scr is renamed to boot.scr.orig
  • New boot.scr is compiled from original boot.cmd with addition of init=/bin/fixtime.sh to bootargs
  • /bin/fixtime.sh, with exec bits set, contains:
#!/bin/bash

echo "Firstboot script"

sleep 1

echo "Mounting filesystems"

mount -v proc /proc -t proc
mount -v -o remount,rw /

echo "Restoring default u-boot script"

mv -v -f /boot/boot.scr /boot/boot.scr.fixtime
mv -v -f /boot/boot.scr.orig /boot/boot.scr

echo "Analyzing system time"

hwclock -r -D

if [ "$(hwclock -r)" == "" ] || [ $(date +%Y) -lt 2015 ]
then
    echo "Wrong RTC/system time detected. Fixing" 
    date -s "2015-10-12 10:00:00"
    # alternative, not tested - set date to zImage modification time
    # date +%F -r /boot/zImage | xargs date +%F --set
    hwclock -w
fi

echo "Rebooting in 10 seconds"

sleep 10

sync
reboot -f

Tested a little bit on cubietruck with Jessie 4.5 mainline image. Worked for me.

@igorpecovnik
Copy link
Member

Nice idea, I'll explore it.

@igorpecovnik
Copy link
Member

Workaround works & bug will eventually be fixed within kernel so closing this.

@bwilcutt
Copy link

The worse part about systemd is you have no idea what it is doing and, therefore, no idea how to fix any issues. Even running systemctl "--test" gives errors "cannot run as root". What kind of app doesn't run as root? Seriously, what? Ridiculous.

So, people crawl back to initd because it is more plain, well spoken, and understandable. Systemd fixes a problem that never existed, then complements the 'fix' by bringing more issues. Not good.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

6 participants