Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move to latest Debian release 'buster' #19

Closed
pdsouza opened this issue Aug 16, 2019 · 58 comments
Closed

Move to latest Debian release 'buster' #19

pdsouza opened this issue Aug 16, 2019 · 58 comments
Labels

Comments

@pdsouza
Copy link
Member

pdsouza commented Aug 16, 2019

Buster has been released and we should upgrade from stretch ASAP.

Press release: https://lists.debian.org/debian-announce/2019/msg00003.html
Overview of new stuff: https://wiki.debian.org/NewInBuster
Complete release notes: https://www.debian.org/releases/buster/amd64/release-notes/ch-whats-new.en.html

@nbbm26
Copy link

nbbm26 commented Sep 25, 2019

Can I help? CS background, getting into building a better OS for devices, interested in this project. Previously ran an XFCE desktop on a chromebook via chroot and have been interested in getting a running version of chroot (not Proot, though it works) on my droid 4 (sliding keyboard and HDMI port are very nice)

In short, what can I do to help?

@pdsouza
Copy link
Member Author

pdsouza commented Sep 26, 2019

@nbbm26 Thanks for getting in touch. I believe that we are already in a position to build buster containers, but there are probably things we need to migrate over to buster to have Maru run properly (check whether there were any package dependencies that changed that we may need to explicitly depend on, changes to default security policies on the desktop that we have to handle, etc.).

I would suggest first getting Maru running on your hardware so you can get familiar with the system (if you own one of our supported devices that's easy, otherwise you can learn a lot by doing a port). Do you have Maru running yet?

I am planning to make the move to buster as one of my next tasks in the next few weeks. Perhaps once you have Maru running, you can help test the buster images on your device to help me make sure all is well.

@nbbm26
Copy link

nbbm26 commented Oct 3, 2019 via email

@utzcoz
Copy link
Member

utzcoz commented Nov 15, 2019

@pdsouza The Chrome OS 80 will start using Debian 10 Buster on new Linux installations. And I have found that we can build buster rootfs with current blueprints version and docker version, just replace stretch with buster in build command, because of the updated LXC template supports buster. Can we release buster version rootfs in next maru version to open the test for it?

@pdsouza
Copy link
Member Author

pdsouza commented Nov 15, 2019

@utzcoz Oh, that's great to hear we can easily switch to buster! Have you tried using buster in your builds? If everything looks good with buster on Maru with some quick tests, I'm happy to release it in the next version.

@utzcoz
Copy link
Member

utzcoz commented Nov 15, 2019

@pdsouza I have built it manually to debug problem in pixel. And it looks fine. You can just replace stretch to buster in build command to build and test it manually.

@pdsouza
Copy link
Member Author

pdsouza commented Nov 15, 2019 via email

@pdsouza
Copy link
Member Author

pdsouza commented Dec 6, 2019

Ran into an issue building a buster container for armhf:

$ ./build-with-docker.sh -b debian -n buster-container -- -r buster -a armhf --minimal
[...]
Processing triggers for ca-certificates (20190110) ...
Updating certificates in /etc/ssl/certs...
qemu: Unsupported syscall: 382
0 added, 0 removed; done.
Running hooks in /etc/ca-certificates/update.d...
done.
curl: (60) SSL certificate problem: unable to get local issuer certificate
More details here: https://curl.haxx.se/docs/sslcerts.html

curl failed to verify the legitimacy of the server and therefore could not
establish a secure connection to it. To learn more about this situation and
how to fix it, please visit the web page mentioned above.
gpg: no valid OpenPGP data found.
[*] Cleaning up...
Destroyed container buster-container

This issue happens when curl'ing the Maru gpg key to install the mclient package from our servers.

Looks like there is an issue with the buster ca-certificates package on armhf when using QEMU: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=923479

In order to support our armhf devices (Nexus 5, Nexus 7) on buster we will need to work around this.

I tried installing the older ca-certificates package from stretch by explicitly stating it as a dependency when installing curl in chroot-configure.sh. Here is my quick patch:

diff --git a/blueprint/debian/chroot-configure.sh b/blueprint/debian/chroot-configure.sh
index c7282cf..3ca42a2 100755
--- a/blueprint/debian/chroot-configure.sh
+++ b/blueprint/debian/chroot-configure.sh
@@ -57,8 +57,11 @@ install_minimal () {

 add_maru_key () {
     apt-get clean
+    cat > /etc/apt/sources.list.d/stretch.list <<EOF
+deb http://deb.debian.org/debian stretch main
+EOF
     apt-get -q update
-    apt-get -q -y install curl gnupg
+    apt-get -q -y install curl gnupg ca-certificates/stretch
     curl -fsSL https://maruos.com/static/gpg.txt | apt-key add -
 }

With this patch, the build succeeds!

So I pushed the container to my Nexus 5 and tried starting it and hit some early errors:

hammerhead # lxc-start -n default
lxc-start: external/lxc/src/lxc/conf.c: mk_devtmpfs: 1379 /dev/.lxc is not setup - taking fallback
                                                                                                  Failed to lookup module alias 'autofs4': Function not implemented
Failed to determine whether /sys is a mount point: Bad file descriptor
Failed to determine whether /proc is a mount point: Bad file descriptor
Failed to determine whether /dev is a mount point: Bad file descriptor
Failed to determine whether /dev/shm is a mount point: Bad file descriptor
Failed to determine whether /run is a mount point: Bad file descriptor
Failed to determine whether /run/lock is a mount point: Bad file descriptor
Failed to determine whether /sys/fs/cgroup is a mount point: Bad file descriptor
Failed to determine whether /sys/fs/cgroup/systemd is a mount point: Bad file descriptor
[!!!!!!] Failed to mount API filesystems.
Exiting PID 1...
lxc-start: external/lxc/src/lxc/lxc_start.c: main: 336 The container failed to start.
lxc-start: external/lxc/src/lxc/lxc_start.c: main: 340 Additional information can be obtained by setting the --logfile and --logpriority options.

These are errors thrown by systemd. In stretch we were on systemd 232, but in buster it is systemd 241. Something must have changed between these two versions to cause this issue. It looks like getting buster to work on armhf is trickier than expected.

For arm64, I was able to build an arm64 buster container successfully without any changes. But sadly, both (!) of my Nexus 5Xs have died (very likely it was the common hardware failures reported for 5X) so I don't have an arm64 device for actually running the container until my new phone arrives later this month.

@pdsouza
Copy link
Member Author

pdsouza commented Dec 7, 2019

I tried downgrading systemd on armhf buster using stretch's APT repository but I ran into issues installing XFCE dependencies for the desktop. That's a no-go.

@pdsouza
Copy link
Member Author

pdsouza commented Dec 7, 2019

In order to confirm my hypothesis that this is indeed due to some change in systemd between v232 and v241, I just downgraded to v232 from stretch's APT repository and skipped the rest of the Maru configuration (installing XFCE results in an impossible state due to downgrade) to create a super minimal headless image.

And what do you know...

hammerhead:/ # lxc-start -n default
lxc-start: external/lxc/src/lxc/conf.c: mk_devtmpfs: 1379 /dev/.lxc is not setup - taking fallback
                                                                                                  systemd 232 running in system mode. (+PAM +AUDIT +SELINUX +IMA +APPARMOR +SMACK +SYSVINIT +UTMP +LIBCRYPTSETUP +GCRYPT +GNUTLS +ACL +XZ +LZ4 +SECCOMP +BLKID +ELFUTILS +KMOD +IDN)
Detected virtualization lxc.
Detected architecture arm.

Welcome to Debian GNU/Linux 10 (buster)!

Set hostname to <buster-container>.
sys-kernel-debug.mount: Cannot add dependency job, ignoring: Unit sys-kernel-debug.mount is masked.
container-getty@3.service: Cannot add dependency job, ignoring: Unit container-getty@3.service is masked.
container-getty@4.service: Cannot add dependency job, ignoring: Unit container-getty@4.service is masked.
container-getty@1.service: Cannot add dependency job, ignoring: Unit container-getty@1.service is masked.
container-getty@2.service: Cannot add dependency job, ignoring: Unit container-getty@2.service is masked.
container-getty@0.service: Cannot add dependency job, ignoring: Unit container-getty@0.service is masked.
[  OK  ] Started Forward Password Requests to Wall Directory Watch.
[  OK  ] Listening on Journal Socket (/dev/log).
[  OK  ] Reached target Remote File Systems.
[  OK  ] Listening on Journal Audit Socket.
[  OK  ] Reached target Swap.
system.slice: Failed to set invocation ID on control group /system.slice, ignoring: Operation not supported
[  OK  ] Created slice System Slice.
system-getty.slice: Failed to set invocation ID on control group /system.slice/system-getty.slice, ignoring: Operation not supported
[  OK  ] Created slice system-getty.slice.
dev-hugepages.mount: Couldn't determine result for ConditionVirtualization=!private-users, assuming failed: No such file or directory
[  OK  ] Reached target Slices.
[  OK  ] Started Dispatch Password Requests to Console Directory Watch.
[  OK  ] Reached target Paths.
[  OK  ] Reached target Encrypted Volumes.
[  OK  ] Listening on Journal Socket.
systemd-modules-load.service: Failed to set invocation ID on control group /system.slice/systemd-modules-load.service, ignoring: Operation not supported
         Starting Load Kernel Modules...
systemd-tmpfiles-setup-dev.service: Failed to set invocation ID on control group /system.slice/systemd-tmpfiles-setup-dev.service, ignoring: Operation not supported
         Starting Create Static Device Nodes in /dev...
systemd-journald.service: Failed to set invocation ID on control group /system.slice/systemd-journald.service, ignoring: Operation not supported
         Starting Journal Service...
systemd-remount-fs.service: Failed to set invocation ID on control group /system.slice/systemd-remount-fs.service, ignoring: Operation not supported
         Starting Remount Root and Kernel File Systems...
ifupdown-pre.service: Failed to set invocation ID on control group /system.slice/ifupdown-pre.service, ignoring: Operation not supported
         Starting Helper to synchronize boot up for ifupdown...
[  OK  ] Listening on /dev/initctl Compatibility Named Pipe.
[  OK  ] Reached target Sockets.
[  OK  ] Started Helper to synchronize boot up for ifupdown.
[  OK  ] Started Load Kernel Modules.
[  OK  ] Started Create Static Device Nodes in /dev.
[  OK  ] Started Remount Root and Kernel File Systems.
[  OK  ] Started Journal Service.
         Starting Flush Journal to Persistent Storage...
[  OK  ] Reached target Local File Systems (Pre).
[  OK  ] Reached target Local File Systems.
         Starting Apply Kernel Variables...
[  OK  ] Started Apply Kernel Variables.
         Starting Raise network interfaces...
[  OK  ] Started Flush Journal to Persistent Storage.
[FAILED] Failed to start Raise network interfaces.
See 'systemctl status networking.service' for details.
[  OK  ] Reached target Network.
         Starting Create Volatile Files and Directories...
[  OK  ] Started Create Volatile Files and Directories.
[  OK  ] Reached target System Time Synchronized.
         Starting Update UTMP about System Boot/Shutdown...
[  OK  ] Started Update UTMP about System Boot/Shutdown.
[  OK  ] Reached target System Initialization.
[  OK  ] Reached target Basic System.
         Starting Permit User Sessions...
[  OK  ] Started Daily apt download activities.
[  OK  ] Started Daily apt upgrade and clean activities.
[  OK  ] Started Daily Cleanup of Temporary Directories.
[  OK  ] Reached target Timers.
[  OK  ] Started Permit User Sessions.
[  OK  ] Started Console Getty.
[  OK  ] Reached target Login Prompts.
[  OK  ] Reached target Multi-User System.
[  OK  ] Reached target Graphical Interface.
         Starting Update UTMP about System Runlevel Changes...
[  OK  ] Started Update UTMP about System Runlevel Changes.

Debian GNU/Linux 10 buster-container console

buster-container login:

This confirms the issue is with systemd, and that some change introduced between v232 and v241 is causing the mount point check errors.

@pdsouza
Copy link
Member Author

pdsouza commented Dec 7, 2019

So I cloned systemd and was deep-diving through the code to debug what changed between v232 and v241. I noticed a few differences in the mount checking code between the two but it was not obvious what exactly was causing the errors.

Fortunately, I happened to stumble upon this very helpful error documented in the Halium project. Apparently, there was a change in v233 that switched from using canonicalize_file_name to custom logic that causes issues on the Android 3.4 kernel due to fstat not correctly handling O_PATH file descriptors. I applied the suggested patch from the UBPorts kernel and...

hammerhead:/data/maru/containers/default # lxc-start -n default
lxc-start: external/lxc/src/lxc/conf.c: mk_devtmpfs: 1379 /dev/.lxc is not setup - taking fallback
Failed to lookup module alias 'autofs4': Function not implemented
systemd 241 running in system mode. (+PAM +AUDIT +SELINUX +IMA +APPARMOR +SMACK +SYSVINIT +UTMP +LIBCRYPTSETUP +GCRYPT +GNUTLS +ACL +XZ +LZ4 +SECCOMP +BLKID +ELFUTILS +KMOD -IDN2 +IDN -PCRE2 default-hierarchy=hybrid)
Detected virtualization lxc.
Detected architecture arm.

Welcome to Debian GNU/Linux 10 (buster)!

Set hostname to <buster-container>.
Failed to create timezone change event source: Too many levels of symbolic links
File /lib/systemd/system/systemd-journald.service:12 configures an IP firewall (IPAddressDeny=any), but the local system does not support BPF/cgroup based firewalling.
Proceeding WITHOUT firewalling in effect! (This warning is only shown for the first loaded unit using IP firewalling.)
[  OK  ] Listening on udev Kernel Socket.
[  OK  ] Listening on udev Control Socket.
[  OK  ] Started Dispatch Password Requests to Console Directory Watch.
[  OK  ] Created slice system-getty.slice.
[  OK  ] Listening on initctl Compatibility Named Pipe.
[  OK  ] Started Forward Password Requests to Wall Directory Watch.
[  OK  ] Reached target Local Encrypted Volumes.
[  OK  ] Reached target Paths.
[  OK  ] Reached target Swap.
[  OK  ] Reached target Remote File Systems.
[  OK  ] Listening on Journal Audit Socket.
[  OK  ] Listening on Journal Socket (/dev/log).
[  OK  ] Listening on Journal Socket.
         Starting udev Coldplug all input devices...
         Starting Journal Service...
         Starting Remount Root and Kernel File Systems...
[  OK  ] Created slice User and Session Slice.
[  OK  ] Reached target Slices.
         Starting Load Kernel Modules...
[  OK  ] Started udev Coldplug all input devices.
[  OK  ] Started Journal Service.
[  OK  ] Started Remount Root and Kernel File Systems.
[  OK  ] Started Load Kernel Modules.
         Starting Apply Kernel Variables...
         Starting Create System Users...
         Starting Flush Journal to Persistent Storage...
[  OK  ] Started Apply Kernel Variables.
[  OK  ] Started Flush Journal to Persistent Storage.
[  OK  ] Started Create System Users.
         Starting Create Static Device Nodes in /dev...
[  OK  ] Started Create Static Device Nodes in /dev.
[  OK  ] Reached target Local File Systems (Pre).
[  OK  ] Reached target Local File Systems.
         Starting Create Volatile Files and Directories...
         Starting udev Kernel Device Manager...
[  OK  ] Started udev Kernel Device Manager.
[  OK  ] Started Create Volatile Files and Directories.
[  OK  ] Reached target System Time Synchronized.
         Starting Update UTMP about System Boot/Shutdown...
[  OK  ] Started Update UTMP about System Boot/Shutdown.
[  OK  ] Reached target System Initialization.
[  OK  ] Started Daily apt download activities.
[  OK  ] Started Daily apt upgrade and clean activities.
[  OK  ] Started Daily Cleanup of Temporary Directories.
[  OK  ] Reached target Timers.
[  OK  ] Listening on D-Bus System Message Bus Socket.
[  OK  ] Reached target Sockets.
[  OK  ] Reached target Basic System.
         Starting Login Service...
         Starting Permit User Sessions...
[  OK  ] Started D-Bus System Message Bus.
[  OK  ] Started Permit User Sessions.
[  OK  ] Started Login Service.
         Starting Light Display Manager...
[  OK  ] Started Console Getty.
[  OK  ] Reached target Login Prompts.
[  OK  ] Started Light Display Manager.
[  OK  ] Started mflinger client for X11.
[  OK  ] Reached target Multi-User System.
[  OK  ] Reached target Graphical Interface.
         Starting Update UTMP about System Runlevel Changes...
[  OK  ] Started Update UTMP about System Runlevel Changes.

Debian GNU/Linux 10 buster-container console

buster-container login: maru
Password:
Linux buster-container 3.4.0-gf314140eff4e #3 SMP PREEMPT Sat Dec 7 19:46:35 UTC 2019 armv7l

The programs included with the Debian GNU/Linux system are free software;
the exact distribution terms for each program are described in the
individual files in /usr/share/doc/*/copyright.

Debian GNU/Linux comes with ABSOLUTELY NO WARRANTY, to the extent
permitted by applicable law.
maru@buster-container:~$ echo YAY!
YAY!

It works!!

@pdsouza
Copy link
Member Author

pdsouza commented Dec 7, 2019

One more issue: I ran my container in graphical mode with an external monitor and noticed that all the icons on the desktop are missing. With some searching, it looks like building armhf buster containers on a 64-bit host with QEMU causes subtle errors due to readdir() erroring out. See this issue for more details. This is the real root cause for the ca-certificates issue I described earlier.

This manifests in errors like the following during building:

Processing triggers for libgdk-pixbuf2.0-0:armhf (2.38.1+dfsg-1) ...

(process:16214): GLib-ERROR **: 22:19:44.099: getauxval () failed: No such file or directory
qemu: uncaught target signal 5 (Trace/breakpoint trap) - core dumped
Trace/breakpoint trap (core dumped)

Fortunately, it appears that this should be fixed if using a later version of QEMU that's available on buster. I am going to try upgrading our build Docker images to buster, which I was planning to do anyways since we are upgrading Maru Desktop to buster as well (similar to how we transitioned from jessie to stretch). The tricky thing is that we will by moving from LXC 2 in stretch to LXC 3 in buster, which brings some major changes to lxc-create.

@pdsouza
Copy link
Member Author

pdsouza commented Dec 17, 2019

@utzcoz I have pushed up patches to support building Buster containers on branch upgrade-to-buster. Can you please test building an arm64 container and let me know if it starts correctly and desktop GUI starts up fine on your Pixel? Unfortunately I do not have access to arm64 devices right now. armhf works great on my Nexus 5.

@utzcoz
Copy link
Member

utzcoz commented Dec 17, 2019

@pdsouza I got below error when building blueprints:

....
qemu: uncaught target signal 11 (Segmentation fault) - core dumped
Segmentation fault (core dumped)
dpkg: error processing package shared-mime-info (--configure):
 installed shared-mime-info package post-installation script subprocess returned error exit status 139
....
dpkg: dependency problems prevent configuration of librsvg2-2:arm64:
 librsvg2-2:arm64 depends on libgdk-pixbuf2.0-0 (>= 2.25.2); however:
  Package libgdk-pixbuf2.0-0:arm64 is not configured yet.

dpkg: error processing package librsvg2-2:arm64 (--configure):
 dependency problems - leaving unconfigured
dpkg: dependency problems prevent configuration of adwaita-icon-theme:
 adwaita-icon-theme depends on gtk-update-icon-cache; however:
  Package gtk-update-icon-cache is not configured yet.
 adwaita-icon-theme depends on librsvg2-common; however:
  Package librsvg2-common:arm64 is not configured yet.

dpkg: error processing package adwaita-icon-theme (--configure):
 dependency problems - leaving unconfigured
....
Errors were encountered while processing:
 shared-mime-info
 libgtk-3-0:arm64
 libxfce4ui-2-0:arm64
 libgtk2.0-0:arm64
 libvte-2.91-0:arm64
 libgdk-pixbuf2.0-0:arm64
 librsvg2-common:arm64
 ristretto
 exo-utils
 xfce4-terminal
 firefox-esr
 libxfce4ui-1-0:arm64
 gtk-update-icon-cache
 libexo-2-0:arm64
 librsvg2-2:arm64
 adwaita-icon-theme
E: Sub-process /usr/bin/dpkg returned an error code (1)
[*] Cleaning up...
lxc-destroy: buster-container: tools/lxc_destroy.c: main: 271 Destroyed container buster-container

@utzcoz
Copy link
Member

utzcoz commented Dec 17, 2019

I used custom lxc image server, so I will remove cache and rebuild the blueprints to test again.

@PifPof73
Copy link

Hi @pdsouza,
with your patch I managed to build a buster container for armhf.
Could you please post the command how to push this container to my Nexus 5 without flashing the complete device?
Is it just a adb to "the right place"?
I would love to test the debian 10 container.
Regards,
PifPof73

@utzcoz
Copy link
Member

utzcoz commented Dec 18, 2019

@pdsouza I have tried to use travis to build buster online, and it failed too, see https://travis-ci.org/utzcoz/blueprints/builds/626511647.

@pdsouza
Copy link
Member Author

pdsouza commented Dec 18, 2019

@PifPof73 hammerhead needs a kernel patch to run buster containers due to the new version of systemd so you'll need to a new boot.img as well.

But in the future if you'd like to test, you just need to:

$ adb root
$ adb push maru-*.tar.gz /data/maru/containers/default/
$ adb shell
hammerhead# cd /data/maru/containers/default
hammerhead# rm -rf rootfs/
hammerhead# tar xzf maru-*.tar.gz

You should then be able to boot into the new container.

@pdsouza
Copy link
Member Author

pdsouza commented Dec 18, 2019

@utzcoz Hmm, ok. Thanks for testing. I thought I was able to successfully build an arm64 image on my machine... I will try again to reproduce this error.

@pdsouza
Copy link
Member Author

pdsouza commented Dec 18, 2019

@utzcoz I just ran the same build command as in the Travis CI test and I was able to successfully build an arm64 container! Strange. Maybe it is related to the host kernel? I am running on Arch Linux with a very recent kernel (5.4.3-arch1-1 x86_64).

@pdsouza
Copy link
Member Author

pdsouza commented Dec 18, 2019

Here is my log of the successful build on my machine: https://pastebin.com/b5CVkNjw

It gets past the shared-mime-info error on Travis CI without any QEMU errors.

@PifPof73
Copy link

PifPof73 commented Dec 18, 2019

@PifPof73 hammerhead needs a kernel patch to run buster containers due to the new version of systemd so you'll need to a new boot.img as well.

But in the future if you'd like to test, you just need to:

$ adb root
$ adb push maru-*.tar.gz /data/maru/containers/default/
$ adb shell
hammerhead# cd /data/maru/containers/default
hammerhead# rm -rf rootfs/
hammerhead# tar xzf maru-*.tar.gz

You should then be able to boot into the new container.

Thanks.
At the moment I fail to build a boot.img.
As it does not is related to this issue, I made a request on the google dev group
https://groups.google.com/forum/#!topic/maru-os-dev/h6l2tX5Q7VE

Do you have a boot.img available for Nexus 5?
Regards,
PifPof73

@pdsouza
Copy link
Member Author

pdsouza commented Dec 20, 2019

@PifPof73 I have a debug boot.img I used for development but not sure if it will work with your older system partition. I can upload it if you really want to test though — let me know.

@utzcoz
Copy link
Member

utzcoz commented Dec 21, 2019

It gets past the shared-mime-info error on Travis CI without any QEMU errors.

@pdsouza Look like Travic CI doesn't build buster, and it only builds strecth.

@utzcoz
Copy link
Member

utzcoz commented Dec 24, 2019

In the docker buildx issue Building for ARM causes error often, docker tools developer say the QEMU 4.0.0 fix the problem.

@pdsouza
Copy link
Member Author

pdsouza commented Dec 24, 2019

Buster uses QEMU 3.1. Maybe we can try installing QEMU 4.1 from bullseye to test it out.

@utzcoz
Copy link
Member

utzcoz commented Jan 5, 2020

I tested QEMU 4.x with bullseye, and it failed too. And I found QEMU crashed when running update-mime-database, what is executed by shared-mime-info posinst.
The script can build armhf buster, succeed, but failed for arm64.

@pdsouza
Copy link
Member Author

pdsouza commented Jan 5, 2020 via email

@utzcoz
Copy link
Member

utzcoz commented Jan 6, 2020

@pdsouza In travis CI, I remove the i386 support and build arm64 buster, it built succeed. But it also built failed in my work machine(Ubuntu 19.04). From the travis CI, we know can build arm64 buster with amd64 version qemu-user-static, and build armhf buster with i386 version qemu-user-static.

@utzcoz
Copy link
Member

utzcoz commented Jan 10, 2020

@pdsouza I clean up the blueprints out, and my machine builds arm64 succeed. But when I push the modification to use two dockerfiles to support different architecture, the travis build failed. The commit is docker: Add dockerfile for aarch64 and armhf. And unfortunately, the built arm64 buster rootfs show the black screen to me. From adb logcat, I find mflinger only has print log At your service. Then I open the debug of mflinger, and the log shows there is no request from mclient.

@pdsouza
Copy link
Member Author

pdsouza commented Jan 10, 2020 via email

@pdsouza
Copy link
Member Author

pdsouza commented Jan 16, 2020

@utzcoz I sent you an email with a link to an arm64 build I did with your commit for testing. Let me know how it goes.

@utzcoz
Copy link
Member

utzcoz commented Jan 17, 2020

@pdsouza Your generated rootfs has the same problem, and I use journalctl -u maru-mflinger-client to get below log:

Jan 17 15:46:06 buster-arm64-rootfs systemd[1]: Started mflinger client for X11.
Jan 17 15:46:06 buster-arm64-rootfs mclient[100]: error calling XOpenDisplay
Jan 17 15:46:06 buster-arm64-rootfs systemd[1]: maru-mflinger-client.service: Main process exited, code=exited, status=255/EXCEPTION
Jan 17 15:46:06 buster-arm64-rootfs systemd[1]: maru-mflinger-client.service: Failed with result 'exit-code'.
Jan 17 15:46:07 buster-arm64-rootfs systemd[1]: maru-mflinger-client.service: Service RestartSec=100ms expired, scheduling restart.
Jan 17 15:46:07 buster-arm64-rootfs systemd[1]: maru-mflinger-client.service: Scheduled restart job, restart counter is at 1.
Jan 17 15:46:07 buster-arm64-rootfs systemd[1]: Stopped mflinger client for X11.
Jan 17 15:46:07 buster-arm64-rootfs systemd[1]: Started mflinger client for X11.
Jan 17 15:46:07 buster-arm64-rootfs mclient[102]: error calling XOpenDisplay
Jan 17 15:46:07 buster-arm64-rootfs systemd[1]: maru-mflinger-client.service: Main process exited, code=exited, status=255/EXCEPTION
Jan 17 15:46:07 buster-arm64-rootfs systemd[1]: maru-mflinger-client.service: Failed with result 'exit-code'.
Jan 17 15:46:07 buster-arm64-rootfs systemd[1]: maru-mflinger-client.service: Service RestartSec=100ms expired, scheduling restart.
Jan 17 15:46:07 buster-arm64-rootfs systemd[1]: maru-mflinger-client.service: Scheduled restart job, restart counter is at 2.
Jan 17 15:46:07 buster-arm64-rootfs systemd[1]: Stopped mflinger client for X11.
Jan 17 15:46:07 buster-arm64-rootfs systemd[1]: Started mflinger client for X11.
Jan 17 15:46:07 buster-arm64-rootfs mclient[103]: error calling XOpenDisplay
Jan 17 15:46:07 buster-arm64-rootfs systemd[1]: maru-mflinger-client.service: Main process exited, code=exited, status=255/EXCEPTION
Jan 17 15:46:07 buster-arm64-rootfs systemd[1]: maru-mflinger-client.service: Failed with result 'exit-code'.
Jan 17 15:46:07 buster-arm64-rootfs systemd[1]: maru-mflinger-client.service: Service RestartSec=100ms expired, scheduling restart.
Jan 17 15:46:07 buster-arm64-rootfs systemd[1]: maru-mflinger-client.service: Scheduled restart job, restart counter is at 3.
Jan 17 15:46:07 buster-arm64-rootfs systemd[1]: Stopped mflinger client for X11.
Jan 17 15:46:07 buster-arm64-rootfs systemd[1]: Started mflinger client for X11.
Jan 17 15:46:07 buster-arm64-rootfs mclient[104]: error calling XOpenDisplay
Jan 17 15:46:07 buster-arm64-rootfs systemd[1]: maru-mflinger-client.service: Main process exited, code=exited, status=255/EXCEPTION
Jan 17 15:46:07 buster-arm64-rootfs systemd[1]: maru-mflinger-client.service: Failed with result 'exit-code'.
Jan 17 15:46:07 buster-arm64-rootfs systemd[1]: maru-mflinger-client.service: Service RestartSec=100ms expired, scheduling restart.
Jan 17 15:46:07 buster-arm64-rootfs systemd[1]: maru-mflinger-client.service: Scheduled restart job, restart counter is at 4.
Jan 17 15:46:07 buster-arm64-rootfs systemd[1]: Stopped mflinger client for X11.
Jan 17 15:46:07 buster-arm64-rootfs systemd[1]: Started mflinger client for X11.
Jan 17 15:46:07 buster-arm64-rootfs mclient[105]: error calling XOpenDisplay
Jan 17 15:46:07 buster-arm64-rootfs systemd[1]: maru-mflinger-client.service: Main process exited, code=exited, status=255/EXCEPTION
Jan 17 15:46:07 buster-arm64-rootfs systemd[1]: maru-mflinger-client.service: Failed with result 'exit-code'.
Jan 17 15:46:08 buster-arm64-rootfs systemd[1]: maru-mflinger-client.service: Service RestartSec=100ms expired, scheduling restart.
Jan 17 15:46:08 buster-arm64-rootfs systemd[1]: maru-mflinger-client.service: Scheduled restart job, restart counter is at 5.
Jan 17 15:46:08 buster-arm64-rootfs systemd[1]: Stopped mflinger client for X11.
Jan 17 15:46:08 buster-arm64-rootfs systemd[1]: maru-mflinger-client.service: Start request repeated too quickly.
Jan 17 15:46:08 buster-arm64-rootfs systemd[1]: maru-mflinger-client.service: Failed with result 'exit-code'.
Jan 17 15:46:08 buster-arm64-rootfs systemd[1]: Failed to start mflinger client for X11.

@utzcoz
Copy link
Member

utzcoz commented Jan 17, 2020

And I install x11-utils, execute xdpyinfo and I got below error:

maru@buster-container:~$ xdpyinfo
xdpyinfo:  unable to open display "".

@pdsouza
Copy link
Member Author

pdsouza commented Jan 27, 2020

@utzcoz Thanks for testing! I recently picked up an HTC10 so I finally have an arm64 device for debugging. I will look into what's going on when I get some time.

@utzcoz
Copy link
Member

utzcoz commented Feb 2, 2020

@pdsouza I found a very interesting problem. I have said I can change stretch to buster directly in current build script to build arm64 buster rootfs, and run it correctly in my marlin phone. Today, I change my blueprints to master branch, and use it to build arm64 buster rootfs, and the screen blank as my old post in this thread. Maybe buster updates some libraries, and it doesn't work correctly in our environment. And then I test it with stretch rootfs, the screen shows the correct Linux desktop.

@pdsouza
Copy link
Member Author

pdsouza commented Feb 7, 2020

@utzcoz Oh wow, interesting. Yes, probably some library related to Xorg has changed. I haven't had time to debug this issue yet, but hope to do so soon on my HTC10.

@luka177
Copy link

luka177 commented Feb 7, 2020

Hi guys!
I will try my best, when will get cedric running, for now building 34%

@utzcoz
Copy link
Member

utzcoz commented Mar 18, 2020

@pdsouza The buster uses the systemd v241, and stretch uses the systemd v232, what we can get from debian package website. From the version v232 to v241, the systemd add a patch core: run each system service with a fresh session keyring, that calls keyctl(KEYCTL_JOIN_SESSION_KEYRING, 0, 0, 0, 0) to create new session keyring for each service. So the systemd-sysusers will get an anonymous session. But it will try to fetch some kernel keyring, and fail because of the Required key not found error. After it failed, the maru-mflinger-client maybe get incorrect envrionment, so it started failed. So I try to hack to disable KEYCTL_JOIN_SESSION_KEYRING in marlin kernel, and it works fine. After hacking, Linux shows correctly in simulate secondary display. And I can use sudo apt-get update and sudo apt-get install emacs successfully.

@utzcoz
Copy link
Member

utzcoz commented Mar 18, 2020

The modification is in keyctl.c:

/*
 * The key control system call
 */
SYSCALL_DEFINE5(keyctl, int, option, unsigned long, arg2, unsigned long, arg3,
		unsigned long, arg4, unsigned long, arg5)
{
    kdebug("keyctl option %d", option);
	switch (option) {
	case KEYCTL_GET_KEYRING_ID:
		return keyctl_get_keyring_ID((key_serial_t) arg2,
					     (int) arg3);

	case KEYCTL_JOIN_SESSION_KEYRING:
		return -1;
//		return keyctl_join_session_keyring((const char __user *) arg2);

@utzcoz
Copy link
Member

utzcoz commented Mar 18, 2020

And there is my operation log:

maru@buster-container:~$ sudo apt update

We trust you have received the usual lecture from the local System
Administrator. It usually boils down to these three things:

    #1) Respect the privacy of others.
    #2) Think before you type.
    #3) With great power comes great responsibility.

[sudo] password for maru: 
Get:1 http://mirrors.huaweicloud.com/debian buster InRelease [122 kB]
Get:2 http://mirrors.huaweicloud.com/debian-security buster/updates InRelease [65.4 kB]
Get:3 http://mirrors.huaweicloud.com/debian buster/main arm64 Packages [7,737 kB]
Get:4 http://mirrors.huaweicloud.com/debian buster/main Translation-en [5,970 kB]                                                                                     
Get:5 http://mirrors.huaweicloud.com/debian-security buster/updates/main arm64 Packages [182 kB]                                                                      
Get:6 http://mirrors.huaweicloud.com/debian-security buster/updates/main Translation-en [97.2 kB]                                                                     
Fetched 14.2 MB in 17s (847 kB/s)                                                                                                                                     
Reading package lists... Done
Building dependency tree       
Reading state information... Done
All packages are up to date.
maru@buster-container:~$ sudo apt install emacs
Reading package lists... Done
Building dependency tree       
Reading state information... Done
The following additional packages will be installed:
  cron emacs-bin-common emacs-common emacs-el emacs-gtk exim4-base exim4-config exim4-daemon-light fonts-droid-fallback fonts-noto-mono ghostscript gsfonts
  guile-2.2-libs imagemagick-6-common install-info libcupsfilters1 libcupsimage2 libgc1c2 libgd3 libgif7 libgnutls-dane0 libgs9 libgs9-common libgsasl7 libheif1
  libijs-0.35 libjbig2dec0 libkyotocabinet16v5 liblqr-1-0 liblzo2-2 libm17n-0 libmagickcore-6.q16-6 libmagickwand-6.q16-6 libmailutils5 libmariadb3 libntlm0 libotf0
  libpython2.7 libpython2.7-minimal libpython2.7-stdlib libunbound8 m17n-db mailutils mailutils-common mariadb-common mysql-common
Suggested packages:
  anacron logrotate checksecurity emacs-common-non-dfsg ncurses-term exim4-doc-html | exim4-doc-info eximon4 spf-tools-perl swaks fonts-noto ghostscript-x
  libgd-tools dns-root-data m17n-docs libmagickcore-6.q16-6-extra gawk mailutils-mh mailutils-doc
The following NEW packages will be installed:
  cron emacs emacs-bin-common emacs-common emacs-el emacs-gtk exim4-base exim4-config exim4-daemon-light fonts-droid-fallback fonts-noto-mono ghostscript gsfonts
  guile-2.2-libs imagemagick-6-common install-info libcupsfilters1 libcupsimage2 libgc1c2 libgd3 libgif7 libgnutls-dane0 libgs9 libgs9-common libgsasl7 libheif1
  libijs-0.35 libjbig2dec0 libkyotocabinet16v5 liblqr-1-0 liblzo2-2 libm17n-0 libmagickcore-6.q16-6 libmagickwand-6.q16-6 libmailutils5 libmariadb3 libntlm0 libotf0
  libpython2.7 libpython2.7-minimal libpython2.7-stdlib libunbound8 m17n-db mailutils mailutils-common mariadb-common mysql-common
0 upgraded, 47 newly installed, 0 to remove and 0 not upgraded.
Need to get 63.8 MB of archives.
After this operation, 258 MB of additional disk space will be used.

@pdsouza
Copy link
Member Author

pdsouza commented Mar 19, 2020

@utzcoz Wow, nice debugging man! I have been busy and haven't been able to follow up on this so thanks so much. Maybe we can merge this kernel hack in for now? This should let us merge in the two Dockerfile approach you are using in your fork right now?

@pdsouza
Copy link
Member Author

pdsouza commented Mar 20, 2020

@utzcoz's two Dockerfile solution for buster has been merged to upgrade-to-buster branch in #20.

Please help test buster containers on your devices (@luka177)!

@luka177
Copy link

luka177 commented Mar 21, 2020

ok guy soon will test building los 16

@luka177
Copy link

luka177 commented Mar 21, 2020

You just created a Debian buster armhf (20200321_05:24) container.

To enable SSH, run: apt install openssh-server
No default root or user password are set by LXC.
[ DEBIAN ] building maru debpkg...
fakeroot dpkg-deb --build debpkg/
dpkg-deb: building package 'maru' in 'debpkg.deb'.
mv debpkg.deb maru_0.1-1_all.deb
[ DEBIAN ] configuring rootfs...
chroot: failed to run command 'bash': No such file or directory
[*] Cleaning up...
lxc-destroy: buster-container1: tools/lxc_destroy.c: main: 271 Destroyed container buster-container1
hello@Strangers-In-My-Network:~/cedric/android/p/vendor/maruos/blueprints$ 

Hm facing this problem

@utzcoz
Copy link
Member

utzcoz commented Mar 23, 2020

@luka177 Do you use the build script with dockerinblueprints` directory?

@luka177
Copy link

luka177 commented Mar 23, 2020

yes

@utzcoz
Copy link
Member

utzcoz commented Mar 23, 2020

@luka177 Maybe you can install qemu-user-static to your work machine, following Adding notes about qemu dependencies, and try to build it with docker again.

@pintaf
Copy link
Member

pintaf commented Aug 2, 2020

Hi there. Little update.
I've been able to build buster arm64.
I integrated it inside my maru-0.7 build. It works !
But there are some issues.
Lxc-start works, but it is not possible to manage the desktop from the settings (official settings, not the external one from utcoz).
Impossible to start it. When the desktop started the settings does not reflect the correct stat (still says it's off).
Also, when we press the power button of the phone, to wake it, the desktop catch it and shutdown.

Any idea on where I should focus on this power button issue?

@pintaf
Copy link
Member

pintaf commented Aug 2, 2020

I found some other weird issues like the maru desktop being displayed on the screen phone

@utzcoz
Copy link
Member

utzcoz commented Aug 28, 2020

Hi @pdsouza , can we plan to merge buster building to main branch? @pintaf builds it and runs it to its porting maru-0.7. There are no response from other maintainers, maybe we can build two versions, one with stretch, one with buster, and provide them to user for testing.

@pdsouza
Copy link
Member Author

pdsouza commented Sep 4, 2020

@pintaf Thanks for the feedback! As @utzcoz mentioned, I will be merging this to master soon.
@utzcoz Yes, sounds good. Thank you for reminding me about this! I am going to do some much needed maintenance this weekend on maru-0.6 (I need to merge in some updates from LineageOS that's breaking our build) and I will also take care of merging buster building as well.

And thanks everyone for your patience on this taking forever, I have been swamped with other work lately. But I should be able to get everything fixed up this weekend.

@pdsouza
Copy link
Member Author

pdsouza commented Sep 9, 2020

Alright I've merged in @utzcoz's dual Dockerfile approach to master. I just built and ran an armhf buster container on my Nexus 5 and it appears that things are still working normally like when I tested it last time. The next official release will include buster containers by default. We can always revert back to stretch if needed.

If there are no other concerns I will close this issue in a few days.

@pdsouza
Copy link
Member Author

pdsouza commented Sep 9, 2020

Looks like Travis CI still fails to build arm64 buster containers: https://travis-ci.org/github/maruos/blueprints/jobs/725457381#L4377.

They build fine on my desktop though.

@utzcoz
Copy link
Member

utzcoz commented Sep 9, 2020

@pdsouza Looks like it is a qemu simulation problem of travis. Maybe we can try multiarch.

@utzcoz
Copy link
Member

utzcoz commented Sep 9, 2020

A very strange phenomenon, if I run arm64 building firstly, it will success. But armhf building will fail. @pdsouza I create a new pr to fix it with splitting building with two travis instances.

@pdsouza
Copy link
Member Author

pdsouza commented Sep 9, 2020

@utzcoz Thank you for looking into this so quickly! I also noticed that if I build the other way, i.e. arm64 first and then armhf, that if I first mount binfmt_misc with 64-bit qemu-user-static and then run the Dockerfile.armhf build, I get the same curl error I was getting before. I have a feeling that we must unmount and remount binfmt_misc between builds of different architectures. Maybe the kernel is caching the qemu-user-static binary file descriptor between builds, and it continues to use the first binary loaded somehow. This is just a guess though, I haven't tried remounting yet.

Either way, using separate travis instances is a good solution!

@github-actions
Copy link

github-actions bot commented Nov 9, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

6 participants