Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add armhf/raspberry pi example stack #64

Closed
wants to merge 13 commits into from

Conversation

gesellix
Copy link

This is a working example app for the armhf platform, including docker stack support for a cluster of Raspberry Pis. I tried to stick to the "official" armhf/ images on the Docker Hub.

Currently, some images are pulled from my private Docker Hub repo (gesellix/sample-*) to make the docker-stack work out of the box. I guess it would be nice to provide those via the official dockersamples/ repo, but that can obviously be fixed later.

Please note:

  • The .Net worker hasn't been converted, yet.
  • I changed the building of the Java worker to be a two step process now (1. building the .jar, 2. wrapping it in a JRE-image). The good thing is that it shows how to split build and runtime images, the bad thing is that multiple steps are probably harder to grasp for newbies?

- back-tier

db:
image: postgres:9.5

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If someone cloned this from Github and ran docker-compose up this would complain about not finding the images. Maybe it could have a build parameter here instead?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you prefer a build or an existing public image on the Docker Hub? I'd prefer a public image, but that's no hard preference.
What's the best way to make people like such examples? In my experience an existing image brings less risk to break over time compared to some package manager having connection issues with some repository mirror.


db:
image: gesellix/postgres:9.5-rpi
volumes:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should ping @tianon to see if there is a reason why we aren't supporting postgres from the armhf repository already. We would need to have the images set to a official account to merge.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

armhf doesn't appear to be supported by upstream in their official APT repos: http://apt.postgresql.org/pub/repos/apt/pool/main/p/postgresql-9.6/

We could probably get the Alpine variants building, though.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll have to update that Dockerfile - the gosu stuff isn't actually used if I remember correctly.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check back with @tianon I think I saw some activity on this over on the repo he manages.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I tried getting postgres up and going on armhf, but as I mentioned their repos don't support Debian, so I was planning to get the Alpine variants up at least, but in the case of Alpine 3.5 on armhf, I'm running into https://bugs.alpinelinux.org/issues/6372 again (so can't even get the base image built properly). 😞

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tianon - that bug was supposed to have been fixed and is marked as fixed - can you reopen it so we can discuss there? I don't know if it's a new problem or a regression from an old one.

@@ -0,0 +1,11 @@
FROM armhf/maven:jdk-8

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The splitting out of the jar via the builder is neat, but I think this either needs to be applied to the x64 bit images too or reverted back to the original format.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see the inconsistency and would also like to have a common pattern across architectures/platforms. But which pattern is good enough to be a good example without distracting people from "just starting a stack"? I personally see the combined builder+run image as a bad example, because we then have to explain that it's not good to run such an image in production. A Makefile could help documenting the necessary steps.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need the consistency between the two examples, the ARM files should just be a port rather than something that needs to be maintained separately.

@alexellis
Copy link

@gesellix hi, are you also planning on creating a .NET core worker image to complete the stack?

@alexellis
Copy link

@ManoMarks I'd also recommend reducing the parallelism factor to 1 for all the containers for the Raspberry Pi. I am not even sure all the containers plus the .NET core code can fit inside the RAM of all Raspberry Pis with a full OS and Docker.

With the initial test (docker stack deploy) I didn't see the votes being incremented - probably because of the worker?

Btw.. the worker's docker ps logs says:

Waiting for redis
Waiting for redis
Waiting for redis
Waiting for redis
Waiting for redis
Waiting for redis
Waiting for redis
Waiting for redis

I can test across a multi-node cluster once we've iterated on a few of the comments.

@alexellis
Copy link

We should also call these files armhf in the same way the docker/docker repo does. Someone will probably help us out with an aarch64 / arm64 version in the future. Maybe @vielmetti? :-)

@vielmetti
Copy link

It strikes me that this is a good reason for multiarch style images, so that docker-compose works independent of architecture, and the porting necessary is just in building the containers and not stitching them together.

Is there a query you can run on docker hub that returns only multiarch manifests?

@gesellix
Copy link
Author

@vielmetti like mentioned in my initial thoughts (#63) and seen in the popular portainer project, I'd also wish for an easier way of producing multiarch images. As far as I understood, we need to build native images on their respective platforms before combining them into a multiarch "view", right? Not sure whether this is currently possible in an automated way.

@gesellix
Copy link
Author

@alexellis I'll check if the .NET worker is possible. I haven't tried, yet.

I'll also check what happened to the Java worker waiting for redis. I swear it worked on my machine/cluster... but not anymore.

@vielmetti
Copy link

@gesellix - I was thinking of Portainer too as a model for this as far as multiarch goes.

@ManoMarks
Copy link
Contributor

So where are we on this? @alexellis do we have an LGTM?

@gesellix
Copy link
Author

@ManoMarks I'm working on some suggestions by the reviewers.
The most notable blocker seems to be the fact that referenced images should be available in an official repo. Did I get this right?

@alexellis
Copy link

alexellis commented Jan 31, 2017

@ManoMarks several comments to work through. We need to iterate on it.

@gesellix
Copy link
Author

gesellix commented Feb 1, 2017

Update:

  • Renamed tags and file infixes to armhf
  • The docker-compose file uses local build contexts
  • The Java worker Dockerfile is now available as combined builder+runtime file - I'll delete the others (builder, runtime) then, since we settle on the existing approach?

Not addressed/fixed, yet:

@vielmetti
Copy link

@tianon - can you open a new bug with stack trace from the core dump to replace
https://bugs.alpinelinux.org/issues/6372 ? Timo suggests that it's a different issue than the one I opened in October that was resolved.

@tianon
Copy link

tianon commented Feb 1, 2017 via email

@ManoMarks
Copy link
Contributor

Is there another relational database that we can use to replace PostGres on Alpine? Alternately, can we use a non-alpine version of Postgres?

@vielmetti
Copy link

@ManoMarks - there is https://github.com/kiasaki/docker-alpine-postgres which is a Dockerized Alpine Postgres - you might find that useful.

However, the bug 6795 reported above is now marked as fixed so this might be a sufficiently solved problem to make progress.

@ManoMarks
Copy link
Contributor

@alexellis, @gesellix is this moving forward?

@gesellix
Copy link
Author

@ManoMarks yes, I'm still working on it. I got a flu, though, and I also stumbled on issues like this one: rsyslog/rsyslog#35
Sorry for the delay!

@ManoMarks
Copy link
Contributor

@gesellix No worries! I hope you feel better! I just wanted to make sure this wasn't getting lost.

@alexellis
Copy link

alexellis commented Feb 19, 2017

Is there another relational database that we can use to replace PostGres on Alpine? Alternately, can we use a non-alpine version of Postgres?

Is there a postgres or mysql version in Raspbian aka (Debian) repo(s)?

https://wiki.debian.org/PostgreSql

I guess whichever configuration is used, it should work on armhf (RPi), arm64/v8 (Pine64/Packet) and regular x86_64 consistently with only needing to change the base image. Any further changes than that will result in maintenance headaches.

Also wonder whether it's worth using local images for the stack, so that people following the labs don't have to use untrusted images.

@gesellix
Copy link
Author

local images for the stack

You mean without central registry? Would that work with multiple nodes, then?

I would hope for official images to be available at some point in time. The dockersamples repo might be considered as trusted repos?
Local images would keep bandwidth usage low, though. Hm.

@gesellix
Copy link
Author

gesellix commented Feb 26, 2017

Minor update: I'm struggling with the error below, leading to a 100% cpu usage of the kworker process. It seems like that error only appears after deploying the example stack (not instantly, but after a few minutes), so maybe it's a Docker on Raspberry Pi issue, or something triggered by one of the services. If some of you have an idea how to find the root cause, I'd be very glad :-)

/cc @alexellis @StefanScherer @DieterReuter

[ 1328.937557] BUG: using smp_processor_id() in preemptible [00000000] code: node/7528
[ 1329.030770] caller is debug_smp_processor_id+0x18/0x24
[ 1329.122947] CPU: 3 PID: 7528 Comm: node Not tainted 4.4.43-hypriotos-v7+ #1
[ 1329.214980] Hardware name: BCM2709
[ 1329.300230] [<800193e4>] (unwind_backtrace) from [<800149e0>] (show_stack+0x20/0x24)
[ 1329.390209] [<800149e0>] (show_stack) from [<8033613c>] (dump_stack+0xbc/0x108)
[ 1329.478209] [<8033613c>] (dump_stack) from [<80350c94>] (check_preemption_disabled+0x104/0x134)
[ 1329.565476] [<80350c94>] (check_preemption_disabled) from [<80350cdc>] (debug_smp_processor_id+0x18/0x24)
[ 1329.651826] [<80350cdc>] (debug_smp_processor_id) from [<7f3fd50c>] (ip_vs_schedule+0x180/0x738 [ip_vs])
[ 1329.739200] [<7f3fd50c>] (ip_vs_schedule [ip_vs]) from [<7f40b590>] (tcp_conn_schedule+0x118/0x22c [ip_vs])
[ 1329.827818] [<7f40b590>] (tcp_conn_schedule [ip_vs]) from [<7f3fef78>] (ip_vs_in.part.2.constprop.9+0x6e4/0x75c [ip_vs])
[ 1329.918360] [<7f3fef78>] (ip_vs_in.part.2.constprop.9 [ip_vs]) from [<7f3ff074>] (ip_vs_local_request4+0x40/0x44 [ip_vs])
[ 1330.010632] [<7f3ff074>] (ip_vs_local_request4 [ip_vs]) from [<8050dca8>] (nf_iterate+0x80/0x90)
[ 1330.104027] [<8050dca8>] (nf_iterate) from [<8050dd38>] (nf_hook_slow+0x80/0xec)
[ 1330.194149] [<8050dd38>] (nf_hook_slow) from [<8051a654>] (__ip_local_out+0xb4/0xc0)
[ 1330.284487] [<8051a654>] (__ip_local_out) from [<8051a684>] (ip_local_out+0x24/0x4c)
[ 1330.377252] [<8051a684>] (ip_local_out) from [<8051a9ac>] (ip_queue_xmit+0x144/0x3c0)
[ 1330.471462] [<8051a9ac>] (ip_queue_xmit) from [<80532dd8>] (tcp_transmit_skb+0x4d0/0x918)
[ 1330.566692] [<80532dd8>] (tcp_transmit_skb) from [<80534a14>] (tcp_connect+0x584/0x798)
[ 1330.664182] [<80534a14>] (tcp_connect) from [<80537b90>] (tcp_v4_connect+0x2fc/0x468)
[ 1330.760989] [<80537b90>] (tcp_v4_connect) from [<8054fab0>] (__inet_stream_connect+0x1a4/0x330)
[ 1330.859171] [<8054fab0>] (__inet_stream_connect) from [<8054fc80>] (inet_stream_connect+0x44/0x58)
[ 1330.960026] [<8054fc80>] (inet_stream_connect) from [<804bdec8>] (SyS_connect+0x74/0xa4)
[ 1331.059682] [<804bdec8>] (SyS_connect) from [<8000fc20>] (ret_fast_syscall+0x0/0x1c)

[... repeats endlessly...]

@gesellix
Copy link
Author

hmmm, looks similar: moby/moby#27833

@StefanScherer
Copy link
Contributor

@gesellix Regarding building multiarch images. I've created a Travis matrix build with a deploy script that waits for the other builds and drafts and pushes the multiarch image.

@gesellix
Copy link
Author

@StefanScherer Nice, thanks for the link, I'll give it a try!

@gesellix
Copy link
Author

gesellix commented Feb 28, 2017

After the findings in moby/moby#27833 (comment) I tried to deploy the stack on Raspbian instead of Hypriot, but now I hit this error:

"starting container failed: subnet sandbox join failed for "10.0.1.0/24": error creating vxlan interface: operation not supported"

Seems to be a known issue:

edit: ... and should be fixed with raspberrypi/linux#1614

$ docker info
Containers: 0
 Running: 0
 Paused: 0
 Stopped: 0
Images: 3
Server Version: 1.13.1
Storage Driver: overlay2
 Backing Filesystem: extfs
 Supports d_type: true
 Native Overlay Diff: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins: 
 Volume: local
 Network: bridge host ipvlan macvlan null overlay
Swarm: active
 NodeID: ffvx1pznkxxx0345t5128xmbm
 Is Manager: true
 ClusterID: 09k04xtm1rq8lgbp5vsx59kjq
 Managers: 1
 Nodes: 5
 Orchestration:
  Task History Retention Limit: 5
 Raft:
  Snapshot Interval: 10000
  Number of Old Snapshots to Retain: 0
  Heartbeat Tick: 1
  Election Tick: 3
 Dispatcher:
  Heartbeat Period: 5 seconds
 CA Configuration:
  Expiry Duration: 3 months
 Node Address: 192.168.178.39
 Manager Addresses:
  192.168.178.39:2377
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: aa8187dbd3b7ad67d8e5e3a15115d3eef43a7ed1
runc version: 9df8b306d01f59d3a8029be411de015b7304dd8f
init version: 949e6fa
Kernel Version: 4.4.38-v7+
Operating System: Raspbian GNU/Linux 8 (jessie)
OSType: linux
Architecture: armv7l
CPUs: 4
Total Memory: 925.5 MiB
Name: pi1
ID: HFBQ:H5X6:VW6K:MNLM:JYJR:KLB5:PJXF:Y2L6:5LSV:64KE:XSOO:FGCF
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
WARNING: No swap limit support
WARNING: No kernel memory limit support
WARNING: No cpu cfs quota support
WARNING: No cpu cfs period support
WARNING: No cpuset support
Labels:
 os=linux
 arch=arm
Experimental: true
Insecure Registries:
 127.0.0.0/8
Live Restore Enabled: false

@gesellix
Copy link
Author

Using Hypriot 1.4.0 with a recent Kernel works for @StefanScherer (moby/moby#27833 (comment)), and myself without any of the mentioned Kernel issues. 🎉

Some votes still aren't handled by the worker, as it seems. I'll try to find out what happens in such cases. I suspect this is an issue of the example stack in general, and no specific issue of the armhf stack. Multiple redis containers with only a single worker (watching a single redis instance) might be the reason...

The images still need to be pushed to an official repo. I'm happy to help with that, but any pointers how to start that process would be nice. I'll have to talk to @tianon I guess?

@gesellix
Copy link
Author

Now I got the error messages again (via dmesg):

[ 2102.873233] BUG: using smp_processor_id() in preemptible [00000000] code: node/3348
[ 2102.873250] caller is debug_smp_processor_id+0x18/0x24
[ 2102.873269] CPU: 2 PID: 3348 Comm: node Not tainted 4.4.50-hypriotos-v7+ #1
[ 2102.873279] Hardware name: BCM2709
[ 2102.873306] [<80019468>] (unwind_backtrace) from [<80014a14>] (show_stack+0x20/0x24)
[ 2102.873331] [<80014a14>] (show_stack) from [<803362dc>] (dump_stack+0xbc/0x108)
[ 2102.873358] [<803362dc>] (dump_stack) from [<80350e34>] (check_preemption_disabled+0x104/0x134)
[ 2102.873381] [<80350e34>] (check_preemption_disabled) from [<80350e7c>] (debug_smp_processor_id+0x18/0x24)
[ 2102.873443] [<80350e7c>] (debug_smp_processor_id) from [<7f40bb90>] (ip_vs_in.part.2.constprop.9+0x2fc/0x75c [ip_vs])
[ 2102.873573] [<7f40bb90>] (ip_vs_in.part.2.constprop.9 [ip_vs]) from [<7f40c074>] (ip_vs_local_request4+0x40/0x44 [ip_vs])
[ 2102.873656] [<7f40c074>] (ip_vs_local_request4 [ip_vs]) from [<8050e014>] (nf_iterate+0x80/0x90)
[ 2102.873699] [<8050e014>] (nf_iterate) from [<8050e0a4>] (nf_hook_slow+0x80/0xec)
[ 2102.873735] [<8050e0a4>] (nf_hook_slow) from [<8051a9d8>] (__ip_local_out+0xb4/0xc0)
[ 2102.873777] [<8051a9d8>] (__ip_local_out) from [<8051aa08>] (ip_local_out+0x24/0x4c)
[ 2102.873821] [<8051aa08>] (ip_local_out) from [<8051ad30>] (ip_queue_xmit+0x144/0x3c0)
[ 2102.873854] [<8051ad30>] (ip_queue_xmit) from [<80533178>] (tcp_transmit_skb+0x4d0/0x918)
[ 2102.873890] [<80533178>] (tcp_transmit_skb) from [<80533730>] (tcp_write_xmit+0x170/0xe9c)
[ 2102.873924] [<80533730>] (tcp_write_xmit) from [<80534774>] (__tcp_push_pending_frames+0x44/0xb0)
[ 2102.873946] [<80534774>] (__tcp_push_pending_frames) from [<80522f78>] (tcp_push+0x130/0x158)
[ 2102.873966] [<80522f78>] (tcp_push) from [<805265c8>] (tcp_sendmsg+0xc8/0xa6c)
[ 2102.873987] [<805265c8>] (tcp_sendmsg) from [<80550170>] (inet_sendmsg+0xa8/0xd0)
[ 2102.874010] [<80550170>] (inet_sendmsg) from [<804bd220>] (sock_sendmsg+0x24/0x34)
[ 2102.874033] [<804bd220>] (sock_sendmsg) from [<804bd2c4>] (sock_write_iter+0x94/0xc4)
[ 2102.874056] [<804bd2c4>] (sock_write_iter) from [<8016ae18>] (__vfs_write+0xb8/0xe8)
[ 2102.874077] [<8016ae18>] (__vfs_write) from [<8016b658>] (vfs_write+0xa0/0x1a8)
[ 2102.874096] [<8016b658>] (vfs_write) from [<8016bf78>] (SyS_write+0x4c/0xa0)
[ 2102.874122] [<8016bf78>] (SyS_write) from [<8000fc40>] (ret_fast_syscall+0x0/0x1c)

@gesellix
Copy link
Author

The current Raspbian (2017-03-02-raspbian-jessie-lite) works without Kernel errors.

Kernel Version: 4.4.50-v7+
Operating System: Raspbian GNU/Linux 8 (jessie)

 Version:      17.03.0-ce
 API version:  1.26 (minimum version 1.12)
 Go version:   go1.7.5
 Git commit:   60ccb22
 Built:        Thu Feb 23 11:32:23 2017
 OS/Arch:      linux/arm
 Experimental: true

@gesellix
Copy link
Author

I created an issue for Hypriot: hypriot/image-builder-rpi#166
Let's focus on Raspbian and the other steps to move forward with this PR.

@ManoMarks
Copy link
Contributor

Any update on this? Should I close this PR?

@ManoMarks
Copy link
Contributor

I'm going to close this PR. Feel free to re-open if you want to try again.

@ManoMarks ManoMarks closed this May 4, 2017
Olivety pushed a commit to Olivety/example-voting-app that referenced this pull request Nov 1, 2019
* Fix initial state of demo

- pages/launches.js contains a minimal React component
- pages/index.js removes all the components that have not been implemented yet

* Removed forgotten import

* Scaffold React components, revert pages/index.js

* Spacing adjustments
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants