Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP

Loading…

3.12.7 problems with GPU_MEM=16 in config.txt #503

Closed
amtssp opened this Issue · 21 comments

4 participants

@amtssp

Hi

3.12.6 and 3.12.7 kernels won't fully boot if GPU_MEM in config.txt is reduced to 16 MB.

Maybe it is booting, however, only the raspberry image is present on the screen in the top left corner. No keyboard input is shown.

However, the LEDs seems to indicate that that the kernel has booted, but I'm unable to produce any text on the screen.

@popcornmix
Owner

Does it work for you with 3.10.y kernel?

@msperl

I believe I may see a similar issue with 3.12 (a93bfa0) and 3.13(6928683), but it is not necessarily related to GPU_MEM=16, but in my case to:
config.txt:

gpu_mem_256=112
gpu_mem_512=368
cma_lwm=16
cma_hwm=32
cma_offline_start=16

cmdline.txt:
coherent_pool=6M smsc95xx.turbo_mode=N dwc_otg.lpm_enable=0 console=ttyAMA0,115
200 kgdboc=ttyAMA0,115200 console=tty1 root=/dev/mmcblk0p2 rootfstype=ext4 eleva
tor=deadline rootwait

as recommended in: http://www.raspberrypi.org/phpBB3/viewtopic.php?f=29&t=19334&start=125
then I get:

[    0.018694] ------------[ cut here ]------------
[    0.018771] WARNING: CPU: 0 PID: 1 at mm/page_alloc.c:2483 __alloc_pages_nodemask+0x1ac/0x89c()
[    0.018819] Modules linked in:
[    0.018859] CPU: 0 PID: 1 Comm: swapper Not tainted 3.12.9+ #20
[    0.018945] [<c0013fc0>] (unwind_backtrace+0x0/0xf0) from [<c0011264>] (show_stack+0x10/0x14)
[    0.019022] [<c0011264>] (show_stack+0x10/0x14) from [<c001ed1c>] (warn_slowpath_common+0x68/0x88)
[    0.019089] [<c001ed1c>] (warn_slowpath_common+0x68/0x88) from [<c001ed58>] (warn_slowpath_null+0x1c/0x24)
[    0.019160] [<c001ed58>] (warn_slowpath_null+0x1c/0x24) from [<c00a02cc>] (__alloc_pages_nodemask+0x1ac/0x89c)
[    0.019237] [<c00a02cc>] (__alloc_pages_nodemask+0x1ac/0x89c) from [<c00172d0>] (__dma_alloc_buffer.isra.20+0x2c/0xb8)
[    0.019308] [<c00172d0>] (__dma_alloc_buffer.isra.20+0x2c/0xb8) from [<c0017370>] (__alloc_remap_buffer.isra.23+0x14/0xa0)
[    0.019392] [<c0017370>] (__alloc_remap_buffer.isra.23+0x14/0xa0) from [<c0597514>] (atomic_pool_init+0x6c/0x10c)
[    0.019464] [<c0597514>] (atomic_pool_init+0x6c/0x10c) from [<c000851c>] (do_one_initcall+0x40/0x180)
[    0.019530] [<c000851c>] (do_one_initcall+0x40/0x180) from [<c0593b34>] (kernel_init_freeable+0xe8/0x1b4)
[    0.019601] [<c0593b34>] (kernel_init_freeable+0xe8/0x1b4) from [<c04121c4>] (kernel_init+0x8/0xe4)
[    0.019667] [<c04121c4>] (kernel_init+0x8/0xe4) from [<c000e158>] (ret_from_fork+0x14/0x3c)
[    0.019766] ---[ end trace da227214a82491b7 ]---
[    0.019808] DMA: failed to allocate 6144 KiB pool for atomic coherent allocation
[    0.020494] cpuidle: using governor ladder
...
[    1.525473] ------------[ cut here ]------------
[    1.531277] WARNING: CPU: 0 PID: 1 at arch/arm/mm/dma-mapping.c:491 __dma_alloc+0x20c/0x254()
[    1.542012] coherent pool not initialised!
[    1.547254] Modules linked in:
[    1.551445] CPU: 0 PID: 1 Comm: swapper Tainted: G        W    3.12.9+ #20
[    1.559479] [<c0013fc0>] (unwind_backtrace+0x0/0xf0) from [<c0011264>] (show_stack+0x10/0x14)
[    1.570309] [<c0011264>] (show_stack+0x10/0x14) from [<c001ed1c>] (warn_slowpath_common+0x68/0x88)
[    1.581718] [<c001ed1c>] (warn_slowpath_common+0x68/0x88) from [<c001edd0>] (warn_slowpath_fmt+0x30/0x40)
[    1.593911] [<c001edd0>] (warn_slowpath_fmt+0x30/0x40) from [<c0017608>] (__dma_alloc+0x20c/0x254)
[    1.605593] [<c0017608>] (__dma_alloc+0x20c/0x254) from [<c0017770>] (arm_dma_alloc+0x80/0x98)
[    1.617061] [<c0017770>] (arm_dma_alloc+0x80/0x98) from [<c05a8b0c>] (vchiq_platform_init+0x3c/0x1fc)
[    1.629342] [<c05a8b0c>] (vchiq_platform_init+0x3c/0x1fc) from [<c05a8a10>] (vchiq_init+0xe0/0x1a0)
[    1.641711] [<c05a8a10>] (vchiq_init+0xe0/0x1a0) from [<c000851c>] (do_one_initcall+0x40/0x180)
[    1.653793] [<c000851c>] (do_one_initcall+0x40/0x180) from [<c0593b34>] (kernel_init_freeable+0xe8/0x1b4)
[    1.666756] [<c0593b34>] (kernel_init_freeable+0xe8/0x1b4) from [<c04121c4>] (kernel_init+0x8/0xe4)
[    1.679325] [<c04121c4>] (kernel_init+0x8/0xe4) from [<c000e158>] (ret_from_fork+0x14/0x3c)
[    1.691197] ---[ end trace da227214a82491b8 ]---
[    1.697545] vchiq: Unable to allocate channel memory
[    1.704485] vchiq: could not load vchiq
...
several more such exceptions...

So it may be related to this in some respect resulting in DMA memory not being allocateable.
Maybe it is related to Firmware?

But then: the same settings work with a 3.11(8f768c5) kernel

One surprising thing though: access to everything other Device on the USB is working fine (via the serial console) - only the USB NIC code seems to fail...

Martin

@msperl

P.s: only removing: coherent_pool=6M
and

cma_lwm=16
cma_hwm=32
cma_offline_start=16

makes the device boot without any of those "exceptions/traces" and only then network is working - otherwise there are no packets received, but TX counters goes up for DHCP, but I am not sure if the packets really get on the wire...

@msperl

And if you look at the very early details it shows: a difference between 3.11

[    0.000000] Booting Linux on physical CPU 0x0
[    0.000000] Initializing cgroup subsys cpu
[    0.000000] Initializing cgroup subsys cpuacct
[    0.000000] Linux version 3.11.10+ (root@raspberrypi) (gcc version 4.6.3 (Debian 4.6.3-14+rpi1) ) #21 PREEMPT Sat Feb 1 19:53:59 UTC 2014
[    0.000000] CPU: ARMv6-compatible processor [410fb767] revision 7 (ARMv7), cr=00c5387d
[    0.000000] CPU: PIPT / VIPT nonaliasing data cache, VIPT nonaliasing instruction cache
[    0.000000] Machine: BCM2708
[    0.000000] early_vc_cma_mem(0/0x14c00000@0xa000000)
[    0.000000]  -> initial 0, size 14c00000, base a000000<6>[    0.000000] cma: CMA: reserved 332 MiB at 0a000000
[    0.000000] cma: CMA: reserved 16 MiB at 08000000
[    0.000000] Memory policy: ECC disabled, Data cache writeback
[    0.000000] On node 0 totalpages: 121856
[    0.000000] free_area_init_node: node 0, pgdat c05d7b20, node_mem_map c06830$
[    0.000000]   Normal zone: 984 pages used for memmap
[    0.000000]   Normal zone: 0 pages reserved
[    0.000000]   Normal zone: 121856 pages, LIFO batch:31
[    0.000000] pcpu-alloc: s0 r0 d32768 u32768 alloc=1*32768
[    0.000000] pcpu-alloc: [0] 0
...

and 3.13:

[    0.000000] Booting Linux on physical CPU 0x0
[    0.000000] Initializing cgroup subsys cpu
[    0.000000] Initializing cgroup subsys cpuacct
[    0.000000] Linux version 3.13.0+ (root@raspberrypi) (gcc version 4.6.3 (Deb$
[    0.000000] CPU: ARMv6-compatible processor [410fb767] revision 7 (ARMv7), c$
[    0.000000] CPU: PIPT / VIPT nonaliasing data cache, VIPT nonaliasing instru$
[    0.000000] Machine: BCM2708
[    0.000000] early_vc_cma_mem(0/0x14c00000@0xa000000)
[    0.000000]  -> initial 0, size 14c00000, base a000000<3>[    0.000000] vc_cma: dma_declare_contiguous(14c00000,a000000) failed
[    0.000000] Memory policy: Data cache writeback
[    0.000000] On node 0 totalpages: 121856
[    0.000000] free_area_init_node: node 0, pgdat c05f86b4, node_mem_map c06a60$
[    0.000000]   Normal zone: 984 pages used for memmap
[    0.000000]   Normal zone: 0 pages reserved
[    0.000000]   Normal zone: 121856 pages, LIFO batch:31
[    0.000000] pcpu-alloc: s0 r0 d32768 u32768 alloc=1*32768
[    0.000000] pcpu-alloc: [0] 0

You see the missing newline in the message "initial,..." which results in a concatenated subsequent line (which should get fixed) and then the vc_cma_dma_declare_contiguous error for the 3.13 kernel?
You also see that the data cache policy has changed from "ECC disabled, Data cache writeback"
to "Data cache writeback"...

So something VERY early during the initialization fails already... (and it is the same error for 3.12)

@amtssp

It is still not booting fully in 3.13.y kernels if GPU_MEM is reduced to 16.

3.10.y kernels are OK
It hangs with just the raspberry icon in the top left corner.

@popcornmix
Owner

I'm not seeing this just booted and:

pi@raspberrypi:~ $ uname -a
Linux raspberrypi 3.13.6+ #938 PREEMPT Fri Mar 7 17:36:24 GMT 2014 armv6l GNU/Linux
pi@raspberrypi:~ $ vcgencmd version
Mar  7 2014 16:43:19 
Copyright (c) 2012 Broadcom
version c96cb035fdc907d28db836bbb1606aea2a8e73d9 (clean) (release)
pi@raspberrypi:~ $ vcgencmd get_mem gpu
gpu=16M
pi@raspberrypi:~ $ vcgencmd get_mem arm
arm=496M
pi@raspberrypi:~ $ free
             total       used       free     shared    buffers     cached
Mem:        496756      53192     443564          0         20      28592
-/+ buffers/cache:      24580     472176
Swap:            0          0          0

Do you have any config.txt settings or cmdline.txt settings that are non-default?

@msperl

See my comments about CMA setup - hence - non-fixed memory split...

@popcornmix
Owner

CMA is not officially supported. I was asking @amtssp who I believe isn't using CMA.

@msperl

OK - that is something new, which you may want to document somewhere.
I found out about the options (which are of most interest to a model A + camera use-cases) via: http://elinux.org/RPiconfig
I will add a comment there that it is not officially supported...

@popcornmix
Owner

@msperl
We tried getting it working, but it's never been reliable for me and so isn't enabled as standard.
You always seem to end up with a flood of kernel alloc failures under heavy load involving network (e.g. using midori). It doesn't get officially tested.

If someone who knows CMA well and understands what causes these alloc failures wants to help get it working, then we'd be interested in getting it working well.

@msperl

I tried it on mine with gpu_mem=16 and it booted to gpu_mem=128
only when setting it to gpu_mem=32 or higher it was gpu_mem=32 after booting.

Here the example for the setting of 31MB for GPU:

root@raspberrypi:~# uptime
 18:23:23 up 4 min,  1 user,  load average: 0.02, 0.14, 0.08
root@raspberrypi:~# grep gpu_mem /boot/config.txt 
gpu_mem_256=31
gpu_mem_512=31
root@raspberrypi:~# free
             total       used       free     shared    buffers     cached
Mem:        382988      62236     320752          0      13864      26572
-/+ buffers/cache:      21800     361188
Swap:       262140          0     262140
root@raspberrypi:~# vcgencmd get_mem gpu
gpu=128M
root@raspberrypi:~# vcgencmd get_mem arm
arm=384M
root@raspberrypi:~# vcgencmd version
Jan 29 2014 14:58:39 
Copyright (c) 2012 Broadcom
version 0f547430c65eae8761de21ee72246bf0dc3bbf79 (clean) (release)

This could indicate that it is no longer allowed to run with GPU_MEM<32M with newer versions of the firmware and the issue of @amtssp is related to a FW version that still allowed it but triggered a bug in USB...

@popcornmix
Owner

@msperl but my log shows gpu_mem=16 is supported by latest firmware/kernel.
Any value less than 32 causes start_cd.elf and fixup_cd.dat to be used.
My guess is you have a missing/broken start_cd.elf, or you are overriding the start file in config.txt.

@msperl

the only thing that i override is kernel=... to use my compiled kernel - independently to what the rpi-update does provide...

But disabling start_x=1 (which gets enabled with the camera) gets me back to 16MB...

@popcornmix
Owner

Yes of course. There are three start files, a cutdown one (start_cd.elf), a normal one (start.elf) and an extended one (start_x.elf).
You can't have the cutdown memory usage along with the extended feature set.
To use the camera you need at least gpu_mem=64M.

@popcornmix popcornmix referenced this issue from a commit in raspberrypi/firmware
Dom Cobley firmware: audio_render: add support for float samples and non-power-o…
…f-two channels

firmware: hdmi: Allow hdmi channel map to be overridden with a gencmd

firmware: audioplus: limit sample rates to ones supported by hardware

firmware: mailbox: Add property to get memory handle from dispmanx resource
See: #257

firmware: Allow interrupts to be masked from GPU (e.g. when arm is handling them)
See: #257

firmware: memory reduction of cutdown firmware (saves about 1M)
See: raspberrypi/linux#503
05dbfa4
@popcornmix popcornmix referenced this issue from a commit in Hexxeh/rpi-firmware
Dom Cobley firmware: audio_render: add support for float samples and non-power-o…
…f-two channels

firmware: hdmi: Allow hdmi channel map to be overridden with a gencmd

firmware: audioplus: limit sample rates to ones supported by hardware

firmware: mailbox: Add property to get memory handle from dispmanx resource
See: raspberrypi/firmware#257

firmware: Allow interrupts to be masked from GPU (e.g. when arm is handling them)
See: raspberrypi/firmware#257

firmware: memory reduction of cutdown firmware (saves about 1M)
See: raspberrypi/linux#503
c869b3b
@popcornmix
Owner

@amtssp
I've done some pruning on start_cd.elf and saves about 1M.
If your problem was gpu exhausting its share of memory then this may have been fixed.
Can you test?

@amtssp

Sorry for my late reply.
I'm now at kernel 3.14.1 but still gpu_mem=16 causes the boot process to stop when the raspberry image is shown. But still the LEDs seems to flicker as they use to do when it is booting normally.

If I use gpu_mem=32 it boots fully and everything is fine.

@msperl

Can you connect a serial console and see what you get on the console when booting gpu_mem=16 and share it?

Alternatively if you do not have a serial console at hand you can try the following:
erase: /var/log/syslog`

then reboot with gpu_mem=16

and then after some time with your "blinking" led boot back again with gpu_mem=32?

then check /var/log/syslog and check if you get 2 blocks of lines similar to this:

May  1 08:38:28 raspberrypi kernel: imklog 5.8.11, log source = /proc/kmsg started.
May  1 08:38:28 raspberrypi rsyslogd: [origin software="rsyslogd" swVersion="5.8.11" x-pid="1900" x-info="http://www.rsyslog.com"] start
May  1 08:38:28 raspberrypi kernel: [    0.000000] Booting Linux on physical CPU 0x0
May  1 08:38:28 raspberrypi kernel: [    0.000000] Initializing cgroup subsys cpu
May  1 08:38:28 raspberrypi kernel: [    0.000000] Initializing cgroup subsys cpuacct
May  1 08:38:28 raspberrypi kernel: [    0.000000] Linux version 3.13.4+ (root@raspberrypi) (gcc version 4.6.3 (Debian 4.6.3-14+rpi1) ) #27 PREEMPT Mon Mar 31 12:01:05 UTC 2014
May  1 08:38:28 raspberrypi kernel: [    0.000000] CPU: ARMv6-compatible processor [410fb767] revision 7 (ARMv7), cr=00c5387d
May  1 08:38:28 raspberrypi kernel: [    0.000000] CPU: PIPT / VIPT nonaliasing data cache, VIPT nonaliasing instruction cache
May  1 08:38:28 raspberrypi kernel: [    0.000000] Machine: BCM2708

This would indicate that the kernel is booting far enough to get to a point where it can log its boot-messages to SD-card.

Then please share all those kernel: lines - especially the ones from the boot with gpu_mem=16
(note that you may want to sanitize these lines: Kernel command line: dma.dmachans=0x7f35 bcm2708_fb.fbwidth=6... removing the values of bcm2708.serial=... and smsc95xx.macaddr=...)

If you do not see the first boot, then we need to get the output from the serial console.

@popcornmix
Owner

@amtssp
Can you confirm if this happens with latest 3.12.23 kernel?
Are you using CMA?

@amtssp

@popcornmix
Sorry I have not tested with 3.12.23 yet,
However, I just tried to copy the newest firmware to my 3.14.2 kernel.
It still does not boot fully with gpu mem=16 (stops with the raspberry image in the top left corner). If I increase gpu mem to 32 it boots fully.

I will try to find time to test the 3.12.23 later.

No I don't use CMA

@popcornmix
Owner

Anything non-default in cmdline.txt or config.txt?
I tried booting with gpu_mem=16 yesterday and it was fine.

@Noltari Noltari referenced this issue from a commit
Commit has since been removed from the repository and is no longer available.
@Noltari Noltari referenced this issue from a commit
Commit has since been removed from the repository and is no longer available.
@P33M
Owner

Closing as no reponse from OP. If gpu_mem=16 is still broken then post a comment explaining what configuration this was used with.

@P33M P33M closed this
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.