Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bcm2835 ALSA sound: pop on sound playback start/completion #128

Closed
P33M opened this issue Oct 2, 2012 · 48 comments
Closed

bcm2835 ALSA sound: pop on sound playback start/completion #128

P33M opened this issue Oct 2, 2012 · 48 comments

Comments

@P33M
Copy link

@P33M P33M commented Oct 2, 2012

Hello

After doing some testing with sound onboard the Rpi, I can characterize (via oscilloscope) the conditions that result in a rather jarring click on start of playback and pop on completion of playback

Running FW 339137
Kernel Linux raspberrypi 3.2.27+ #174 PREEMPT Wed Sep 26 14:09:47

Sample command:
mpg321blah.mp3
(sound plays)
^C
(pop)

The rpi schematic details use of PWM channels for audio carrier generation - having read some other information on improving the rpi's sound quality, I can offer insight on to what happens when the PWM is initiated/released in playback.

On boot and in an "idle" state, the PWM output is 0V (disabled). When playback commences, the voltage measured at C20/C26 pin "2" ramps to +0.56V quite quickly, causing an audible click. This is because the output 10u AC coupling capacitor C34/C48 passes this ramp, which then saturates any input stage in a downstream amp expecting a ~700mV rms AC signal. The effect is variable depending on the input coupling used on the downstream amp.

A similar but more complex effect happens when the audio playback is interrupted or completes. On measuring C20 pin "2" the +0.56V DC biased audio signal has a brief pulse at +1.8V for 6.5ms, then 0V for 16.5ms, then a pulse back to 0.56V for 4ms, then 0V idle state [a fat pair of square pulses]. Similar to the start of playback case, this causes the pop sound on playback.

Maybe the open/close functions in ALSA need to be set so that they preserve PWM at 50% duty cycle on a) driver loading and b) start/finish of playback

@damiencorpataux
Copy link

@damiencorpataux damiencorpataux commented Oct 17, 2012

Hi. Same here.

Linux raspberrypi 3.1.9+ #90 Wed Apr 18 18:23:05 BST 2012 armv6l GNU/Linux

GPU (useful?):
/opt/vc/bin/vcgencmd version
Apr 18 2012 15:04:46
Copyright (c) 2012 Broadcom
version 310376 (release)

@licaon-kter
Copy link

@licaon-kter licaon-kter commented Oct 18, 2012

updating to the latest kernel & firmware help?

@hetsch
Copy link

@hetsch hetsch commented Oct 22, 2012

Hi, same problem..

root@XBian:~# /opt/vc/bin/vcgencmd version
Oct 19 2012 23:40:40
Copyright (c) 2012 Broadcom
version 345130 (release)

Linux XBian 3.2.27+ #250 PREEMPT Thu Oct 18 19:03:02 BST 2012 armv6l GNU/Linux

If you need more information, I will be happy to deliver that.

@ghost
Copy link

@ghost ghost commented Oct 25, 2012

http://www.raspberrypi.org/phpBB3/viewtopic.php?f=28&t=20689
IMO It seems to be a longstanding VideoCore firmware issue (not alsa related).
Pausing/starting a video in the omxplayer produces the same voltage jumps.

@tiagocpontesp
Copy link

@tiagocpontesp tiagocpontesp commented Oct 25, 2012

I'm using MPD to play stuff as a workaround, since it doesn't stop/start pops between tracks.

@mdxe
Copy link

@mdxe mdxe commented Oct 29, 2012

I haven't tried it yet, but apparently the Arch Linux doesn't have this problem

@tiagocpontesp
Copy link

@tiagocpontesp tiagocpontesp commented Oct 29, 2012

Aye, it does! (or did)

On Mon, Oct 29, 2012 at 9:20 PM, Alex Bouchard notifications@github.comwrote:

I haven't tried it yet, but apparently the Arch Linux doesn't have this
problem


Reply to this email directly or view it on GitHubhttps://github.com//issues/128#issuecomment-9885887.

@paintjob
Copy link

@paintjob paintjob commented Nov 6, 2012

Arch definitely still has the problem, as well. I experience it at the start and end of tracks in pianobar when using the 3.5mm audio out port. This is with both the Raspian & Arch distros. Sound over HDMI works without a problem.

@hetsch
Copy link

@hetsch hetsch commented Dec 4, 2012

Any news on this one?
I really want to help, but still have no idea where to start...

@cyclooctane
Copy link

@cyclooctane cyclooctane commented Dec 5, 2012

I can confirm this issue on both the 256 and 512mb versions of the Raspberry pi using the current Raspbian image.
As paintjob noted, sound over HDMI works without a problem in both cases.

Regards

Note: the current image in this case is
2012-10-28-wheezy-raspbian

@JstNnyms
Copy link

@JstNnyms JstNnyms commented Dec 5, 2012

same issue here... this is really painfull to endure :(

@dbader
Copy link

@dbader dbader commented Dec 5, 2012

I have the same problem with the 512MB version and analog output.

As a workaround I use mpd with the PulseAudio output plugin. If you disable PulseAudio's module-suspend-on-idle the pops disappear (even when switching songs in mpd) because the audio device is never closed.

@JstNnyms
Copy link

@JstNnyms JstNnyms commented Dec 5, 2012

dbader: could you please give a further explanation how your workaround is ought to be used? im not sure what to do.

should i just comment the line "module-suspend-on-idle" in /etc/pulse/default.pa out?

edit: just commenting the line out didnt work for me.

@dbader
Copy link

@dbader dbader commented Dec 5, 2012

This gives me playback without crackling in mpd (on the latest Raspbian)

  • Install PulseAudio: sudo apt-get install pulseaudio
  • Configure mpd to use PulseAudio in /etc/mpd.conf:
audio_output {
  type   "pulse"
  name   "MPD PulseAudio Output"
}
  • Comment out load-module module-suspend-on-idle in /etc/pulse/default.pa
  • Restart PulseAudio and mpd:
sudo /etc/init.d/pulseaudio restart
sudo /etc/init.d/mpd restart
@JstNnyms
Copy link

@JstNnyms JstNnyms commented Dec 5, 2012

thank you so far. i just need to figure out how to apply that to raspbmc... 👍

@andyhelp
Copy link

@andyhelp andyhelp commented Dec 8, 2012

dbader thanks!
It works (I had to reboot the system).

Tested on Pi Revision B (512MB), rasbian:
pi@raspberrypi ~ $ uname -a
Linux raspberrypi 3.2.27+ #250 PREEMPT Thu Oct 18 19:03:02 BST 2012 armv6l GNU/Linux
pi@raspberrypi ~ $ pulseaudio --version
pulseaudio 2.0
pi@raspberrypi ~ $ mpd --version
mpd (MPD: Music Player Daemon) 0.16.7

@Matthijss
Copy link

@Matthijss Matthijss commented Dec 11, 2012

It also works for xmms2 with pulseaudio. I also had to reboot the system. Now, it only clicks once during start up.

@skipper79
Copy link

@skipper79 skipper79 commented Dec 20, 2012

@dbader; is this also possible for xbian? Can you describe the steps with some more detail? I'm just a beginner ;-) and I really looking forward for a solution for the pop-on-sound.

@Goofinder
Copy link

@Goofinder Goofinder commented Dec 22, 2012

I have enabled pulseaudio on my RPi with mpd, however if I pause the mpd output it fails to restart complaining about the device being suspended despite having commented out module-suspend-on-idle.

Can anyone else repeat this problem? (start playing audio using mpd, pause the output, attempt to restart. No output is possible until the RPi is rebooted).

@acieroid
Copy link

@acieroid acieroid commented Dec 22, 2012

Same problem as Goofinder here, on raspbian, doing mpc play, mpc pause, mpc play. MPD logs contains:

Dec 22 16:43 : output: "My Pulse Output" [pulse] failed to play: suspended

@eisenrah
Copy link

@eisenrah eisenrah commented Dec 22, 2012

Works fine for me, but only for mpd.
I also installed Shairport and changed /etc/libao.conf to "default_driver=pulse", rebooted my Pi, but still no sound from Shairport...
Any thoughts?

@acieroid
Copy link

@acieroid acieroid commented Dec 24, 2012

Note that launching PulseAudio (which is not recommended) in system mode avoid the problem with suspended input. You can do so by adapting the pulseaudio initscript, and adding the mpd user to the group pulse-access.

@dbader
Copy link

@dbader dbader commented Dec 24, 2012

@comotion
Copy link

@comotion comotion commented Jan 3, 2013

workaround ain't good enough for openelec and friends :-(
is there a known setting in the firmware to fix the pops/clicks?

@acieroid
Copy link

@acieroid acieroid commented Jan 3, 2013

Also, note that by setting your RPi's volume to the maximum (by default mine is at 40%...), the pops volume is acceptable (compared to the video/audio volume).

@hetsch
Copy link

@hetsch hetsch commented Jan 3, 2013

Think Mr. Upton knows about the problems with the als driver:
http://permalink.gmane.org/gmane.linux.alsa.devel/104114

@robgithub
Copy link

@robgithub robgithub commented Jan 5, 2013

uname -a
Linux raspberrypi 3.6.11+ #348 PREEMPT Tue Jan 1 16:33:22 GMT 2013 armv6l GNU/Linux
/opt/vc/bin/vcgencmd version
Jan 2 2013 23:40:34
Copyright (c) 2012 Broadcom
version 360331 (release)

still pops at the beginning and end of each audio sample (I am using espeak)

@bob2hh
Copy link

@bob2hh bob2hh commented Jan 5, 2013

Dear Experts,
As far as I can see, you are very knowledgeable about how the sound system using pulse audio works on the raspbian.

I have 2 questions:

  • How can I avoid such "clicks" when using omxplayer ? They happen at the start and the end of a track when using the analog output (e.g. omxplayer -o local anyfile.mp3).
  • I have tried to obtain an output from the 3.5 mm jack local headphone output using mplayer and the pulse audio / alsa mixer. However, I didi not succeed on getting any output at all, always silent. However, on the HDMI output it works fine. I know that in ancient raspbian images the command sudo amixer cset numid=3 1 was supossed to redirect ALSA output to the headphone local jack, but I read somewhere that in the latest raspbian releases this no longer works. Actaully this command refers to the Master Playback Volume control and this only reduces the volume to an unaudible level. Does any of you know how can one redirect the alsa mixer output to the local headphone jack in the latest raspbian releases?

Mine is actually as follows;

pi@raspberrypi ~ $ uname -a
Linux raspberrypi 3.2.27+ #250 PREEMPT Thu Oct 18 19:03:02 BST 2012 armv6l GNU/Linux
pi@raspberrypi ~ $ sudo amixer controls
numid=4,iface=MIXER,name='Master Playback Switch'
numid=3,iface=MIXER,name='Master Playback Volume'
numid=2,iface=MIXER,name='Capture Switch'
numid=1,iface=MIXER,name='Capture Volume'
pi@raspberrypi ~ $ sudo amixer cset numid=3 1
numid=3,iface=MIXER,name='Master Playback Volume'
; type=INTEGER,access=rw------,values=2,min=0,max=65536,step=1
: values=1,1
(... the only difference when doing this is that the volume is practically unadible, but it is still going to the HDMI output and not to the local headphone oputput).
pi@raspberrypi ~ $ sudo amixer cset numid=3 65536
numid=3,iface=MIXER,name='Master Playback Volume'
; type=INTEGER,access=rw------,values=2,min=0,max=65536,step=1
: values=65536,65536
(... previous command resumes normal volume to the HDMI output).
pi@raspberrypi ~ $ aplay -L
null
Discard all samples (playback) or generate zero samples (capture)
pulse
Playback/recording through the PulseAudio sound server
sysdefault:CARD=ALSA
bcm2835 ALSA, bcm2835 ALSA
Default Audio Device

Many thanks in advance!

Bob.

@robgithub
Copy link

@robgithub robgithub commented Jan 20, 2013

uname -a
Linux raspberrypi 3.6.11+ #358 PREEMPT Tue Jan 15 00:45:33 GMT 2013 armv6l GNU/Linux
/opt/vc/bin/vcgencmd version
Jan 15 2013 12:54:01
Copyright (c) 2012 Broadcom
version 362704 (release)

still pops at the beginning and end of each audio sample (I am using espeak)

@skechboy
Copy link

@skechboy skechboy commented Jan 25, 2013

Same problem here
Linux raspberrypi 3.6.11-4-ARCH+ #1 PREEMPT Wed Jan 16 19:06:54 UTC 2013 armv6l GNU/Linux

Isn't there any way to disable alsa power saving mode so the sound card is constantly on?

@licaon-kter
Copy link

@licaon-kter licaon-kter commented Jan 26, 2013

sudo echo "performance" > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor ?

@robgithub
Copy link

@robgithub robgithub commented Jan 28, 2013

sudo echo "performance" > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor ?

no change, still pops at the beginning and end of each audio sample

@jamesblackburn
Copy link

@jamesblackburn jamesblackburn commented Feb 10, 2013

Someone's written a blog post about pop-free audio for XBMC using the analog jack:
http://www.oxymoronical.com/blog/2013/01/Pop-free-sound-from-a-Raspberry-Pi-running-XBMC

@mrbrdo
Copy link

@mrbrdo mrbrdo commented Feb 10, 2013

Yeah but that requires pulseaudio, which for example OpenELEC doesn't have by default. And it's a hack anyway, not a real solution.

Seeing this issue is like 4 months old, would be really nice to see a fix soon...

@jamesblackburn
Copy link

@jamesblackburn jamesblackburn commented Feb 10, 2013

Agreed. The problem has existed since the beginning so just thought I'd post one workaround in case it helps someone...

@sjmurdoch
Copy link

@sjmurdoch sjmurdoch commented Feb 11, 2013

I've put some more detail in my blog post, along with a work-around suitable for playing back sound with varying sample rate: http://www.lightbluetouchpaper.org/2013/02/10/fixing-poppingclicking-audio-on-raspberry-pi/

@Dmole
Copy link

@Dmole Dmole commented Feb 11, 2013

The workaround only fixes the stop pop not the start pop.
related: OpenELEC/OpenELEC.tv#923

@P33M
Copy link
Author

@P33M P33M commented Mar 7, 2013

A recent firmware update should have fixed some of these issues - the GPU firmware now doesn't switch off the PWM after the device is released.

@leucos
Copy link

@leucos leucos commented Mar 7, 2013

Awesome. The click is here at first play, but then never happens again.

@robgithub
Copy link

@robgithub robgithub commented Mar 7, 2013

uname -a
Linux raspberrypi 3.6.11+ #389 PREEMPT Wed Mar 6 12:43:30 GMT 2013 armv6l GNU/Linux
/opt/vc/bin/vcgencmd version
Mar 4 2013 22:02:46
Copyright (c) 2012 Broadcom
version 374489 (release)

Pop is on first audio use, clear until next reboot, HUGE improvement, Many Thanks!

@andyhunti
Copy link

@andyhunti andyhunti commented Mar 15, 2013

I used rpi-update to update to commit c2d133fb4efe9c9995da7fd7e1c45d74254f5c4b
That caused the problem again.
Downgrading with sudo rpi-update 779f0fb6139452a0f1c4be32dab58eb87359517e fixed it.

@popcornmix
Copy link
Collaborator

@popcornmix popcornmix commented Mar 15, 2013

@andyhunti
Are you saying c2d133f has popping or some other issue?
Can you narrow down exactly which commit causes the problem?

@andyhunti
Copy link

@andyhunti andyhunti commented Mar 18, 2013

@popcornmix
Well, I'm not entirely sure I'm afraid – the day after rpi-update seemed to work just like @robgithub listed I ran
sudo apt-get update
and
sudo apt-get upgrade
Quite a lot of packages were updated and the popping at the start and end of playback using mpg321 and aplay returned.
Rolling back using rpi-update appears to have fixed the issue.
I'm not particularly clear on where the list of packages upgraded is stored, but I assume there's a log with that info in it (I'm pretty new to cmdline Linux). If you can advise I can dig out the detail!

@popcornmix
Copy link
Collaborator

@popcornmix popcornmix commented Mar 18, 2013

@andyhunti
apt-get upgrade gets you back onto stable releases. This fix is only on the testing rpi-update tree.

To get back to the testing firmware, run:
sudo rm /boot/.firmware_revision
sudo rpi-update

@popcornmix popcornmix closed this May 1, 2013
anholt referenced this issue in anholt/linux Apr 21, 2015
Unlike scan_async_group(), soc_of_bind() doesn't allocate its
soc_camera_async_client structure using devm_kzalloc(), but has it
embedded inside the soc_of_info structure.  Hence on failure, it must
free the whole soc_of_info structure, and not just the embedded
soc_camera_async_client structure, as the latter causes a warning, and
may cause slab corruption:

    soc-camera-pdrv soc-camera-pdrv.0: Probing soc-camera-pdrv.0
    ------------[ cut here ]------------
    WARNING: CPU: 0 PID: 1 at drivers/base/devres.c:887 devm_kfree+0x30/0x40()
    CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.19.0-shmobile-08386-g37feb0d093cb2d8e #128
    Hardware name: Generic R8A7791 (Flattened Device Tree)
    Backtrace:
    [<c0011e7c>] (dump_backtrace) from [<c0012024>] (show_stack+0x18/0x1c)
     r6:c05a923b r5:00000009 r4:00000000 r3:00204140
    [<c001200c>] (show_stack) from [<c048ed30>] (dump_stack+0x78/0x94)
    [<c048ecb8>] (dump_stack) from [<c002687c>] (warn_slowpath_common+0x8c/0xb8)
     r4:00000000 r3:00000000
    [<c00267f0>] (warn_slowpath_common) from [<c0026980>] (warn_slowpath_null+0x24/0x2c)
     r8:ee7d8214 r7:ed83b810 r6:ed83bc20 r5:fffffffa r4:ed83e510
    [<c002695c>] (warn_slowpath_null) from [<c025e0cc>] (devm_kfree+0x30/0x40)
    [<c025e09c>] (devm_kfree) from [<c032bbf4>] (soc_of_bind.isra.14+0x194/0x1d4)
    [<c032ba60>] (soc_of_bind.isra.14) from [<c032c6b8>] (soc_camera_host_register+0x208/0x31c)
     r9:00000070 r8:ee7e05d0 r7:ee153210 r6:00000000 r5:ee7e0218 r4:ed83bc20
    [<c032c4b0>] (soc_camera_host_register) from [<c032e80c>] (rcar_vin_probe+0x1f4/0x238)
     r8:ee153200 r7:00000008 r6:ee153210 r5:ed83bc10 r4:c066319c r3:000000c0
    [<c032e618>] (rcar_vin_probe) from [<c025c334>] (platform_drv_probe+0x50/0xa0)
     r10:00000000 r9:c0662fa8 r8:00000000 r7:c06a3700 r6:c0662fa8 r5:ee153210
     r4:00000000
    [<c025c2e4>] (platform_drv_probe) from [<c025af08>] (driver_probe_device+0xc4/0x208)
     r6:c06a36f4 r5:00000000 r4:ee153210 r3:c025c2e4
    [<c025ae44>] (driver_probe_device) from [<c025b108>] (__driver_attach+0x70/0x94)
     r9:c066f9c0 r8:c0624a98 r7:c065b790 r6:c0662fa8 r5:ee153244 r4:ee153210
    [<c025b098>] (__driver_attach) from [<c025984c>] (bus_for_each_dev+0x74/0x98)
     r6:c025b098 r5:c0662fa8 r4:00000000 r3:00000001
    [<c02597d8>] (bus_for_each_dev) from [<c025b1dc>] (driver_attach+0x20/0x28)
     r6:ed83c200 r5:00000000 r4:c0662fa8
    [<c025b1bc>] (driver_attach) from [<c025a00c>] (bus_add_driver+0xdc/0x1c4)
    [<c0259f30>] (bus_add_driver) from [<c025b8f4>] (driver_register+0xa4/0xe8)
     r7:c0624a98 r6:00000000 r5:c060b010 r4:c0662fa8
    [<c025b850>] (driver_register) from [<c025ccd0>] (__platform_driver_register+0x50/0x64)
     r5:c060b010 r4:ed8394c0
    [<c025cc80>] (__platform_driver_register) from [<c060b028>] (rcar_vin_driver_init+0x18/0x20)
    [<c060b010>] (rcar_vin_driver_init) from [<c05edde8>] (do_one_initcall+0x108/0x1b8)
    [<c05edce0>] (do_one_initcall) from [<c05edfb4>] (kernel_init_freeable+0x11c/0x1e4)
     r9:c066f9c0 r8:c066f9c0 r7:c062eab0 r6:c06252c4 r5:000000ad r4:00000006
    [<c05ede98>] (kernel_init_freeable) from [<c048c3d0>] (kernel_init+0x10/0xec)
     r9:00000000 r8:00000000 r7:00000000 r6:00000000 r5:c048c3c0 r4:00000000
    [<c048c3c0>] (kernel_init) from [<c000eba0>] (ret_from_fork+0x14/0x34)
     r4:00000000 r3:ee04e000
    ---[ end trace e3a984cc0335c8a0 ]---
    rcar_vin e6ef1000.video: group probe failed: -6

Fixes: 1ddc6a6 ("[media] soc_camera: add support for dt binding soc_camera drivers")

Cc: <stable@vger.kernel.org> # 3.17+
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
popcornmix pushed a commit that referenced this issue May 9, 2015
[ Upstream commit 8e48a2d ]

Unlike scan_async_group(), soc_of_bind() doesn't allocate its
soc_camera_async_client structure using devm_kzalloc(), but has it
embedded inside the soc_of_info structure.  Hence on failure, it must
free the whole soc_of_info structure, and not just the embedded
soc_camera_async_client structure, as the latter causes a warning, and
may cause slab corruption:

    soc-camera-pdrv soc-camera-pdrv.0: Probing soc-camera-pdrv.0
    ------------[ cut here ]------------
    WARNING: CPU: 0 PID: 1 at drivers/base/devres.c:887 devm_kfree+0x30/0x40()
    CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.19.0-shmobile-08386-g37feb0d093cb2d8e #128
    Hardware name: Generic R8A7791 (Flattened Device Tree)
    Backtrace:
    [<c0011e7c>] (dump_backtrace) from [<c0012024>] (show_stack+0x18/0x1c)
     r6:c05a923b r5:00000009 r4:00000000 r3:00204140
    [<c001200c>] (show_stack) from [<c048ed30>] (dump_stack+0x78/0x94)
    [<c048ecb8>] (dump_stack) from [<c002687c>] (warn_slowpath_common+0x8c/0xb8)
     r4:00000000 r3:00000000
    [<c00267f0>] (warn_slowpath_common) from [<c0026980>] (warn_slowpath_null+0x24/0x2c)
     r8:ee7d8214 r7:ed83b810 r6:ed83bc20 r5:fffffffa r4:ed83e510
    [<c002695c>] (warn_slowpath_null) from [<c025e0cc>] (devm_kfree+0x30/0x40)
    [<c025e09c>] (devm_kfree) from [<c032bbf4>] (soc_of_bind.isra.14+0x194/0x1d4)
    [<c032ba60>] (soc_of_bind.isra.14) from [<c032c6b8>] (soc_camera_host_register+0x208/0x31c)
     r9:00000070 r8:ee7e05d0 r7:ee153210 r6:00000000 r5:ee7e0218 r4:ed83bc20
    [<c032c4b0>] (soc_camera_host_register) from [<c032e80c>] (rcar_vin_probe+0x1f4/0x238)
     r8:ee153200 r7:00000008 r6:ee153210 r5:ed83bc10 r4:c066319c r3:000000c0
    [<c032e618>] (rcar_vin_probe) from [<c025c334>] (platform_drv_probe+0x50/0xa0)
     r10:00000000 r9:c0662fa8 r8:00000000 r7:c06a3700 r6:c0662fa8 r5:ee153210
     r4:00000000
    [<c025c2e4>] (platform_drv_probe) from [<c025af08>] (driver_probe_device+0xc4/0x208)
     r6:c06a36f4 r5:00000000 r4:ee153210 r3:c025c2e4
    [<c025ae44>] (driver_probe_device) from [<c025b108>] (__driver_attach+0x70/0x94)
     r9:c066f9c0 r8:c0624a98 r7:c065b790 r6:c0662fa8 r5:ee153244 r4:ee153210
    [<c025b098>] (__driver_attach) from [<c025984c>] (bus_for_each_dev+0x74/0x98)
     r6:c025b098 r5:c0662fa8 r4:00000000 r3:00000001
    [<c02597d8>] (bus_for_each_dev) from [<c025b1dc>] (driver_attach+0x20/0x28)
     r6:ed83c200 r5:00000000 r4:c0662fa8
    [<c025b1bc>] (driver_attach) from [<c025a00c>] (bus_add_driver+0xdc/0x1c4)
    [<c0259f30>] (bus_add_driver) from [<c025b8f4>] (driver_register+0xa4/0xe8)
     r7:c0624a98 r6:00000000 r5:c060b010 r4:c0662fa8
    [<c025b850>] (driver_register) from [<c025ccd0>] (__platform_driver_register+0x50/0x64)
     r5:c060b010 r4:ed8394c0
    [<c025cc80>] (__platform_driver_register) from [<c060b028>] (rcar_vin_driver_init+0x18/0x20)
    [<c060b010>] (rcar_vin_driver_init) from [<c05edde8>] (do_one_initcall+0x108/0x1b8)
    [<c05edce0>] (do_one_initcall) from [<c05edfb4>] (kernel_init_freeable+0x11c/0x1e4)
     r9:c066f9c0 r8:c066f9c0 r7:c062eab0 r6:c06252c4 r5:000000ad r4:00000006
    [<c05ede98>] (kernel_init_freeable) from [<c048c3d0>] (kernel_init+0x10/0xec)
     r9:00000000 r8:00000000 r7:00000000 r6:00000000 r5:c048c3c0 r4:00000000
    [<c048c3c0>] (kernel_init) from [<c000eba0>] (ret_from_fork+0x14/0x34)
     r4:00000000 r3:ee04e000
    ---[ end trace e3a984cc0335c8a0 ]---
    rcar_vin e6ef1000.video: group probe failed: -6

Fixes: 1ddc6a6 ("[media] soc_camera: add support for dt binding soc_camera drivers")

Cc: <stable@vger.kernel.org> # 3.17+
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
popcornmix pushed a commit that referenced this issue Feb 12, 2018
[ Upstream commit 4adfa79 ]

When we dump the ip6mr mfc entries via proc, we initialize an iterator
with the table to dump but we don't clear the cache pointer which might
be initialized from a prior read on the same descriptor that ended. This
can result in lock imbalance (an unnecessary unlock) leading to other
crashes and hangs. Clear the cache pointer like ipmr does to fix the issue.
Thanks for the reliable reproducer.

Here's syzbot's trace:
 WARNING: bad unlock balance detected!
 4.15.0-rc3+ #128 Not tainted
 syzkaller971460/3195 is trying to release lock (mrt_lock) at:
 [<000000006898068d>] ipmr_mfc_seq_stop+0xe1/0x130 net/ipv6/ip6mr.c:553
 but there are no more locks to release!

 other info that might help us debug this:
 1 lock held by syzkaller971460/3195:
  #0:  (&p->lock){+.+.}, at: [<00000000744a6565>] seq_read+0xd5/0x13d0
 fs/seq_file.c:165

 stack backtrace:
 CPU: 1 PID: 3195 Comm: syzkaller971460 Not tainted 4.15.0-rc3+ #128
 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
 Google 01/01/2011
 Call Trace:
  __dump_stack lib/dump_stack.c:17 [inline]
  dump_stack+0x194/0x257 lib/dump_stack.c:53
  print_unlock_imbalance_bug+0x12f/0x140 kernel/locking/lockdep.c:3561
  __lock_release kernel/locking/lockdep.c:3775 [inline]
  lock_release+0x5f9/0xda0 kernel/locking/lockdep.c:4023
  __raw_read_unlock include/linux/rwlock_api_smp.h:225 [inline]
  _raw_read_unlock+0x1a/0x30 kernel/locking/spinlock.c:255
  ipmr_mfc_seq_stop+0xe1/0x130 net/ipv6/ip6mr.c:553
  traverse+0x3bc/0xa00 fs/seq_file.c:135
  seq_read+0x96a/0x13d0 fs/seq_file.c:189
  proc_reg_read+0xef/0x170 fs/proc/inode.c:217
  do_loop_readv_writev fs/read_write.c:673 [inline]
  do_iter_read+0x3db/0x5b0 fs/read_write.c:897
  compat_readv+0x1bf/0x270 fs/read_write.c:1140
  do_compat_preadv64+0xdc/0x100 fs/read_write.c:1189
  C_SYSC_preadv fs/read_write.c:1209 [inline]
  compat_SyS_preadv+0x3b/0x50 fs/read_write.c:1203
  do_syscall_32_irqs_on arch/x86/entry/common.c:327 [inline]
  do_fast_syscall_32+0x3ee/0xf9d arch/x86/entry/common.c:389
  entry_SYSENTER_compat+0x51/0x60 arch/x86/entry/entry_64_compat.S:125
 RIP: 0023:0xf7f73c79
 RSP: 002b:00000000e574a15c EFLAGS: 00000292 ORIG_RAX: 000000000000014d
 RAX: ffffffffffffffda RBX: 000000000000000f RCX: 0000000020a3afb0
 RDX: 0000000000000001 RSI: 0000000000000067 RDI: 0000000000000000
 RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
 R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
 R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
 BUG: sleeping function called from invalid context at lib/usercopy.c:25
 in_atomic(): 1, irqs_disabled(): 0, pid: 3195, name: syzkaller971460
 INFO: lockdep is turned off.
 CPU: 1 PID: 3195 Comm: syzkaller971460 Not tainted 4.15.0-rc3+ #128
 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
 Google 01/01/2011
 Call Trace:
  __dump_stack lib/dump_stack.c:17 [inline]
  dump_stack+0x194/0x257 lib/dump_stack.c:53
  ___might_sleep+0x2b2/0x470 kernel/sched/core.c:6060
  __might_sleep+0x95/0x190 kernel/sched/core.c:6013
  __might_fault+0xab/0x1d0 mm/memory.c:4525
  _copy_to_user+0x2c/0xc0 lib/usercopy.c:25
  copy_to_user include/linux/uaccess.h:155 [inline]
  seq_read+0xcb4/0x13d0 fs/seq_file.c:279
  proc_reg_read+0xef/0x170 fs/proc/inode.c:217
  do_loop_readv_writev fs/read_write.c:673 [inline]
  do_iter_read+0x3db/0x5b0 fs/read_write.c:897
  compat_readv+0x1bf/0x270 fs/read_write.c:1140
  do_compat_preadv64+0xdc/0x100 fs/read_write.c:1189
  C_SYSC_preadv fs/read_write.c:1209 [inline]
  compat_SyS_preadv+0x3b/0x50 fs/read_write.c:1203
  do_syscall_32_irqs_on arch/x86/entry/common.c:327 [inline]
  do_fast_syscall_32+0x3ee/0xf9d arch/x86/entry/common.c:389
  entry_SYSENTER_compat+0x51/0x60 arch/x86/entry/entry_64_compat.S:125
 RIP: 0023:0xf7f73c79
 RSP: 002b:00000000e574a15c EFLAGS: 00000292 ORIG_RAX: 000000000000014d
 RAX: ffffffffffffffda RBX: 000000000000000f RCX: 0000000020a3afb0
 RDX: 0000000000000001 RSI: 0000000000000067 RDI: 0000000000000000
 RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
 R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
 R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
 WARNING: CPU: 1 PID: 3195 at lib/usercopy.c:26 _copy_to_user+0xb5/0xc0
 lib/usercopy.c:26

Reported-by: syzbot <bot+eceb3204562c41a438fa1f2335e0fe4f6886d669@syzkaller.appspotmail.com>
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
popcornmix pushed a commit that referenced this issue Feb 19, 2018
[ Upstream commit 4adfa79 ]

When we dump the ip6mr mfc entries via proc, we initialize an iterator
with the table to dump but we don't clear the cache pointer which might
be initialized from a prior read on the same descriptor that ended. This
can result in lock imbalance (an unnecessary unlock) leading to other
crashes and hangs. Clear the cache pointer like ipmr does to fix the issue.
Thanks for the reliable reproducer.

Here's syzbot's trace:
 WARNING: bad unlock balance detected!
 4.15.0-rc3+ #128 Not tainted
 syzkaller971460/3195 is trying to release lock (mrt_lock) at:
 [<000000006898068d>] ipmr_mfc_seq_stop+0xe1/0x130 net/ipv6/ip6mr.c:553
 but there are no more locks to release!

 other info that might help us debug this:
 1 lock held by syzkaller971460/3195:
  #0:  (&p->lock){+.+.}, at: [<00000000744a6565>] seq_read+0xd5/0x13d0
 fs/seq_file.c:165

 stack backtrace:
 CPU: 1 PID: 3195 Comm: syzkaller971460 Not tainted 4.15.0-rc3+ #128
 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
 Google 01/01/2011
 Call Trace:
  __dump_stack lib/dump_stack.c:17 [inline]
  dump_stack+0x194/0x257 lib/dump_stack.c:53
  print_unlock_imbalance_bug+0x12f/0x140 kernel/locking/lockdep.c:3561
  __lock_release kernel/locking/lockdep.c:3775 [inline]
  lock_release+0x5f9/0xda0 kernel/locking/lockdep.c:4023
  __raw_read_unlock include/linux/rwlock_api_smp.h:225 [inline]
  _raw_read_unlock+0x1a/0x30 kernel/locking/spinlock.c:255
  ipmr_mfc_seq_stop+0xe1/0x130 net/ipv6/ip6mr.c:553
  traverse+0x3bc/0xa00 fs/seq_file.c:135
  seq_read+0x96a/0x13d0 fs/seq_file.c:189
  proc_reg_read+0xef/0x170 fs/proc/inode.c:217
  do_loop_readv_writev fs/read_write.c:673 [inline]
  do_iter_read+0x3db/0x5b0 fs/read_write.c:897
  compat_readv+0x1bf/0x270 fs/read_write.c:1140
  do_compat_preadv64+0xdc/0x100 fs/read_write.c:1189
  C_SYSC_preadv fs/read_write.c:1209 [inline]
  compat_SyS_preadv+0x3b/0x50 fs/read_write.c:1203
  do_syscall_32_irqs_on arch/x86/entry/common.c:327 [inline]
  do_fast_syscall_32+0x3ee/0xf9d arch/x86/entry/common.c:389
  entry_SYSENTER_compat+0x51/0x60 arch/x86/entry/entry_64_compat.S:125
 RIP: 0023:0xf7f73c79
 RSP: 002b:00000000e574a15c EFLAGS: 00000292 ORIG_RAX: 000000000000014d
 RAX: ffffffffffffffda RBX: 000000000000000f RCX: 0000000020a3afb0
 RDX: 0000000000000001 RSI: 0000000000000067 RDI: 0000000000000000
 RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
 R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
 R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
 BUG: sleeping function called from invalid context at lib/usercopy.c:25
 in_atomic(): 1, irqs_disabled(): 0, pid: 3195, name: syzkaller971460
 INFO: lockdep is turned off.
 CPU: 1 PID: 3195 Comm: syzkaller971460 Not tainted 4.15.0-rc3+ #128
 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
 Google 01/01/2011
 Call Trace:
  __dump_stack lib/dump_stack.c:17 [inline]
  dump_stack+0x194/0x257 lib/dump_stack.c:53
  ___might_sleep+0x2b2/0x470 kernel/sched/core.c:6060
  __might_sleep+0x95/0x190 kernel/sched/core.c:6013
  __might_fault+0xab/0x1d0 mm/memory.c:4525
  _copy_to_user+0x2c/0xc0 lib/usercopy.c:25
  copy_to_user include/linux/uaccess.h:155 [inline]
  seq_read+0xcb4/0x13d0 fs/seq_file.c:279
  proc_reg_read+0xef/0x170 fs/proc/inode.c:217
  do_loop_readv_writev fs/read_write.c:673 [inline]
  do_iter_read+0x3db/0x5b0 fs/read_write.c:897
  compat_readv+0x1bf/0x270 fs/read_write.c:1140
  do_compat_preadv64+0xdc/0x100 fs/read_write.c:1189
  C_SYSC_preadv fs/read_write.c:1209 [inline]
  compat_SyS_preadv+0x3b/0x50 fs/read_write.c:1203
  do_syscall_32_irqs_on arch/x86/entry/common.c:327 [inline]
  do_fast_syscall_32+0x3ee/0xf9d arch/x86/entry/common.c:389
  entry_SYSENTER_compat+0x51/0x60 arch/x86/entry/entry_64_compat.S:125
 RIP: 0023:0xf7f73c79
 RSP: 002b:00000000e574a15c EFLAGS: 00000292 ORIG_RAX: 000000000000014d
 RAX: ffffffffffffffda RBX: 000000000000000f RCX: 0000000020a3afb0
 RDX: 0000000000000001 RSI: 0000000000000067 RDI: 0000000000000000
 RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
 R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
 R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
 WARNING: CPU: 1 PID: 3195 at lib/usercopy.c:26 _copy_to_user+0xb5/0xc0
 lib/usercopy.c:26

Reported-by: syzbot <bot+eceb3204562c41a438fa1f2335e0fe4f6886d669@syzkaller.appspotmail.com>
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
ED6E0F17 pushed a commit to ED6E0F17/linux that referenced this issue Mar 15, 2018
[ Upstream commit 4adfa79 ]

When we dump the ip6mr mfc entries via proc, we initialize an iterator
with the table to dump but we don't clear the cache pointer which might
be initialized from a prior read on the same descriptor that ended. This
can result in lock imbalance (an unnecessary unlock) leading to other
crashes and hangs. Clear the cache pointer like ipmr does to fix the issue.
Thanks for the reliable reproducer.

Here's syzbot's trace:
 WARNING: bad unlock balance detected!
 4.15.0-rc3+ raspberrypi#128 Not tainted
 syzkaller971460/3195 is trying to release lock (mrt_lock) at:
 [<000000006898068d>] ipmr_mfc_seq_stop+0xe1/0x130 net/ipv6/ip6mr.c:553
 but there are no more locks to release!

 other info that might help us debug this:
 1 lock held by syzkaller971460/3195:
  #0:  (&p->lock){+.+.}, at: [<00000000744a6565>] seq_read+0xd5/0x13d0
 fs/seq_file.c:165

 stack backtrace:
 CPU: 1 PID: 3195 Comm: syzkaller971460 Not tainted 4.15.0-rc3+ raspberrypi#128
 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
 Google 01/01/2011
 Call Trace:
  __dump_stack lib/dump_stack.c:17 [inline]
  dump_stack+0x194/0x257 lib/dump_stack.c:53
  print_unlock_imbalance_bug+0x12f/0x140 kernel/locking/lockdep.c:3561
  __lock_release kernel/locking/lockdep.c:3775 [inline]
  lock_release+0x5f9/0xda0 kernel/locking/lockdep.c:4023
  __raw_read_unlock include/linux/rwlock_api_smp.h:225 [inline]
  _raw_read_unlock+0x1a/0x30 kernel/locking/spinlock.c:255
  ipmr_mfc_seq_stop+0xe1/0x130 net/ipv6/ip6mr.c:553
  traverse+0x3bc/0xa00 fs/seq_file.c:135
  seq_read+0x96a/0x13d0 fs/seq_file.c:189
  proc_reg_read+0xef/0x170 fs/proc/inode.c:217
  do_loop_readv_writev fs/read_write.c:673 [inline]
  do_iter_read+0x3db/0x5b0 fs/read_write.c:897
  compat_readv+0x1bf/0x270 fs/read_write.c:1140
  do_compat_preadv64+0xdc/0x100 fs/read_write.c:1189
  C_SYSC_preadv fs/read_write.c:1209 [inline]
  compat_SyS_preadv+0x3b/0x50 fs/read_write.c:1203
  do_syscall_32_irqs_on arch/x86/entry/common.c:327 [inline]
  do_fast_syscall_32+0x3ee/0xf9d arch/x86/entry/common.c:389
  entry_SYSENTER_compat+0x51/0x60 arch/x86/entry/entry_64_compat.S:125
 RIP: 0023:0xf7f73c79
 RSP: 002b:00000000e574a15c EFLAGS: 00000292 ORIG_RAX: 000000000000014d
 RAX: ffffffffffffffda RBX: 000000000000000f RCX: 0000000020a3afb0
 RDX: 0000000000000001 RSI: 0000000000000067 RDI: 0000000000000000
 RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
 R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
 R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
 BUG: sleeping function called from invalid context at lib/usercopy.c:25
 in_atomic(): 1, irqs_disabled(): 0, pid: 3195, name: syzkaller971460
 INFO: lockdep is turned off.
 CPU: 1 PID: 3195 Comm: syzkaller971460 Not tainted 4.15.0-rc3+ raspberrypi#128
 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
 Google 01/01/2011
 Call Trace:
  __dump_stack lib/dump_stack.c:17 [inline]
  dump_stack+0x194/0x257 lib/dump_stack.c:53
  ___might_sleep+0x2b2/0x470 kernel/sched/core.c:6060
  __might_sleep+0x95/0x190 kernel/sched/core.c:6013
  __might_fault+0xab/0x1d0 mm/memory.c:4525
  _copy_to_user+0x2c/0xc0 lib/usercopy.c:25
  copy_to_user include/linux/uaccess.h:155 [inline]
  seq_read+0xcb4/0x13d0 fs/seq_file.c:279
  proc_reg_read+0xef/0x170 fs/proc/inode.c:217
  do_loop_readv_writev fs/read_write.c:673 [inline]
  do_iter_read+0x3db/0x5b0 fs/read_write.c:897
  compat_readv+0x1bf/0x270 fs/read_write.c:1140
  do_compat_preadv64+0xdc/0x100 fs/read_write.c:1189
  C_SYSC_preadv fs/read_write.c:1209 [inline]
  compat_SyS_preadv+0x3b/0x50 fs/read_write.c:1203
  do_syscall_32_irqs_on arch/x86/entry/common.c:327 [inline]
  do_fast_syscall_32+0x3ee/0xf9d arch/x86/entry/common.c:389
  entry_SYSENTER_compat+0x51/0x60 arch/x86/entry/entry_64_compat.S:125
 RIP: 0023:0xf7f73c79
 RSP: 002b:00000000e574a15c EFLAGS: 00000292 ORIG_RAX: 000000000000014d
 RAX: ffffffffffffffda RBX: 000000000000000f RCX: 0000000020a3afb0
 RDX: 0000000000000001 RSI: 0000000000000067 RDI: 0000000000000000
 RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
 R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
 R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
 WARNING: CPU: 1 PID: 3195 at lib/usercopy.c:26 _copy_to_user+0xb5/0xc0
 lib/usercopy.c:26

Reported-by: syzbot <bot+eceb3204562c41a438fa1f2335e0fe4f6886d669@syzkaller.appspotmail.com>
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
@kfrncs
Copy link

@kfrncs kfrncs commented Aug 29, 2018

I'm still experiencing an extremely intense pop when using the Pi with USB audio (Arturia Audiofuse, which works great otherwise).

@pelwell
Copy link
Contributor

@pelwell pelwell commented Aug 29, 2018

That's a separate problem - feel free to create a new issue, explaining your setup and the symptoms.

@NicoHood
Copy link

@NicoHood NicoHood commented Dec 9, 2018

I have the same problem. I tried 4 different usb soundcards on all raspberry pi versions 1-3, all have bad sound when the bass is loud. I am using pulseaudio to stream the music to my raspberry. Other linux systems dont have any problem with the usb sound card.

@Dmole
Copy link

@Dmole Dmole commented Dec 9, 2018

@NicoHood Pwm an usb are separate issues, clipping loud sound is yet a 3rd issue. Start a new issue if you want help.

popcornmix pushed a commit that referenced this issue Apr 27, 2020
…p PTE entries

commit 36b7840 upstream.

H_PAGE_THP_HUGE is used to differentiate between a THP hugepage and
hugetlb hugepage entries. The difference is WRT how we handle hash
fault on these address. THP address enables MPSS in segments. We want
to manage devmap hugepage entries similar to THP pt entries. Hence use
H_PAGE_THP_HUGE for devmap huge PTE entries.

With current code while handling hash PTE fault, we do set is_thp =
true when finding devmap PTE huge PTE entries.

Current code also does the below sequence we setting up huge devmap
entries.

	entry = pmd_mkhuge(pfn_t_pmd(pfn, prot));
	if (pfn_t_devmap(pfn))
		entry = pmd_mkdevmap(entry);

In that case we would find both H_PAGE_THP_HUGE and PAGE_DEVMAP set
for huge devmap PTE entries. This results in false positive error like
below.

  kernel BUG at /home/kvaneesh/src/linux/mm/memory.c:4321!
  Oops: Exception in kernel mode, sig: 5 [#1]
  LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries
  Modules linked in:
  CPU: 56 PID: 67996 Comm: t_mmap_dio Not tainted 5.6.0-rc4-59640-g371c804dedbc #128
  ....
  NIP [c00000000044c9e4] __follow_pte_pmd+0x264/0x900
  LR [c0000000005d45f8] dax_writeback_one+0x1a8/0x740
  Call Trace:
    str_spec.74809+0x22ffb4/0x2d116c (unreliable)
    dax_writeback_one+0x1a8/0x740
    dax_writeback_mapping_range+0x26c/0x700
    ext4_dax_writepages+0x150/0x5a0
    do_writepages+0x68/0x180
    __filemap_fdatawrite_range+0x138/0x180
    file_write_and_wait_range+0xa4/0x110
    ext4_sync_file+0x370/0x6e0
    vfs_fsync_range+0x70/0xf0
    sys_msync+0x220/0x2e0
    system_call+0x5c/0x68

This is because our pmd_trans_huge check doesn't exclude _PAGE_DEVMAP.

To make this all consistent, update pmd_mkdevmap to set
H_PAGE_THP_HUGE and pmd_trans_huge check now excludes _PAGE_DEVMAP
correctly.

Fixes: ebd3119 ("powerpc/mm: Add devmap support for ppc64")
Cc: stable@vger.kernel.org # v4.13+
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200313094842.351830-1-aneesh.kumar@linux.ibm.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
popcornmix pushed a commit that referenced this issue Apr 27, 2020
…p PTE entries

commit 36b7840 upstream.

H_PAGE_THP_HUGE is used to differentiate between a THP hugepage and
hugetlb hugepage entries. The difference is WRT how we handle hash
fault on these address. THP address enables MPSS in segments. We want
to manage devmap hugepage entries similar to THP pt entries. Hence use
H_PAGE_THP_HUGE for devmap huge PTE entries.

With current code while handling hash PTE fault, we do set is_thp =
true when finding devmap PTE huge PTE entries.

Current code also does the below sequence we setting up huge devmap
entries.

	entry = pmd_mkhuge(pfn_t_pmd(pfn, prot));
	if (pfn_t_devmap(pfn))
		entry = pmd_mkdevmap(entry);

In that case we would find both H_PAGE_THP_HUGE and PAGE_DEVMAP set
for huge devmap PTE entries. This results in false positive error like
below.

  kernel BUG at /home/kvaneesh/src/linux/mm/memory.c:4321!
  Oops: Exception in kernel mode, sig: 5 [#1]
  LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries
  Modules linked in:
  CPU: 56 PID: 67996 Comm: t_mmap_dio Not tainted 5.6.0-rc4-59640-g371c804dedbc #128
  ....
  NIP [c00000000044c9e4] __follow_pte_pmd+0x264/0x900
  LR [c0000000005d45f8] dax_writeback_one+0x1a8/0x740
  Call Trace:
    str_spec.74809+0x22ffb4/0x2d116c (unreliable)
    dax_writeback_one+0x1a8/0x740
    dax_writeback_mapping_range+0x26c/0x700
    ext4_dax_writepages+0x150/0x5a0
    do_writepages+0x68/0x180
    __filemap_fdatawrite_range+0x138/0x180
    file_write_and_wait_range+0xa4/0x110
    ext4_sync_file+0x370/0x6e0
    vfs_fsync_range+0x70/0xf0
    sys_msync+0x220/0x2e0
    system_call+0x5c/0x68

This is because our pmd_trans_huge check doesn't exclude _PAGE_DEVMAP.

To make this all consistent, update pmd_mkdevmap to set
H_PAGE_THP_HUGE and pmd_trans_huge check now excludes _PAGE_DEVMAP
correctly.

Fixes: ebd3119 ("powerpc/mm: Add devmap support for ppc64")
Cc: stable@vger.kernel.org # v4.13+
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200313094842.351830-1-aneesh.kumar@linux.ibm.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
popcornmix pushed a commit that referenced this issue Apr 27, 2020
…p PTE entries

commit 36b7840 upstream.

H_PAGE_THP_HUGE is used to differentiate between a THP hugepage and
hugetlb hugepage entries. The difference is WRT how we handle hash
fault on these address. THP address enables MPSS in segments. We want
to manage devmap hugepage entries similar to THP pt entries. Hence use
H_PAGE_THP_HUGE for devmap huge PTE entries.

With current code while handling hash PTE fault, we do set is_thp =
true when finding devmap PTE huge PTE entries.

Current code also does the below sequence we setting up huge devmap
entries.

	entry = pmd_mkhuge(pfn_t_pmd(pfn, prot));
	if (pfn_t_devmap(pfn))
		entry = pmd_mkdevmap(entry);

In that case we would find both H_PAGE_THP_HUGE and PAGE_DEVMAP set
for huge devmap PTE entries. This results in false positive error like
below.

  kernel BUG at /home/kvaneesh/src/linux/mm/memory.c:4321!
  Oops: Exception in kernel mode, sig: 5 [#1]
  LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries
  Modules linked in:
  CPU: 56 PID: 67996 Comm: t_mmap_dio Not tainted 5.6.0-rc4-59640-g371c804dedbc #128
  ....
  NIP [c00000000044c9e4] __follow_pte_pmd+0x264/0x900
  LR [c0000000005d45f8] dax_writeback_one+0x1a8/0x740
  Call Trace:
    str_spec.74809+0x22ffb4/0x2d116c (unreliable)
    dax_writeback_one+0x1a8/0x740
    dax_writeback_mapping_range+0x26c/0x700
    ext4_dax_writepages+0x150/0x5a0
    do_writepages+0x68/0x180
    __filemap_fdatawrite_range+0x138/0x180
    file_write_and_wait_range+0xa4/0x110
    ext4_sync_file+0x370/0x6e0
    vfs_fsync_range+0x70/0xf0
    sys_msync+0x220/0x2e0
    system_call+0x5c/0x68

This is because our pmd_trans_huge check doesn't exclude _PAGE_DEVMAP.

To make this all consistent, update pmd_mkdevmap to set
H_PAGE_THP_HUGE and pmd_trans_huge check now excludes _PAGE_DEVMAP
correctly.

Fixes: ebd3119 ("powerpc/mm: Add devmap support for ppc64")
Cc: stable@vger.kernel.org # v4.13+
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200313094842.351830-1-aneesh.kumar@linux.ibm.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
popcornmix pushed a commit that referenced this issue May 20, 2020
…p PTE entries

commit 36b7840 upstream.

H_PAGE_THP_HUGE is used to differentiate between a THP hugepage and
hugetlb hugepage entries. The difference is WRT how we handle hash
fault on these address. THP address enables MPSS in segments. We want
to manage devmap hugepage entries similar to THP pt entries. Hence use
H_PAGE_THP_HUGE for devmap huge PTE entries.

With current code while handling hash PTE fault, we do set is_thp =
true when finding devmap PTE huge PTE entries.

Current code also does the below sequence we setting up huge devmap
entries.

	entry = pmd_mkhuge(pfn_t_pmd(pfn, prot));
	if (pfn_t_devmap(pfn))
		entry = pmd_mkdevmap(entry);

In that case we would find both H_PAGE_THP_HUGE and PAGE_DEVMAP set
for huge devmap PTE entries. This results in false positive error like
below.

  kernel BUG at /home/kvaneesh/src/linux/mm/memory.c:4321!
  Oops: Exception in kernel mode, sig: 5 [#1]
  LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries
  Modules linked in:
  CPU: 56 PID: 67996 Comm: t_mmap_dio Not tainted 5.6.0-rc4-59640-g371c804dedbc #128
  ....
  NIP [c00000000044c9e4] __follow_pte_pmd+0x264/0x900
  LR [c0000000005d45f8] dax_writeback_one+0x1a8/0x740
  Call Trace:
    str_spec.74809+0x22ffb4/0x2d116c (unreliable)
    dax_writeback_one+0x1a8/0x740
    dax_writeback_mapping_range+0x26c/0x700
    ext4_dax_writepages+0x150/0x5a0
    do_writepages+0x68/0x180
    __filemap_fdatawrite_range+0x138/0x180
    file_write_and_wait_range+0xa4/0x110
    ext4_sync_file+0x370/0x6e0
    vfs_fsync_range+0x70/0xf0
    sys_msync+0x220/0x2e0
    system_call+0x5c/0x68

This is because our pmd_trans_huge check doesn't exclude _PAGE_DEVMAP.

To make this all consistent, update pmd_mkdevmap to set
H_PAGE_THP_HUGE and pmd_trans_huge check now excludes _PAGE_DEVMAP
correctly.

Fixes: ebd3119 ("powerpc/mm: Add devmap support for ppc64")
Cc: stable@vger.kernel.org # v4.13+
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200313094842.351830-1-aneesh.kumar@linux.ibm.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
You can’t perform that action at this time.