Really bad sound output #5

asdplayer · 2018-11-25T18:07:56Z

I compiled and installed succesfully your kernel, but I got problems at runtime.
I used localmodconfig, and everything else is working. I also followed the
instructions given in this repository. I attached a dmesg and my kernel config that I used (I've renamed it to .txt just to upload it).
cx2072x_dmesg_while_using.log
cx2072x_kernel_config_4.19.4.txt

I tried to reproduce audio, but there are some problems.
Everything was also checked against a working USB soundcard, the Behringer UMC404HD, to rule out the most obvious issues in user software.

=== Play mp3/m4a/ogg/wav using Audacious player, to ALSA:
plays at 2x speed (and 2x pitch), with some noise, like gross aliasing.
The card is reported twice (this is normal I think), as:
-> sysdefault:CARD=chtcx2072x
-> usbstream:CARD=chtcx2072x
selecting the latter results in an error: "ALSA error: snd_pcm_open failed: Invalid argument." Why does it say "usbstream" anyway? The same is reported by aplay -L (list pcm outputs)

=== Play using Audacity editor, to ALSA:
plays at normal speed and pitch.
(There is consistent lagging of the UI when changing play/capture settings from the toolbars, while the soundcard is disabled and reinitialized many (~10) times, but this is fault of Audacity).

=== Play mp3/m4a/ogg/wav/video using Parole video player, which uses gstreamer, which uses PulseAudio:
There is major distortion, but playback speed is OK. See below for an hypothesis.

=== Play mp3/m4a/ogg/wav/video using VLC player, to ALSA:
Audio is played at 2x speed and 2x pitch, but is sinchronized to real time. So, some length of audio is played, then there is a void, then it restarts... At random intervals, maybe 3 gaps a second. The play time is therefore not shortened.
Using Pulseaudio backend instead of direct ALSA, VLC yelds the same exact result obtained from Parole.

=== Play test signals with kokkinizita's jaaa, through JACK, 1024 sample buffer:
I played various test signals.
Playing a 200 Hz sine:

There is a small high frequency distortion byproduct; there is maximum signal output about -42 dB (just under 8-bit noise level?), more than that produces harsh distorsion out of the soundcard. This seems like the wrong endianness is applied to the audio signal, so a small signal stays mainly in the center byte and is represented correctly; the least significant byte is over-represented resulting fundamentally in 1-bit error noise magnified by a 16-bit shift, and if the MSB is used, the signal is completely messed up.
400 Hz output waveform is measured out of the soundcard (I used a tuner app and the Spectroid app on my mobile, and my ear roughly confirms), when no distorsion is happening. This seems like the codec is set for 96 kHz, while applications believe to play at 48 kHz (because they are so told) (but effectively they are requested twice the data per "real" time unit).
The 2nd (right) digital channel outputs sound to both channels of headphones and speakers. Playing the sample on the 1st (left) digital channel results in almost no sound, as if I were hearing just an analog crosstalk. (the 2nd channel is right, while the 1st is left, I think.)
=== Lowering buffer size to 256 samples or 128 samples:
There is another type of problem in the signal output, I can't surely tell what that is. It's like the buffer cannot be less than 512 samples, and it is filled with 0's or random values if I choose a smaller buffer size. Sounds from audio players seem played at the right speed now, I'm not sure about pitch. It seems like the signal is mixed with another frequency that could be a byproduct of the incorrect handling of buffers. I have done guesswork measuring waveforms with a mobile phone, so it's just "wild" guesswork.

=== Playing various sounds through JACK, with synths and audio players:
Every application that outputs over around -45 dB of digital signal results is harsh distorsion out of the soundcard.
Some applications kind of work, they reproduce audio at 2x speed and 2x pitch, and output gets distorted over about -45 dB; they are: jaaa, synth_v1, audacious (which also has a JACK backend), yoshimi.
Some applications don't work, and sometimes wreck the JACK instance so that the JACK server is running, but no sound can be reproduced anymore by any application: mixxx, fluidsynth.
Bristol crashes 3 seconds after started, but that is kind of expected, as the code is old and unstable.
VLC (which has a JACK backend, too) sounds as before, pitch and speed are 2x, except that now intervals with and without sounds are equally spaced and have the same duration, like 50%/50%.

JACK is configured for "playback only", because I could not start it in "duplex" mode (so I only have output, and not input + output).
If I try to start JACK in duplex mode, it says something unhelpful like "overall operation failed", and in the dmesg log appear hundreds of these lines:
[ 177.914611] intel_sst_acpi 808622A8:00: sst: Busy wait failed, cant send this msg
This is a good topic for another report maybe.

EDIT:
My computer is the Asus E200H, the soundcard is the CX20723.
And some typo fixes.
And another thing: If you see the need, in a week's time I can attach an oscilloscope to the headphone output to diagnose further; by now I don't have time. Let me know!

The text was updated successfully, but these errors were encountered:

heikomat · 2018-11-27T17:50:33Z

Oh damn, you've gone way more in-depth than i ever have, nice work!
I'd be very interested in at least reproducing your findings, but i have only ever worked with debian-based distributions. Getting this to work on arch-based distributions would be really nice though.

this guy seems to have been successfull at using this kernel, but he probably used kernel version 4.17, and some things have changed since then.

I see your kernel-version is 4.19.4. The newset version i merged is 4.19. Did you apply the fixes of my cx2072x branch yourself?

Could you write up something like a reproduction manual? This way i might see if something is odd or missing, and could try it myself.

asdplayer · 2018-11-28T10:16:50Z

I'm writing the complete process of compilation I followed. Basically I edit Arch linux build scripts (PKGBUILD files), either the ones used to compile official packages or user-provided ones in the AUR.
I made patches from your branch of the kernel. I don't have enough free space on my E200 to pull two source trees at the same time, so, instead of using diff and learning git, I copied and pasted from the Files changed tab of this page in your repository into patches I modified by hand. They output some warning messages when applied, but otherwise work (How unexpected!). I checked each file by hand afterwards.
By the way I also wonder why is there a difference in a seemingly unrelated file: drivers/pwm/pwm-lpss.c.
Having both the modified tree AND the patches, I tried various kernels, and I obtained the same results on all of them.

4.19.1, official source tarball, my ugly cx2072x patches, -rt patches, some other patches for Arch and bfq-mq scheduler.
4.19, from your repository, with the same patches (but without the need to use my ugly cx2072x patches).
4.19.4, Arch Linux kernel Git repo and the cx2072x patches. No 'native optimizations' option was available on this one, so I chose 'Intel Atom' instead.

All these kernels used make localmodconfig, then I configured them further with make menuconfig to enable native compiler optimization and enable the two options for the cx2072x. And to check every thing I know.
The realtime kernels also had many security features disabled, including memory layout randomization and even features I don't entirely understand.
Instead, while configuring the kernel from the official Arch repo, I left everything as default (other than enabling cx2072x and compile optimization for the Atom architecture), so almost all sensible security features are compiled in.
Before the week ends I'll have more storage space, so I'll try to compile another kernel, based only on this repository, and see if it works.
I also tried to read the sources of the cx2072x driver, but I don't understand many things and I don't have proper documentation of the codec. For example it would be good to be able to use the inbuilt equalizer, or the 192 kHz sample rate, but I'm not ready to code yet.
Thank you for the time you spend on this, I'll post the complete compilation steps soon.

heikomat · 2018-11-28T11:05:57Z

The specific (seemingly unrelated) function you mentioned was not orignally added by me.
When i started this patched kernel, i based it on Fixes from tiwai. he had 3 different branches that fixed different things reagarding cherrytrail.

The specific commit including the reasoning for adding the function was this one

I'll later check if it is really no longer needed, and if so, remove it

7twin · 2018-12-25T06:56:22Z

@heikomat @asdplayer
Just saw this - never got a notification for it. I did indeed not have it on 4.19 yet, as on my 4.19 setup I didn't yet have need for sound.

4.20 is supposed to come soon too, so I would be definitely interested in how to get it to work with either 4.19.12 or 4.20.

Thanks for doing all this work for getting sound working!

heikomat · 2018-12-26T02:00:13Z

@7twin nice to see you here :)
I'll merge 4.20 tomorrow (in about 10-12 hours from now) and make the regular debian/ubuntu build

It was observed that a process blocked indefintely in __fscache_read_or_alloc_page(), waiting for FSCACHE_COOKIE_LOOKING_UP to be cleared via fscache_wait_for_deferred_lookup(). At this time, ->backing_objects was empty, which would normaly prevent __fscache_read_or_alloc_page() from getting to the point of waiting. This implies that ->backing_objects was cleared *after* __fscache_read_or_alloc_page was was entered. When an object is "killed" and then "dropped", FSCACHE_COOKIE_LOOKING_UP is cleared in fscache_lookup_failure(), then KILL_OBJECT and DROP_OBJECT are "called" and only in DROP_OBJECT is ->backing_objects cleared. This leaves a window where something else can set FSCACHE_COOKIE_LOOKING_UP and __fscache_read_or_alloc_page() can start waiting, before ->backing_objects is cleared There is some uncertainty in this analysis, but it seems to be fit the observations. Adding the wake in this patch will be handled correctly by __fscache_read_or_alloc_page(), as it checks if ->backing_objects is empty again, after waiting. Customer which reported the hang, also report that the hang cannot be reproduced with this fix. The backtrace for the blocked process looked like: PID: 29360 TASK: ffff881ff2ac0f80 CPU: 3 COMMAND: "zsh" #0 [ffff881ff43efbf8] schedule at ffffffff815e56f1 #1 [ffff881ff43efc58] bit_wait at ffffffff815e64ed #2 [ffff881ff43efc68] __wait_on_bit at ffffffff815e61b8 #3 [ffff881ff43efca0] out_of_line_wait_on_bit at ffffffff815e625e #4 [ffff881ff43efd08] fscache_wait_for_deferred_lookup at ffffffffa04f2e8f [fscache] #5 [ffff881ff43efd18] __fscache_read_or_alloc_page at ffffffffa04f2ffe [fscache] #6 [ffff881ff43efd58] __nfs_readpage_from_fscache at ffffffffa0679668 [nfs] #7 [ffff881ff43efd78] nfs_readpage at ffffffffa067092b [nfs] #8 [ffff881ff43efda0] generic_file_read_iter at ffffffff81187a73 #9 [ffff881ff43efe50] nfs_file_read at ffffffffa066544b [nfs] #10 [ffff881ff43efe70] __vfs_read at ffffffff811fc756 #11 [ffff881ff43efee8] vfs_read at ffffffff811fccfa #12 [ffff881ff43eff18] sys_read at ffffffff811fda62 #13 [ffff881ff43eff50] entry_SYSCALL_64_fastpath at ffffffff815e986e Signed-off-by: NeilBrown <neilb@suse.com> Signed-off-by: David Howells <dhowells@redhat.com>

Function graph tracing recurses into itself when stackleak is enabled, causing the ftrace graph selftest to run for up to 90 seconds and trigger the softlockup watchdog. Breakpoint 2, ftrace_graph_caller () at ../arch/arm64/kernel/entry-ftrace.S:200 200 mcount_get_lr_addr x0 // pointer to function's saved lr (gdb) bt \#0 ftrace_graph_caller () at ../arch/arm64/kernel/entry-ftrace.S:200 \#1 0xffffff80081d5280 in ftrace_caller () at ../arch/arm64/kernel/entry-ftrace.S:153 \#2 0xffffff8008555484 in stackleak_track_stack () at ../kernel/stackleak.c:106 \#3 0xffffff8008421ff8 in ftrace_ops_test (ops=0xffffff8009eaa840 <graph_ops>, ip=18446743524091297036, regs=<optimized out>) at ../kernel/trace/ftrace.c:1507 \#4 0xffffff8008428770 in __ftrace_ops_list_func (regs=<optimized out>, ignored=<optimized out>, parent_ip=<optimized out>, ip=<optimized out>) at ../kernel/trace/ftrace.c:6286 \#5 ftrace_ops_no_ops (ip=18446743524091297036, parent_ip=18446743524091242824) at ../kernel/trace/ftrace.c:6321 \#6 0xffffff80081d5280 in ftrace_caller () at ../arch/arm64/kernel/entry-ftrace.S:153 \#7 0xffffff800832fd10 in irq_find_mapping (domain=0xffffffc03fc4bc80, hwirq=27) at ../kernel/irq/irqdomain.c:876 \#8 0xffffff800832294c in __handle_domain_irq (domain=0xffffffc03fc4bc80, hwirq=27, lookup=true, regs=0xffffff800814b840) at ../kernel/irq/irqdesc.c:650 \#9 0xffffff80081d52b4 in ftrace_graph_caller () at ../arch/arm64/kernel/entry-ftrace.S:205 Rework so we mark stackleak_track_stack as notrace Co-developed-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Anders Roxell <anders.roxell@linaro.org> Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Kees Cook <keescook@chromium.org>

The *_frag_reasm() functions are susceptible to miscalculating the byte count of packet fragments in case the truesize of a head buffer changes. The truesize member may be changed by the call to skb_unclone(), leaving the fragment memory limit counter unbalanced even if all fragments are processed. This miscalculation goes unnoticed as long as the network namespace which holds the counter is not destroyed. Should an attempt be made to destroy a network namespace that holds an unbalanced fragment memory limit counter the cleanup of the namespace never finishes. The thread handling the cleanup gets stuck in inet_frags_exit_net() waiting for the percpu counter to reach zero. The thread is usually in running state with a stacktrace similar to: PID: 1073 TASK: ffff880626711440 CPU: 1 COMMAND: "kworker/u48:4" #5 [ffff880621563d48] _raw_spin_lock at ffffffff815f5480 #6 [ffff880621563d48] inet_evict_bucket at ffffffff8158020b #7 [ffff880621563d80] inet_frags_exit_net at ffffffff8158051c #8 [ffff880621563db0] ops_exit_list at ffffffff814f5856 #9 [ffff880621563dd8] cleanup_net at ffffffff814f67c0 #10 [ffff880621563e38] process_one_work at ffffffff81096f14 It is not possible to create new network namespaces, and processes that call unshare() end up being stuck in uninterruptible sleep state waiting to acquire the net_mutex. The bug was observed in the IPv6 netfilter code by Per Sundstrom. I thank him for his analysis of the problem. The parts of this patch that apply to IPv4 and IPv6 fragment reassembly are preemptive measures. Signed-off-by: Jiri Wiesner <jwiesner@suse.com> Reported-by: Per Sundstrom <per.sundstrom@redqube.se> Acked-by: Peter Oskolkov <posk@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>

Commit 9b6f7e1 ("mm: rework memcg kernel stack accounting") will result in fork failing if allocating a kernel stack for a task in dup_task_struct exceeds the kernel memory allowance for that cgroup. Unfortunately, it also results in a crash. This is due to the code jumping to free_stack and calling free_thread_stack when the memcg kernel stack charge fails, but without tsk->stack pointing at the freshly allocated stack. This in turn results in the vfree_atomic in free_thread_stack oopsing with a backtrace like this: #5 [ffffc900244efc88] die at ffffffff8101f0ab #6 [ffffc900244efcb8] do_general_protection at ffffffff8101cb86 #7 [ffffc900244efce0] general_protection at ffffffff818ff082 [exception RIP: llist_add_batch+7] RIP: ffffffff8150d487 RSP: ffffc900244efd98 RFLAGS: 00010282 RAX: 0000000000000000 RBX: ffff88085ef55980 RCX: 0000000000000000 RDX: ffff88085ef55980 RSI: 343834343531203a RDI: 343834343531203a RBP: ffffc900244efd98 R8: 0000000000000001 R9: ffff8808578c3600 R10: 0000000000000000 R11: 0000000000000001 R12: ffff88029f6c21c0 R13: 0000000000000286 R14: ffff880147759b00 R15: 0000000000000000 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 #8 [ffffc900244efda0] vfree_atomic at ffffffff811df2c7 #9 [ffffc900244efdb8] copy_process at ffffffff81086e37 #10 [ffffc900244efe98] _do_fork at ffffffff810884e0 #11 [ffffc900244eff10] sys_vfork at ffffffff810887ff #12 [ffffc900244eff20] do_syscall_64 at ffffffff81002a43 RIP: 000000000049b948 RSP: 00007ffcdb307830 RFLAGS: 00000246 RAX: ffffffffffffffda RBX: 0000000000896030 RCX: 000000000049b948 RDX: 0000000000000000 RSI: 00007ffcdb307790 RDI: 00000000005d7421 RBP: 000000000067370f R8: 00007ffcdb3077b0 R9: 000000000001ed00 R10: 0000000000000008 R11: 0000000000000246 R12: 0000000000000040 R13: 000000000000000f R14: 0000000000000000 R15: 000000000088d018 ORIG_RAX: 000000000000003a CS: 0033 SS: 002b The simplest fix is to assign tsk->stack right where it is allocated. Link: http://lkml.kernel.org/r/20181214231726.7ee4843c@imladris.surriel.com Fixes: 9b6f7e1 ("mm: rework memcg kernel stack accounting") Signed-off-by: Rik van Riel <riel@surriel.com> Acked-by: Roman Gushchin <guro@fb.com> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Shakeel Butt <shakeelb@google.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Tejun Heo <tj@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

heikomat · 2018-12-26T13:17:26Z

As promised, 4.20 is merged and the installer-script updated

7twin · 2018-12-26T20:51:26Z

Thanks, well done! will try to test it asap!

7twin · 2018-12-28T06:39:45Z

@heikomat can confirm it working, thanks! the only thing I faced was having to rename: /net/netfilter/xt_rateest.ko to /net/netfilter/xt_RATEEST.ko else during make modules_install it would complain about not being able to stat that file.

I did get two warnings after the successful install about xt_rateest_put together with another one possibly not existing, but haven't seen anything break (yet), so not sure if that is anything to be worried about and possibly just the regular warning output.

heikomat · 2018-12-30T20:35:43Z

@7twin what are the steps you took to build the kernel?
Maybe we can dockerize the process and add a download for arch in future versions

7twin · 2018-12-31T00:58:26Z

@heikomat good that you mentioned that, reminded me that I anyway wanted to write down the process of getting it to work on arch for a long time, but kept forgetting it, here it is: https://github.com/7twin/arch_sound_e200ha

I am terrible at docker though, so can't help much there, hopefully somebody can pick up those instructions and create a working docker image.

heikomat · 2018-12-31T10:10:41Z

Thanks for that!
I do a lot with docker, so i'll give it a try

heikomat · 2019-01-02T16:28:54Z

ok, i tried a few things, and building the kernel in a docker container is most likely not a problem.
What is a problem though, is that i'd like to make it build pacman-installable linux-header and linux-image files.
I found some tutorials on how to make them, but they require configuration files (like a PKGBUILD file) to work, which i don't have.

It is probably very possible to write these package-config-files and then use them to create these installable kernel packages, but as i'm not yet that familiar with arch and pacman, creating these would certainly take me hours or even a day or two, because i'd need to read and learn a lot, which i (at this moment) don't want to invest, as i have other things to do.

Titotix · 2019-01-04T22:43:29Z

@asdplayer I had same troubles you described but only for encoded sounds files (so wav for sounds and mov for video were working fine).
I did steps described by 7twin, it works for me, you should give it a try ;)

asdplayer · 2019-01-05T10:36:07Z

@Titotix Thank you, I definitely have to try the new version. Unluckily my internet connection is down because some moron stepped on optical fiber left on the floor under a manhole cover (and obviously made a mess, the fiber is broken somewhere inside the pipe to my house probably because it was pulled too strong). I had some kind of disconnected holiday, remembering the lost beauty of ADSL. In two days I'll be back at University, if I have time left I'll give this a try. It's unbelievable how short vacations become when in fact you have to work...

asdplayer · 2019-01-09T17:07:07Z

Good evening, I tried the new kernel.
Basically I have the same myriad of problems I had with the previous version.
Here is the PKGBUILD file I used to build the kernel: https://pastebin.com/eCdLU2Zr
To use it (on Arch), download all the files from this page containing files regarding the official kernel package of Arch (by hand, or using ABS (type asp export linux)); then use this PKGBUILD instead of the original one (or edit by yourself).
Here is the .config I used. Copy it in the PKGBUILD directory with the other files, overwrite the existing config file (without trailing dot).
The issues I have are similar to those I had with the previous version.
This time audio playback with Audacious player is much better: playback speed is right now, however the aliasing-like artifacts remain. They can be clearly heard on the end of a song, when it fades away but there is still some signal.
Jack audio doesn't work well, apparently applications play at twice the speed, or break because they are aware of the real playback speed they should achieve.
Playback happens only from the second (right) logical channel, to both headphone speakers; playing from the first (left) logical channel only results in a small distorted output (from both headphone speakers), like analog crosstalk. This was tested with Jack, if that matters.
I'm not sure why, but the second time I plugged in the hedphones, playback happened through the speakers anyway. It may be due to the fact I touched settings in alsamixer which may be related to the speaker/headphone switching and I messed up something.
I had no time to test further, I apologize for this. And I don't understand many things about programming, too. So feel free to ask for more testing, I'll get the notification email and answer ASAP.

heikomat · 2019-01-10T08:16:06Z

@asdplayer thanks for the info and config files. I'll give building the kernel packages using docker another try on the weekend, using these files.

If we have installable packages, we might reproduce your issues on other devices. we could also compare the config you used with the one used by @7twin

7twin · 2019-01-10T08:20:47Z

@heikomat absolutely, I'll be happy to upload my .config I sourced on my e200ha if needed

asdplayer · 2019-01-10T12:11:09Z

@heikomat Do you mean you want to try my compiled package? It doesn't seem too secure on your side, however if you want I can upload it on Drive or something.

heikomat · 2019-01-10T12:13:39Z

@asdplayer close. I meant that i'll try to build the package myself, using your config files

asdplayer · 2019-01-10T12:33:11Z

Ok that's nicer, I misunderstood your message. :-| Thank you for your effort by the way.

heikomat · 2019-01-12T22:20:34Z

@7twin @asdplayer Apparently someone already made an arch package for this kernel O.o
https://aur.archlinux.org/packages/linux-cx2072x/

Can you guys check if this works for you? (i still haven't installed arch on my laptop ^^)

7twin · 2019-01-12T23:24:02Z

@heikomat impressive to see, but to be quite honest I am not sure how I would use that, also lines like that make it seem somewhat odd: https://aur.archlinux.org/cgit/aur.git/tree/PKGBUILD?h=linux-cx2072x#n146

heikomat · 2019-01-24T21:26:10Z

Sorry for the delayed progress on this, but this month is really packed for me. A lot of work, and a lot of studying. I'm still interested in debugging this, but I just don't known when I'll be able to do so.

7twin · 2019-01-24T22:44:18Z

@heikomat absolutely no issue, take your time, exams have much higher priority

asdplayer · 2019-01-25T08:21:44Z

No problem, I'm short of time too.

Ido Schimmel says: ==================== mlxsw: Various fixes This patchset contains various small fixes for mlxsw. Patch #1 fixes a warning generated by switchdev core when the driver fails to insert an MDB entry in the commit phase. Patches #2-#4 fix a warning in check_flush_dependency() that can be triggered when a work item in a WQ_MEM_RECLAIM workqueue tries to flush a non-WQ_MEM_RECLAIM workqueue. It seems that the semantics of the WQ_MEM_RECLAIM flag are not very clear [1] and that various patches have been sent to remove it from various workqueues throughout the kernel [2][3][4] in order to silence the warning. These patches do the same for the workqueues created by mlxsw that probably should not have been created with this flag in the first place. Patch #5 fixes a regression where an IP address cannot be assigned to a VRF upper due to erroneous MAC validation check. Patch #6 adds a test case. Patch #7 adjusts Spectrum-2 shared buffer configuration to be compatible with Spectrum-1. The problem and fix are described in detail in the commit message. Please consider patches #1-#5 for 5.0.y. I verified they apply cleanly. [1] https://patchwork.kernel.org/patch/10791315/ [2] Commit ce162bf ("mac80211_hwsim: don't use WQ_MEM_RECLAIM") [3] Commit 39baf10 ("IB/core: Fix use workqueue without WQ_MEM_RECLAIM") [4] Commit 75215e5 ("iwcm: Don't allocate iwcm workqueue with WQ_MEM_RECLAIM") ==================== Signed-off-by: David S. Miller <davem@davemloft.net>

Syzkaller report this: BUG: unable to handle kernel paging request at fffffbfff830524b PGD 237fe8067 P4D 237fe8067 PUD 237e64067 PMD 1c9716067 PTE 0 Oops: 0000 [#1] SMP KASAN PTI CPU: 1 PID: 4465 Comm: syz-executor.0 Not tainted 5.0.0+ #5 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014 RIP: 0010:__list_add_valid+0x21/0xe0 lib/list_debug.c:23 Code: 8b 0c 24 e9 17 fd ff ff 90 55 48 89 fd 48 8d 7a 08 53 48 89 d3 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 48 83 ec 08 <80> 3c 02 00 0f 85 8b 00 00 00 48 8b 53 08 48 39 f2 75 35 48 89 f2 RSP: 0018:ffff8881ea2278d0 EFLAGS: 00010282 RAX: dffffc0000000000 RBX: ffffffffc1829250 RCX: 1ffff1103d444ef4 RDX: 1ffffffff830524b RSI: ffffffff85659300 RDI: ffffffffc1829258 RBP: ffffffffc1879250 R08: fffffbfff0acb269 R09: fffffbfff0acb269 R10: ffff8881ea2278f0 R11: fffffbfff0acb268 R12: ffffffffc1829250 R13: dffffc0000000000 R14: 0000000000000008 R15: ffffffffc187c830 FS: 00007fe0361df700(0000) GS:ffff8881f7300000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: fffffbfff830524b CR3: 00000001eb39a001 CR4: 00000000007606e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 PKRU: 55555554 Call Trace: __list_add include/linux/list.h:60 [inline] list_add include/linux/list.h:79 [inline] proto_register+0x444/0x8f0 net/core/sock.c:3375 nr_proto_init+0x73/0x4b3 [netrom] ? 0xffffffffc1628000 ? 0xffffffffc1628000 do_one_initcall+0xbc/0x47d init/main.c:887 do_init_module+0x1b5/0x547 kernel/module.c:3456 load_module+0x6405/0x8c10 kernel/module.c:3804 __do_sys_finit_module+0x162/0x190 kernel/module.c:3898 do_syscall_64+0x9f/0x450 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x462e99 Code: f7 d8 64 89 02 b8 ff ff ff ff c3 66 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 bc ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007fe0361dec58 EFLAGS: 00000246 ORIG_RAX: 0000000000000139 RAX: ffffffffffffffda RBX: 000000000073bf00 RCX: 0000000000462e99 RDX: 0000000000000000 RSI: 0000000020000100 RDI: 0000000000000003 RBP: 00007fe0361dec70 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 00007fe0361df6bc R13: 00000000004bcefa R14: 00000000006f6fb0 R15: 0000000000000004 Modules linked in: netrom(+) ax25 fcrypt pcbc af_alg arizona_ldo1 v4l2_common videodev media v4l2_dv_timings hdlc ide_cd_mod snd_soc_sigmadsp_regmap snd_soc_sigmadsp intel_spi_platform intel_spi mtd spi_nor snd_usbmidi_lib usbcore lcd ti_ads7950 hi6421_regulator snd_soc_kbl_rt5663_max98927 snd_soc_hdac_hdmi snd_hda_ext_core snd_hda_core snd_soc_rt5663 snd_soc_core snd_pcm_dmaengine snd_compress snd_soc_rl6231 mac80211 rtc_rc5t583 spi_slave_time leds_pwm hid_gt683r hid industrialio_triggered_buffer kfifo_buf industrialio ir_kbd_i2c rc_core led_class_flash dwc_xlgmac snd_ymfpci gameport snd_mpu401_uart snd_rawmidi snd_ac97_codec snd_pcm ac97_bus snd_opl3_lib snd_timer snd_seq_device snd_hwdep snd soundcore iptable_security iptable_raw iptable_mangle iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 iptable_filter bpfilter ip6_vti ip_vti ip_gre ipip sit tunnel4 ip_tunnel hsr veth netdevsim vxcan batman_adv cfg80211 rfkill chnl_net caif nlmon dummy team bonding vcan bridge stp llc ip6_gre gre ip6_tunnel tunnel6 tun joydev mousedev ppdev tpm kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel ide_pci_generic piix aesni_intel aes_x86_64 crypto_simd cryptd glue_helper ide_core psmouse input_leds i2c_piix4 serio_raw intel_agp intel_gtt ata_generic agpgart pata_acpi parport_pc rtc_cmos parport floppy sch_fq_codel ip_tables x_tables sha1_ssse3 sha1_generic ipv6 [last unloaded: rxrpc] Dumping ftrace buffer: (ftrace buffer empty) CR2: fffffbfff830524b ---[ end trace 039ab24b305c4b19 ]--- If nr_proto_init failed, it may forget to call proto_unregister, tiggering this issue.This patch rearrange code of nr_proto_init to avoid such issues. Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>

By calling maps__insert() we assume to get 2 references on the map, which we relese within maps__remove call. However if there's already same map name, we currently don't bump the reference and can crash, like: Program received signal SIGABRT, Aborted. 0x00007ffff75e60f5 in raise () from /lib64/libc.so.6 (gdb) bt #0 0x00007ffff75e60f5 in raise () from /lib64/libc.so.6 #1 0x00007ffff75d0895 in abort () from /lib64/libc.so.6 #2 0x00007ffff75d0769 in __assert_fail_base.cold () from /lib64/libc.so.6 #3 0x00007ffff75de596 in __assert_fail () from /lib64/libc.so.6 #4 0x00000000004fc006 in refcount_sub_and_test (i=1, r=0x1224e88) at tools/include/linux/refcount.h:131 #5 refcount_dec_and_test (r=0x1224e88) at tools/include/linux/refcount.h:148 #6 map__put (map=0x1224df0) at util/map.c:299 #7 0x00000000004fdb95 in __maps__remove (map=0x1224df0, maps=0xb17d80) at util/map.c:953 #8 maps__remove (maps=0xb17d80, map=0x1224df0) at util/map.c:959 #9 0x00000000004f7d8a in map_groups__remove (map=<optimized out>, mg=<optimized out>) at util/map_groups.h:65 #10 machine__process_ksymbol_unregister (sample=<optimized out>, event=0x7ffff7279670, machine=<optimized out>) at util/machine.c:728 #11 machine__process_ksymbol (machine=<optimized out>, event=0x7ffff7279670, sample=<optimized out>) at util/machine.c:741 #12 0x00000000004fffbb in perf_session__deliver_event (session=0xb11390, event=0x7ffff7279670, tool=0x7fffffffc7b0, file_offset=13936) at util/session.c:1362 #13 0x00000000005039bb in do_flush (show_progress=false, oe=0xb17e80) at util/ordered-events.c:243 #14 __ordered_events__flush (oe=0xb17e80, how=OE_FLUSH__ROUND, timestamp=<optimized out>) at util/ordered-events.c:322 torvalds#15 0x00000000005005e4 in perf_session__process_user_event (session=session@entry=0xb11390, event=event@entry=0x7ffff72a4af8, ... Add the map to the list and getting the reference event if we find the map with same name. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Andi Kleen <ak@linux.intel.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Eric Saint-Etienne <eric.saint.etienne@oracle.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <songliubraving@fb.com> Fixes: 1e62856 ("perf symbols: Fix slowness due to -ffunction-section") Link: http://lkml.kernel.org/r/20190416160127.30203-10-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

Michael Chan says: ==================== bnxt_en: Misc. bug fixes. 6 miscellaneous bug fixes covering several issues in error code paths, a setup issue for statistics DMA, and an improvement for setting up multicast address filters. Please queue these for stable as well. Patch #5 (bnxt_en: Fix statistics context reservation logic) is for the most recent 5.0 stable only. Thanks. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>

Ido Schimmel says: ==================== mlxsw: Various fixes This patchset contains various fixes for mlxsw. Patch #1 fixes an hash polarization problem when a nexthop device is a LAG device. This is caused by the fact that the same seed is used for the LAG and ECMP hash functions. Patch #2 fixes an issue in which the driver fails to refresh a nexthop neighbour after it becomes dead. This prevents the nexthop from ever being written to the adjacency table and used to forward traffic. Patch Patch #4 fixes a wrong extraction of TOS value in flower offload code. Patch #5 is a test case. Patch #6 works around a buffer issue in Spectrum-2 by reducing the default sizes of the shared buffer pools. Patch #7 prevents prio-tagged packets from entering the switch when PVID is removed from the bridge port. Please consider patches #2, #4 and #6 for 5.1.y ==================== Signed-off-by: David S. Miller <davem@davemloft.net>

Puts range check before dereferencing the pointer. Reproducer: # echo stacktrace > trace_options # echo 1 > events/enable # cat trace > /dev/null KASAN report: ================================================================== BUG: KASAN: use-after-free in trace_stack_print+0x26b/0x2c0 Read of size 8 at addr ffff888069d20000 by task cat/1953 CPU: 0 PID: 1953 Comm: cat Not tainted 5.2.0-rc3+ #5 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-2.fc30 04/01/2014 Call Trace: dump_stack+0x8a/0xce print_address_description+0x60/0x224 ? trace_stack_print+0x26b/0x2c0 ? trace_stack_print+0x26b/0x2c0 __kasan_report.cold+0x1a/0x3e ? trace_stack_print+0x26b/0x2c0 kasan_report+0xe/0x20 trace_stack_print+0x26b/0x2c0 print_trace_line+0x6ea/0x14d0 ? tracing_buffers_read+0x700/0x700 ? trace_find_next_entry_inc+0x158/0x1d0 s_show+0xea/0x310 seq_read+0xaa7/0x10e0 ? seq_escape+0x230/0x230 __vfs_read+0x7c/0x100 vfs_read+0x16c/0x3a0 ksys_read+0x121/0x240 ? kernel_write+0x110/0x110 ? perf_trace_sys_enter+0x8a0/0x8a0 ? syscall_slow_exit_work+0xa9/0x410 do_syscall_64+0xb7/0x390 ? prepare_exit_to_usermode+0x165/0x200 entry_SYSCALL_64_after_hwframe+0x44/0xa9 RIP: 0033:0x7f867681f910 Code: b6 fe ff ff 48 8d 3d 0f be 08 00 48 83 ec 08 e8 06 db 01 00 66 0f 1f 44 00 00 83 3d f9 2d 2c 00 00 75 10 b8 00 00 00 00 04 RSP: 002b:00007ffdabf23488 EFLAGS: 00000246 ORIG_RAX: 0000000000000000 RAX: ffffffffffffffda RBX: 0000000000020000 RCX: 00007f867681f910 RDX: 0000000000020000 RSI: 00007f8676cde000 RDI: 0000000000000003 RBP: 00007f8676cde000 R08: ffffffffffffffff R09: 0000000000000000 R10: 0000000000000871 R11: 0000000000000246 R12: 00007f8676cde000 R13: 0000000000000003 R14: 0000000000020000 R15: 0000000000000ec0 Allocated by task 1214: save_stack+0x1b/0x80 __kasan_kmalloc.constprop.0+0xc2/0xd0 kmem_cache_alloc+0xaf/0x1a0 getname_flags+0xd2/0x5b0 do_sys_open+0x277/0x5a0 do_syscall_64+0xb7/0x390 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Freed by task 1214: save_stack+0x1b/0x80 __kasan_slab_free+0x12c/0x170 kmem_cache_free+0x8a/0x1c0 putname+0xe1/0x120 do_sys_open+0x2c5/0x5a0 do_syscall_64+0xb7/0x390 entry_SYSCALL_64_after_hwframe+0x44/0xa9 The buggy address belongs to the object at ffff888069d20000 which belongs to the cache names_cache of size 4096 The buggy address is located 0 bytes inside of 4096-byte region [ffff888069d20000, ffff888069d21000) The buggy address belongs to the page: page:ffffea0001a74800 refcount:1 mapcount:0 mapping:ffff88806ccd1380 index:0x0 compound_mapcount: 0 flags: 0x100000000010200(slab|head) raw: 0100000000010200 dead000000000100 dead000000000200 ffff88806ccd1380 raw: 0000000000000000 0000000000070007 00000001ffffffff 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: ffff888069d1ff00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ffff888069d1ff80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 >ffff888069d20000: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ^ ffff888069d20080: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ffff888069d20100: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ================================================================== Link: http://lkml.kernel.org/r/20190610040016.5598-1-devel@etsukata.com Fixes: 4285f2f ("tracing: Remove the ULONG_MAX stack trace hackery") Signed-off-by: Eiichi Tsukata <devel@etsukata.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>

It is possible for an irq triggered by channel0 to be received later after clks are disabled once firmware loaded during sdma probe. If that happens then clearing them by writing to SDMA_H_INTR won't work and the kernel will hang processing infinite interrupts. Actually, don't need interrupt triggered on channel0 since it's pollling SDMA_H_STATSTOP to know channel0 done rather than interrupt in current code, just clear BD_INTR to disable channel0 interrupt to avoid the above case. This issue was brought by commit 1d069bf ("dmaengine: imx-sdma: ack channel 0 IRQ in the interrupt handler") which didn't take care the above case. Fixes: 1d069bf ("dmaengine: imx-sdma: ack channel 0 IRQ in the interrupt handler") Cc: stable@vger.kernel.org #5.0+ Signed-off-by: Robin Gong <yibin.gong@nxp.com> Reported-by: Sven Van Asbroeck <thesven73@gmail.com> Tested-by: Sven Van Asbroeck <thesven73@gmail.com> Reviewed-by: Michael Olbrich <m.olbrich@pengutronix.de> Signed-off-by: Vinod Koul <vkoul@kernel.org>

Titotix mentioned this issue Jan 4, 2019

Small considerations 7twin/arch_sound_e200ha#1

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Really bad sound output #5

Really bad sound output #5

asdplayer commented Nov 25, 2018 •

edited

heikomat commented Nov 27, 2018 •

edited

asdplayer commented Nov 28, 2018

heikomat commented Nov 28, 2018 •

edited

7twin commented Dec 25, 2018

heikomat commented Dec 26, 2018 •

edited

heikomat commented Dec 26, 2018

7twin commented Dec 26, 2018

7twin commented Dec 28, 2018

heikomat commented Dec 30, 2018

7twin commented Dec 31, 2018

heikomat commented Dec 31, 2018

heikomat commented Jan 2, 2019

Titotix commented Jan 4, 2019

asdplayer commented Jan 5, 2019

asdplayer commented Jan 9, 2019

heikomat commented Jan 10, 2019

7twin commented Jan 10, 2019

asdplayer commented Jan 10, 2019

heikomat commented Jan 10, 2019

asdplayer commented Jan 10, 2019

heikomat commented Jan 12, 2019

7twin commented Jan 12, 2019

heikomat commented Jan 24, 2019

7twin commented Jan 24, 2019

asdplayer commented Jan 25, 2019

Really bad sound output #5

Really bad sound output #5

Comments

asdplayer commented Nov 25, 2018 • edited

heikomat commented Nov 27, 2018 • edited

asdplayer commented Nov 28, 2018

heikomat commented Nov 28, 2018 • edited

7twin commented Dec 25, 2018

heikomat commented Dec 26, 2018 • edited

heikomat commented Dec 26, 2018

7twin commented Dec 26, 2018

7twin commented Dec 28, 2018

heikomat commented Dec 30, 2018

7twin commented Dec 31, 2018

heikomat commented Dec 31, 2018

heikomat commented Jan 2, 2019

Titotix commented Jan 4, 2019

asdplayer commented Jan 5, 2019

asdplayer commented Jan 9, 2019

heikomat commented Jan 10, 2019

7twin commented Jan 10, 2019

asdplayer commented Jan 10, 2019

heikomat commented Jan 10, 2019

asdplayer commented Jan 10, 2019

heikomat commented Jan 12, 2019

7twin commented Jan 12, 2019

heikomat commented Jan 24, 2019

7twin commented Jan 24, 2019

asdplayer commented Jan 25, 2019

asdplayer commented Nov 25, 2018 •

edited

heikomat commented Nov 27, 2018 •

edited

heikomat commented Nov 28, 2018 •

edited

heikomat commented Dec 26, 2018 •

edited