Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Audio driver #11

Closed
xc-racer99 opened this issue Jun 27, 2018 · 50 comments
Closed

Audio driver #11

xc-racer99 opened this issue Jun 27, 2018 · 50 comments
Labels
s5pv210 Issues related to s5pv210 SOC

Comments

@xc-racer99
Copy link
Collaborator

There is a 3.3-based wm8994 driver for one of the CDMA Galaxy S variants (Verizon Fascinate/SCH-i500) at https://github.com/syndtr/android-kernel/blob/qss-next/sound/soc/samsung/qss_wm8994.c

The modem part would need to change and it would need to be brought up to date (eg DT), but the rest is likely functional. Android userspace part at https://github.com/syndtr/android_device_samsung_qss/tree/master/audio

Alternatively, we could resurrect the goni driver.

@PabloPL
Copy link
Owner

PabloPL commented Jun 27, 2018

For modem part, maybe we could base on this https://github.com/fourkbomb/linux/tree/modem.

Edit 1:
Also this could be usefull tom3q@dd9af27.

@xc-racer99
Copy link
Collaborator Author

Thanks for the second part, I'd missed the fact that a simple wm8994 driver had been implemented there.

@PabloPL PabloPL added this to the V2 milestone Jun 28, 2018
@PabloPL
Copy link
Owner

PabloPL commented Jul 9, 2018

@xc-racer99 Could You push branch with audio changes and describe what problems are with it?

@xc-racer99
Copy link
Collaborator Author

Created branch https://github.com/PabloPL/linux/tree/wip-audio

Current state: probes, detects chip properly (without interrupts as I can't find an interrupt connected to it from the stock kernel).
Broken: Playing a wav file with sox/play command plays extremely quickly then hangs (with pulseaudio) or just hangs (without pulseaudio).
Notes:
-The pinctrl (audio_pwr) should theoretically be moved into wm8994 node under the gpio ldo property.
-Might want to move the clock node (MCLK) to the wm8994 node, but it is the responsibility of the machine-specific driver to adjust it (see commit 087ee09)
-Based on stock kernel line

	MUX(MOUT_CLKOUT, "mout_clkout", mout_clkout_p, MISC, 8, 2),

in drivers/clk/clk-s5pv210.c should be

	MUX(MOUT_CLKOUT, "mout_clkout", mout_clkout_p, MISC, 8, 3),

-Prior to DT, idma address for i2s0 was 0xc0000000 but is now listed as 0xc0001000 in the s5pv210 dtsi

Currently, I'm working on booting a stock-ish non-DT kernel to test with.

@PabloPL
Copy link
Owner

PabloPL commented Jul 9, 2018

Looking at s5pv210.dtsi, i2s0 is using clk_audss (compatible = "samsung,s5pv210-audss-clock") which is not enabled in defconfig ? (EXYNOS_AUDSS_CLK_CON).

Also according to documentation at Documentation/devicetree/bindings/sound/samsung-i2s.txt

#sound-dai-cells = <0>;

should be

#sound-dai-cells = <1>;

Edit 1:

without interrupts as I can't find an interrupt connected to it from the stock kernel

I hope that You have schematics for Your device ;). Looking at it, i don't see anything which could be irq line.

@xc-racer99
Copy link
Collaborator Author

xc-racer99 commented Jul 9, 2018

EXYNOS_AUDSS_CLK_CON != s5pv210 - s5pv210 audss clk is built whenever CONFIG_ARCH_S5PV210 is defined.

I've managed to get playback working (except sound routing issues, but that can come later) in a 3.3 based kernel so I now know more or less what to look for.

Edit: Documentation for exynos audss clock is a bit misleading in that it mentions s5pv210 - there's a separate driver for s5pv210

@xc-racer99
Copy link
Collaborator Author

Turns out I didn't really have anything majorly wrong, simply pulseaudio won't run properly as the root user. Creating a non-root user and running pulseaudio as it causes playback to work at the correct speed, albeit without no sound. Will need to check into the routing and make sure I've done it correctly.

@xc-racer99
Copy link
Collaborator Author

Alright, turns out when I thought I had things working via pulseaudio it was actually using a dummy output - not the wm8994. I've turned things on it's head a little and started implementing the mainline wm8994 and Samsung I2S drivers on the 3.0 kernel and with Android 4.4 - that work is at https://github.com/xc-racer99/android_kernel_samsung_aries/tree/mainline-wm8994 and https://github.com/xc-racer99/android_device_samsung_aries-common/tree/mainline-wm8994

Currently, the headphones and speaker are confirmed working.

TODO:

  1. Test the microphone
  2. Test calling (may need someone else to test this on the i9000 as well - I think the input clock from my modem is at 32khz while the i9000 comes in at 24khz - but I'm guessing as I don't have schematics for the i9000 - there might even be some master/slave changes)
  3. Convert the Android audio HAL into ALSA UCM
  4. Forward-port the 3.0 mainline machine driver changes to 4.19

WISHLIST:

  1. Add support for Bluetooth audio (uses AIF3, and might be able to do it)
  2. Add support for FM radio (I can't test this as my variant doesn't have an FM radio)
  3. Add support for USB dock audio (I don't have an audio dock, but could probably cobble one together from spare parts)

@PabloPL
Copy link
Owner

PabloPL commented Sep 10, 2018

  • About FM radio i will test it (fixed mainline driver, got patches for it)
  • about USB dock audio - it's probably just a matter of correct resistor. Also on branch for-upstream/fsa9480 there is driver which is responsible for detection what is connected over usb

About modem clock - i have schematic for i9000 (can check), but we will need first to port https://github.com/fourkbomb/linux/tree/modem for i9000.

@xc-racer99
Copy link
Collaborator Author

xc-racer99 commented Sep 10, 2018

For modem, since I have Android set up, I can test it on the 3.0 kernel and assume it works on 4.19.

I don't think we want to port that modem driver - it likely won't work for our devices. AP-CP communication is done via interrupts and onedram (16MB of shared RAM with control being passed between CP and AP) - see https://github.com/xc-racer99/android_kernel_samsung_aries/tree/aosp-7.1/drivers/misc/samsung_modemctl - the folders are used by the S1 series, the individual files are used by crespo (Nexus S) and might actually be a better, cleaner implementation to port - but I have yet to get them working with Replicant's libsamsung-ipc on my variant (however, I expect the i9000 to work out of the box with correct userspace as it shares the xmm6160 modem with crespo - mine uses a different manufacturer and has a few quirks)

USB dock can be made based off instructions at https://forum.xda-developers.com/showthread.php?t=1321491

@xc-racer99
Copy link
Collaborator Author

xc-racer99 commented Sep 11, 2018

Using amixer, the following commands make functional audio on 3.0 kernel:

amixer sset 'SPKR DAC1' on
amixer sset 'SPKL DAC1' on
amixer sset 'DAC2' 96
amixer sset 'DAC1R Mixer AIF1.1' on
amixer sset 'DAC1L Mixer AIF1.1' on
amixer sset 'DAC1' on
amixer sset 'Speaker Mixer' 3
amixer sset 'Speaker' 63
amixer sset 'Speaker Boost' 6
amixer sset 'Speaker' on
amixer sset 'Headphone' off

Note that only the left-hand part of 'Speaker' actually needs to be enabled - the right hand part does nothing. We may need to implement something like https://github.com/fourkbomb/linux/blob/master/sound/soc/samsung/midas_wm1811.c#L160

PabloPL pushed a commit that referenced this issue Oct 11, 2018
This reverts commit e70a3aa.

This change causes use-after-free on dst->_metrics.
The crash trace looks like this:
[   97.763269] BUG: KASAN: use-after-free in ip6_mtu+0x116/0x140
[   97.769038] Read of size 4 at addr ffff881781d2cf84 by task svw_NetThreadEv/8801

[   97.777954] CPU: 76 PID: 8801 Comm: svw_NetThreadEv Not tainted 4.15.0-smp-DEV #11
[   97.777956] Hardware name: Default string Default string/Indus_QC_02, BIOS 5.46.4 03/29/2018
[   97.777957] Call Trace:
[   97.777971]  [<ffffffff895709db>] dump_stack+0x4d/0x72
[   97.777985]  [<ffffffff881651df>] print_address_description+0x6f/0x260
[   97.777997]  [<ffffffff88165747>] kasan_report+0x257/0x370
[   97.778001]  [<ffffffff894488e6>] ? ip6_mtu+0x116/0x140
[   97.778004]  [<ffffffff881658b9>] __asan_report_load4_noabort+0x19/0x20
[   97.778008]  [<ffffffff894488e6>] ip6_mtu+0x116/0x140
[   97.778013]  [<ffffffff892bb91e>] tcp_current_mss+0x12e/0x280
[   97.778016]  [<ffffffff892bb7f0>] ? tcp_mtu_to_mss+0x2d0/0x2d0
[   97.778022]  [<ffffffff887b45b8>] ? depot_save_stack+0x138/0x4a0
[   97.778037]  [<ffffffff87c38985>] ? __mmdrop+0x145/0x1f0
[   97.778040]  [<ffffffff881643b1>] ? save_stack+0xb1/0xd0
[   97.778046]  [<ffffffff89264c82>] tcp_send_mss+0x22/0x220
[   97.778059]  [<ffffffff89273a49>] tcp_sendmsg_locked+0x4f9/0x39f0
[   97.778062]  [<ffffffff881642b4>] ? kasan_check_write+0x14/0x20
[   97.778066]  [<ffffffff89273550>] ? tcp_sendpage+0x60/0x60
[   97.778070]  [<ffffffff881cb359>] ? rw_copy_check_uvector+0x69/0x280
[   97.778075]  [<ffffffff8873c65f>] ? import_iovec+0x9f/0x430
[   97.778078]  [<ffffffff88164be7>] ? kasan_slab_free+0x87/0xc0
[   97.778082]  [<ffffffff8873c5c0>] ? memzero_page+0x140/0x140
[   97.778085]  [<ffffffff881642b4>] ? kasan_check_write+0x14/0x20
[   97.778088]  [<ffffffff89276f6c>] tcp_sendmsg+0x2c/0x50
[   97.778092]  [<ffffffff89276f6c>] ? tcp_sendmsg+0x2c/0x50
[   97.778098]  [<ffffffff89352d43>] inet_sendmsg+0x103/0x480
[   97.778102]  [<ffffffff89352c40>] ? inet_gso_segment+0x15b0/0x15b0
[   97.778105]  [<ffffffff890294da>] sock_sendmsg+0xba/0xf0
[   97.778108]  [<ffffffff8902ab6a>] ___sys_sendmsg+0x6ca/0x8e0
[   97.778113]  [<ffffffff87dccac1>] ? hrtimer_try_to_cancel+0x71/0x3b0
[   97.778116]  [<ffffffff8902a4a0>] ? copy_msghdr_from_user+0x3d0/0x3d0
[   97.778119]  [<ffffffff881646d1>] ? memset+0x31/0x40
[   97.778123]  [<ffffffff87a0cff5>] ? schedule_hrtimeout_range_clock+0x165/0x380
[   97.778127]  [<ffffffff87a0ce90>] ? hrtimer_nanosleep_restart+0x250/0x250
[   97.778130]  [<ffffffff87dcc700>] ? __hrtimer_init+0x180/0x180
[   97.778133]  [<ffffffff87dd1f82>] ? ktime_get_ts64+0x172/0x200
[   97.778137]  [<ffffffff8822b8ec>] ? __fget_light+0x8c/0x2f0
[   97.778141]  [<ffffffff8902d5c6>] __sys_sendmsg+0xe6/0x190
[   97.778144]  [<ffffffff8902d5c6>] ? __sys_sendmsg+0xe6/0x190
[   97.778147]  [<ffffffff8902d4e0>] ? SyS_shutdown+0x20/0x20
[   97.778152]  [<ffffffff87cd4370>] ? wake_up_q+0xe0/0xe0
[   97.778155]  [<ffffffff8902d670>] ? __sys_sendmsg+0x190/0x190
[   97.778158]  [<ffffffff8902d683>] SyS_sendmsg+0x13/0x20
[   97.778162]  [<ffffffff87a1600c>] do_syscall_64+0x2ac/0x430
[   97.778166]  [<ffffffff87c17515>] ? do_page_fault+0x35/0x3d0
[   97.778171]  [<ffffffff8960131f>] ? page_fault+0x2f/0x50
[   97.778174]  [<ffffffff89600071>] entry_SYSCALL_64_after_hwframe+0x3d/0xa2
[   97.778177] RIP: 0033:0x7f83fa36000d
[   97.778178] RSP: 002b:00007f83ef9229e0 EFLAGS: 00000293 ORIG_RAX: 000000000000002e
[   97.778180] RAX: ffffffffffffffda RBX: 0000000000000001 RCX: 00007f83fa36000d
[   97.778182] RDX: 0000000000004000 RSI: 00007f83ef922f00 RDI: 0000000000000036
[   97.778183] RBP: 00007f83ef923040 R08: 00007f83ef9231f8 R09: 00007f83ef923168
[   97.778184] R10: 0000000000000000 R11: 0000000000000293 R12: 00007f83f69c5b40
[   97.778185] R13: 000000000000001c R14: 0000000000000001 R15: 0000000000004000

[   97.779684] Allocated by task 5919:
[   97.783185]  save_stack+0x46/0xd0
[   97.783187]  kasan_kmalloc+0xad/0xe0
[   97.783189]  kmem_cache_alloc_trace+0xdf/0x580
[   97.783190]  ip6_convert_metrics.isra.79+0x7e/0x190
[   97.783192]  ip6_route_info_create+0x60a/0x2480
[   97.783193]  ip6_route_add+0x1d/0x80
[   97.783195]  inet6_rtm_newroute+0xdd/0xf0
[   97.783198]  rtnetlink_rcv_msg+0x641/0xb10
[   97.783200]  netlink_rcv_skb+0x27b/0x3e0
[   97.783202]  rtnetlink_rcv+0x15/0x20
[   97.783203]  netlink_unicast+0x4be/0x720
[   97.783204]  netlink_sendmsg+0x7bc/0xbf0
[   97.783205]  sock_sendmsg+0xba/0xf0
[   97.783207]  ___sys_sendmsg+0x6ca/0x8e0
[   97.783208]  __sys_sendmsg+0xe6/0x190
[   97.783209]  SyS_sendmsg+0x13/0x20
[   97.783211]  do_syscall_64+0x2ac/0x430
[   97.783213]  entry_SYSCALL_64_after_hwframe+0x3d/0xa2

[   97.784709] Freed by task 0:
[   97.785056] knetbase: Error: /proc/sys/net/core/txcs_enable does not exist
[   97.794497]  save_stack+0x46/0xd0
[   97.794499]  kasan_slab_free+0x71/0xc0
[   97.794500]  kfree+0x7c/0xf0
[   97.794501]  fib6_info_destroy_rcu+0x24f/0x310
[   97.794504]  rcu_process_callbacks+0x38b/0x1730
[   97.794506]  __do_softirq+0x1c8/0x5d0

Reported-by: John Sperbeck <jsperbeck@google.com>
Signed-off-by: Wei Wang <weiwan@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
@xc-racer99
Copy link
Collaborator Author

xc-racer99 commented Oct 12, 2018

Ok, Audio for non-DT S5PV210 worked up until d37bdf7 but then stopped working (well between 3.13 and d37bdf7 there was a bunch of DMA issues as well).

Edit: 0baf8f6 supposed fixed the DMA issues...
Edit 2: Can confirm above commit fixed the DMA issues, but still don't have anything working...
Edit 3: DMA channel finding issues: the following diff fixes audio after d37bdf7

diff --git a/sound/soc/samsung/dmaengine.c b/sound/soc/samsung/dmaengine.c
index 750ce5808d9f..4a71b94ae2be 100644
--- a/sound/soc/samsung/dmaengine.c
+++ b/sound/soc/samsung/dmaengine.c
@@ -17,6 +17,7 @@
 
 #include <linux/module.h>
 #include <linux/amba/pl08x.h>
+#include <linux/amba/pl330.h>
 
 #include <sound/core.h>
 #include <sound/pcm.h>
@@ -30,7 +31,7 @@
 #ifdef CONFIG_ARCH_S3C64XX
 #define filter_fn pl08x_filter_id
 #else
-#define filter_fn NULL
+#define filter_fn pl330_filter
 #endif
 
 static const struct snd_dmaengine_pcm_config samsung_dmaengine_pcm_config = {

@PabloPL
Copy link
Owner

PabloPL commented Oct 13, 2018

Offtopic: We need to enable CONFIG_DMADEVICES which will enable CONFIG_PL330_DMA (which is used on s5pv210 according to https://patchwork.kernel.org/patch/981252/).

@xc-racer99
Copy link
Collaborator Author

Offtopic: We need to enable CONFIG_DMADEVICES which will enable CONFIG_PL330_DMA (which is used on s5pv210 according to https://patchwork.kernel.org/patch/981252/).

Yep, I got that, even confirmed that DMA is working via dmatest.

Currently, I've confirmed that audio works just before the removal of board files (with the above pl330_filter patch - but with the new clock driver) and with the DT just after it was added (without needing the pl330_filter patch). Diff as of d78c16c:

diff --git a/arch/arm/boot/dts/s5pv210-goni.dts b/arch/arm/boot/dts/s5pv210-goni.dts
index 6387c77a6f7b..c3ba8900fc59 100644
--- a/arch/arm/boot/dts/s5pv210-goni.dts
+++ b/arch/arm/boot/dts/s5pv210-goni.dts
@@ -14,6 +14,7 @@
  */
 
 /dts-v1/;
+#include <dt-bindings/gpio/gpio.h>
 #include <dt-bindings/input/input.h>
 #include "s5pv210.dtsi"
 
@@ -26,7 +27,11 @@
 	};
 
 	chosen {
-		bootargs = "console=ttySAC0,115200n8 root=/dev/mmcblk0p5 rw rootwait ignore_loglevel earlyprintk";
+/*
+		stdout-path = &uart2;
+		bootargs = "root=/dev/mmcblk1p2 rw rootwait ignore_loglevel earlyprintk init=/bin/bash";
+*/
+		bootargs = "root=/dev/mmcblk0p2 rootwait rw console=ttySAC2,115200n8 mem=80M mem=256M@0x40000000 mem=128M@0x50000000 ignore_loglevel earlyprintk mtdparts=b0600000.onenand:256k@25856k(uboot-env),10240k(boot),10240k(recovery),980480k(ubi) ubi.mtd=ubi init=/bin/bash";
 	};
 
 	memory {
@@ -242,6 +247,57 @@
 			gpio-key,wakeup;
 		};
 	};
+
+
+	sound {
+		compatible = "samsung,smdk-wm8994";
+
+		samsung,i2s-controller = <&i2s0>;
+		samsung,audio-codec = <&wm8994>;
+	};
+
+	i2c_sound: i2c-gpio-2 {
+		compatible = "i2c-gpio";
+		gpios = <&mp05 3 (GPIO_ACTIVE_HIGH)>, <&mp05 2 (GPIO_ACTIVE_HIGH)>;
+		i2c-gpio,delay-us = <2>;
+		#address-cells = <1>;
+		#size-cells = <0>;
+
+		wm8994: wm8994@1a {
+			compatible = "wlf,wm8994";
+			reg = <0x1a>;
+
+			gpio-controller;
+			#gpio-cells = <2>;
+
+			clocks = <&xusbxti>;
+			clock-names = "MCLK1";
+
+			AVDD2-supply = <&bat_reg>;
+			CPVDD-supply = <&bat_reg>;
+			DBVDD-supply = <&bat_reg>;
+			/* DCVDD-supply = <&bat_reg>; */
+			SPKVDD1-supply = <&bat_reg>;
+			SPKVDD2-supply = <&bat_reg>;
+
+			wlf,gpio-cfg = <0x0001 0xa101 0x8100 0x8100 0x8100 0xa101
+				0x0100 0x8100 0x0100 0x0100 0x0100>;
+
+			wlf,ldo1ena = <&gpf3 4 GPIO_ACTIVE_HIGH>;
+			wlf,ldo2ena = <&gpf3 4 GPIO_ACTIVE_HIGH>;
+
+			lineout1-se;
+			lineout2-se;
+
+			assigned-clocks = <&clocks MOUT_CLKOUT>;
+			assigned-clock-rates = <0>;
+			assigned-clock-parents = <&xusbxti>;
+		};
+	};
+};
+
+&i2s0 {
+	status = "okay";
 };
 
 &xusbxti {
@@ -322,7 +378,10 @@
 
 &sdhci2 {
 	bus-width = <4>;
+/*
 	cd-gpios = <&gph3 4 1>;
+*/
+	non-removable;
 	vmmc-supply = <&vtf_reg>;
 	cd-inverted;
 	pinctrl-0 = <&sd2_clk &sd2_cmd &sd2_bus4>;
diff --git a/arch/arm/configs/s5pv210_defconfig b/arch/arm/configs/s5pv210_defconfig
index fa989902236d..7748b4c9d792 100644
--- a/arch/arm/configs/s5pv210_defconfig
+++ b/arch/arm/configs/s5pv210_defconfig
@@ -1,31 +1,31 @@
-CONFIG_EXPERIMENTAL=y
-CONFIG_SYSFS_DEPRECATED_V2=y
+CONFIG_NO_HZ=y
+CONFIG_HIGH_RES_TIMERS=y
 CONFIG_BLK_DEV_INITRD=y
 CONFIG_KALLSYMS_ALL=y
 CONFIG_MODULES=y
 CONFIG_MODULE_UNLOAD=y
 # CONFIG_BLK_DEV_BSG is not set
+CONFIG_PARTITION_ADVANCED=y
+CONFIG_BSD_DISKLABEL=y
+CONFIG_SOLARIS_X86_PARTITION=y
 CONFIG_ARCH_S5PV210=y
-CONFIG_S3C_LOWLEVEL_UART_PORT=1
-CONFIG_S3C_DEV_FB=y
-CONFIG_S5PV210_SETUP_FB_24BPP=y
-CONFIG_MACH_AQUILA=y
-CONFIG_MACH_GONI=y
-CONFIG_MACH_SMDKC110=y
-CONFIG_MACH_SMDKV210=y
-CONFIG_NO_HZ=y
-CONFIG_HIGH_RES_TIMERS=y
+# CONFIG_CACHE_L2X0 is not set
 CONFIG_VMSPLIT_2G=y
 CONFIG_PREEMPT=y
 CONFIG_AEABI=y
-CONFIG_CMDLINE="root=/dev/ram0 rw ramdisk=8192 initrd=0x20800000,8M console=ttySAC1,115200 init=/linuxrc"
+CONFIG_ARM_APPENDED_DTB=y
+CONFIG_CMDLINE="console=ttySAC2,115200n8"
 CONFIG_VFP=y
 CONFIG_NEON=y
+CONFIG_NET=y
+CONFIG_UNIX=y
+CONFIG_INET=y
 CONFIG_UEVENT_HELPER_PATH="/sbin/hotplug"
+CONFIG_DEVTMPFS=y
+CONFIG_DEVTMPFS_MOUNT=y
 CONFIG_BLK_DEV_LOOP=y
 CONFIG_BLK_DEV_RAM=y
 CONFIG_BLK_DEV_RAM_SIZE=8192
-# CONFIG_MISC_DEVICES is not set
 CONFIG_SCSI=y
 CONFIG_BLK_DEV_SD=y
 CONFIG_CHR_DEV_SG=y
@@ -33,41 +33,48 @@ CONFIG_INPUT_EVDEV=y
 # CONFIG_INPUT_KEYBOARD is not set
 # CONFIG_INPUT_MOUSE is not set
 CONFIG_INPUT_TOUCHSCREEN=y
-CONFIG_SERIAL_8250=y
 CONFIG_SERIAL_SAMSUNG=y
 CONFIG_SERIAL_SAMSUNG_CONSOLE=y
 CONFIG_HW_RANDOM=y
+CONFIG_I2C=y
+CONFIG_I2C_GPIO=y
+CONFIG_GPIO_WM8994=y
 # CONFIG_HWMON is not set
-# CONFIG_VGA_CONSOLE is not set
-# CONFIG_HID_SUPPORT is not set
+CONFIG_REGULATOR=y
+CONFIG_REGULATOR_FIXED_VOLTAGE=y
+CONFIG_REGULATOR_GPIO=y
+CONFIG_REGULATOR_WM8994=y
+CONFIG_SOUND=y
+CONFIG_SND=y
+CONFIG_SND_SOC=y
+CONFIG_SND_SOC_SAMSUNG=y
+CONFIG_SND_SOC_SAMSUNG_SMDK_WM8994=y
 # CONFIG_USB_SUPPORT is not set
-CONFIG_EXT2_FS=y
-CONFIG_INOTIFY=y
+CONFIG_MMC=y
+CONFIG_MMC_SDHCI=y
+CONFIG_MMC_SDHCI_S3C=y
+CONFIG_DMADEVICES=y
+CONFIG_EXT4_FS=y
 CONFIG_MSDOS_FS=y
 CONFIG_VFAT_FS=y
 CONFIG_TMPFS=y
 CONFIG_TMPFS_POSIX_ACL=y
 CONFIG_CRAMFS=y
 CONFIG_ROMFS_FS=y
-CONFIG_PARTITION_ADVANCED=y
-CONFIG_BSD_DISKLABEL=y
-CONFIG_SOLARIS_X86_PARTITION=y
 CONFIG_NLS_CODEPAGE_437=y
 CONFIG_NLS_ASCII=y
 CONFIG_NLS_ISO8859_1=y
+CONFIG_NLS_UTF8=y
+CONFIG_DEBUG_INFO=y
+CONFIG_DEBUG_FS=y
 CONFIG_MAGIC_SYSRQ=y
 CONFIG_DEBUG_KERNEL=y
 # CONFIG_DEBUG_PREEMPT is not set
 CONFIG_DEBUG_RT_MUTEXES=y
 CONFIG_DEBUG_SPINLOCK=y
 CONFIG_DEBUG_MUTEXES=y
-CONFIG_DEBUG_SPINLOCK_SLEEP=y
-CONFIG_DEBUG_INFO=y
-# CONFIG_RCU_CPU_STALL_DETECTOR is not set
-CONFIG_SYSCTL_SYSCALL_CHECK=y
 CONFIG_DEBUG_USER=y
-CONFIG_DEBUG_ERRORS=y
 CONFIG_DEBUG_LL=y
+CONFIG_DEBUG_S3C_UART2=y
 CONFIG_EARLY_PRINTK=y
-CONFIG_DEBUG_S3C_UART=1
 CONFIG_CRC_CCITT=y
diff --git a/sound/soc/samsung/smdk_wm8994.c b/sound/soc/samsung/smdk_wm8994.c
index 3d6272a8cad2..e7ea4d82deb7 100644
--- a/sound/soc/samsung/smdk_wm8994.c
+++ b/sound/soc/samsung/smdk_wm8994.c
@@ -37,7 +37,7 @@
   */
 
 /* SMDK has a 16.934MHZ crystal attached to WM8994 */
-#define SMDK_WM8994_FREQ 16934000
+#define SMDK_WM8994_FREQ 24000000
 
 struct smdk_wm8994_data {
 	int mclk1_rate;
@@ -90,20 +90,20 @@ static int smdk_wm8994_init_paiftx(struct snd_soc_pcm_runtime *rtd)
 	struct snd_soc_dapm_context *dapm = &codec->dapm;
 
 	/* Other pins NC */
-	snd_soc_dapm_nc_pin(dapm, "HPOUT2P");
-	snd_soc_dapm_nc_pin(dapm, "HPOUT2N");
-	snd_soc_dapm_nc_pin(dapm, "SPKOUTLN");
-	snd_soc_dapm_nc_pin(dapm, "SPKOUTLP");
-	snd_soc_dapm_nc_pin(dapm, "SPKOUTRP");
-	snd_soc_dapm_nc_pin(dapm, "SPKOUTRN");
-	snd_soc_dapm_nc_pin(dapm, "LINEOUT1N");
-	snd_soc_dapm_nc_pin(dapm, "LINEOUT1P");
-	snd_soc_dapm_nc_pin(dapm, "LINEOUT2N");
-	snd_soc_dapm_nc_pin(dapm, "LINEOUT2P");
-	snd_soc_dapm_nc_pin(dapm, "IN1LP");
-	snd_soc_dapm_nc_pin(dapm, "IN2LP:VXRN");
-	snd_soc_dapm_nc_pin(dapm, "IN1RP");
-	snd_soc_dapm_nc_pin(dapm, "IN2RP:VXRP");
+	snd_soc_dapm_enable_pin(dapm, "HPOUT2P");
+	snd_soc_dapm_enable_pin(dapm, "HPOUT2N");
+	snd_soc_dapm_enable_pin(dapm, "SPKOUTLN");
+	snd_soc_dapm_enable_pin(dapm, "SPKOUTLP");
+	snd_soc_dapm_enable_pin(dapm, "SPKOUTRP");
+	snd_soc_dapm_enable_pin(dapm, "SPKOUTRN");
+	snd_soc_dapm_enable_pin(dapm, "LINEOUT1N");
+	snd_soc_dapm_enable_pin(dapm, "LINEOUT1P");
+	snd_soc_dapm_enable_pin(dapm, "LINEOUT2N");
+	snd_soc_dapm_enable_pin(dapm, "LINEOUT2P");
+	snd_soc_dapm_enable_pin(dapm, "IN1LP");
+	snd_soc_dapm_enable_pin(dapm, "IN2LP:VXRN");
+	snd_soc_dapm_enable_pin(dapm, "IN1RP");
+	snd_soc_dapm_enable_pin(dapm, "IN2RP:VXRP");
 
 	return 0;
 }
@@ -120,6 +120,7 @@ static struct snd_soc_dai_link smdk_dai[] = {
 		.dai_fmt = SND_SOC_DAIFMT_I2S | SND_SOC_DAIFMT_NB_NF |
 			SND_SOC_DAIFMT_CBM_CFM,
 		.ops = &smdk_ops,
+#if 0
 	}, { /* Sec_Fifo Playback i/f */
 		.name = "Sec_FIFO TX",
 		.stream_name = "Sec_Dai",
@@ -130,6 +131,7 @@ static struct snd_soc_dai_link smdk_dai[] = {
 		.dai_fmt = SND_SOC_DAIFMT_I2S | SND_SOC_DAIFMT_NB_NF |
 			SND_SOC_DAIFMT_CBM_CFM,
 		.ops = &smdk_ops,
+#endif
 	},
 };
 

@xc-racer99
Copy link
Collaborator Author

Created branch https://github.com/PabloPL/linux/tree/v4.19-rc7-audio with minimal working audio support after running the commands in #11 (comment)

Turns out it was a clock issue with PDMA1 needing PDMA0's clock enabled as well.

I will be creating a new machine driver that is designed for aries that will support things such as the microphone, FM radio, modem, etc

PabloPL pushed a commit that referenced this issue Dec 21, 2018
It was observed that a process blocked indefintely in
__fscache_read_or_alloc_page(), waiting for FSCACHE_COOKIE_LOOKING_UP
to be cleared via fscache_wait_for_deferred_lookup().

At this time, ->backing_objects was empty, which would normaly prevent
__fscache_read_or_alloc_page() from getting to the point of waiting.
This implies that ->backing_objects was cleared *after*
__fscache_read_or_alloc_page was was entered.

When an object is "killed" and then "dropped",
FSCACHE_COOKIE_LOOKING_UP is cleared in fscache_lookup_failure(), then
KILL_OBJECT and DROP_OBJECT are "called" and only in DROP_OBJECT is
->backing_objects cleared.  This leaves a window where
something else can set FSCACHE_COOKIE_LOOKING_UP and
__fscache_read_or_alloc_page() can start waiting, before
->backing_objects is cleared

There is some uncertainty in this analysis, but it seems to be fit the
observations.  Adding the wake in this patch will be handled correctly
by __fscache_read_or_alloc_page(), as it checks if ->backing_objects
is empty again, after waiting.

Customer which reported the hang, also report that the hang cannot be
reproduced with this fix.

The backtrace for the blocked process looked like:

PID: 29360  TASK: ffff881ff2ac0f80  CPU: 3   COMMAND: "zsh"
 #0 [ffff881ff43efbf8] schedule at ffffffff815e56f1
 #1 [ffff881ff43efc58] bit_wait at ffffffff815e64ed
 #2 [ffff881ff43efc68] __wait_on_bit at ffffffff815e61b8
 #3 [ffff881ff43efca0] out_of_line_wait_on_bit at ffffffff815e625e
 #4 [ffff881ff43efd08] fscache_wait_for_deferred_lookup at ffffffffa04f2e8f [fscache]
 #5 [ffff881ff43efd18] __fscache_read_or_alloc_page at ffffffffa04f2ffe [fscache]
 #6 [ffff881ff43efd58] __nfs_readpage_from_fscache at ffffffffa0679668 [nfs]
 #7 [ffff881ff43efd78] nfs_readpage at ffffffffa067092b [nfs]
 #8 [ffff881ff43efda0] generic_file_read_iter at ffffffff81187a73
 #9 [ffff881ff43efe50] nfs_file_read at ffffffffa066544b [nfs]
#10 [ffff881ff43efe70] __vfs_read at ffffffff811fc756
#11 [ffff881ff43efee8] vfs_read at ffffffff811fccfa
#12 [ffff881ff43eff18] sys_read at ffffffff811fda62
#13 [ffff881ff43eff50] entry_SYSCALL_64_fastpath at ffffffff815e986e

Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: David Howells <dhowells@redhat.com>
PabloPL pushed a commit that referenced this issue Dec 25, 2018
Commit 9b6f7e1 ("mm: rework memcg kernel stack accounting") will
result in fork failing if allocating a kernel stack for a task in
dup_task_struct exceeds the kernel memory allowance for that cgroup.

Unfortunately, it also results in a crash.

This is due to the code jumping to free_stack and calling
free_thread_stack when the memcg kernel stack charge fails, but without
tsk->stack pointing at the freshly allocated stack.

This in turn results in the vfree_atomic in free_thread_stack oopsing
with a backtrace like this:

#5 [ffffc900244efc88] die at ffffffff8101f0ab
 #6 [ffffc900244efcb8] do_general_protection at ffffffff8101cb86
 #7 [ffffc900244efce0] general_protection at ffffffff818ff082
    [exception RIP: llist_add_batch+7]
    RIP: ffffffff8150d487  RSP: ffffc900244efd98  RFLAGS: 00010282
    RAX: 0000000000000000  RBX: ffff88085ef55980  RCX: 0000000000000000
    RDX: ffff88085ef55980  RSI: 343834343531203a  RDI: 343834343531203a
    RBP: ffffc900244efd98   R8: 0000000000000001   R9: ffff8808578c3600
    R10: 0000000000000000  R11: 0000000000000001  R12: ffff88029f6c21c0
    R13: 0000000000000286  R14: ffff880147759b00  R15: 0000000000000000
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
 #8 [ffffc900244efda0] vfree_atomic at ffffffff811df2c7
 #9 [ffffc900244efdb8] copy_process at ffffffff81086e37
#10 [ffffc900244efe98] _do_fork at ffffffff810884e0
#11 [ffffc900244eff10] sys_vfork at ffffffff810887ff
#12 [ffffc900244eff20] do_syscall_64 at ffffffff81002a43
    RIP: 000000000049b948  RSP: 00007ffcdb307830  RFLAGS: 00000246
    RAX: ffffffffffffffda  RBX: 0000000000896030  RCX: 000000000049b948
    RDX: 0000000000000000  RSI: 00007ffcdb307790  RDI: 00000000005d7421
    RBP: 000000000067370f   R8: 00007ffcdb3077b0   R9: 000000000001ed00
    R10: 0000000000000008  R11: 0000000000000246  R12: 0000000000000040
    R13: 000000000000000f  R14: 0000000000000000  R15: 000000000088d018
    ORIG_RAX: 000000000000003a  CS: 0033  SS: 002b

The simplest fix is to assign tsk->stack right where it is allocated.

Link: http://lkml.kernel.org/r/20181214231726.7ee4843c@imladris.surriel.com
Fixes: 9b6f7e1 ("mm: rework memcg kernel stack accounting")
Signed-off-by: Rik van Riel <riel@surriel.com>
Acked-by: Roman Gushchin <guro@fb.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
@xc-racer99
Copy link
Collaborator Author

Branch at https://github.com/xc-racer99/linux/tree/modem-audio contains a machine driver along with modem drivers.

Currently known that (with HiFi verb) earpiece, speaker, headphone, mic, and headset mic work with the master branch from https://github.com/xc-racer99/aries_ucm while the tizen_wm1811 also should work (not as fully tested).

Note that the above is with manually running alsaucm (alsaucm set _verb HiFi set _enadev Speaker); pulseaudio works if you remove any input paths from the UCM files but as soon as you add an input path nothing plays or records. I don't know why this issue occurs.

@xc-racer99
Copy link
Collaborator Author

@PabloPL Would you mind testing the above ucm files (put the folders in /usr/share/alsa/ucm) with pulseaudio on Ubuntu? I'm wondering if it's a debian pulseaudio issue as there is similar issue described at maemo-leste/bugtracker#77 which also appears using the debian stretch packages as far as I can tell.

@PabloPL
Copy link
Owner

PabloPL commented Jan 16, 2019

@xc-racer99 Sure, i'll do testing tomorrow (since today it's too late) and leave response here.

@PabloPL
Copy link
Owner

PabloPL commented Jan 17, 2019

[    1.860448] exynos-adc e1700000.adc: e1700000.adc supply vdd not found, using dummy regulator
[    1.877589] exynos-adc e1700000.adc: Linked as a consumer to regulator.0
[    1.894461] wm8994-codec wm8994-codec: Failed to request Mic1 detect IRQ: -22
[    1.911013] wm8994-codec wm8994-codec: Failed to request Mic1 short IRQ: -22
[    1.926933] wm8994-codec wm8994-codec: Failed to request Mic2 detect IRQ: -22
[    1.943177] wm8994-codec wm8994-codec: Failed to request Mic2 short IRQ: -22
[    2.052407] aries-audio-wm8994 sound: wm8994-aif1 <-> samsung-i2s mapping ok
[    2.069144] aries-audio-wm8994 sound: wm8994-aif2 <-> Voice call mapping ok
[    2.085523] aries-audio-wm8994 sound: wm8994-aif3 <-> Bluetooth mapping ok
[    2.109140] wm8994 10-001a: Failed to write 102 = 3: -6
[    2.117647] wm8994 10-001a: Failed to restore register map: -6                                                                                                                                                                                  
[    2.129738] input: Aries Dock as /devices/platform/sound/sound/card0/input0
[    2.146734] input: Aries Headset as /devices/platform/sound/sound/card0/input1

@xc-racer99
Copy link
Collaborator Author

The adc supply is normal, the missing IRQ is normal (irq line isn't wired up).

The "Failed to write 102"/"Failed to restore register map" messages are something that isn't present on my variant... Looks like I might have messed up something on the galaxys dts

@xc-racer99
Copy link
Collaborator Author

Can confirm that on Debian Jessie pulseaudio (v5.0-13) works just fine. jessie-backports pulseaudio of v7.1-2~bpo8+1 also works just fine.

Next up is Buster (pulseaudio v12.2-3) to see if it works, unlike stretch (v10.0-1+deb9u1).

@xc-racer99
Copy link
Collaborator Author

Ok, I'm very confused now :)

Buster didn't work either. However, I have now discovered that audio will work on Stretch+ for 5s after the pulseaudio daemon is started (probably until module-suspend-on-idle kicks in).

So I figured no problem, I'll just bisect the issue in pulseaudio on jessie. I started with installing the pulseaudio and alsa packages from stretch on jessie, and they work.

I have posted a comparison of the pulseaudio output and the pactl list between stretch and jessie with stretch audio packages at https://gist.github.com/xc-racer99/ee7181c3f21a97f8cc321e3bc71c1d86 and I see no appreciable difference.

At this point, I'm going to assume that I'm missing some other package on stretch+ and just debug the UCM files on jessie (I'm still working a bit on getting all jacks to work and having all outputs working). If anyone has any ideas, let me know!

@PabloPL
Copy link
Owner

PabloPL commented Jan 25, 2019

Any progress on this (which is final branch)? I want to check why it was failing for me on galaxys (those errors on boot).

@xc-racer99
Copy link
Collaborator Author

I haven't really cleaned things up yet; my wip branch is at https://github.com/xc-racer99/linux/tree/modem-audio

https://github.com/xc-racer99/aries-hw-files/tree/older-ucm contains some very messy WIP UCM files that may or may not be updated for Aries...

Note that I'm unable to reproduce the the galaxys failing on my device, I rechecked some GPIOs and everything looked good...

@PabloPL
Copy link
Owner

PabloPL commented Jan 25, 2019

Ok, do You want me to check this on galaxys or i should wait till it get finished (and i'll just look at different thing now - that's no problem)?

@xc-racer99
Copy link
Collaborator Author

Up to you. If you could try and see if it probes correctly now that would be great. UCM files will need some work and some testing on fm radio.

@PabloPL PabloPL added the s5pv210 Issues related to s5pv210 SOC label Jun 24, 2019
@xc-racer99
Copy link
Collaborator Author

xc-racer99 commented Jul 16, 2019

Ok, so the

[    2.124174] wm8994 10-001a: Failed to write 102 = 3: -6
[    2.132880] wm8994 10-001a: Failed to restore register map: -6

bug has come back to bit me :) Digging into it again, it appears that it is related to the fact that we're not waiting long enough after enabling gpf3-4. I'm guessing based on what registers are on in which power map, that gpf3-4 is not connected to the LDO1 pin on the wm8994, but rather is connected to DBVDD. Regardless, I've updated the DTS to show it this way, as then we can set a startup-delay-us of 80000 (80ms) which is long enough for everything to be powered properly.

Edit: Maybe this isn't even the issue, as it has shown up again. I wish I could figure out why...

Branch at https://github.com/xc-racer99/linux/tree/modem-audio updated.

Note that I've decided to drop the fully_route property from the ASoC driver. This means that the userspace UCM files will need some adjusting as all of the (virtual) modem controls aren't needed.

@PabloPL
Copy link
Owner

PabloPL commented Jul 17, 2019

GPF3-4 is connected to LDO1ENA and LDO2ENA pins of wm8994.
DBVDD is connected to VCC_1.8V_PDA which goes from max8998 (LX3 pin).

@PabloPL PabloPL removed this from the V2 milestone Jul 17, 2019
@xc-racer99
Copy link
Collaborator Author

DBVDD is connected to VCC_1.8V_PDA which goes from max8998 (LX3 pin).

I'm guessing LX3 pin is buck3 in DTS? Where are other regulators supplied from? We have

  • DBVDD - buck3
  • DCVDD - from LDO2?
  • AVDD1 - from LDO1?
  • AVDD2 - ??? (buck3 as well based on recommended voltage of 1.8V)
  • CPVDD - ??? (buck3 as well based on recommended voltage of 1.8V)
  • SPKVDD1 - ??? (recommended voltage of 5V, minimum of 2.7V)
  • SPKVDD2 (Right SPK channel, N/C?)

@PabloPL
Copy link
Owner

PabloPL commented Jul 17, 2019

AVDD1 - GND
AVDD2/CPVDD/DBVDD/LDO2VDD - buck3
DCVDD - GND
LDO1VDD/SPKVDD1/SPKVDD2 - V_BAT (could not find what is value for this)

Edit 1:
On goni V_BAT is set to 3700000 microvolts.

Edit 2:
Hm... tom3q had it set to 4200000 tom3q@dd9af27

@xc-racer99
Copy link
Collaborator Author

Edit 1:
On goni V_BAT is set to 3700000 microvolts.

Edit 2:
Hm... tom3q had it set to 4200000 tom3q@dd9af27

Yep, I noticed those too. However, tom3q also has GPH1(0) as controlling V_BAT, while we appear to have no such pin.

It could be that V_BAT is directly connected from the battery itself. That would explain the 3.7V

@PabloPL
Copy link
Owner

PabloPL commented Jul 24, 2019

@xc-racer99 Did You tried to do something like this

wm8994_codec_en: wm8994-codec-en {
	samsung,pins = "gpf3-4";
	samsung,pin-function = <EXYNOS_PIN_FUNC_OUTPUT>;
	samsung,pin-pud = <S3C64XX_PIN_PULL_NONE>;
	samsung,pin-drv = <EXYNOS4_PIN_DRV_LV1>;
	samsung,pin-val = <1>;
};

And then in wm8994 node reference it in

pinctrl-names = "default";
pinctrl-0 = <&wm8994_codec_en>;

Just a idea

@xc-racer99
Copy link
Collaborator Author

Yeah, I have tried something like that - it didn't work. There is something really off in the power management of the entire wm8994 code, because about 10s into playing via pulseaudio, pulseaudio detects things as not in use and tries to suspend them...

I should really hook into kgdb and find out exactly when things are being called.

@xc-racer99
Copy link
Collaborator Author

Ugh, finally figured out why I was having some many issues with jack detection - the GPIO on my main testing device must be partially broken as it doesn't work properly in Android either. Means I can now firm up jack detection.

Still haven't figured out why we sometimes have errors writing to the i2c and sometimes don't. Still need to connect KGDB to figure this one out.

@PabloPL
Copy link
Owner

PabloPL commented Aug 13, 2019

@xc-racer99
Could You try

alsactl store -f alsa_state.txt

And later restore

alsactl restore -f alsa_state.txt

On first one i'm getting

alsactl: get_control:256: Cannot read control '2,0,0,IN1L Volume,0': No such device or address

I wonder from where it got that IN1L...

@xc-racer99
Copy link
Collaborator Author

IN1L is connected to the main microphone. alsactl is probably having issues reading the control because when we get the Failed to write 102 = 3: -6 it essentially has brought the device to its knees and then controls stop working either (really all i2c communication).

I have figured out a workaround for this error though - it appears that we need a bit of a timeout after disabling the regulators before enabling them again. A patch like xc-racer99@e1b619d works, although you can also add an "off_on_delay" to the wm8994 regulators to get the same effect. I've uploaded a newer branch at https://github.com/xc-racer99/linux/tree/v5.3-rc1-audio that appears to be mostly functional, but is still somewhat a WIP (I don't have the i9000 dts updated there, but it's quite similar to the fascinate one).

@PabloPL
Copy link
Owner

PabloPL commented Aug 13, 2019

UCM files stays the same?

@xc-racer99
Copy link
Collaborator Author

UCM files stays the same?

Modem has changed slightly, but the main HiFi should be the same (I haven't changed the one's I'm using)

PabloPL pushed a commit that referenced this issue Aug 17, 2019
A deadlock with this stacktrace was observed.

The loop thread does a GFP_KERNEL allocation, it calls into dm-bufio
shrinker and the shrinker depends on I/O completion in the dm-bufio
subsystem.

In order to fix the deadlock (and other similar ones), we set the flag
PF_MEMALLOC_NOIO at loop thread entry.

PID: 474    TASK: ffff8813e11f4600  CPU: 10  COMMAND: "kswapd0"
   #0 [ffff8813dedfb938] __schedule at ffffffff8173f405
   #1 [ffff8813dedfb990] schedule at ffffffff8173fa27
   #2 [ffff8813dedfb9b0] schedule_timeout at ffffffff81742fec
   #3 [ffff8813dedfba60] io_schedule_timeout at ffffffff8173f186
   #4 [ffff8813dedfbaa0] bit_wait_io at ffffffff8174034f
   #5 [ffff8813dedfbac0] __wait_on_bit at ffffffff8173fec8
   #6 [ffff8813dedfbb10] out_of_line_wait_on_bit at ffffffff8173ff81
   #7 [ffff8813dedfbb90] __make_buffer_clean at ffffffffa038736f [dm_bufio]
   #8 [ffff8813dedfbbb0] __try_evict_buffer at ffffffffa0387bb8 [dm_bufio]
   #9 [ffff8813dedfbbd0] dm_bufio_shrink_scan at ffffffffa0387cc3 [dm_bufio]
  #10 [ffff8813dedfbc40] shrink_slab at ffffffff811a87ce
  #11 [ffff8813dedfbd30] shrink_zone at ffffffff811ad778
  #12 [ffff8813dedfbdc0] kswapd at ffffffff811ae92f
  #13 [ffff8813dedfbec0] kthread at ffffffff810a8428
  #14 [ffff8813dedfbf50] ret_from_fork at ffffffff81745242

  PID: 14127  TASK: ffff881455749c00  CPU: 11  COMMAND: "loop1"
   #0 [ffff88272f5af228] __schedule at ffffffff8173f405
   #1 [ffff88272f5af280] schedule at ffffffff8173fa27
   #2 [ffff88272f5af2a0] schedule_preempt_disabled at ffffffff8173fd5e
   #3 [ffff88272f5af2b0] __mutex_lock_slowpath at ffffffff81741fb5
   #4 [ffff88272f5af330] mutex_lock at ffffffff81742133
   #5 [ffff88272f5af350] dm_bufio_shrink_count at ffffffffa03865f9 [dm_bufio]
   #6 [ffff88272f5af380] shrink_slab at ffffffff811a86bd
   #7 [ffff88272f5af470] shrink_zone at ffffffff811ad778
   #8 [ffff88272f5af500] do_try_to_free_pages at ffffffff811adb34
   #9 [ffff88272f5af590] try_to_free_pages at ffffffff811adef8
  #10 [ffff88272f5af610] __alloc_pages_nodemask at ffffffff811a09c3
  #11 [ffff88272f5af710] alloc_pages_current at ffffffff811e8b71
  #12 [ffff88272f5af760] new_slab at ffffffff811f4523
  #13 [ffff88272f5af7b0] __slab_alloc at ffffffff8173a1b5
  #14 [ffff88272f5af880] kmem_cache_alloc at ffffffff811f484b
  #15 [ffff88272f5af8d0] do_blockdev_direct_IO at ffffffff812535b3
  #16 [ffff88272f5afb00] __blockdev_direct_IO at ffffffff81255dc3
  #17 [ffff88272f5afb30] xfs_vm_direct_IO at ffffffffa01fe3fc [xfs]
  #18 [ffff88272f5afb90] generic_file_read_iter at ffffffff81198994
  #19 [ffff88272f5afc50] __dta_xfs_file_read_iter_2398 at ffffffffa020c970 [xfs]
  #20 [ffff88272f5afcc0] lo_rw_aio at ffffffffa0377042 [loop]
  #21 [ffff88272f5afd70] loop_queue_work at ffffffffa0377c3b [loop]
  #22 [ffff88272f5afe60] kthread_worker_fn at ffffffff810a8a0c
  #23 [ffff88272f5afec0] kthread at ffffffff810a8428
  #24 [ffff88272f5aff50] ret_from_fork at ffffffff81745242

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
PabloPL pushed a commit that referenced this issue Aug 24, 2019
Revert the commit bd293d0. The proper
fix has been made available with commit d0a255e ("loop: set
PF_MEMALLOC_NOIO for the worker thread").

Note that the fix offered by commit bd293d0 doesn't really prevent
the deadlock from occuring - if we look at the stacktrace reported by
Junxiao Bi, we see that it hangs in bit_wait_io and not on the mutex -
i.e. it has already successfully taken the mutex. Changing the mutex
from mutex_lock to mutex_trylock won't help with deadlocks that happen
afterwards.

PID: 474    TASK: ffff8813e11f4600  CPU: 10  COMMAND: "kswapd0"
   #0 [ffff8813dedfb938] __schedule at ffffffff8173f405
   #1 [ffff8813dedfb990] schedule at ffffffff8173fa27
   #2 [ffff8813dedfb9b0] schedule_timeout at ffffffff81742fec
   #3 [ffff8813dedfba60] io_schedule_timeout at ffffffff8173f186
   #4 [ffff8813dedfbaa0] bit_wait_io at ffffffff8174034f
   #5 [ffff8813dedfbac0] __wait_on_bit at ffffffff8173fec8
   #6 [ffff8813dedfbb10] out_of_line_wait_on_bit at ffffffff8173ff81
   #7 [ffff8813dedfbb90] __make_buffer_clean at ffffffffa038736f [dm_bufio]
   #8 [ffff8813dedfbbb0] __try_evict_buffer at ffffffffa0387bb8 [dm_bufio]
   #9 [ffff8813dedfbbd0] dm_bufio_shrink_scan at ffffffffa0387cc3 [dm_bufio]
  #10 [ffff8813dedfbc40] shrink_slab at ffffffff811a87ce
  #11 [ffff8813dedfbd30] shrink_zone at ffffffff811ad778
  #12 [ffff8813dedfbdc0] kswapd at ffffffff811ae92f
  #13 [ffff8813dedfbec0] kthread at ffffffff810a8428
  #14 [ffff8813dedfbf50] ret_from_fork at ffffffff81745242

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Cc: stable@vger.kernel.org
Fixes: bd293d0 ("dm bufio: fix deadlock with loop device")
Depends-on: d0a255e ("loop: set PF_MEMALLOC_NOIO for the worker thread")
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
PabloPL pushed a commit that referenced this issue Nov 3, 2019
There are three places where we access uninitialized memmaps, namely:
- /proc/kpagecount
- /proc/kpageflags
- /proc/kpagecgroup

We have initialized memmaps either when the section is online or when the
page was initialized to the ZONE_DEVICE.  Uninitialized memmaps contain
garbage and in the worst case trigger kernel BUGs, especially with
CONFIG_PAGE_POISONING.

For example, not onlining a DIMM during boot and calling /proc/kpagecount
with CONFIG_PAGE_POISONING:

  :/# cat /proc/kpagecount > tmp.test
  BUG: unable to handle page fault for address: fffffffffffffffe
  #PF: supervisor read access in kernel mode
  #PF: error_code(0x0000) - not-present page
  PGD 114616067 P4D 114616067 PUD 114618067 PMD 0
  Oops: 0000 [#1] SMP NOPTI
  CPU: 0 PID: 469 Comm: cat Not tainted 5.4.0-rc1-next-20191004+ #11
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.4
  RIP: 0010:kpagecount_read+0xce/0x1e0
  Code: e8 09 83 e0 3f 48 0f a3 02 73 2d 4c 89 e7 48 c1 e7 06 48 03 3d ab 51 01 01 74 1d 48 8b 57 08 480
  RSP: 0018:ffffa14e409b7e78 EFLAGS: 00010202
  RAX: fffffffffffffffe RBX: 0000000000020000 RCX: 0000000000000000
  RDX: 0000000000000001 RSI: 00007f76b5595000 RDI: fffff35645000000
  RBP: 00007f76b5595000 R08: 0000000000000001 R09: 0000000000000000
  R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000140000
  R13: 0000000000020000 R14: 00007f76b5595000 R15: ffffa14e409b7f08
  FS:  00007f76b577d580(0000) GS:ffff8f41bd400000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: fffffffffffffffe CR3: 0000000078960000 CR4: 00000000000006f0
  Call Trace:
   proc_reg_read+0x3c/0x60
   vfs_read+0xc5/0x180
   ksys_read+0x68/0xe0
   do_syscall_64+0x5c/0xa0
   entry_SYSCALL_64_after_hwframe+0x49/0xbe

For now, let's drop support for ZONE_DEVICE from the three pseudo files
in order to fix this.  To distinguish offline memory (with garbage
memmap) from ZONE_DEVICE memory with properly initialized memmaps, we
would have to check get_dev_pagemap() and pfn_zone_device_reserved()
right now.  The usage of both (especially, special casing devmem) is
frowned upon and needs to be reworked.

The fundamental issue we have is:

	if (pfn_to_online_page(pfn)) {
		/* memmap initialized */
	} else if (pfn_valid(pfn)) {
		/*
		 * ???
		 * a) offline memory. memmap garbage.
		 * b) devmem: memmap initialized to ZONE_DEVICE.
		 * c) devmem: reserved for driver. memmap garbage.
		 * (d) devmem: memmap currently initializing - garbage)
		 */
	}

We'll leave the pfn_zone_device_reserved() check in stable_page_flags()
in place as that function is also used from memory failure.  We now no
longer dump information about pages that are not in use anymore -
offline.

Link: http://lkml.kernel.org/r/20191009142435.3975-2-david@redhat.com
Fixes: f1dd2cd ("mm, memory_hotplug: do not associate hotadded memory to zones until online")	[visible after d0dc12e]
Signed-off-by: David Hildenbrand <david@redhat.com>
Reported-by: Qian Cai <cai@lca.pw>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Toshiki Fukasawa <t-fukasawa@vx.jp.nec.com>
Cc: Pankaj gupta <pagupta@redhat.com>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: Anthony Yznaga <anthony.yznaga@oracle.com>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>
Cc: <stable@vger.kernel.org>	[4.13+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
PabloPL pushed a commit that referenced this issue Jan 11, 2020
Properly initialize refcount to 1 when hardware queue arrays for
TC-MQPRIO offload have been freshly allocated. Otherwise, following
warning is observed. Also fix up error path to only free hardware
queue arrays when refcount reaches 0.

[  130.075342] ------------[ cut here ]------------
[  130.075343] refcount_t: addition on 0; use-after-free.
[  130.075355] WARNING: CPU: 0 PID: 10870 at lib/refcount.c:25
refcount_warn_saturate+0xe1/0x100
[  130.075356] Modules linked in: sch_mqprio iptable_nat ib_iser
libiscsi scsi_transport_iscsi ib_ipoib rdma_ucm ib_umad iw_cxgb4 libcxgb
ib_uverbs x86_pkg_temp_thermal cxgb4 igb
[  130.075361] CPU: 0 PID: 10870 Comm: tc Kdump: loaded Not tainted
5.5.0-rc1+ #11
[  130.075362] Hardware name: Supermicro
X9SRE/X9SRE-3F/X9SRi/X9SRi-3F/X9SRE/X9SRE-3F/X9SRi/X9SRi-3F, BIOS 3.2
01/16/2015
[  130.075363] RIP: 0010:refcount_warn_saturate+0xe1/0x100
[  130.075364] Code: e8 14 41 c1 ff 0f 0b c3 80 3d 44 f4 10 01 00 0f 85
63 ff ff ff 48 c7 c7 38 9f 83 8c 31 c0 c6 05 2e f4 10 01 01 e8 ef 40 c1
ff <0f> 0b c3 48 c7 c7 10 9f 83 8c 31 c0 c6 05 17 f4 10 01 01 e8 d7 40
[  130.075365] RSP: 0018:ffffa48d00c0b768 EFLAGS: 00010286
[  130.075366] RAX: 0000000000000000 RBX: 0000000000000008 RCX:
0000000000000001
[  130.075366] RDX: 0000000000000001 RSI: 0000000000000096 RDI:
ffff8a2e9fa187d0
[  130.075367] RBP: ffff8a2e93890000 R08: 0000000000000398 R09:
000000000000003c
[  130.075367] R10: 00000000000142a0 R11: 0000000000000397 R12:
ffffa48d00c0b848
[  130.075368] R13: ffff8a2e94746498 R14: ffff8a2e966f7000 R15:
0000000000000031
[  130.075368] FS:  00007f689015f840(0000) GS:ffff8a2e9fa00000(0000)
knlGS:0000000000000000
[  130.075369] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  130.075369] CR2: 00000000006762a0 CR3: 00000007cf164005 CR4:
00000000001606f0
[  130.075370] Call Trace:
[  130.075377]  cxgb4_setup_tc_mqprio+0xbee/0xc30 [cxgb4]
[  130.075382]  ? cxgb4_ethofld_restart+0x50/0x50 [cxgb4]
[  130.075384]  ? pfifo_fast_init+0x7e/0xf0
[  130.075386]  mqprio_init+0x5f4/0x630 [sch_mqprio]
[  130.075389]  qdisc_create+0x1bf/0x4a0
[  130.075390]  tc_modify_qdisc+0x1ff/0x770
[  130.075392]  rtnetlink_rcv_msg+0x28b/0x350
[  130.075394]  ? rtnl_calcit.isra.32+0x110/0x110
[  130.075395]  netlink_rcv_skb+0xc6/0x100
[  130.075396]  netlink_unicast+0x1db/0x330
[  130.075397]  netlink_sendmsg+0x2f5/0x460
[  130.075399]  ? _copy_from_user+0x2e/0x60
[  130.075400]  sock_sendmsg+0x59/0x70
[  130.075401]  ____sys_sendmsg+0x1f0/0x230
[  130.075402]  ? copy_msghdr_from_user+0xd7/0x140
[  130.075403]  ___sys_sendmsg+0x77/0xb0
[  130.075404]  ? ___sys_recvmsg+0x84/0xb0
[  130.075406]  ? __handle_mm_fault+0x377/0xaf0
[  130.075407]  __sys_sendmsg+0x53/0xa0
[  130.075409]  do_syscall_64+0x44/0x130
[  130.075412]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[  130.075413] RIP: 0033:0x7f688f13af10
[  130.075414] Code: c3 48 8b 05 82 6f 2c 00 f7 db 64 89 18 48 83 cb ff
eb dd 0f 1f 80 00 00 00 00 83 3d 8d d0 2c 00 00 75 10 b8 2e 00 00 00 0f
05 <48> 3d 01 f0 ff ff 73 31 c3 48 83 ec 08 e8 ae cc 00 00 48 89 04 24
[  130.075414] RSP: 002b:00007ffe6c7d9988 EFLAGS: 00000246 ORIG_RAX:
000000000000002e
[  130.075415] RAX: ffffffffffffffda RBX: 00000000006703a0 RCX:
00007f688f13af10
[  130.075415] RDX: 0000000000000000 RSI: 00007ffe6c7d99f0 RDI:
0000000000000003
[  130.075416] RBP: 000000005df38312 R08: 0000000000000002 R09:
0000000000008000
[  130.075416] R10: 00007ffe6c7d93e0 R11: 0000000000000246 R12:
0000000000000000
[  130.075417] R13: 00007ffe6c7e9c50 R14: 0000000000000001 R15:
000000000067c600
[  130.075418] ---[ end trace 8fbb3bf36a8671db ]---

v2:
- Move the refcount_set() closer to where the hardware queue arrays
  are being allocated.
- Fix up error path to only free hardware queue arrays when refcount
  reaches 0.

Fixes: 2d0cb84 ("cxgb4: add ETHOFLD hardware queue support")
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
@xc-racer99
Copy link
Collaborator Author

Audio machine driver upstream, but no DTS yet.

PabloPL pushed a commit that referenced this issue Oct 3, 2020
The aliases were never released causing the following leaks:

  Indirect leak of 1224 byte(s) in 9 object(s) allocated from:
    #0 0x7feefb830628 in malloc (/lib/x86_64-linux-gnu/libasan.so.5+0x107628)
    #1 0x56332c8f1b62 in __perf_pmu__new_alias util/pmu.c:322
    #2 0x56332c8f401f in pmu_add_cpu_aliases_map util/pmu.c:778
    #3 0x56332c792ce9 in __test__pmu_event_aliases tests/pmu-events.c:295
    #4 0x56332c792ce9 in test_aliases tests/pmu-events.c:367
    #5 0x56332c76a09b in run_test tests/builtin-test.c:410
    #6 0x56332c76a09b in test_and_print tests/builtin-test.c:440
    #7 0x56332c76ce69 in __cmd_test tests/builtin-test.c:695
    #8 0x56332c76ce69 in cmd_test tests/builtin-test.c:807
    #9 0x56332c7d2214 in run_builtin /home/namhyung/project/linux/tools/perf/perf.c:312
    #10 0x56332c6701a8 in handle_internal_command /home/namhyung/project/linux/tools/perf/perf.c:364
    #11 0x56332c6701a8 in run_argv /home/namhyung/project/linux/tools/perf/perf.c:408
    #12 0x56332c6701a8 in main /home/namhyung/project/linux/tools/perf/perf.c:538
    #13 0x7feefb359cc9 in __libc_start_main ../csu/libc-start.c:308

Fixes: 956a783 ("perf test: Test pmu-events aliases")
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Reviewed-by: John Garry <john.garry@huawei.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20200915031819.386559-11-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
PabloPL pushed a commit that referenced this issue Oct 3, 2020
The evsel->unit borrows a pointer of pmu event or alias instead of
owns a string.  But tool event (duration_time) passes a result of
strdup() caused a leak.

It was found by ASAN during metric test:

  Direct leak of 210 byte(s) in 70 object(s) allocated from:
    #0 0x7fe366fca0b5 in strdup (/lib/x86_64-linux-gnu/libasan.so.5+0x920b5)
    #1 0x559fbbcc6ea3 in add_event_tool util/parse-events.c:414
    #2 0x559fbbcc6ea3 in parse_events_add_tool util/parse-events.c:1414
    #3 0x559fbbd8474d in parse_events_parse util/parse-events.y:439
    #4 0x559fbbcc95da in parse_events__scanner util/parse-events.c:2096
    #5 0x559fbbcc95da in __parse_events util/parse-events.c:2141
    #6 0x559fbbc28555 in check_parse_id tests/pmu-events.c:406
    #7 0x559fbbc28555 in check_parse_id tests/pmu-events.c:393
    #8 0x559fbbc28555 in check_parse_cpu tests/pmu-events.c:415
    #9 0x559fbbc28555 in test_parsing tests/pmu-events.c:498
    #10 0x559fbbc0109b in run_test tests/builtin-test.c:410
    #11 0x559fbbc0109b in test_and_print tests/builtin-test.c:440
    #12 0x559fbbc03e69 in __cmd_test tests/builtin-test.c:695
    #13 0x559fbbc03e69 in cmd_test tests/builtin-test.c:807
    #14 0x559fbbc691f4 in run_builtin /home/namhyung/project/linux/tools/perf/perf.c:312
    #15 0x559fbbb071a8 in handle_internal_command /home/namhyung/project/linux/tools/perf/perf.c:364
    #16 0x559fbbb071a8 in run_argv /home/namhyung/project/linux/tools/perf/perf.c:408
    #17 0x559fbbb071a8 in main /home/namhyung/project/linux/tools/perf/perf.c:538
    #18 0x7fe366b68cc9 in __libc_start_main ../csu/libc-start.c:308

Fixes: f0fbb11 ("perf stat: Implement duration_time as a proper event")
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20200915031819.386559-6-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
PabloPL pushed a commit that referenced this issue Oct 3, 2020
The test_generic_metric() missed to release entries in the pctx.  Asan
reported following leak (and more):

  Direct leak of 128 byte(s) in 1 object(s) allocated from:
    #0 0x7f4c9396980e in calloc (/lib/x86_64-linux-gnu/libasan.so.5+0x10780e)
    #1 0x55f7e748cc14 in hashmap_grow (/home/namhyung/project/linux/tools/perf/perf+0x90cc14)
    #2 0x55f7e748d497 in hashmap__insert (/home/namhyung/project/linux/tools/perf/perf+0x90d497)
    #3 0x55f7e7341667 in hashmap__set /home/namhyung/project/linux/tools/perf/util/hashmap.h:111
    #4 0x55f7e7341667 in expr__add_ref util/expr.c:120
    #5 0x55f7e7292436 in prepare_metric util/stat-shadow.c:783
    #6 0x55f7e729556d in test_generic_metric util/stat-shadow.c:858
    #7 0x55f7e712390b in compute_single tests/parse-metric.c:128
    #8 0x55f7e712390b in __compute_metric tests/parse-metric.c:180
    #9 0x55f7e712446d in compute_metric tests/parse-metric.c:196
    #10 0x55f7e712446d in test_dcache_l2 tests/parse-metric.c:295
    #11 0x55f7e712446d in test__parse_metric tests/parse-metric.c:355
    #12 0x55f7e70be09b in run_test tests/builtin-test.c:410
    #13 0x55f7e70be09b in test_and_print tests/builtin-test.c:440
    #14 0x55f7e70c101a in __cmd_test tests/builtin-test.c:661
    #15 0x55f7e70c101a in cmd_test tests/builtin-test.c:807
    #16 0x55f7e7126214 in run_builtin /home/namhyung/project/linux/tools/perf/perf.c:312
    #17 0x55f7e6fc41a8 in handle_internal_command /home/namhyung/project/linux/tools/perf/perf.c:364
    #18 0x55f7e6fc41a8 in run_argv /home/namhyung/project/linux/tools/perf/perf.c:408
    #19 0x55f7e6fc41a8 in main /home/namhyung/project/linux/tools/perf/perf.c:538
    #20 0x7f4c93492cc9 in __libc_start_main ../csu/libc-start.c:308

Fixes: 6d432c4 ("perf tools: Add test_generic_metric function")
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20200915031819.386559-8-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
PabloPL pushed a commit that referenced this issue Oct 3, 2020
The metricgroup__add_metric() can find multiple match for a metric group
and it's possible to fail.  Also it can fail in the middle like in
resolve_metric() even for single metric.

In those cases, the intermediate list and ids will be leaked like:

  Direct leak of 3 byte(s) in 1 object(s) allocated from:
    #0 0x7f4c938f40b5 in strdup (/lib/x86_64-linux-gnu/libasan.so.5+0x920b5)
    #1 0x55f7e71c1bef in __add_metric util/metricgroup.c:683
    #2 0x55f7e71c31d0 in add_metric util/metricgroup.c:906
    #3 0x55f7e71c3844 in metricgroup__add_metric util/metricgroup.c:940
    #4 0x55f7e71c488d in metricgroup__add_metric_list util/metricgroup.c:993
    #5 0x55f7e71c488d in parse_groups util/metricgroup.c:1045
    #6 0x55f7e71c60a4 in metricgroup__parse_groups_test util/metricgroup.c:1087
    #7 0x55f7e71235ae in __compute_metric tests/parse-metric.c:164
    #8 0x55f7e7124650 in compute_metric tests/parse-metric.c:196
    #9 0x55f7e7124650 in test_recursion_fail tests/parse-metric.c:318
    #10 0x55f7e7124650 in test__parse_metric tests/parse-metric.c:356
    #11 0x55f7e70be09b in run_test tests/builtin-test.c:410
    #12 0x55f7e70be09b in test_and_print tests/builtin-test.c:440
    #13 0x55f7e70c101a in __cmd_test tests/builtin-test.c:661
    #14 0x55f7e70c101a in cmd_test tests/builtin-test.c:807
    #15 0x55f7e7126214 in run_builtin /home/namhyung/project/linux/tools/perf/perf.c:312
    #16 0x55f7e6fc41a8 in handle_internal_command /home/namhyung/project/linux/tools/perf/perf.c:364
    #17 0x55f7e6fc41a8 in run_argv /home/namhyung/project/linux/tools/perf/perf.c:408
    #18 0x55f7e6fc41a8 in main /home/namhyung/project/linux/tools/perf/perf.c:538
    #19 0x7f4c93492cc9 in __libc_start_main ../csu/libc-start.c:308

Fixes: 83de0b7 ("perf metric: Collect referenced metrics in struct metric_ref_node")
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20200915031819.386559-9-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
PabloPL pushed a commit that referenced this issue Oct 3, 2020
The following leaks were detected by ASAN:

  Indirect leak of 360 byte(s) in 9 object(s) allocated from:
    #0 0x7fecc305180e in calloc (/lib/x86_64-linux-gnu/libasan.so.5+0x10780e)
    #1 0x560578f6dce5 in perf_pmu__new_format util/pmu.c:1333
    #2 0x560578f752fc in perf_pmu_parse util/pmu.y:59
    #3 0x560578f6a8b7 in perf_pmu__format_parse util/pmu.c:73
    #4 0x560578e07045 in test__pmu tests/pmu.c:155
    #5 0x560578de109b in run_test tests/builtin-test.c:410
    #6 0x560578de109b in test_and_print tests/builtin-test.c:440
    #7 0x560578de401a in __cmd_test tests/builtin-test.c:661
    #8 0x560578de401a in cmd_test tests/builtin-test.c:807
    #9 0x560578e49354 in run_builtin /home/namhyung/project/linux/tools/perf/perf.c:312
    #10 0x560578ce71a8 in handle_internal_command /home/namhyung/project/linux/tools/perf/perf.c:364
    #11 0x560578ce71a8 in run_argv /home/namhyung/project/linux/tools/perf/perf.c:408
    #12 0x560578ce71a8 in main /home/namhyung/project/linux/tools/perf/perf.c:538
    #13 0x7fecc2b7acc9 in __libc_start_main ../csu/libc-start.c:308

Fixes: cff7f95 ("perf tests: Move pmu tests into separate object")
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20200915031819.386559-12-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
PabloPL pushed a commit that referenced this issue Oct 23, 2021
devm_regmap_init may return error which caused by like out of memory,
this will results in null pointer dereference later when reading
or writing register:

general protection fault in encx24j600_spi_probe
KASAN: null-ptr-deref in range [0x0000000000000090-0x0000000000000097]
CPU: 0 PID: 286 Comm: spi-encx24j600- Not tainted 5.15.0-rc2-00142-g9978db750e31-dirty #11 9c53a778c1306b1b02359f3c2bbedc0222cba652
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014
RIP: 0010:regcache_cache_bypass drivers/base/regmap/regcache.c:540
Code: 54 41 89 f4 55 53 48 89 fb 48 83 ec 08 e8 26 94 a8 fe 48 8d bb a0 00 00 00 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 <80> 3c 02 00 0f 85 4a 03 00 00 4c 8d ab b0 00 00 00 48 8b ab a0 00
RSP: 0018:ffffc900010476b8 EFLAGS: 00010207
RAX: dffffc0000000000 RBX: fffffffffffffff4 RCX: 0000000000000000
RDX: 0000000000000012 RSI: ffff888002de0000 RDI: 0000000000000094
RBP: ffff888013c9a000 R08: 0000000000000000 R09: fffffbfff3f9cc6a
R10: ffffc900010476e8 R11: fffffbfff3f9cc69 R12: 0000000000000001
R13: 000000000000000a R14: ffff888013c9af54 R15: ffff888013c9ad08
FS:  00007ffa984ab580(0000) GS:ffff88801fe00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000055a6384136c8 CR3: 000000003bbe6003 CR4: 0000000000770ef0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
PKRU: 55555554
Call Trace:
 encx24j600_spi_probe drivers/net/ethernet/microchip/encx24j600.c:459
 spi_probe drivers/spi/spi.c:397
 really_probe drivers/base/dd.c:517
 __driver_probe_device drivers/base/dd.c:751
 driver_probe_device drivers/base/dd.c:782
 __device_attach_driver drivers/base/dd.c:899
 bus_for_each_drv drivers/base/bus.c:427
 __device_attach drivers/base/dd.c:971
 bus_probe_device drivers/base/bus.c:487
 device_add drivers/base/core.c:3364
 __spi_add_device drivers/spi/spi.c:599
 spi_add_device drivers/spi/spi.c:641
 spi_new_device drivers/spi/spi.c:717
 new_device_store+0x18c/0x1f1 [spi_stub 4e02719357f1ff33f5a43d00630982840568e85e]
 dev_attr_store drivers/base/core.c:2074
 sysfs_kf_write fs/sysfs/file.c:139
 kernfs_fop_write_iter fs/kernfs/file.c:300
 new_sync_write fs/read_write.c:508 (discriminator 4)
 vfs_write fs/read_write.c:594
 ksys_write fs/read_write.c:648
 do_syscall_64 arch/x86/entry/common.c:50
 entry_SYSCALL_64_after_hwframe arch/x86/entry/entry_64.S:113

Add error check in devm_regmap_init_encx24j600 to avoid this situation.

Fixes: 04fbfce ("net: Microchip encx24j600 driver")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Nanyong Sun <sunnanyong@huawei.com>
Link: https://lore.kernel.org/r/20211012125901.3623144-1-sunnanyong@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
@xc-racer99
Copy link
Collaborator Author

All audio patches are now merged. UCM2 files at https://github.com/xc-racer99/aries-hw-files/tree/ucm2

PabloPL pushed a commit that referenced this issue Apr 16, 2023
When a system with E810 with existing VFs gets rebooted the following
hang may be observed.

 Pid 1 is hung in iavf_remove(), part of a network driver:
 PID: 1        TASK: ffff965400e5a340  CPU: 24   COMMAND: "systemd-shutdow"
  #0 [ffffaad04005fa50] __schedule at ffffffff8b3239cb
  #1 [ffffaad04005fae8] schedule at ffffffff8b323e2d
  #2 [ffffaad04005fb00] schedule_hrtimeout_range_clock at ffffffff8b32cebc
  #3 [ffffaad04005fb80] usleep_range_state at ffffffff8b32c930
  #4 [ffffaad04005fbb0] iavf_remove at ffffffffc12b9b4c [iavf]
  #5 [ffffaad04005fbf0] pci_device_remove at ffffffff8add7513
  #6 [ffffaad04005fc10] device_release_driver_internal at ffffffff8af08baa
  #7 [ffffaad04005fc40] pci_stop_bus_device at ffffffff8adcc5fc
  #8 [ffffaad04005fc60] pci_stop_and_remove_bus_device at ffffffff8adcc81e
  #9 [ffffaad04005fc70] pci_iov_remove_virtfn at ffffffff8adf9429
 #10 [ffffaad04005fca8] sriov_disable at ffffffff8adf98e4
 #11 [ffffaad04005fcc8] ice_free_vfs at ffffffffc04bb2c8 [ice]
 #12 [ffffaad04005fd10] ice_remove at ffffffffc04778fe [ice]
 #13 [ffffaad04005fd38] ice_shutdown at ffffffffc0477946 [ice]
 #14 [ffffaad04005fd50] pci_device_shutdown at ffffffff8add58f1
 #15 [ffffaad04005fd70] device_shutdown at ffffffff8af05386
 #16 [ffffaad04005fd98] kernel_restart at ffffffff8a92a870
 #17 [ffffaad04005fda8] __do_sys_reboot at ffffffff8a92abd6
 #18 [ffffaad04005fee0] do_syscall_64 at ffffffff8b317159
 #19 [ffffaad04005ff08] __context_tracking_enter at ffffffff8b31b6fc
 #20 [ffffaad04005ff18] syscall_exit_to_user_mode at ffffffff8b31b50d
 #21 [ffffaad04005ff28] do_syscall_64 at ffffffff8b317169
 #22 [ffffaad04005ff50] entry_SYSCALL_64_after_hwframe at ffffffff8b40009b
     RIP: 00007f1baa5c13d7  RSP: 00007fffbcc55a98  RFLAGS: 00000202
     RAX: ffffffffffffffda  RBX: 0000000000000000  RCX: 00007f1baa5c13d7
     RDX: 0000000001234567  RSI: 0000000028121969  RDI: 00000000fee1dead
     RBP: 00007fffbcc55ca0   R8: 0000000000000000   R9: 00007fffbcc54e90
     R10: 00007fffbcc55050  R11: 0000000000000202  R12: 0000000000000005
     R13: 0000000000000000  R14: 00007fffbcc55af0  R15: 0000000000000000
     ORIG_RAX: 00000000000000a9  CS: 0033  SS: 002b

During reboot all drivers PM shutdown callbacks are invoked.
In iavf_shutdown() the adapter state is changed to __IAVF_REMOVE.
In ice_shutdown() the call chain above is executed, which at some point
calls iavf_remove(). However iavf_remove() expects the VF to be in one
of the states __IAVF_RUNNING, __IAVF_DOWN or __IAVF_INIT_FAILED. If
that's not the case it sleeps forever.
So if iavf_shutdown() gets invoked before iavf_remove() the system will
hang indefinitely because the adapter is already in state __IAVF_REMOVE.

Fix this by returning from iavf_remove() if the state is __IAVF_REMOVE,
as we already went through iavf_shutdown().

Fixes: 9745780 ("iavf: Add waiting so the port is initialized in remove")
Fixes: a841733 ("iavf: Fix race condition between iavf_shutdown and iavf_remove")
Reported-by: Marius Cornea <mcornea@redhat.com>
Signed-off-by: Stefan Assmann <sassmann@kpanic.de>
Reviewed-by: Michal Kubiak <michal.kubiak@intel.com>
Tested-by: Rafal Romanowski <rafal.romanowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
PabloPL pushed a commit that referenced this issue Apr 16, 2023
As reported by Christoph, the mptcp protocol can run the
worker when the relevant msk socket is in an unexpected state:

connect()
// incoming reset + fastclose
// the mptcp worker is scheduled
mptcp_disconnect()
// msk is now CLOSED
listen()
mptcp_worker()

Leading to the following splat:

divide error: 0000 [#1] PREEMPT SMP
CPU: 1 PID: 21 Comm: kworker/1:0 Not tainted 6.3.0-rc1-gde5e8fd0123c #11
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-2.el7 04/01/2014
Workqueue: events mptcp_worker
RIP: 0010:__tcp_select_window+0x22c/0x4b0 net/ipv4/tcp_output.c:3018
RSP: 0018:ffffc900000b3c98 EFLAGS: 00010293
RAX: 000000000000ffd7 RBX: 000000000000ffd7 RCX: 0000000000000000
RDX: 0000000000000000 RSI: ffffffff8214ce97 RDI: 0000000000000004
RBP: 000000000000ffd7 R08: 0000000000000004 R09: 0000000000010000
R10: 000000000000ffd7 R11: ffff888005afa148 R12: 000000000000ffd7
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
FS:  0000000000000000(0000) GS:ffff88803ed00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000405270 CR3: 000000003011e006 CR4: 0000000000370ee0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 <TASK>
 tcp_select_window net/ipv4/tcp_output.c:262 [inline]
 __tcp_transmit_skb+0x356/0x1280 net/ipv4/tcp_output.c:1345
 tcp_transmit_skb net/ipv4/tcp_output.c:1417 [inline]
 tcp_send_active_reset+0x13e/0x320 net/ipv4/tcp_output.c:3459
 mptcp_check_fastclose net/mptcp/protocol.c:2530 [inline]
 mptcp_worker+0x6c7/0x800 net/mptcp/protocol.c:2705
 process_one_work+0x3bd/0x950 kernel/workqueue.c:2390
 worker_thread+0x5b/0x610 kernel/workqueue.c:2537
 kthread+0x138/0x170 kernel/kthread.c:376
 ret_from_fork+0x2c/0x50 arch/x86/entry/entry_64.S:308
 </TASK>

This change addresses the issue explicitly checking for bad states
before running the mptcp worker.

Fixes: e16163b ("mptcp: refactor shutdown and close")
Cc: stable@vger.kernel.org
Reported-by: Christoph Paasch <cpaasch@apple.com>
Link: multipath-tcp/mptcp_net-next#374
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Tested-by: Christoph Paasch <cpaasch@apple.com>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
s5pv210 Issues related to s5pv210 SOC
Projects
None yet
Development

No branches or pull requests

2 participants