undefined libm functions in AMDGPU #327

nickdesaulniers · 2019-01-24T19:19:06Z

@groeck reported that x86_64 defconfig +
CONFIG_DRM_AMDGPU=y
CONFIG_DRM_AMDGPU_SI=y
CONFIG_DRM_AMDGPU_CIK=y
CONFIG_DRM_AMDGPU_USERPTR=y
CONFIG_DRM_AMDGPU_GART_DEBUGFS=y
CONFIG_DRM_AMD_ACP=y
causes tons of linkage failures for undefined references to libm floating point helper functions. Looks similar to #186 .

groeck · 2019-01-24T19:50:15Z

__divdi3 (#186) is most likely due to 64-bit divide operations, not related to floating point.

Setting CONFIG_DRM_AMDGPU=y on top of defconfig is sufficient to produce the error. Partial output:

drivers/gpu/drm/amd/display/dc/calcs/dcn_calcs.o: In function dcn_validate_bandwidth': drivers/gpu/drm/amd/amdgpu/../display/dc/calcs/dcn_calcs.c:(.text+0x55): undefined reference to __mulsf3'
drivers/gpu/drm/amd/amdgpu/../display/dc/calcs/dcn_calcs.c:(.text+0x5c): undefined reference to __fixsfsi' drivers/gpu/drm/amd/amdgpu/../display/dc/calcs/dcn_calcs.c:(.text+0x72): undefined reference to __floatsidf'
drivers/gpu/drm/amd/amdgpu/../display/dc/calcs/dcn_calcs.c:(.text+0x84): undefined reference to __divdf3' drivers/gpu/drm/amd/amdgpu/../display/dc/calcs/dcn_calcs.c:(.text+0x8c): undefined reference to __truncdfsf2'
drivers/gpu/drm/amd/amdgpu/../display/dc/calcs/dcn_calcs.c:(.text+0xa5): undefined reference to __mulsf3' drivers/gpu/drm/amd/amdgpu/../display/dc/calcs/dcn_calcs.c:(.text+0xac): undefined reference to __fixsfsi'
drivers/gpu/drm/amd/amdgpu/../display/dc/calcs/dcn_calcs.c:(.text+0xbf): undefined reference to __floatsidf' drivers/gpu/drm/amd/amdgpu/../display/dc/calcs/dcn_calcs.c:(.text+0xd1): undefined reference to __divdf3'
drivers/gpu/drm/amd/amdgpu/../display/dc/calcs/dcn_calcs.c:(.text+0xd9): undefined reference to __truncdfsf2' drivers/gpu/drm/amd/amdgpu/../display/dc/calcs/dcn_calcs.c:(.text+0xf3): undefined reference to __mulsf3'
...
drivers/gpu/drm/amd/display/dc/calcs/dcn_calc_math.o: In function dcn_bw_mod': drivers/gpu/drm/amd/amdgpu/../display/dc/calcs/dcn_calc_math.c:(.text+0xa): undefined reference to __unordsf2'
drivers/gpu/drm/amd/amdgpu/../display/dc/calcs/dcn_calc_math.c:(.text+0x17): undefined reference to __unordsf2' drivers/gpu/drm/amd/amdgpu/../display/dc/calcs/dcn_calc_math.c:(.text+0x24): undefined reference to __divsf3'
drivers/gpu/drm/amd/amdgpu/../display/dc/calcs/dcn_calc_math.c:(.text+0x2b): undefined reference to __fixsfsi' drivers/gpu/drm/amd/amdgpu/../display/dc/calcs/dcn_calc_math.c:(.text+0x32): undefined reference to __floatsisf'
drivers/gpu/drm/amd/amdgpu/../display/dc/calcs/dcn_calc_math.c:(.text+0x3b): undefined reference to __mulsf3' drivers/gpu/drm/amd/amdgpu/../display/dc/calcs/dcn_calc_math.c:(.text+0x44): undefined reference to __subsf3'
...
drivers/gpu/drm/amd/display/dc/calcs/dcn_calc_auto.o: In function scaler_settings_calculation': drivers/gpu/drm/amd/amdgpu/../display/dc/calcs/dcn_calc_auto.c:(.text+0x129): undefined reference to __nesf2'
drivers/gpu/drm/amd/amdgpu/../display/dc/calcs/dcn_calc_auto.c:(.text+0x138): undefined reference to __addsf3' drivers/gpu/drm/amd/amdgpu/../display/dc/calcs/dcn_calc_auto.c:(.text+0x154): undefined reference to __nesf2'
drivers/gpu/drm/amd/amdgpu/../display/dc/calcs/dcn_calc_auto.c:(.text+0x166): undefined reference to __mulsf3' drivers/gpu/drm/amd/amdgpu/../display/dc/calcs/dcn_calc_auto.c:(.text+0x17b): undefined reference to __mulsf3'
drivers/gpu/drm/amd/amdgpu/../display/dc/calcs/dcn_calc_auto.c:(.text+0x1d4): undefined reference to __gtsf2' drivers/gpu/drm/amd/amdgpu/../display/dc/calcs/dcn_calc_auto.c:(.text+0x1f5): undefined reference to __addsf3'

jyknight · 2019-01-24T21:38:33Z

drivers/gpu/drm/amd/display/dc/calcs/Makefile specifies -mhard-float for building dcn_calcs.o.

I guess that's not getting through to the clang invocation, or else is being ignored for some reason?

nathanchance · 2019-01-24T21:58:05Z

It looks like it makes it to the Clang invocation...

/usr/bin/ccache clang -Wp,-MD,drivers/gpu/drm/amd/amdgpu/../display/dc/calcs/.dcn_calcs.o.d  -nostdinc -isystem /home/nathan/cbl/usr/lib/clang/9.0.0/include -I/home/nathan/cbl/linux/arch/x86/include -I./arch/x86/include/generated  -I/home/nathan/cbl/linux/include -I./include -I/home/nathan/cbl/linux/arch/x86/include/uapi -I./arch/x86/include/generated/uapi -I/home/nathan/cbl/linux/include/uapi -I./include/generated/uapi -include /home/nathan/cbl/linux/include/linux/kconfig.h -include /home/nathan/cbl/linux/include/linux/compiler_types.h  -I/home/nathan/cbl/linux/drivers/gpu/drm/amd/amdgpu -Idrivers/gpu/drm/amd/amdgpu -D__KERNEL__ -Qunused-arguments -Wall -Wundef -Werror=strict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common -fshort-wchar -fno-PIE -Werror-implicit-function-declaration -Werror=implicit-int -Wno-format-security -std=gnu89 -no-integrated-as -mno-sse -mno-mmx -mno-sse2 -mno-3dnow -mno-avx -m64 -mno-80387 -mstack-alignment=8 -mtune=generic -mno-red-zone -mcmodel=kernel -DCONFIG_AS_CFI=1 -DCONFIG_AS_CFI_SIGNAL_FRAME=1 -DCONFIG_AS_CFI_SECTIONS=1 -DCONFIG_AS_FXSAVEQ=1 -DCONFIG_AS_SSSE3=1 -DCONFIG_AS_AVX=1 -DCONFIG_AS_AVX2=1 -DCONFIG_AS_AVX512=1 -DCONFIG_AS_SHA1_NI=1 -DCONFIG_AS_SHA256_NI=1 -Wno-sign-compare -fno-asynchronous-unwind-tables -mretpoline-external-thunk -D__BPF_TRACING__ -fno-delete-null-pointer-checks -O2 -Wframe-larger-than=2048 -fstack-protector-strong -Wno-format-invalid-specifier -Wno-gnu -Wno-address-of-packed-member -Wno-tautological-compare -mno-global-merge -Wno-unused-const-variable -fomit-frame-pointer -Wdeclaration-after-statement -Wvla -Wno-pointer-sign -fno-strict-overflow -fno-merge-all-constants -fno-stack-check -Werror=date-time -Werror=incompatible-pointer-types -Wno-initializer-overrides -Wno-unused-value -Wno-format -Wno-sign-compare -Wno-format-zero-length -Wno-uninitialized  -I/home/nathan/cbl/linux/drivers/gpu/drm/amd/amdgpu/../powerplay/inc/  -I/home/nathan/cbl/linux/drivers/gpu/drm/amd/amdgpu/../include/asic_reg  -I/home/nathan/cbl/linux/drivers/gpu/drm/amd/amdgpu/../include  -I/home/nathan/cbl/linux/drivers/gpu/drm/amd/amdgpu/../powerplay/smumgr  -I/home/nathan/cbl/linux/drivers/gpu/drm/amd/amdgpu/../powerplay/hwmgr  -I/home/nathan/cbl/linux/drivers/gpu/drm/amd/amdgpu/../display/dc/inc/  -I/home/nathan/cbl/linux/drivers/gpu/drm/amd/amdgpu/../display/dc/inc/hw  -I/home/nathan/cbl/linux/drivers/gpu/drm/amd/amdgpu/../display/modules/inc  -I/home/nathan/cbl/linux/drivers/gpu/drm/amd/amdgpu/../display/modules/freesync  -I/home/nathan/cbl/linux/drivers/gpu/drm/amd/amdgpu/../display/modules/color  -I/home/nathan/cbl/linux/drivers/gpu/drm/amd/amdgpu/../display/modules/info_packet  -I/home/nathan/cbl/linux/drivers/gpu/drm/amd/amdgpu/../display/modules/power -DBUILD_FEATURE_TIMING_SYNC=0  -I/home/nathan/cbl/linux/drivers/gpu/drm/amd/amdgpu/../display/dc  -I/home/nathan/cbl/linux/drivers/gpu/drm/amd/amdgpu/../include/asic_reg  -I/home/nathan/cbl/linux/drivers/gpu/drm/amd/amdgpu/../include  -I/home/nathan/cbl/linux/drivers/gpu/drm/amd/amdgpu/../amdgpu  -I/home/nathan/cbl/linux/drivers/gpu/drm/amd/amdgpu/../powerplay/inc  -I/home/nathan/cbl/linux/drivers/gpu/drm/amd/amdgpu/../acp/include  -I/home/nathan/cbl/linux/drivers/gpu/drm/amd/amdgpu/../display  -I/home/nathan/cbl/linux/drivers/gpu/drm/amd/amdgpu/../display/include  -I/home/nathan/cbl/linux/drivers/gpu/drm/amd/amdgpu/../display/dc  -I/home/nathan/cbl/linux/drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm  -I/home/nathan/cbl/linux/drivers/gpu/drm/amd/amdgpu/../amdkfd -mhard-float -msse -mstack-alignment=16    -DKBUILD_BASENAME='"dcn_calcs"' -DKBUILD_MODNAME='"amdgpu"' -c -o drivers/gpu/drm/amd/amdgpu/../display/dc/calcs/dcn_calcs.o /home/nathan/cbl/linux/drivers/gpu/drm/amd/amdgpu/../display/dc/calcs/dcn_calcs.c

Here's the full .cmd file output: https://gist.github.com/d2cc096cafb0a341e603a799890534cb

nickdesaulniers · 2019-01-24T23:36:54Z

reproducer:

// $ clang -O2 -mno-sse -mno-80387 -c float.c
// $ nm float.o
// 0000000000000000 T div
//                  U __divsf3
float div (float a, float b) {
  return b / a;
}

Removing -mno-sse fixes the generated calls to gcc_s/compiler_rt.

Edit: oh, but kbuild then adds back a -msse later... (which invalidates this reproducer)
Edit: ah __divdf3 is from dividing doubles, not floats (the mnemonic being divide double-wide floats 3-code?)

nickdesaulniers · 2019-01-24T23:58:42Z

Better reproducer:

// $ clang -c -O2 -mno-sse2 -mno-80387 -msse float.c
double div (double a, double b) {                                               
  return b / a;                                                                 
}

The solution is turning back on -msse2 for enabling HW FP support for double precision operations with Clang.

nickdesaulniers · 2019-01-25T00:03:33Z

RFT:

From 0007415a8c6d8ffea63c6a99e8f69a6f25142bea Mon Sep 17 00:00:00 2001
From: Nick Desaulniers <ndesaulniers@google.com>
Date: Thu, 24 Jan 2019 15:59:58 -0800
Subject: [PATCH] drm/amd/display: add -msse2 to prevent clang from emitting
 libcalls to undefined SW FP routines

Top level Makefile disables SSE for the whole kernel.  Turn on SSE2 to
support emitting double precision floating point instructions rather
than calls to non-existent (usually available from gcc_s or compiler_rt)
floating point helper routines.

Link: https://gcc.gnu.org/onlinedocs/gccint/Soft-float-library-routines.html
Link: https://github.com/ClangBuiltLinux/linux/issues/327
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
---
 drivers/gpu/drm/amd/display/dc/calcs/Makefile | 2 +-
 drivers/gpu/drm/amd/display/dc/dml/Makefile   | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/calcs/Makefile b/drivers/gpu/drm/amd/display/dc/calcs/Makefile
index 95f332ee3e7e..dc85a3c088af 100644
--- a/drivers/gpu/drm/amd/display/dc/calcs/Makefile
+++ b/drivers/gpu/drm/amd/display/dc/calcs/Makefile
@@ -30,7 +30,7 @@ else ifneq ($(call cc-option, -mstack-alignment=16),)
 	cc_stack_align := -mstack-alignment=16
 endif
 
-calcs_ccflags := -mhard-float -msse $(cc_stack_align)
+calcs_ccflags := -mhard-float -msse -msse2 $(cc_stack_align)
 
 CFLAGS_dcn_calcs.o := $(calcs_ccflags)
 CFLAGS_dcn_calc_auto.o := $(calcs_ccflags)
diff --git a/drivers/gpu/drm/amd/display/dc/dml/Makefile b/drivers/gpu/drm/amd/display/dc/dml/Makefile
index d97ca6528f9d..33c7d7588712 100644
--- a/drivers/gpu/drm/amd/display/dc/dml/Makefile
+++ b/drivers/gpu/drm/amd/display/dc/dml/Makefile
@@ -30,7 +30,7 @@ else ifneq ($(call cc-option, -mstack-alignment=16),)
 	cc_stack_align := -mstack-alignment=16
 endif
 
-dml_ccflags := -mhard-float -msse $(cc_stack_align)
+dml_ccflags := -mhard-float -msse -msse2 $(cc_stack_align)
 
 CFLAGS_display_mode_lib.o := $(dml_ccflags)
 CFLAGS_display_pipe_clocks.o := $(dml_ccflags)
-- 
2.20.1.321.g9e740568ce-goog

groeck · 2019-01-25T00:12:18Z

Excellent! Do you plan to send this upstream ? If so, please feel free to add

Tested-by: Guenter Roeck linux@roeck-us.net

nickdesaulniers · 2019-01-25T00:13:34Z

Do you plan to send this upstream

Will do; thanks for helping test + report, @groeck I appreciate it!

nickdesaulniers · 2019-01-25T00:54:43Z

thanks all for the report and help: https://lore.kernel.org/r/20190125005304.183322-1-ndesaulniers@google.com/

dileks · 2019-01-25T07:28:00Z

Just for the records:
I reported "[Linux-v4.18-rc6] modpost-errors when compiling with clang-7 and CONFIG_DRM_AMDGPU=m"

Workaround for me was CONFIG_DRM_AMDGPU=n

UPDATE: I can try Nick's patch here.
UPDATE-2: Please CC Christian König and amd-gfx ML. Feel free to add a Reported-by.

My 1st priority for testing is to see clang-8.0.0rc1 with asm-goto RFC patches building and booting an 5.0-rc3+ amd64 kernel on Debian/buster.

UPDATE-3: My selfmade llvm-toolchain-8.0.0rc1 consists of llvm, clang and compiler-rt.
UPDATE-4: Feedback from Christian K. in [2].

[1] https://lore.kernel.org/r/0e782eaa-f495-61a2-b30e-61057dcfc611@amd.com/
[2] https://lore.kernel.org/r/54cdfdd5-d889-5714-263f-a663f8649d42@amd.com/

nathanchance · 2019-01-26T05:00:53Z

Patch accepted: https://cgit.freedesktop.org/~agd5f/linux/commit/?id=10117450735c7a7c0858095fb46a860e7037cb9a

…o undefined SW FP routines arch/x86/Makefile disables SSE and SSE2 for the whole kernel. The AMDGPU drivers modified in this patch re-enable SSE but not SSE2. Turn on SSE2 to support emitting double precision floating point instructions rather than calls to non-existent (usually available from gcc_s or compiler_rt) floating point helper routines. Link: https://gcc.gnu.org/onlinedocs/gccint/Soft-float-library-routines.html Link: ClangBuiltLinux/linux#327 Cc: stable@vger.kernel.org # 4.19 Reported-by: S, Shirish <Shirish.S@amd.com> Reported-by: Matthias Kaehlcke <mka@google.com> Suggested-by: James Y Knight <jyknight@google.com> Suggested-by: Nathan Chancellor <natechancellor@gmail.com> Signed-off-by: Nick Desaulniers <ndesaulniers@google.com> Tested-by: Guenter Roeck <linux@roeck-us.net> Tested-by: Matthias Kaehlcke <mka@chromium.org> Tested-by: Nathan Chancellor <natechancellor@gmail.com> Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

dileks · 2019-01-28T09:01:37Z

Tested-by: Sedat Dilek sedat.dilek@gmail.com [ Linux v5.0-rc3+ with LLVM/Clang asm-goto ]

Which versions of the Linux-kernel are affected?
I have seen this 1st with Linux v4.18-rc6 and CONFIG_DRM_AMDGPU=m, so I guess it is a good idea to get this patch into Linux v4.19 LTS, v4.20 and v5.0 as the patch is qeued for Linux v5.1?

[1] https://lore.kernel.org/r/0e782eaa-f495-61a2-b30e-61057dcfc611@amd.com/
[2] https://cgit.freedesktop.org/~agd5f/linux/commit/?h=drm-next-5.1&id=10117450735c7a7c0858095fb46a860e7037cb9a

ms178 · 2019-01-31T06:41:43Z

FYI: That patch was reverted due to causing this bug, https://bugs.freedesktop.org/show_bug.cgi?id=109487

Update: From the discussion on the LKML, it seems that older GCC (5.4) is fine, while 7.3 and 8.2 shows issues: https://lore.kernel.org/r/CAKwvOdkzVv4_hruDkbgiq0TRbPw6uyXAHZ+tAWyqN7CoyOLU_g@mail.gmail.com/ - Nick is already in the loop, so this post is just for documenting the issue here as well.

nickdesaulniers · 2019-02-14T18:32:24Z

@ms178 @groeck any idea what ended up happening here?

groeck · 2019-02-14T19:15:36Z

Nothing. AMD reassigned the Google-internal bug back to me. Presumably the assumption on the AMD side is that we'll have to track down and fix the problem(s) ourselves if we are interested building v4.19+ with clang.

nickdesaulniers · 2019-02-14T19:45:36Z

please cc me on it

ms178 · 2019-02-14T20:01:52Z

Unfortunately, I am also not of much help for you, as I am just an interested bystander of this project. If I spot something noteworthy on the mailing lists, I'll let you know though.

ms178 · 2019-02-14T20:20:19Z

As the bug was reported on Raven Ridge and was display related, that reminds me that the DC code on Raven Ridge relies on SSE code as pointed out to me here in Post No. 9: https://www.phoronix.com/forums/forum/linux-graphics-x-org-drivers/open-source-amd-linux/1070582-it-turns-out-amdgpu-kfd-compute-support-can-work-on-64-bit-arm

And there might be regressions with the mentioned different GCC versions with SSE2 enabled. But that's just an uneducated guess. ;)

… to undefined SW FP routines arch/x86/Makefile disables SSE and SSE2 for the whole kernel. The AMDGPU drivers modified in this patch re-enable SSE but not SSE2. Turn on SSE2 to support emitting double precision floating point instructions rather than calls to non-existent (usually available from gcc_s or compiler_rt) floating point helper routines for Clang. This was originally landed in: commit 1011745 ("drm/amd/display: add -msse2 to prevent Clang from emitting libcalls to undefined SW FP routines") but reverted in: commit 193392e ("Revert "drm/amd/display: add -msse2 to prevent Clang from emitting libcalls to undefined SW FP routines"") due to bugreports from GCC builds. Add guards to only do so for Clang. Link: https://bugs.freedesktop.org/show_bug.cgi?id=109487 Link: ClangBuiltLinux/linux#327 Suggested-by: Sedat Dilek <sedat.dilek@gmail.com> Suggested-by: Sami Tolvanen <samitolvanen@google.com> Signed-off-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

nickdesaulniers · 2019-08-07T21:35:50Z

accepted: https://cgit.freedesktop.org/~agd5f/linux/commit/?h=amd-staging-drm-next&id=b59953fdf2252a140cd6fab9ab5fe42ea07f0182

…o undefined SW FP routines arch/x86/Makefile disables SSE and SSE2 for the whole kernel. The AMDGPU drivers modified in this patch re-enable SSE but not SSE2. Turn on SSE2 to support emitting double precision floating point instructions rather than calls to non-existent (usually available from gcc_s or compiler_rt) floating point helper routines. Link: https://gcc.gnu.org/onlinedocs/gccint/Soft-float-library-routines.html Link: ClangBuiltLinux#327 Cc: stable@vger.kernel.org # 4.19 Reported-by: S, Shirish <Shirish.S@amd.com> Reported-by: Matthias Kaehlcke <mka@google.com> Suggested-by: James Y Knight <jyknight@google.com> Suggested-by: Nathan Chancellor <natechancellor@gmail.com> Signed-off-by: Nick Desaulniers <ndesaulniers@google.com> Tested-by: Guenter Roeck <linux@roeck-us.net> Tested-by: Matthias Kaehlcke <mka@chromium.org> Tested-by: Nathan Chancellor <natechancellor@gmail.com> Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> [added CONFIG_CC_IS_CLANG] Link: https://bugs.freedesktop.org/show_bug.cgi?id=109487 Signed-off-by: Sami Tolvanen <samitolvanen@google.com>

… to undefined SW FP routines arch/x86/Makefile disables SSE and SSE2 for the whole kernel. The AMDGPU drivers modified in this patch re-enable SSE but not SSE2. Turn on SSE2 to support emitting double precision floating point instructions rather than calls to non-existent (usually available from gcc_s or compiler_rt) floating point helper routines for Clang. This was originally landed in: commit 1011745 ("drm/amd/display: add -msse2 to prevent Clang from emitting libcalls to undefined SW FP routines") but reverted in: commit 193392e ("Revert "drm/amd/display: add -msse2 to prevent Clang from emitting libcalls to undefined SW FP routines"") due to bugreports from GCC builds. Add guards to only do so for Clang. Link: https://bugs.freedesktop.org/show_bug.cgi?id=109487 Link: ClangBuiltLinux/linux#327 Suggested-by: Sedat Dilek <sedat.dilek@gmail.com> Suggested-by: Sami Tolvanen <samitolvanen@google.com> Signed-off-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

… to undefined SW FP routines [ Upstream commit 0f0727d ] arch/x86/Makefile disables SSE and SSE2 for the whole kernel. The AMDGPU drivers modified in this patch re-enable SSE but not SSE2. Turn on SSE2 to support emitting double precision floating point instructions rather than calls to non-existent (usually available from gcc_s or compiler_rt) floating point helper routines for Clang. This was originally landed in: commit 1011745 ("drm/amd/display: add -msse2 to prevent Clang from emitting libcalls to undefined SW FP routines") but reverted in: commit 193392e ("Revert "drm/amd/display: add -msse2 to prevent Clang from emitting libcalls to undefined SW FP routines"") due to bugreports from GCC builds. Add guards to only do so for Clang. Link: https://bugs.freedesktop.org/show_bug.cgi?id=109487 Link: ClangBuiltLinux#327 Suggested-by: Sedat Dilek <sedat.dilek@gmail.com> Suggested-by: Sami Tolvanen <samitolvanen@google.com> Signed-off-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>

nathanchance · 2019-09-28T01:28:56Z

Merged into mainline: https://git.kernel.org/linus/0f0727d971f6fdf8f1077180d495ddb9928f0c8b

@nickdesaulniers reopen this issue for the runtime issues if you feel it necessary (or maybe file a new one given that this one is just about the build failure).

… to undefined SW FP routines [ Upstream commit 0f0727d ] arch/x86/Makefile disables SSE and SSE2 for the whole kernel. The AMDGPU drivers modified in this patch re-enable SSE but not SSE2. Turn on SSE2 to support emitting double precision floating point instructions rather than calls to non-existent (usually available from gcc_s or compiler_rt) floating point helper routines for Clang. This was originally landed in: commit 1011745 ("drm/amd/display: add -msse2 to prevent Clang from emitting libcalls to undefined SW FP routines") but reverted in: commit 193392e ("Revert "drm/amd/display: add -msse2 to prevent Clang from emitting libcalls to undefined SW FP routines"") due to bugreports from GCC builds. Add guards to only do so for Clang. Link: https://bugs.freedesktop.org/show_bug.cgi?id=109487 Link: ClangBuiltLinux#327 Suggested-by: Sedat Dilek <sedat.dilek@gmail.com> Suggested-by: Sami Tolvanen <samitolvanen@google.com> Signed-off-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>

… to undefined SW FP routines [ Upstream commit 0f0727d971f6fdf8f1077180d495ddb9928f0c8b ] arch/x86/Makefile disables SSE and SSE2 for the whole kernel. The AMDGPU drivers modified in this patch re-enable SSE but not SSE2. Turn on SSE2 to support emitting double precision floating point instructions rather than calls to non-existent (usually available from gcc_s or compiler_rt) floating point helper routines for Clang. This was originally landed in: commit 10117450735c ("drm/amd/display: add -msse2 to prevent Clang from emitting libcalls to undefined SW FP routines") but reverted in: commit 193392ed9f69 ("Revert "drm/amd/display: add -msse2 to prevent Clang from emitting libcalls to undefined SW FP routines"") due to bugreports from GCC builds. Add guards to only do so for Clang. Link: https://bugs.freedesktop.org/show_bug.cgi?id=109487 Link: ClangBuiltLinux/linux#327 Suggested-by: Sedat Dilek <sedat.dilek@gmail.com> Suggested-by: Sami Tolvanen <samitolvanen@google.com> Signed-off-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>

… to undefined SW FP routines commit 0f0727d upstream. arch/x86/Makefile disables SSE and SSE2 for the whole kernel. The AMDGPU drivers modified in this patch re-enable SSE but not SSE2. Turn on SSE2 to support emitting double precision floating point instructions rather than calls to non-existent (usually available from gcc_s or compiler_rt) floating point helper routines for Clang. This was originally landed in: commit 1011745 ("drm/amd/display: add -msse2 to prevent Clang from emitting libcalls to undefined SW FP routines") but reverted in: commit 193392e ("Revert "drm/amd/display: add -msse2 to prevent Clang from emitting libcalls to undefined SW FP routines"") due to bugreports from GCC builds. Add guards to only do so for Clang. Link: https://bugs.freedesktop.org/show_bug.cgi?id=109487 Link: ClangBuiltLinux/linux#327 Suggested-by: Sedat Dilek <sedat.dilek@gmail.com> Suggested-by: Sami Tolvanen <samitolvanen@google.com> Signed-off-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

… to undefined SW FP routines [ Upstream commit 0f0727d ] arch/x86/Makefile disables SSE and SSE2 for the whole kernel. The AMDGPU drivers modified in this patch re-enable SSE but not SSE2. Turn on SSE2 to support emitting double precision floating point instructions rather than calls to non-existent (usually available from gcc_s or compiler_rt) floating point helper routines for Clang. This was originally landed in: commit 1011745 ("drm/amd/display: add -msse2 to prevent Clang from emitting libcalls to undefined SW FP routines") but reverted in: commit 193392e ("Revert "drm/amd/display: add -msse2 to prevent Clang from emitting libcalls to undefined SW FP routines"") due to bugreports from GCC builds. Add guards to only do so for Clang. Link: https://bugs.freedesktop.org/show_bug.cgi?id=109487 Link: ClangBuiltLinux/linux#327 Suggested-by: Sedat Dilek <sedat.dilek@gmail.com> Suggested-by: Sami Tolvanen <samitolvanen@google.com> Signed-off-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>

… to undefined SW FP routines [ Upstream commit 0f0727d971f6fdf8f1077180d495ddb9928f0c8b ] arch/x86/Makefile disables SSE and SSE2 for the whole kernel. The AMDGPU drivers modified in this patch re-enable SSE but not SSE2. Turn on SSE2 to support emitting double precision floating point instructions rather than calls to non-existent (usually available from gcc_s or compiler_rt) floating point helper routines for Clang. This was originally landed in: commit 1011745 ("drm/amd/display: add -msse2 to prevent Clang from emitting libcalls to undefined SW FP routines") but reverted in: commit 193392e ("Revert "drm/amd/display: add -msse2 to prevent Clang from emitting libcalls to undefined SW FP routines"") due to bugreports from GCC builds. Add guards to only do so for Clang. Link: https://bugs.freedesktop.org/show_bug.cgi?id=109487 Link: ClangBuiltLinux/linux#327 Suggested-by: Sedat Dilek <sedat.dilek@gmail.com> Suggested-by: Sami Tolvanen <samitolvanen@google.com> Signed-off-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>

… to undefined SW FP routines BugLink: https://bugs.launchpad.net/bugs/1847663 [ Upstream commit 0f0727d971f6fdf8f1077180d495ddb9928f0c8b ] arch/x86/Makefile disables SSE and SSE2 for the whole kernel. The AMDGPU drivers modified in this patch re-enable SSE but not SSE2. Turn on SSE2 to support emitting double precision floating point instructions rather than calls to non-existent (usually available from gcc_s or compiler_rt) floating point helper routines for Clang. This was originally landed in: commit 10117450735c ("drm/amd/display: add -msse2 to prevent Clang from emitting libcalls to undefined SW FP routines") but reverted in: commit 193392ed9f69 ("Revert "drm/amd/display: add -msse2 to prevent Clang from emitting libcalls to undefined SW FP routines"") due to bugreports from GCC builds. Add guards to only do so for Clang. Link: https://bugs.freedesktop.org/show_bug.cgi?id=109487 Link: ClangBuiltLinux/linux#327 Suggested-by: Sedat Dilek <sedat.dilek@gmail.com> Suggested-by: Sami Tolvanen <samitolvanen@google.com> Signed-off-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Kamal Mostafa <kamal@canonical.com> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>

… to undefined SW FP routines BugLink: https://bugs.launchpad.net/bugs/1848042 commit 0f0727d971f6fdf8f1077180d495ddb9928f0c8b upstream. arch/x86/Makefile disables SSE and SSE2 for the whole kernel. The AMDGPU drivers modified in this patch re-enable SSE but not SSE2. Turn on SSE2 to support emitting double precision floating point instructions rather than calls to non-existent (usually available from gcc_s or compiler_rt) floating point helper routines for Clang. This was originally landed in: commit 1011745 ("drm/amd/display: add -msse2 to prevent Clang from emitting libcalls to undefined SW FP routines") but reverted in: commit 193392e ("Revert "drm/amd/display: add -msse2 to prevent Clang from emitting libcalls to undefined SW FP routines"") due to bugreports from GCC builds. Add guards to only do so for Clang. Link: https://bugs.freedesktop.org/show_bug.cgi?id=109487 Link: ClangBuiltLinux/linux#327 Suggested-by: Sedat Dilek <sedat.dilek@gmail.com> Suggested-by: Sami Tolvanen <samitolvanen@google.com> Signed-off-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>

Since 62240a8 ("PCI: rockchip: Drop storing driver private outbound resource data), the offset calculation is wrong to access the register number to program the IO outbound ATU. Fix this by computing the ATU IO register number based on the number of MEM registers, not the size of the IO region. This causes 'synchronous external aborts' like the following: mwifiex_pcie 0000:01:00.0: enabling device (0000 -> 0002) mwifiex_pcie: PCI memory map Virt0: 00000000a573ad00 PCI memory map Virt2: 00000000783126c4 Internal error: synchronous external abort: 96000210 [#1] PREEMPT SMP Modules linked in: mwifiex_pcie(+) mwifiex uvcvideo cfg80211 atmel_mxt_ts videobuf2_vmalloc ... CPU: 2 PID: 269 Comm: systemd-udevd Not tainted 5.4.0+ #327 Hardware name: Google Kevin (DT) pstate: 60000005 (nZCv daif -PAN -UAO) pc : mwifiex_register_dev+0x264/0x3f8 [mwifiex_pcie] lr : mwifiex_register_dev+0x150/0x3f8 [mwifiex_pcie] sp : ffff800012073860 x29: ffff800012073860 x28: ffff8000100a2e28 x27: ffff8000118b6210 x26: ffff800008f57458 x25: ffff0000ecfda000 x24: 0000000000000001 x23: ffff0000e9905080 x22: ffff800008f5d000 x21: ffff0000eecea078 x20: ffff0000e9905080 x19: ffff0000eecea000 x18: 0000000000000001 x17: 0000000000000000 x16: 0000000000000000 x15: ffffffffffffffff x14: ffff8000118998c8 x13: ffff000000000000 x12: 0000000000000008 x11: 0101010101010101 x10: ffff7f7fffff7fff x9 : 0000000000000000 x8 : ffff0000e3c24240 x7 : 0000000000000000 x6 : ffff0000e3c24148 x5 : ffff0000e3c24148 x4 : ffff0000e7975ec8 x3 : 0000000000000001 x2 : 0000000000002b42 x1 : ffff800012c00008 x0 : ffff0000e9905080 Call trace: mwifiex_register_dev+0x264/0x3f8 [mwifiex_pcie] mwifiex_add_card+0x2f8/0x430 [mwifiex] mwifiex_pcie_probe+0x98/0x148 [mwifiex_pcie] pci_device_probe+0x110/0x1a8 ... Code: a8c67bfd d65f03c0 f942ac01 91002021 (b9400021) Suggested-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Fixes: 62240a8 ("PCI: rockchip: Drop storing driver private outbound resource data) Link: https://lore.kernel.org/r/20191211093450.7481-1-enric.balletbo@collabora.com Reported-by: Enric Balletbo i Serra <enric.balletbo@collabora.com> Reported-by: Vicente Bergas <vicencb@gmail.com> Tested-by: Vicente Bergas <vicencb@gmail.com> Signed-off-by: Enric Balletbo i Serra <enric.balletbo@collabora.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Andrew Murray <andrew.murray@arm.com>

…o undefined SW FP routines arch/x86/Makefile disables SSE and SSE2 for the whole kernel. The AMDGPU drivers modified in this patch re-enable SSE but not SSE2. Turn on SSE2 to support emitting double precision floating point instructions rather than calls to non-existent (usually available from gcc_s or compiler_rt) floating point helper routines. Link: https://gcc.gnu.org/onlinedocs/gccint/Soft-float-library-routines.html Link: ClangBuiltLinux/linux#327 Cc: stable@vger.kernel.org # 4.19 Reported-by: S, Shirish <Shirish.S@amd.com> Reported-by: Matthias Kaehlcke <mka@google.com> Suggested-by: James Y Knight <jyknight@google.com> Suggested-by: Nathan Chancellor <natechancellor@gmail.com> Signed-off-by: Nick Desaulniers <ndesaulniers@google.com> Tested-by: Guenter Roeck <linux@roeck-us.net> Tested-by: Matthias Kaehlcke <mka@chromium.org> Tested-by: Nathan Chancellor <natechancellor@gmail.com> Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

nickdesaulniers added [BUG] Untriaged Something isn't working [ARCH] x86_64 This bug impacts ARCH=x86_64 labels Jan 24, 2019

nickdesaulniers added [BUG] linux A bug that should be fixed in the mainline kernel. [PATCH] Exists There is a patch that fixes this issue and removed [BUG] Untriaged Something isn't working labels Jan 25, 2019

nickdesaulniers added [PATCH] Submitted A patch has been submitted for review and removed [PATCH] Exists There is a patch that fixes this issue labels Jan 25, 2019

nickdesaulniers self-assigned this Jan 25, 2019

nathanchance added [PATCH] Accepted A submitted patch has been accepted upstream and removed [PATCH] Submitted A patch has been submitted for review labels Jan 26, 2019

dileks mentioned this issue Feb 11, 2019

support for asm goto #6

Closed

nickdesaulniers added [PATCH] Accepted A submitted patch has been accepted upstream and removed [PATCH] Submitted A patch has been submitted for review labels Aug 7, 2019

nathanchance closed this as completed Sep 28, 2019

nathanchance added [FIXED][LINUX] 5.4 This bug was fixed in Linux 5.4 and removed [PATCH] Accepted A submitted patch has been accepted upstream labels Sep 28, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

undefined libm functions in AMDGPU #327

undefined libm functions in AMDGPU #327

nickdesaulniers commented Jan 24, 2019

groeck commented Jan 24, 2019

jyknight commented Jan 24, 2019

nathanchance commented Jan 24, 2019

nickdesaulniers commented Jan 24, 2019 •

edited

Loading

nickdesaulniers commented Jan 24, 2019

nickdesaulniers commented Jan 25, 2019

groeck commented Jan 25, 2019

nickdesaulniers commented Jan 25, 2019

nickdesaulniers commented Jan 25, 2019 •

edited by nathanchance

Loading

dileks commented Jan 25, 2019 •

edited by nathanchance

Loading

nathanchance commented Jan 26, 2019

dileks commented Jan 28, 2019 •

edited by nathanchance

Loading

ms178 commented Jan 31, 2019 •

edited by nathanchance

Loading

nickdesaulniers commented Feb 14, 2019

groeck commented Feb 14, 2019

nickdesaulniers commented Feb 14, 2019

ms178 commented Feb 14, 2019

ms178 commented Feb 14, 2019 •

edited

Loading

nickdesaulniers commented Aug 7, 2019

nathanchance commented Sep 28, 2019

undefined libm functions in AMDGPU #327

undefined libm functions in AMDGPU #327

Comments

nickdesaulniers commented Jan 24, 2019

groeck commented Jan 24, 2019

jyknight commented Jan 24, 2019

nathanchance commented Jan 24, 2019

nickdesaulniers commented Jan 24, 2019 • edited Loading

nickdesaulniers commented Jan 24, 2019

nickdesaulniers commented Jan 25, 2019

groeck commented Jan 25, 2019

nickdesaulniers commented Jan 25, 2019

nickdesaulniers commented Jan 25, 2019 • edited by nathanchance Loading

dileks commented Jan 25, 2019 • edited by nathanchance Loading

nathanchance commented Jan 26, 2019

dileks commented Jan 28, 2019 • edited by nathanchance Loading

ms178 commented Jan 31, 2019 • edited by nathanchance Loading

nickdesaulniers commented Feb 14, 2019

groeck commented Feb 14, 2019

nickdesaulniers commented Feb 14, 2019

ms178 commented Feb 14, 2019

ms178 commented Feb 14, 2019 • edited Loading

nickdesaulniers commented Aug 7, 2019

nathanchance commented Sep 28, 2019

nickdesaulniers commented Jan 24, 2019 •

edited

Loading

nickdesaulniers commented Jan 25, 2019 •

edited by nathanchance

Loading

dileks commented Jan 25, 2019 •

edited by nathanchance

Loading

dileks commented Jan 28, 2019 •

edited by nathanchance

Loading

ms178 commented Jan 31, 2019 •

edited by nathanchance

Loading

ms178 commented Feb 14, 2019 •

edited

Loading