Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ACPICA: utdebug.c: use spinlocks to fix the data-races reported by th… #893

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

mtodorov3-69
Copy link

…e KCSAN

Linux kernel's concurency sanitiser KCSAN reported hundreds of instances of data-races in ACPICA like this one:

[ 6.994149] ==================================================================
[ 6.994443] BUG: KCSAN: data-race in acpi_ut_status_exit / acpi_ut_trace

[ 6.994795] write to 0xffffffffbae5a884 of 4 bytes by task 0 on cpu 2:
[ 6.994944] acpi_ut_status_exit (/home/marvin/linux/kernel/torvalds2/drivers/acpi/acpica/utdebug.c:467)
[ 6.994957] acpi_hw_register_read (/home/marvin/linux/kernel/torvalds2/drivers/acpi/acpica/hwregs.c:563)
[ 6.994968] acpi_read_bit_register (/home/marvin/linux/kernel/torvalds2/drivers/acpi/acpica/hwxface.c:171)
[ 6.994980] acpi_idle_bm_check (/home/marvin/linux/kernel/torvalds2/drivers/acpi/processor_idle.c:511)
[ 6.994990] acpi_idle_enter_bm (/home/marvin/linux/kernel/torvalds2/drivers/acpi/processor_idle.c:644 (discriminator 1))
[ 6.995000] acpi_idle_enter (/home/marvin/linux/kernel/torvalds2/drivers/acpi/processor_idle.c:695)
[ 6.995010] cpuidle_enter_state (/home/marvin/linux/kernel/torvalds2/drivers/cpuidle/cpuidle.c:267)
[ 6.995019] cpuidle_enter (/home/marvin/linux/kernel/torvalds2/drivers/cpuidle/cpuidle.c:390)
[ 6.995027] call_cpuidle (/home/marvin/linux/kernel/torvalds2/kernel/sched/idle.c:135)
[ 6.995038] do_idle (/home/marvin/linux/kernel/torvalds2/kernel/sched/idle.c:219 /home/marvin/linux/kernel/torvalds2/kernel/sched/idle.c:282)
[ 6.995046] cpu_startup_entry (/home/marvin/linux/kernel/torvalds2/kernel/sched/idle.c:378 (discriminator 1))
[ 6.995055] start_secondary (/home/marvin/linux/kernel/torvalds2/arch/x86/kernel/smpboot.c:210 /home/marvin/linux/kernel/torvalds2/arch/x86/kernel/smpboot.c:294)
[ 6.995066] secondary_startup_64_no_verify (/home/marvin/linux/kernel/torvalds2/arch/x86/kernel/head_64.S:433)

[ 6.995121] read to 0xffffffffbae5a884 of 4 bytes by task 0 on cpu 9:
[ 6.995267] acpi_ut_trace (/home/marvin/linux/kernel/torvalds2/drivers/acpi/acpica/utdebug.c:263)
[ 6.995279] acpi_hw_validate_io_request (/home/marvin/linux/kernel/torvalds2/drivers/acpi/acpica/hwvalid.c:101)
[ 6.995291] acpi_hw_read_port (/home/marvin/linux/kernel/torvalds2/drivers/acpi/acpica/hwvalid.c:202)
[ 6.995303] acpi_hw_read (/home/marvin/linux/kernel/torvalds2/drivers/acpi/acpica/hwregs.c:251)
[ 6.995313] acpi_hw_register_read (/home/marvin/linux/kernel/torvalds2/drivers/acpi/acpica/hwregs.c:725 /home/marvin/linux/kernel/torvalds2/drivers/acpi/acpica/hwregs.c:499)
[ 6.995325] acpi_read_bit_register (/home/marvin/linux/kernel/torvalds2/drivers/acpi/acpica/hwxface.c:171)
[ 6.995336] acpi_idle_bm_check (/home/marvin/linux/kernel/torvalds2/drivers/acpi/processor_idle.c:511)
[ 6.995346] acpi_idle_enter_bm (/home/marvin/linux/kernel/torvalds2/drivers/acpi/processor_idle.c:644 (discriminator 1))
[ 6.995356] acpi_idle_enter (/home/marvin/linux/kernel/torvalds2/drivers/acpi/processor_idle.c:695)
[ 6.995366] cpuidle_enter_state (/home/marvin/linux/kernel/torvalds2/drivers/cpuidle/cpuidle.c:267)
[ 6.995375] cpuidle_enter (/home/marvin/linux/kernel/torvalds2/drivers/cpuidle/cpuidle.c:390)
[ 6.995383] call_cpuidle (/home/marvin/linux/kernel/torvalds2/kernel/sched/idle.c:135)
[ 6.995394] do_idle (/home/marvin/linux/kernel/torvalds2/kernel/sched/idle.c:219 /home/marvin/linux/kernel/torvalds2/kernel/sched/idle.c:282)
[ 6.995402] cpu_startup_entry (/home/marvin/linux/kernel/torvalds2/kernel/sched/idle.c:378 (discriminator 1))
[ 6.995411] start_secondary (/home/marvin/linux/kernel/torvalds2/arch/x86/kernel/smpboot.c:210 /home/marvin/linux/kernel/torvalds2/arch/x86/kernel/smpboot.c:294)
[ 6.995422] secondary_startup_64_no_verify (/home/marvin/linux/kernel/torvalds2/arch/x86/kernel/head_64.S:433)

[ 6.995476] value changed: 0x00000004 -> 0x00000002

[ 6.995629] Reported by Kernel Concurrency Sanitizer on:
[ 6.995748] CPU: 9 PID: 0 Comm: swapper/9 Not tainted 6.6.0-rc2-kcsan-00003-g16819584c239-dirty #21
[ 6.995758] Hardware name: ASRock X670E PG Lightning/X670E PG Lightning, BIOS 1.21 04/26/2023
[ 6.995765] ==================================================================

Please find the complete list at: https://domac.alu.unizg.hr/~mtodorov/linux/patches/acpica_utdebug/acpi_ut_status_exit.log.xz

A number of unprotected increments:

    AcpiGbl_NestingLevel++;

and conditional statements:

    if (AcpiGbl_NestingLevel) {
            AcpiGbl_NestingLevel--;
    }

no longer work in SMP environment.

Proper locking like

    spin_lock(&AcpiUtdebug_Lock);
    AcpiGbl_NestingLevel++;
    spin_unlock(&AcpiUtdebug_Lock);

and

    spin_lock(&AcpiUtdebug_Lock);
    if (AcpiGbl_NestingLevel) {
            AcpiGbl_NestingLevel--;
    }
    spin_unlock(&AcpiUtdebug_Lock);

makes these data-races go away.

Additionally, READ_ONCE() or WRITE_ONCE() is required with the global variable AcpiGbl_NestingLevel to prevent unwanted read or write reordering or other funny stuff the optmisers do.

The patch eliminates KCSAN BUG warnings.

Reported-by: Mirsad Goran Todorovac mirsad.todorovac@alu.unizg.hr
Fixes: 6be2d72b18649 ("ACPICA: Update for a few debug output statements")
Fixes: bf9b448ef8430 ("ACPICA: Debug output: Do not emit function nesting level for kernel build.")
Fixes: 6e875fa0480c1 ("ACPICA: Debugger: fix slight indentation issue")
Fixes: ^1da177e4c3f4 ("Initial git repository build.")
Cc: Jung-uk Kim jkim@FreeBSD.org
Cc: Linus Torvalds torvalds@linux-foundation.org
Cc: Erik Kaneda erik.kaneda@intel.com
Cc: Bob Moore robert.moore@intel.com
Cc: "Rafael J. Wysocki" rafael.j.wysocki@intel.com
Cc: Len Brown lenb@kernel.org
Cc: linux-acpi@vger.kernel.org
Cc: acpica-devel@lists.linuxfoundation.org

…e KCSAN

Linux kernel's concurency sanitiser KCSAN reported hundreds of instances of data-races in ACPICA
like this one:

[    6.994149] ==================================================================
[    6.994443] BUG: KCSAN: data-race in acpi_ut_status_exit / acpi_ut_trace

[    6.994795] write to 0xffffffffbae5a884 of 4 bytes by task 0 on cpu 2:
[    6.994944] acpi_ut_status_exit (/home/marvin/linux/kernel/torvalds2/drivers/acpi/acpica/utdebug.c:467)
[    6.994957] acpi_hw_register_read (/home/marvin/linux/kernel/torvalds2/drivers/acpi/acpica/hwregs.c:563)
[    6.994968] acpi_read_bit_register (/home/marvin/linux/kernel/torvalds2/drivers/acpi/acpica/hwxface.c:171)
[    6.994980] acpi_idle_bm_check (/home/marvin/linux/kernel/torvalds2/drivers/acpi/processor_idle.c:511)
[    6.994990] acpi_idle_enter_bm (/home/marvin/linux/kernel/torvalds2/drivers/acpi/processor_idle.c:644 (discriminator 1))
[    6.995000] acpi_idle_enter (/home/marvin/linux/kernel/torvalds2/drivers/acpi/processor_idle.c:695)
[    6.995010] cpuidle_enter_state (/home/marvin/linux/kernel/torvalds2/drivers/cpuidle/cpuidle.c:267)
[    6.995019] cpuidle_enter (/home/marvin/linux/kernel/torvalds2/drivers/cpuidle/cpuidle.c:390)
[    6.995027] call_cpuidle (/home/marvin/linux/kernel/torvalds2/kernel/sched/idle.c:135)
[    6.995038] do_idle (/home/marvin/linux/kernel/torvalds2/kernel/sched/idle.c:219 /home/marvin/linux/kernel/torvalds2/kernel/sched/idle.c:282)
[    6.995046] cpu_startup_entry (/home/marvin/linux/kernel/torvalds2/kernel/sched/idle.c:378 (discriminator 1))
[    6.995055] start_secondary (/home/marvin/linux/kernel/torvalds2/arch/x86/kernel/smpboot.c:210 /home/marvin/linux/kernel/torvalds2/arch/x86/kernel/smpboot.c:294)
[    6.995066] secondary_startup_64_no_verify (/home/marvin/linux/kernel/torvalds2/arch/x86/kernel/head_64.S:433)

[    6.995121] read to 0xffffffffbae5a884 of 4 bytes by task 0 on cpu 9:
[    6.995267] acpi_ut_trace (/home/marvin/linux/kernel/torvalds2/drivers/acpi/acpica/utdebug.c:263)
[    6.995279] acpi_hw_validate_io_request (/home/marvin/linux/kernel/torvalds2/drivers/acpi/acpica/hwvalid.c:101)
[    6.995291] acpi_hw_read_port (/home/marvin/linux/kernel/torvalds2/drivers/acpi/acpica/hwvalid.c:202)
[    6.995303] acpi_hw_read (/home/marvin/linux/kernel/torvalds2/drivers/acpi/acpica/hwregs.c:251)
[    6.995313] acpi_hw_register_read (/home/marvin/linux/kernel/torvalds2/drivers/acpi/acpica/hwregs.c:725 /home/marvin/linux/kernel/torvalds2/drivers/acpi/acpica/hwregs.c:499)
[    6.995325] acpi_read_bit_register (/home/marvin/linux/kernel/torvalds2/drivers/acpi/acpica/hwxface.c:171)
[    6.995336] acpi_idle_bm_check (/home/marvin/linux/kernel/torvalds2/drivers/acpi/processor_idle.c:511)
[    6.995346] acpi_idle_enter_bm (/home/marvin/linux/kernel/torvalds2/drivers/acpi/processor_idle.c:644 (discriminator 1))
[    6.995356] acpi_idle_enter (/home/marvin/linux/kernel/torvalds2/drivers/acpi/processor_idle.c:695)
[    6.995366] cpuidle_enter_state (/home/marvin/linux/kernel/torvalds2/drivers/cpuidle/cpuidle.c:267)
[    6.995375] cpuidle_enter (/home/marvin/linux/kernel/torvalds2/drivers/cpuidle/cpuidle.c:390)
[    6.995383] call_cpuidle (/home/marvin/linux/kernel/torvalds2/kernel/sched/idle.c:135)
[    6.995394] do_idle (/home/marvin/linux/kernel/torvalds2/kernel/sched/idle.c:219 /home/marvin/linux/kernel/torvalds2/kernel/sched/idle.c:282)
[    6.995402] cpu_startup_entry (/home/marvin/linux/kernel/torvalds2/kernel/sched/idle.c:378 (discriminator 1))
[    6.995411] start_secondary (/home/marvin/linux/kernel/torvalds2/arch/x86/kernel/smpboot.c:210 /home/marvin/linux/kernel/torvalds2/arch/x86/kernel/smpboot.c:294)
[    6.995422] secondary_startup_64_no_verify (/home/marvin/linux/kernel/torvalds2/arch/x86/kernel/head_64.S:433)

[    6.995476] value changed: 0x00000004 -> 0x00000002

[    6.995629] Reported by Kernel Concurrency Sanitizer on:
[    6.995748] CPU: 9 PID: 0 Comm: swapper/9 Not tainted 6.6.0-rc2-kcsan-00003-g16819584c239-dirty acpica#21
[    6.995758] Hardware name: ASRock X670E PG Lightning/X670E PG Lightning, BIOS 1.21 04/26/2023
[    6.995765] ==================================================================

Please find the complete list at: https://domac.alu.unizg.hr/~mtodorov/linux/patches/acpica_utdebug/acpi_ut_status_exit.log.xz

A number of unprotected increments:

        AcpiGbl_NestingLevel++;

and conditional statements:

        if (AcpiGbl_NestingLevel) {
                AcpiGbl_NestingLevel--;
        }

no longer work in SMP environment.

Proper locking like

        spin_lock(&AcpiUtdebug_Lock);
        AcpiGbl_NestingLevel++;
        spin_unlock(&AcpiUtdebug_Lock);

and

        spin_lock(&AcpiUtdebug_Lock);
        if (AcpiGbl_NestingLevel) {
                AcpiGbl_NestingLevel--;
        }
        spin_unlock(&AcpiUtdebug_Lock);

makes these data-races go away.

Additionally, READ_ONCE() or WRITE_ONCE() is required with the global variable
AcpiGbl_NestingLevel to prevent unwanted read or write reordering or other funny
stuff the optmisers do.

The patch eliminates KCSAN BUG warnings.

Reported-by: Mirsad Goran Todorovac <mirsad.todorovac@alu.unizg.hr>
Fixes: 6be2d72b18649 ("ACPICA: Update for a few debug output statements")
Fixes: bf9b448ef8430 ("ACPICA: Debug output: Do not emit function nesting level for kernel build.")
Fixes: 6e875fa0480c1 ("ACPICA: Debugger: fix slight indentation issue")
Fixes: ^1da177e4c3f4 ("Initial git repository build.")
Cc: Jung-uk Kim <jkim@FreeBSD.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Erik Kaneda <erik.kaneda@intel.com>
Cc: Bob Moore <robert.moore@intel.com>
Cc: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>
Cc: Len Brown <lenb@kernel.org>
Cc: linux-acpi@vger.kernel.org
Cc: acpica-devel@lists.linuxfoundation.org
Signed-off-by: Mirsad Goran Todorovac <mirsad.todorovac@alu.unizg.hr>
@juikim
Copy link
Contributor

juikim commented Sep 20, 2023

Note it should be done per compiler/platform as macros via platform-specific atomic operations.

@mtodorov3-69
Copy link
Author

mtodorov3-69 commented Sep 20, 2023 via email

@mtodorov3-69
Copy link
Author

mtodorov3-69 commented Sep 20, 2023 via email

@sacdintel
Copy link
Collaborator

The Linux specific macros go in aclinux.h or aclinuxex.h and you will need to make some changes as per the code review too. You can create "ACPI_Entity" for the same equivalent Linux "Entity" (eg. Entity = spin_lock / unlock or WRITE / READ_ONCE() ) or else those entities won't be supported in other OS thereby breaking their build just what Jung-uk Kim suggested!

We will need to look at the linuxize script to see where and why it fails to do that conversion, but we might need some more time from our end to rectify them. In the meantime please feel free to add more commits to this PR to address those previous concerns, thanks!

@@ -161,6 +161,8 @@

#ifdef ACPI_DEBUG_OUTPUT

static DEFINE_SPINLOCK(AcpiUtdebug_lock);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in ln 164 there is a typo and it needs to be _Lock which is what is used everywhere else below this definition.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants