Skip to content

Conversation

@nchatrad
Copy link
Collaborator

splitting the previous PR 135

Copy link
Collaborator

@mahkurap mahkurap left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add apml_alert dts changes in Kenya and Nigeria device trees as well.
Also, change congo to Congo, morcco to Morocco in one of the commit message.

.of_match_table = of_match_ptr(apml_alertl_dt_ids),
},
.probe = apml_alertl_probe,
.remove = alert_remove,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for uniformity, change alret_remove to apml_alaertl_remove()

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

* 0x3c, 0x4c -> Socket 0, die 0,
* 0x44 -> Socket 0, die 1,
* 0x38, 0x48 -> Socket 1, die 0,
* 0x4c -> Socket 1, die 1.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo ? 0x45?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, corrected

return ret;
}

static u8 static_addr_to_socket(u8 static_addr)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this rule is valid incase 2X1P configuration boot .

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As per my understanding
2X1P and 2P can be used interchangeably.
As the PID and static address are related to Socket and Die ID which are stored in CSR registers(RO)
This should be valid.

return 0;
}

static irqreturn_t alert_l_irq_thread_handler(int irq, void *dev_id)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Error Handling , general: Since this is alert handling, it's better to follow a best-effort approach for error handling?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the input.
Update the same

#endif
};

/ {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Commit message : start with ARM64:dts:aspeed:

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@@ -0,0 +1,50 @@
# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Commit message Please align with upstream style : dt-bindings: misc: apml-alert

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@ojayanth
Copy link
Collaborator

Please ensure your commit message style aligns with the conventions used in the relevant upstream directory

gpio:
maxItems: 1
description: |
GPIO specifier for APML Alert_L line. Specify GPIO controller, GPIO pin
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: white space?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will update, thanks


MODULE_ALIAS("apml_alertl:" DRIVER_NAME);

/* Read TSI Status register to identify the RAS error */
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: correct the comment

return IRQ_NONE;

for (i = 0; i < oob_adata->num_of_tsi_devs; i++) {
rt_src = 0;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: can be removed. no effect statement

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

rt_src = 0;
/* Check for NULL pointer */
if (IS_ERR_OR_NULL(oob_adata->tsi_dev[i])) {
dev_err(dev, "TSI device does not exist\n");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add "i", and any other data to the log, will be more helpful during debug.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, added

msg.data_in.reg_in[REG_OFF_INDEX] = TSI_STATUS_REG;

mutex_lock(&oob_adata->tsi_dev[soc_die_num]->lock);
ret = regmap_read(oob_adata->tsi_dev[soc_die_num]->regmap,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if the regmap_read() fails for any reason, does *temp_status set to zero?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

temp_status is initialized to zero

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

temp_status is initialized outside of the for loop(). In case of failures, we are checking the previously returned by the previous iteration.

/* Check for TSI alert */
ret = tsi_alert_check(oob_adata, &temp_status, i);
if (ret < 0)
dev_err(dev, "Failed to read TSI status register\n");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Additional data in the error logging will be of great help during debugging. Add more context to the logging.

if (ret < 0)
dev_err(dev, "Failed to read TSI status register\n");

if (!temp_status)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

temp_status is not guaranteed to set to zero incase of failures. add this check to the previous check.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

temp_status initialized to zero. On failure, temp_status remains zero

rt_src = (temp_status << 24);
static_addr = oob_adata->tsi_dev[i]->dev_static_addr;

ret = send_uevent(static_addr, rt_src, dev);
Copy link
Collaborator

@mahkurap mahkurap Aug 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rt_src can be replaced with (temp_status << 24). rt_src is not used anywhere else. Same with the variable static_addr.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done


ret = send_uevent(static_addr, rt_src, dev);
if (ret)
dev_err(dev, "Failed to send TSI uevent Err: %d\n", ret);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add (temp_status << 24), and oob_adata->tsi_dev[i]->dev_static_addr to logging.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added more info

struct device *dev = oob_adata->dev;
unsigned int ras_status = 0;
unsigned int temp_status = 0;
u32 rt_src = 0;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

optimization: rt_src, and static_addr variables are not needed/

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done


static irqreturn_t alert_l_irq_thread_handler(int irq, void *dev_id)
{
struct apml_alertl_data *oob_adata = dev_id;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no need to cast?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not necessarily. i have updated. Thanks


/* Check for TSI alert */
ret = tsi_alert_check(oob_adata, &temp_status, i);
if (ret < 0)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tsi_alrert_check() returns 0 if the input parameters are invalid. Change this condition to "<=" or change the tsi_alert_check() to return a negative integer incase of input error validation.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not required with new changes

int ret;

if (!oob_adata->rmi_dev[soc_die_num] || !oob_adata->rmi_dev[soc_die_num]->regmap)
return 0;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

change to a negative value. 0 return code is ==> success

int ret;

if (!oob_adata->tsi_dev[soc_die_num] || !oob_adata->tsi_dev[soc_die_num]->regmap)
return 0;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return negative integer in case of error.

Copy link
Collaborator

@mahkurap mahkurap left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please check comments.

default n
help
If you say yes here you get support for emulated alertl
interface on AMD SoCs with APML interface connected to a BMC device.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NIT:
Alert_L may be asserted for various conditions, including machine check
exceptions, system fatal errors, APML command completions, and thermal
alerts

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks, updated!

@@ -0,0 +1,390 @@
// SPDX-License-Identifier: GPL-2.0-or-later
/*
* alert_l.c - Alert_l driver for AMD APML devices
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fix file name apml_alertl.c

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done, thanks

/*
* alert_l.c - Alert_l driver for AMD APML devices
*
* Copyright (C) 2022-2023 Advanced Micro Devices, Inc.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2025-2026?

* Copyright (C) 2022-2023 Advanced Micro Devices, Inc.
*/

#include <linux/debugfs.h>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NIT: reorder based Kernel coding style
Ordering Logic:
Core kernel headers: module.h, init.h
Platform and device-related: platform_device.h, interrupt.h, regmap.h
Peripheral interfaces: gpio, debugfs, i3c
Device tree support: of.h, of_gpio.h
Local headers: "sbrmi-common.h" comes last

#include <linux/module.h>
#include <linux/init.h>
#include <linux/platform_device.h>
#include <linux/interrupt.h>
#include <linux/regmap.h>
#include <linux/gpio/consumer.h>
#include <linux/debugfs.h>
#include <linux/i3c/device.h>
#include <linux/of.h>
#include <linux/of_gpio.h>

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the info. We have been following the alphabetical order of arranging headers.

#define DRIVER_NAME "apml_alertl"

#define RAS_STATUS_REG 0x4C
#define STATUS_REG 0x2
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NIT be constient on formating.

@mahkurap
Copy link
Collaborator

mahkurap commented Aug 13, 2025 via email

@mahkurap
Copy link
Collaborator

mahkurap commented Aug 13, 2025 via email

@mahkurap
Copy link
Collaborator

mahkurap commented Aug 13, 2025 via email

@mahkurap
Copy link
Collaborator

mahkurap commented Aug 13, 2025 via email

@sathyapk
Copy link
Collaborator

[AMD Official Use Only - AMD Internal Distribution Only] Temp_status is initialized to zero outside of the loop. It will get modified in the for() loop. Imagine a case where in the first iteration of the loop is successful and the second iteration fails. Then we are checking on the temp_status updated in the last successful transaction. From: sathyapk @.> Sent: Wednesday, August 13, 2025 2:38 AM To: AMDESE/linux-aspeed @.> Cc: Kurapati, Mahesh @.>; Review requested @.> Subject: Re: [AMDESE/linux-aspeed] alert_L module for handling oob alerts (PR #143) Caution: This message originated from an External Source. Use proper caution when opening attachments, clicking links, or responding. @sathyapk commented on this pull request.
________________________________ In drivers/misc/amd-apml/apml_alertl.c<#143 (comment)>:

This is a miss, thanks for pointing it out. I will fix this.

@sathyapk
Copy link
Collaborator

[AMD Official Use Only - AMD Internal Distribution Only] Bottom halfs are called/handled as part of work queues as far as I remember. I could be wrong. Function tsi_alert_check() is called in “static irqreturn_t alert_l_irq_thread_handler(int irq, void dev_id)” which is an irq handler. From: sathyapk @.> Sent: Wednesday, August 13, 2025 2:40 AM To: AMDESE/linux-aspeed @.> Cc: Kurapati, Mahesh @.>; Review requested @.> Subject: Re: [AMDESE/linux-aspeed] alert_L module for handling oob alerts (PR #143) Caution: This message originated from an External Source. Use proper caution when opening attachments, clicking links, or responding. @sathyapk commented on this pull request.
________________________________ In drivers/misc/amd-apml/apml_alertl.c<#143 (comment)>:
+#define MAX_SOC_LEN 11
+#define MAX_ERR_LEN 18 + +MODULE_ALIAS("apml_alertl:" DRIVER_NAME); + +/
Read TSI Status register to identify the RAS error / +static int tsi_alert_check(struct apml_alertl_data oob_adata, int temp_status, u8 soc_die_num) +{ + struct apml_message msg = { 0 }; + int ret; + + if (!oob_adata->tsi_dev[soc_die_num] || !oob_adata->tsi_dev[soc_die_num]->regmap) + return 0; + msg.data_in.reg_in[REG_OFF_INDEX] = TSI_STATUS_REG; + + mutex_lock(&oob_adata->tsi_dev[soc_die_num]->lock); Lock is called inthe bottom half of the irq, mutex lock can be invoked in bottom half. — Reply to this email directly, view it on GitHub<#143 (comment)>, or unsubscribehttps://github.com/notifications/unsubscribe-auth/BCUUIJUEFFVKKDKFMC2L3CT3NLTVRAVCNFSM6AAAAAB7HM3CZ2VHI2DSMVQWIX3LMV43YUDVNRWFEZLROVSXG5CSMV3GSZLXHMZTCMJUGMZDEOBVGU. You are receiving this because your review was requested.Message ID: @.@.***>>

There are multiple ways to implement bottom half, like softirqs, tasklets, workqueue, threaded irq handler.
We are using threaded IRQ.

@sathyapk
Copy link
Collaborator

[AMD Official Use Only - AMD Internal Distribution Only] I need to understand more on how the mutexes initialized in a different driver can be accessed in this driver. Something new for me. How do you enforce the probe order()? First sbrmi, sbtsi and then apml_alertl? From: sathyapk @.> Sent: Wednesday, August 13, 2025 2:45 AM To: AMDESE/linux-aspeed @.> Cc: Kurapati, Mahesh @.>; Review requested @.> Subject: Re: [AMDESE/linux-aspeed] alert_L module for handling oob alerts (PR #143) Caution: This message originated from an External Source. Use proper caution when opening attachments, clicking links, or responding. @sathyapk commented on this pull request.
________________________________ In drivers/misc/amd-apml/apml_alertl.c<#143 (comment)>:

  • /* Allocate memory as per the number of TSI devices */
  • tsi_dev = devm_kzalloc(dev, oob_alert->num_of_tsi_devs * sizeof(struct apml_sbtsi_device), + GFP_KERNEL); + if (!tsi_dev) + return -ENOMEM; + + oob_alert->tsi_dev = tsi_dev; + oob_alert->dev = dev; + + /* + * For each of the Alerts get the device associated + * Currently the ALert_L driver identification is only supported + * over I3C. We can add property in dts to identify the bus type + / + + for (i = 0; i < oob_alert->num_of_rmi_devs; i++) { The mutex locks are initialized in respective RMI and TSI drivers. — Reply to this email directly, view it on GitHub<#143 (comment)>, or unsubscribehttps://github.com/notifications/unsubscribe-auth/BCUUIJTJHUOSPXIDWSXSLEL3NLUI5AVCNFSM6AAAAAB7HM3CZ2VHI2DSMVQWIX3LMV43YUDVNRWFEZLROVSXG5CSMV3GSZLXHMZTCMJUGM2DCNRTHA. You are receiving this because your review was requested.Message ID: @.@.*>>

Alert_l driver is dependent on rmi and tsi driver taken care in Kconfig
Probing of rmi/tsi device are not part of Alert_L driver. Alert_L driver only access the rmi/tsi device structure info updated during rmi/tsi driver init(probe).

Move out the sbtsi device structure to a common header file and
add device matching helper function for I2C and I3C buses.

This is required to add support for the APML Alert_L module which
needs access to sbtsi device struct and device matching capability.

Reviewed-by: Naveen Krishna Chatradhi <naveenkrishna.chatradhi@amd.com>
Signed-off-by: Akshay Gupta <akshay.gupta@amd.com>
akky16 and others added 4 commits September 30, 2025 10:34
Processors from AMD provide APML ALERT_L for BMC users to
monitor events.
APML Alert_L is asserted in multiple events,
- Machine Check Exception occurs within the system
- The processor alerts the SBI on system fatal error event
- Set by hardware as a result of a 0x71/0x72/0x73 command completion
- Set by firmware to indicate the completion of a mailbox operation
- High/Low Temperature Alert

APML Alert_L module define uevents to notify registered userspace processes
of the alert event.

Reviewed-by: Naveen Krishna Chatradhi <naveenkrishna.chatradhi@amd.com>
Signed-off-by: Akshay Gupta <akshay.gupta@amd.com>
Signed-off-by: sathya priya kumar <SathyaPriya.K@amd.com>
-Add alertl node for P1 and P2 host processor in Morocco platform.
-Add alertl node for P1 host processor in Congo platform.

Reviewed-by: Naveen Krishna Chatradhi <naveenkrishna.chatradhi@amd.com>
Signed-off-by: Akshay Gupta <akshay.gupta@amd.com>
Signed-off-by: sathya priya kumar <SathyaPriya.K@amd.com>
-Add alertl node for P1 and P2 host processor in Nigeria platform.
-Add alertl node for P1 host processor in Kenya platform.

Reviewed-by: Naveen Krishna Chatradhi <naveenkrishna.chatradhi@amd.com>
Signed-off-by: Akshay Gupta <akshay.gupta@amd.com>
Signed-off-by: sathya priya kumar <SathyaPriya.K@amd.com>
-Document device-tree bindings for APML Alert_L driver.

Reviewed-by: Naveen Krishna Chatradhi <naveenkrishna.chatradhi@amd.com>
Signed-off-by: Akshay Gupta <akshay.gupta@amd.com>
Signed-off-by: sathya priya kumar <SathyaPriya.K@amd.com>
(cherry picked from commit 0f54429)
@srgovard srgovard requested a review from ojayanth October 24, 2025 04:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants