Skip to content

Commit

Permalink
ipmi: fix long wait in unload when IPMI disconnect
Browse files Browse the repository at this point in the history
commit f6f1234 upstream.

When fixing the problem mentioned in PATCH1, we also found
the following problem:

If the IPMI is disconnected and in the sending process, the
uninstallation driver will be stuck for a long time.

The main problem is that uninstalling the driver waits for curr_msg to
be sent or HOSED. After stopping tasklet, the only place to trigger the
timeout mechanism is the circular poll in shutdown_smi.

The poll function delays 10us and calls smi_event_handler(smi_info,10).
Smi_event_handler deducts 10us from kcs->ibf_timeout.

But the poll func is followed by schedule_timeout_uninterruptible(1).
The time consumed here is not counted in kcs->ibf_timeout.

So when 10us is deducted from kcs->ibf_timeout, at least 1 jiffies has
actually passed. The waiting time has increased by more than a
hundredfold.

Now instead of calling poll(). call smi_event_handler() directly and
calculate the elapsed time.

For verification, you can directly use ebpf to check the kcs->
ibf_timeout for each call to kcs_event() when IPMI is disconnected.
Decrement at normal rate before unloading. The decrement rate becomes
very slow after unloading.

  $ bpftrace -e 'kprobe:kcs_event {printf("kcs->ibftimeout : %d\n",
      *(arg0+584));}'

Signed-off-by: Zhang Yuchen <zhangyuchen.lcr@bytedance.com>
Message-Id: <20221007092617.87597-3-zhangyuchen.lcr@bytedance.com>
Signed-off-by: Corey Minyard <cminyard@mvista.com>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
  • Loading branch information
ZhangYuchenLCR authored and gregkh committed Jan 12, 2023
1 parent fa6bbb4 commit 24f4649
Showing 1 changed file with 19 additions and 8 deletions.
27 changes: 19 additions & 8 deletions drivers/char/ipmi/ipmi_si_intf.c
Expand Up @@ -2152,6 +2152,20 @@ static int __init init_ipmi_si(void)
}
module_init(init_ipmi_si);

static void wait_msg_processed(struct smi_info *smi_info)
{
unsigned long jiffies_now;
long time_diff;

while (smi_info->curr_msg || (smi_info->si_state != SI_NORMAL)) {
jiffies_now = jiffies;
time_diff = (((long)jiffies_now - (long)smi_info->last_timeout_jiffies)
* SI_USEC_PER_JIFFY);
smi_event_handler(smi_info, time_diff);
schedule_timeout_uninterruptible(1);
}
}

static void shutdown_smi(void *send_info)
{
struct smi_info *smi_info = send_info;
Expand Down Expand Up @@ -2186,16 +2200,13 @@ static void shutdown_smi(void *send_info)
* in the BMC. Note that timers and CPU interrupts are off,
* so no need for locks.
*/
while (smi_info->curr_msg || (smi_info->si_state != SI_NORMAL)) {
poll(smi_info);
schedule_timeout_uninterruptible(1);
}
wait_msg_processed(smi_info);

if (smi_info->handlers)
disable_si_irq(smi_info);
while (smi_info->curr_msg || (smi_info->si_state != SI_NORMAL)) {
poll(smi_info);
schedule_timeout_uninterruptible(1);
}

wait_msg_processed(smi_info);

if (smi_info->handlers)
smi_info->handlers->cleanup(smi_info->si_sm);

Expand Down

0 comments on commit 24f4649

Please sign in to comment.