Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support keep alive in spdk tcp backend #462

Open
LiadOz opened this issue Apr 18, 2024 · 11 comments
Open

Support keep alive in spdk tcp backend #462

LiadOz opened this issue Apr 18, 2024 · 11 comments

Comments

@LiadOz
Copy link
Contributor

LiadOz commented Apr 18, 2024

According to NVMe TCP Transport Specification 2021.06.02

The NVMe/TCP Transport requires the use of the Keep Alive feature (refer to section 7.12 in the NVMe
Base Specification). The NVMe/TCP Transport does not impose any limitations on the minimum and
maximum Keep Alive Timeout value. The minimum should be set large enough to account for any transient
fabric interconnect failures between the host and controller.
TCP level keep alive functionality is not prohibited but it is recommended that the TCP level keep alive
timeout is set to a higher value than the NVMe Keep Alive Timeout to avoid conflicting policies.

We have encountered the need for keep alive timeout in our application in a case of a sudden shutdown. The shutdown caused the host to abort the commands, but the target has kept them until it was brought back a few minutes later, after that time commands that were sent before the shutdown were completed without the application knowing about them causing data integrity issues. Had a keep alive been active, the target would have aborted those commands.

In lib/xnvme_be_spdk_dev.c the keep_alive_timeout_ms option is disabled for transport types TCP and RDMA, and it seems that this had existed since the spdk backend was created.
Allowing the user to specify a keep alive timeout is not enough, here is how that keep alive should be used according to spdk:

// include/spdk/nvme.h
struct spdk_nvme_ctrlr_opts {
    // ...

    /**
     * Keep alive timeout in milliseconds (0 = disabled).
     *
     * The NVMe library will set the Keep Alive Timer feature to this value and automatically
     * send Keep Alive commands as needed.  The library user must call
     * spdk_nvme_ctrlr_process_admin_completions() periodically to ensure Keep Alive commands
     * are sent.
     */

    uint32_t keep_alive_timeout_ms;
    // ...
}

So to fully support the feature, either xNVMe needs to create a thread that polls spdk_nvme_ctrlr_process_admin_completions or give the user the ability to call that function on their own.

@karlowich
Copy link
Contributor

Hi @LiadOz, thanks for putting this on our radar. We will have to spend some time coming up with the optimal solution.
For now, I think you should be able to achieve the desired outcome by sending the Keep Alive command "manually" through xnvme_cmd_pass_admin(). Can you try this out?

@LiadOz
Copy link
Contributor Author

LiadOz commented Apr 18, 2024

This would technically work as a workaround. If keep_alive_timeout_ms is set, spdk will send the keep alive command on it's own when spdk_nvme_ctrlr_process_admin_completions is called. So with the solution you provided 2 keep alives will be sent, one by the user, and the other one by spdk. The target probably doesn't care that it will get multiple keep alives at the same time.
I will test this solution on my environment and see if it works.

@LiadOz
Copy link
Contributor Author

LiadOz commented Apr 24, 2024

@karlowich This solution seems to work for my use case. In my fork I added spdk_keep_alive_timeout_ms field to xnvme_opts. Can I open a PR for this change?

@karlowich
Copy link
Contributor

karlowich commented Apr 24, 2024

@LiadOz as you point out in your previous comment spdk_keep_alive_timeout_ms is not needed when sending the keep alive manually. So I'm not sure I see why we would use spdk_keep_alive_timeout_ms as long as we don't support manually calling spdk_nvme_ctrlr_process_admin_completions.

But maybe I'm missing something?

@LiadOz
Copy link
Contributor Author

LiadOz commented Apr 24, 2024

spdk_keep_alive_timeout_ms is first passed to the target in the initialization phase. In addition when spdk_nvme_ctrlr_process_admin_completions is called it checks if the keep alive timeout is activated and if a sufficient amount of time has passed then it sends the keep alive command. So it has two purposes.

@karlowich
Copy link
Contributor

@LiadOz Yes, but as far as I can tell, if you don't plan on calling spdk_nvme_ctrlr_process_admin_completions there is no need to set spdk_keep_alive_timeout_ms.

@LiadOz
Copy link
Contributor Author

LiadOz commented Apr 25, 2024

I'm saying that the parameter spdk_keep_alive_timeout_ms does two things. First it sets the keep alive value for the target. So when the target got that value, it knows that if spdk_keep_alive_timeout_ms passes without a keep alive then it will abort all of the other commands. This is the behavior that I need.

In addition, when this value is set in the spdk of the host side, when spdk_nvme_ctrlr_process_admin_completions is called, then if a sufficient amount of time has passed it will send a keep alive command on its own.

So with the solution you suggested, since I can't call spdk_nvme_ctrlr_process_admin_completions directly, I will send keep alive commands on my own. But I need to let the target know what is the acceptable keep alive timeout, that's why I need to set spdk_keep_alive_timeout_ms.
This parameter affects multiple behaviors, that's why I still need it.

@karlowich
Copy link
Contributor

@LiadOz Sorry, you are absolutely right. Now I see the value of being able to set spdk_keep_alive_timeout_ms, you can go ahead and open a PR.

@karlowich
Copy link
Contributor

@LiadOz Can I close this?

@LiadOz
Copy link
Contributor Author

LiadOz commented May 15, 2024

My PR only added the option to enable the keep alive feature. The way we are sending the keep alive commands is workaround/hack, I don't think this ticket should be closed until a proper solution for it is implemented. But if you think the current workaround is good enough you can close this ticket. I have no problem with continuing to work with the hack.

@karlowich
Copy link
Contributor

@LiadOz Good point, let's keep this open to remind us in the future.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

2 participants