Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add a cmd_time_out in LUN class to support configurable cmd_time_out attrib #36

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

zhuozh
Copy link

@zhuozh zhuozh commented Nov 15, 2017

No description provided.

Zhang Zhuoyu added 2 commits November 15, 2017 18:52
…out attrib

add a cmd_time_out in LUN class to make configfs attribute cmd_time_out
optionally configrable when adding a disk node to the configuration.

Signed-off-by: Zhang Zhuoyu <zhangzhuoyu@cmss.chinamobile.com>
record cmd_time_out attirburite as an item of disks subtree
and make sure it is identical after gateway reboot

Signed-off-by: Zhang Zhuoyu <zhangzhuoyu@cmss.chinamobile.com>
@mikechristie
Copy link
Contributor

What do you need cmd_time_out > 0 for?

It will only release commands in the kernel if runner dies (osd op timeout should catch commands that are running in runner), but we do not yet have a way to restart runner safely so I think you need to reboot the node either way.

@lxbsz
Copy link
Member

lxbsz commented Nov 16, 2017

Since we may need to upgrade iscsi tools in product case in future and the restart of gw may cause the bug as we discussed before, It's a little expensive to reboot the node, always this is unacceptable.

@mikechristie
Copy link
Contributor

After tcmu/runner is fixed to be able to restart with IO in progress this will not be needed right? If so I think setting cmd_time_out would just be a temp hack that I do not think we want to add upstream.

My issue with cmd_time_out is that it just sort of works some times. If it's only a couple commands and the reason for restarting is not frequent then you would just slowly leak (kernel never calls tcmu_cmd_free_data). If it happened with a full ring or if runner cannot restart, then you end up where the initiator will keep trying to use the path, the cmd will fail and we will just keep bouncing between the other gw and this one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants