ntb_hw_switchtec: Fix unable to set mw translation bug.#48
ntb_hw_switchtec: Fix unable to set mw translation bug.#48wesleywesley merged 1 commit intodevelfrom
Conversation
|
|
This makes no sense to me. Especially this:
How does the link status flag in memory relate to the switchtec registers? Logan |
|
Hi Logan, Sorry for the unclear description. The issue is The reason is To fix it, we needs the link_is_up flag be 0 before the shared memory window is reconfigured. Thanks |
|
Ah, I see. More of that needs to be in the commit message. In this case you haven't fixed the problem. You've just made the race less likely. If you have two threads running nothing says the second thread doesn't do the check just before the first thread sets the flag. We either need to use a lock or make sure all the link status checking stuff happens in the same work queue (I think the second choice is better). If we make check_link_status the work item and remove the work item for force link down it should probably fix this problem. |
141371b to
42ba215
Compare
|
Hi Logan, Thanks for the review and comments. I updated the code and commit message. Please help to review it again. In this update, we put all the link status checking stuff in the same work queue. I tested it and It can work. Regards |
|
System work queue has no promise on the latter work item be executed after the former is done. |
lsgunth
left a comment
There was a problem hiding this comment.
Oh, yeah, hmm. This path has another problem I don't know how to solve.
You can easily lose messages if switchtec_ntb_check_link() get called twice before the work queue is scheduled. I can't think of a good way to fix this, so maybe this is not a good approach.
|
Hmmm, it might work if instead of storing a message in sndev, you store the requested link state and a flag to indicate whether a force has occurred. |
42ba215 to
6450ff0
Compare
|
Hi Logan, Thanks for the review and suggestion. I add a new message type (MSG_RETRY) to fix this issue. Please help to review it. Regards |
|
Huh? I don't know how you got that adding a message type was the solution.... I mean to store state required for the work queue, and not the message itself. We're trying to avoid the situation caused by dropped messages. |
6450ff0 to
5b5aa2b
Compare
|
Oh, I see. Thank you very much. I updated it, but there is another issue on ntb_transport. The ntb_transport_link_work would be canceled by ntb_transport_cleanup_work. The ntb_transport_probe() Regards |
|
Is there a contradiction? |
|
when test with this version, found 2 issues:
|
|
@wesleywesley Thanks |
When one host receives the link force down message from peer host, ntb_hw_switchtec driver and NTB client(e.g. ntb_transport) may simultaneously call function switchtec_ntb_part_op() to set up MWs. This will lead to the failure to set up the transport MW. To solve this, we put all the link status checking stuff in the same work queue. Signed-off-by: Joey Zhang <joey.zhang@microchip.com>
|
@JoeyZhang-Microsemi
regards, |
5b5aa2b to
815977d
Compare
|
@wesleywesley For issue 1, I have fixed it. The NTB API ntb_link_enable() should be set the flag ntb_is_up when it returns. For issue 2, I did not know the reason. But we call schedule_work() twice in very short time. It may be the reason. Please help to review the change. |
|
@lsgunth Thanks for the comments. I updated the code. Because the NTB API ntb_link_enable() needs update the flag link_is_up before it returns. In the change, the switchtec_ntb_link_enable/disable() directly call the link state checking stuff. |
lsgunth
left a comment
There was a problem hiding this comment.
The code makes sense to me, but it's probably testing that's going to have to govern whether this is correct. It's really tricky to reason about.
|
@JoeyZhang-Microsemi had tested on both xlink and non-xlink NTB setup w/o issue, merged into devel. |
When host receives the link force down message from peer host, the
ntb_hw_switchtec driver would reinitialize the shared memory window,
and NTB client(e.g. ntb_transport) would set up the translation
memory window. Because Switchtec uses same registers to set up memory
window, the access to these registers would fail and the translation
memory window would not be set up.
To fix it, we should disable link before reinitializing the shared
memory window, and enable link after it.
Signed-off-by: Joey Zhang joey.zhang@microchip.com