Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[spinel] add support for spinel interface id #6258

Closed

Conversation

mnp222
Copy link
Contributor

@mnp222 mnp222 commented Mar 9, 2021

This feature allows the RCP to support multiple host stacks
on different PANs by making use of the spinel Interface ID.

This feature allows the RCP to support multiple host stacks
on different PANs by making use of the spinel Interface ID.
@google-cla google-cla bot added the cla: yes label Mar 9, 2021
@mnp222 mnp222 changed the title [spinel] add support for spinel interface id (#6226) [spinel] add support for spinel interface id Mar 9, 2021
@size-report
Copy link

size-report bot commented Mar 9, 2021

Size Report of OpenThread

Merging #6258 into main(b775146).

name branch text data bss total
ot-cli-ftd_1.1 main 424728 780 53760 479268
#6258 424728 780 53760 479268
+/- 0 0 0 0
ot-cli-mtd_1.1 main 343072 780 43640 387492
#6258 343072 780 43640 387492
+/- 0 0 0 0
ot-ncp-ftd_1.1 main 413712 780 56592 471084
#6258 413856 780 56664 471300
+/- +144 0 +72 +216
ot-ncp-mtd_1.1 main 335020 780 46480 382280
#6258 335164 780 46544 382488
+/- +144 0 +64 +208
ot-rcp_1.1 main 56780 564 19940 77284
#6258 56908 564 20004 77476
+/- +128 0 +64 +192
libopenthread-cli-ftd.a_1.1 main 42757 0 452 43209
#6258 42757 0 452 43209
+/- 0 0 0 0
libopenthread-cli-mtd.a_1.1 main 35149 0 444 35593
#6258 35149 0 444 35593
+/- 0 0 0 0
libopenthread-ftd.a_1.1 main 225176 20 35896 261092
#6258 225196 20 35896 261112
+/- +20 0 0 +20
libopenthread-mtd.a_1.1 main 157880 20 25784 183684
#6258 157900 20 25784 183704
+/- +20 0 0 +20
libopenthread-ncp-ftd.a_1.1 main 31228 0 5852 37080
#6258 31386 0 5924 37310
+/- +158 0 +72 +230
libopenthread-ncp-mtd.a_1.1 main 26076 0 5852 31928
#6258 26234 0 5916 32150
+/- +158 0 +64 +222
libopenthread-rcp.a_1.1 main 8695 0 4988 13683
#6258 8849 0 5052 13901
+/- +154 0 +64 +218
libopenthread-radio.a_1.1 main 14277 0 150 14427
#6258 14293 0 150 14443
+/- +16 0 0 +16
ot-cli-ftd_1.2 main 447244 780 62596 510620
#6258 447244 780 62596 510620
+/- 0 0 0 0
ot-cli-mtd_1.2 main 357924 780 44172 402876
#6258 357924 780 44172 402876
+/- 0 0 0 0
ot-ncp-ftd_1.2 main 432896 780 65436 499112
#6258 433056 780 65508 499344
+/- +160 0 +72 +232
ot-ncp-mtd_1.2 main 345936 780 47020 393736
#6258 346080 780 47084 393944
+/- +144 0 +64 +208
ot-rcp_1.2 main 58016 564 20000 78580
#6258 58160 564 20064 78788
+/- +144 0 +64 +208
libopenthread-cli-ftd.a_1.2 main 45597 0 452 46049
#6258 45597 0 452 46049
+/- 0 0 0 0
libopenthread-cli-mtd.a_1.2 main 37468 0 444 37912
#6258 37468 0 444 37912
+/- 0 0 0 0
libopenthread-ftd.a_1.2 main 244516 20 44544 289080
#6258 244536 20 44544 289100
+/- +20 0 0 +20
libopenthread-mtd.a_1.2 main 169829 20 26128 195977
#6258 169849 20 26128 195997
+/- +20 0 0 +20
libopenthread-ncp-ftd.a_1.2 main 31954 0 5852 37806
#6258 32112 0 5924 38036
+/- +158 0 +72 +230
libopenthread-ncp-mtd.a_1.2 main 26338 0 5852 32190
#6258 26496 0 5916 32412
+/- +158 0 +64 +222
libopenthread-rcp.a_1.2 main 8695 0 4988 13683
#6258 8849 0 5052 13901
+/- +154 0 +64 +218
libopenthread-radio.a_1.2 main 15199 0 182 15381
#6258 15215 0 182 15397
+/- +16 0 0 +16

if (status == SPINEL_STATUS_OK)
{
unpacked = spinel_datatype_unpack(aBuffer, aLength, SPINEL_DATATYPE_BOOL_S, &framePending);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change fixes a bug where a spinel error message gets masked by OT_ERROR_PARSE message because the error message can't be further parsed.

}
else
{
mIsReady = true;
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The above change allows recovery without resetting the RCP, which is necessary in the case of multiple host processes.

@@ -2301,6 +2323,7 @@ void RadioSpinel<InterfaceType, ProcessContextType>::RestoreProperties(void)
SuccessOrDie(Set(SPINEL_PROP_PHY_FEM_LNA_GAIN, SPINEL_DATATYPE_INT8_S, mFemLnaGain));
}

#if OPENTHREAD_POSIX_CONFIG_MAX_POWER_TABLE_ENABLE
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This #if fixes a bug unrelated to this issue, but which we needed in order to compile.

spinel_tid_t mNextExpectedTid;
#endif

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we get rid of the switch here and just always use the array?

The value 4 should be replaced by (SPINEL_HEADER_IID_MAX + 1) defined in spinel.h. I will fix that.

if (WriteLastStatusFrame(aHeader, ThreadErrorToSpinelStatus(error)) == OT_ERROR_NONE)
{
return OT_ERROR_NONE;
}
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The above changes starting at line 443 are a bug we came across with how errors were being reported up.


instance = otSysInit(&aConfig->mPlatformConfig);

IgnoreError(otLoggingSetLevel(aConfig->mLogLevel));

atexit(otSysDeinit);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This fixes a bug where otSysInit() sets the log level to the compile time config, rendering the command line argument useless.

@codecov
Copy link

codecov bot commented Mar 9, 2021

Codecov Report

Merging #6258 (fa7b012) into main (bbbeee1) will decrease coverage by 0.01%.
The diff coverage is 100.00%.

❗ Current head fa7b012 differs from pull request most recent head 9bc7f46. Consider uploading reports for the commit 9bc7f46 to get more accurate results

@@            Coverage Diff             @@
##             main    #6258      +/-   ##
==========================================
- Coverage   86.67%   86.66%   -0.02%     
==========================================
  Files         428      428              
  Lines       54568    54595      +27     
==========================================
+ Hits        47299    47316      +17     
- Misses       7269     7279      +10     
Impacted Files Coverage Δ
src/core/common/message.hpp 100.00% <ø> (ø)
src/core/common/message.cpp 98.74% <100.00%> (+0.03%) ⬆️
tests/unit/test_message.cpp 100.00% <100.00%> (ø)
src/core/meshcop/timestamp.cpp 95.23% <0.00%> (-4.77%) ⬇️
src/lib/spinel/radio_spinel_impl.hpp 81.16% <0.00%> (-1.25%) ⬇️
src/core/meshcop/dtls.cpp 86.05% <0.00%> (-0.95%) ⬇️
src/core/mac/mac_frame.hpp 96.33% <0.00%> (-0.92%) ⬇️
examples/platforms/simulation/uart.c 80.90% <0.00%> (-0.91%) ⬇️
src/core/mac/sub_mac.cpp 88.86% <0.00%> (-0.50%) ⬇️
src/core/mac/mac.cpp 89.42% <0.00%> (-0.39%) ⬇️
... and 7 more

VerifyOrExit(!RadioSupportsCsmaBackoff(), swCsma = false);
#endif
Copy link
Member

@abtink abtink Mar 9, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mnp222 , I noticed there are some new code under #if OPENTHREAD_CONFIG_MULTIPAN_RCP_ENABLE in the SubMac and MeshForwarder in this PR. I think your goal is to have sub-mac handle some error cases (like retry if radio is being used by some other entity) and be aware whether we are operating under multi-pan mode or not.

I would suggest not using this model:

  • We try to define a clear abstraction and fixed contract between OT core and the platform layer.
    • The radio.h defines all the APIs and callbacks and list of errors from each callback.
  • OT core modules assume that they are the sole controller of radio (using the platform APIs).
  • From the core OT perspective it should not matter whether the radio platform is RCP or not, or whether the radio is being shared by different entities or not.
    • So I don't think sub-mac or mesh-forwarder should be aware of MULTIPAN_RCP config and perform any differently under it.
  • I think we should try to handle all such details in platform layer implementation itself (e.g. RCP related code and/or radio driver).
  • I think it may be fine if we need to add a new error type from callbacks (e.g. BUSY from TxDone?) but I think it we don't need it (e.g. we can use ABORT which is already there), that would be better.

Some quick info about an internal project from before (which I hope may help):

Internally for a Nest/Google product we had a requirement where we needed to share a single radio between multiple entities (one entity being OT stack and some other entities for some legacy networks/protocols).

We implemented it all in platform layer:

  • Basically we added an "arbiter module" which sat over the radio driver and under the protocol stacks.
  • It provided the otPlatRadio APIs to OT stack and another set of similar APIs for use by other entities.
  • The "arbiter module" ensured that multiple entities could control the radio while they were not aware of the existence of each other (each assumed that it was the only controller of radio).
  • The arbiter module basically handled any conflicting commands, like one entity asking radio to go sleep, while the other one want it to rx or tx, etc.

Thoughts?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Abtin, I agree, I did not like those changes to the OT core modules. This change is for the case where one process wants to tx while the other one is already transmitting. I can place logic to handle it (basically performing retries) into radio_spinel_impl.hpp. Does that seem like an ok place for it?

Thanks for the additional discussion, it is very helpful. In your case were your stack entities running in a single process so that the arbiter module sat under both of them, or were they in separate processes? If the latter, how did the IPC work?

Copy link
Member

@abtink abtink Mar 10, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @mnp222.

I agree that handling it in radio_spinel_imp.hpp sounds reasonable. I would say if we can move it to RCP side itself, it may be even better. I think RCP is the place where we need to be aware of multiple PANs/entities and it can best handle any conflicting requests.

I think from the host side of radio_spinel_impl, we may only need to know to use certain spinel IID to talk to the RCP (would it matter to know if there are other entities that are controlling the RCP or not on the host side?).


Let me try add more details (from what I recall) on our internal project and how it worked:

  • It was all running on an RTOS platform. All modules were part of the firmware running on NPC (OT stack, radio arbiter and driver, and other (legacy) stacks) all part of NCP firmware.
  • OT stack had its own RTOS task:
    • OT stack requires that all the OT public and platform APIs and callbacks to be invoked from same OS context, e.g. same RTOS task.
  • Arbiter and radio driver had their own shared RTOS task (and I think other/legacy stack ran on its own RTOS task as well).
  • I think the different tasks synced using ROTS locks/semaphore.

Arbiter module provided APIs to all entities which wanted to control the radio:

  • For example, otPlatRadio APIs to OT stack and very similar set of APIs to the other (legacy) stack.
  • Arbiter module managed state variables per entity to remember the last request/command (API call).
    • For example, if otPlatRadioSleep() was called to put radio to sleep, or otPlatRadioReceive() to rx, or a request to tx frame or perform energy scan, etc.
  • It then handled the requests/commands from different entities and decided what to do actually with the radio. Here are some of the rules I recall:
    • We put radio to sleep mode only when all entities requested it to sleep.
    • When different entities requested radio to be in rx mode:
      • If the channels matched, it was straight-forward and we put the radio in rx mode on that channel.
      • But if the requests were for different channels, I think we had a simple rule to prioritize one entity (I think OT stack request won on the rx channel).
      • Just a note on this:
        • In our use-case, all stacks (OT and legacy) were actually operating on the same channel, but we still could have the situation that they put radio on different channels, for example to support a discover/mac scan operation (during which we can switch to different channels).
        • We also had the idea to use a windowed rx model when we have conflicting rx channels (rx on first channel for some time interval, then switch to other channel, an so on) but I think we did not implement/use this model.
    • If we get a tx request by an entity while radio was busy with an ongoing tx from another entity we just waited for the current tx to complete (success or failure) and then proceeded with the next tx request.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @abtink, thanks for this discussion. I would like to move the arbitration of transmissions and energy scans to the RCP side as you suggest. I can add a queue to NcpBase which queues up transmission and energy scan commands if the radio is busy, then proceeds with them when the radio is free. That code would only be enabled in MULTIPAN_RCP mode. Let me know if that sounds ok.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @mnp222 . Yes, I think it makes sense and sounds good to me. 👍

I think we can also consider adding this in the LinkRaw class (under OPENTHREAD_RADIO) since LinkRaw is the point of interaction with radio in an RCP build. LinkRaw practically provides a mirror of radio.hpp APIs.

I think placing the multi-pan (radio busy) checks in either Ncp or LinkRaw would be fine. Thanks.

This commit handles arbitration of transmissions and energy scans
from multiple host stacks on the RCP when
OPENTHREAD_CONFIG_MULTIPAN_RCP_ENABLE is defined.
If the radio is busy, transmit and energy scan commands received
over spinel are placed on a pending command queue and processed when
the radio is free.
@mnp222 mnp222 force-pushed the staging/feature/multipan_rcp branch from 941abb4 to 94a8e78 Compare March 29, 2021 14:03
@mnp222
Copy link
Contributor Author

mnp222 commented Mar 29, 2021

HI @abtink. I added a new api to link_raw.h, otLinkRawIsTransmittingOrScanning(), so that ncp_base_radio.cpp could check if the radio is transmitting or scanning before submitting a transmission or energy scan. However, one of the checks (API Version) seems to be unhappy about that. How do I fix this?
https://github.com/openthread/openthread/pull/6258/checks?check_run_id=2222515229

@abtink
Copy link
Member

abtink commented Mar 29, 2021

HI @abtink. I added a new api to link_raw.h, otLinkRawIsTransmittingOrScanning(), so that ncp_base_radio.cpp could check if the radio is transmitting or scanning before submitting a transmission or energy scan. However, one of the checks (API Version) seems to be unhappy about that. How do I fix this?
https://github.com/openthread/openthread/pull/6258/checks?check_run_id=2222515229

@mnp222, you just need to increment OPENTHREAD_API_VERSION in the include/openthread/instance.h in your commit/PR:

/**
* The OpenThread API monotonic version number.
*
* This number MUST increase by one each time the contents of public OpenThread API include headers change.
*
* @note This number versions both OpenThread platform and user APIs.
*
*/
#define OPENTHREAD_API_VERSION (87)

@jwhui jwhui requested a review from abtink March 30, 2021 04:31
Copy link
Member

@abtink abtink left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @mnp222. Looks great to me. 👍

I have one question on the change to aResetRadio. From my perspective and the code itself it looks fine, but would like @bukepo (and maybe others) to also check that to make sure if it may impact any current use-cases or not.

}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to change the behavior and meaning of aResetRadio. There was a recent question related to this #6292.

From what I recall the idea behind aResetRadio param was whether the host driver (radio_spinel_imp) should do the reset or if it would be done by the process that starting the ot-posix. I think within the code here we expect the RCP to be reset anyways.

This change here seem to change the meaning of aResetRadio such that the radio_spinel driver to not issue a reset and also not wait and check for a reset from RCP.

I understand this may be the behavior we want for multi-pan RCP, but I am not sure of its consequence in general case.

@bukepo what do you think of this change?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I understand. I will fix this so the general case is unchanged. The multipan case may need an altered behavior, at least durring RCP recovery.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be fixed now.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like if 2 hosts both have aResetRadio = true, the RCP will be reset twice. Will that cause any issue?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This works as long as the recovery mechanism is enabled, and does not send a new reset when it receives an unexpected reset. I made a change that prevents this, but I see that it is overly general. I will clean that up.

@@ -305,13 +307,22 @@ NcpBase::NcpBase(Instance *aInstance)
#if OPENTHREAD_ENABLE_VENDOR_EXTENSION
aInstance->Get<Extension::ExtensionBase>().SignalNcpInit(*this);
#endif

#if OPENTHREAD_CONFIG_MULTIPAN_RCP_ENABLE
memset(mNextExpectedTid, 0, sizeof(mNextExpectedTid));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need this? We seem to do the same above in general case?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, it was left over by mistake. I've removed it.

// If we fail to report the error now, it will be reported later in HandleReceive() of ncp_base.cpp.
if (WriteLastStatusFrame(aHeader, ThreadErrorToSpinelStatus(error)) == OT_ERROR_NONE)
{
return OT_ERROR_NONE;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This help ensure we still have single return in each function/method (which I think is required by our style guide).

Suggested change
return OT_ERROR_NONE;
error = OT_ERROR_NONE;

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is fixed now.

This commit reverts all changes to the radio reset behavior.

The only new behavior occurs during RCP recovery if the
system is in multipan mode. In that case additional resets
are not sent to avoid playing reset ping pong.

Also cleaned up a few miscellaneous items:
- remove stray init code
- return an error properly
- bump the OPENTHREAD_API_VERSION due to the addition of
  otLinkRawIsTransmittingOrScanning()
bool IsTransmittingOrScanning(void) const { return (mState == kStateTransmit) || (mState == kStateEnergyScan); }
bool IsTransmittingOrScanning(void) const
{
return (mState == kStateTransmit) || (mState == kStateEnergyScan) || (mState == kStateCsmaBackoff);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that during state kStateCsmaBackoff the radio should be considered transmitting

This change was supposed to be pushed with a previous commit.
This change was also supposed to be in the previous commit.
Leave the normal behavior unchanged.  In multipan mode, don't
send a reset.
Copy link
Contributor

@jinran-google jinran-google left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Besides the comments, I think there should be some simulation test cases covering these scenarios:

  • A host transmits while another is scanning, and vice versa
  • Multiple hosts listen on different channels and talk to different nodes
  • Multiple hosts forming the same network talk to each other

Comment on lines 851 to 856
while (aPosition >= kPendingCommandQueueSize)
{
aPosition -= kPendingCommandQueueSize;
}

return aPosition;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
while (aPosition >= kPendingCommandQueueSize)
{
aPosition -= kPendingCommandQueueSize;
}
return aPosition;
return aPosition % kPendingCommandQueueSize;

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like your suggestion. I was following the precedent set in GetWrappedResponseQueueIndex(). I will change it there too.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps OT has a coding standard against using the modulus operator in embedded code, which would explain why the while loop was used instead. @abtink it looks like you wrote this code in the commit below. Do you recall your thinking?
9ce57b2

src/ncp/ncp_base.cpp Outdated Show resolved Hide resolved
@@ -2214,12 +2232,16 @@ void RadioSpinel<InterfaceType, ProcessContextType>::RecoverFromRcpFailure(void)
mIsReady = false;
mIsTimeSynced = false;

#if OPENTHREAD_CONFIG_MULTIPAN_RCP_ENABLE
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the case of failures on multi-PAN RCP, I don't think we can recover transparently. OPENTHREAD_CONFIG_MULTIPAN_RCP_ENABLE and OPENTHREAD_SPINEL_CONFIG_RCP_RESTORATION_MAX_COUNT > 0 should be mutually exclusive.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's necessary to enable the RCP restoration mechanism because the RCP may be reset by another host process. The key point is that upon receipt of an unexpected reset, the host should not reset the RCP again during recovery. I see that the current change I made here is too general because it prevents reset in the case of an RCP timeout as well as an unexpected reset. I can fix this by changing mRcpFailed from a boolean to an enum. How does that sound?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That sounds OK. But there is another problem. What if 2 hosts want to set the same long-term property to different values? E.g. 2 hosts call Receive with different channels, the later one will take effect. Another example is if a host stops itself while others are running, it may set SPINEL_PROP_PHY_ENABLED to false and the RCP stops completely. Things become even more complex when it comes to list-like properties, like SPINEL_PROP_MAC_SRC_MATCH_SHORT_ADDRESSES.

Copy link
Contributor Author

@mnp222 mnp222 Apr 2, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The host applications need to coordinate with each other about long-term global properties of the radio such as channel and power and enable/disable. In addition, a policy system can be established in the platform layer on the RCP, for example, only one of the processes has permission to change these. In any case, it is out of scope of the core stack.

For per-IID properties such as pan id, long id, short id, and the src match lists, storing these separately for each IID is also accomplished at the platform layer. When handling a set or get command, the PAL looks up the current IID via otNcpPlatGetCurCommandIid().

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will the property handling be implemented in this PR or in a separate PR? Looks like the multi PAN feature cannot work properly without coordination between hosts. I think it's good to add some integration test cases so we can better understand the use cases.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The property handling is in the platform layer so it will be in a separate PR. Testing requires the platform changes and a mux to combine the spinel host streams. We perform this testing, but I will also look into how this could be added to the openthread testing. Can you point me to some existing test examples for reference?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

src/ncp/ncp_base.cpp Outdated Show resolved Hide resolved
we're recovering due to an unexpected reset.

Also call LinkRawEnergyScanDone() with invalid rssi value if
a pending energy scan fails.
@jwhui jwhui requested a review from jinran-google April 6, 2021 05:41
Comment on lines +714 to 717
while (aPosition >= aQueueSize)
{
aPosition -= kResponseQueueSize;
aPosition -= aQueueSize;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This loop can be removed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@abtink had written the code with that loop so I left it in place. It looks like he was concerned that you could have EnqueueResponse() called a large number of times before SendQueuedResponses() is called, in which case the tail might actually wrap twice and require multiple subtractions. It seems unlikely, but possible. Do you still think I should replace the while with an if?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function returns aPosition % aQueueSize, so removing this loop won't change any behavior. An if isn't needed either.

@jwhui
Copy link
Member

jwhui commented Mar 9, 2022

@mnp222 @romacdon , are you planning to move this PR forward?

@romacdon
Copy link
Contributor

romacdon commented Mar 9, 2022

@mnp222 @romacdon , are you planning to move this PR forward?

@jwhui - yes we definitely want to get this pushed in. There was a request to add some integration tests and we didn't have time to do it when requested. Would it be possible to push the changes in without those tests? Also, we know we need to refresh the PR with the latest code since its been almost a year. We can get to that in the next couple of weeks? Is someone waiting on these changes or are you just trying to close up old PRs?

@jwhui
Copy link
Member

jwhui commented Mar 9, 2022

@jwhui - yes we definitely want to get this pushed in. There was a request to add some integration tests and we didn't have time to do it when requested. Would it be possible to push the changes in without those tests? Also, we know we need to refresh the PR with the latest code since its been almost a year. We can get to that in the next couple of weeks? Is someone waiting on these changes or are you just trying to close up old PRs?

@romacdon , thanks for the update, I was just trying to clean up old PRs. We prefer to have relevant CI test with along with new functionality.

@Hedda
Copy link

Hedda commented Jun 22, 2022

Any news on this being merged?

@romacdon
Copy link
Contributor

romacdon commented Jun 22, 2022

Any news on this being merged?

@Hedda We do still plan on merging this. We just need to create some unit tests. Do you have a specific need for this?

@Hedda
Copy link

Hedda commented Jun 23, 2022

Any news on this being merged?

@Hedda We do still plan on merging this. We just need to create some unit tests. Do you have a specific need for this?

Yes I believe so, and here is the long back-story behind that in which I try to explain the specific current demand for this feature:

@agners (who is Nabu Casa employee Stefan Agner) is a developer of the very popular home automation software Home Assistant is currently developing an “OpenThread Border Router Add-on” and that OTBR add-on should, in theory, hopefully in the future be able to be compatible with all OpenThread Border Router firmware with Spinel interface in RPC mode no matter which underlying host stack and hardware it is running on. As I understand his OTBR host application add-on will be used as a dependency to enable Matter/CHIP over Thread support in Home Assistant which he is also currently developing:

https://github.com/home-assistant/addons-development/tree/master/openthread_border_router

https://community.home-assistant.io/t/chip-matter-support-including-thread/305633/82

I believe this feature will be needed to for an host application with Spinel RCP support to be compatible with multiple host stacks regardless of underlying hardware and version of stack for OpenThread Border Router, right?

If so then think a lot of us in open-source home automation communities are specifically interested in having one and the same host application being support DIY-users that have bought different OpenThread Border Router hardware adapters and are using firmware versions on them, and initially for the mentioned/linked host application add-on for Home Assistant to support OpenThread Border Router firmware with Spinel interface in RPC mode running on both Silicon Labs EFR32 based adapter hardware and Texas Instruments CC2652/CC1352 based adapter hardware.

home-assistant/addons-development#38

That is, there is a specific interest in first supporting different OTBR Spinel RPC firmware running on Silicon Labs EFR32 based adapters and Texas Instruments CC2652/CC1352 based adapters. The reason for that is that both those are already readily available as USB radio adapters sold to consumers as Zigbee Coordinator dongles (pre-loaded with Zigbee NCP firmware). There TI CC26x2/CC13x2 based USB adapters are currently by far the most popular hardware within the DIY home automation communities because they are less expensive and there are available globally from different manufacturers, (and stock of Silicon Labs EFR32 based USB radio adapters have also suffered much more than Texas Instruments in the current worldwide chip shortage and global supply chain logistics problems).

https://www.zigbee2mqtt.io/guide/adapters/#based-on-texas-instruments-cc2652-cc1352-chip

From what I read @agners has however only worked with Silicon Labs EFR32 (EFR32MG1x/EFR32MG2x) based adapters (primarily Silabs EFR32MG21 chip) with OpenThread Spinel RPC firmware for Matter/CHIP over Thread so his host application add-on but I believe that is only because Nabu Casa has chosen to use Silabs EFR32MG21 chip based adapters that will ship on the board as part on the official Home Assistant Yellow (formerly Home Assistant Amber) hardware as well being the main component of their upcoming Home Assistant SkyConnect USB Stick which was recently announced, both of which @agners is the lead developer as well:

zigpy/zigpy#894

https://www.crowdsupply.com/nabu-casa/home-assistant-yellow

https://www.home-assistant.io/blog/2021/09/13/home-assistant-yellow/

https://twitter.com/ZackBarettHA/status/1537168472131739648?ref_src=twsrc%5Etfw%7Ctwcamp%5Etweetembed%7Ctwterm%5E1537168472131739648%7Ctwgr%5E%7Ctwcon%5Es1_&ref_url=https%3A%2F%2Fwww.redditmedia.com%2Fmediaembed%2Fvdmpl6%3Fresponsive%3Dtrueis_nightmode%3Dfalse

https://docs.google.com/forms/d/e/1FAIpQLScEjHSBJszUZfgO3MIDO51IHr3Oeohcs8BLpRIjY1liJ58IpA/viewform

The ultimate goal I would personally like to see achieved would be for open-source home automation software applications like Home Assistant to allow end-users to buy any OpenThread compatible USB radio adapter and use any version of OpenThread Spinel RPC firmware regardless if the 802.15.4 radio MCU chip hardware is from Silicon Labs, Texas Instruments, NXP, or Nordic Semiconductor, etc., (as long as it is a Thread certified product).

So, hopefully, this new feature will lead to all Thread USB radio adapters with an OTBR Spinel RPC firmware build regardless of stack version and chip hardware manufacturers or models will as such in the future be compatible with the upcoming Thread-based “Matter” (Project CHIP / Connected Home over IP) devices if used in Home Assistant with their other add-ons for the that is also in development.

https://github.com/home-assistant/addons-development/tree/master/chip_tool

https://github.com/home-assistant/addons-development/tree/master/chip_controller_repl

https://github.com/home-assistant-libs/python-matter-server/

PS: Nabu Casa (with the Home Assistant founders) has by the way recently joined the CSA (Cloud Security Alliance) and is planning on certifying the Home Assistant implementation as a Thread as well as a Matter/CHIP compatible CSA-IoT platform:

https://csa-iot.org/members/search/?m_keywords=Nabu+Casa

@jwhui
Copy link
Member

jwhui commented Jun 25, 2022

@Hedda , thanks for providing the additional context.

Spinel is already OpenThread's standard serial protocol for communicating between a host and 802.15.4 SoC (see our Co-Processor Designs page on openthread.io). Practically all of the vendors supporting OpenThread support RCP with spinel.

This PR is proposing to extend spinel to multiplex multiple 802.15.4-based networks simultaneously across a single serial bus. This is for RCPs that can support multiple 802.15.4 PANs simultaneously.

}

return aPosition;
return aPosition % aQueueSize;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess we don't need the %?

@@ -2561,6 +2709,11 @@ otError otNcpStreamWrite(int aStreamId, const uint8_t *aDataPtr, int aDataLen)
return error;
}

extern "C" spinel_iid_t otNcpPlatGetCurCommandIid()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
extern "C" spinel_iid_t otNcpPlatGetCurCommandIid()
extern "C" spinel_iid_t otNcpPlatGetCurCommandIid(void)

@mnp222
Copy link
Contributor Author

mnp222 commented Dec 15, 2022

We will open a new PR that is more up to date.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants