-
Notifications
You must be signed in to change notification settings - Fork 408
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How are Observations stored on client side? #1574
Comments
Could I ask which version of Leshan you are using ?
Yep there was some improvement about that but still some limitations. (code was not initially thought with this use case in mind. Neither in Leshan nor Californium) See for more details at :
Just by curiosity what kind of gain to you get with stop() ?
I guess this is because when you start the client it send a new Register Request.
If I'm right ☝️ this is not about saving the observations its about not sending a new registration on start. |
Are you using |
Thanks for the quick reply!
With the thread pools we can control almost all threads, but the one remaining client-specific thread is the "DTLS-Receiver-0-/127.0.0.1:xxxxx". Stopping the client helps us get rid of this temporarily.
We actually made a change in the RegistrationEngine class so that it checks the registeredServers-field on start method call, and if there exists a server in there, we try to do an update first. This seems to have fixed the behavior as needed, the same registration is kept. @Override
public void start() {
stop(false); // Stop without de-register
synchronized (this) {
started = true;
// Try factory bootstrap
LwM2mServer dmServer = factoryBootstrap();
if (dmServer == null) {
// If it failed try client initiated bootstrap
if (!scheduleClientInitiatedBootstrap(NOW))
throw new IllegalStateException("Unable to start client : No valid server available!");
} else {
// If there exists a registered server already, we try to send a registration update
// Only one server is supported for now
if (!registeredServers.isEmpty()) {
String registrationId = registeredServers.entrySet().iterator().next().getKey();
var updateTask = new CustomRegistrationEngine.UpdateRegistrationTask(dmServer, registrationId,
new RegistrationUpdate());
updateFuture = schedExecutor.submit(updateTask);
} else {
registerFuture = schedExecutor.submit(new CustomRegistrationEngine.RegistrationTask(dmServer));
}
}
}
} Might be that I am thinking this wrong, but I feel that if it is the client who sends the notifications on resource change, it should have knowledge of the observations that have been done to it's resources. I was not able to pinpoint where this information could be, or if it exists at all on the client. |
And it works for you now ?
Observation is mainly a CoAP feature. and this seems to be a nightmare to implement in CoAP (+ there is some specification issue in the RFC) Anyway, just to say that relation is mainly store in Coap Library code :
|
Observation-wise, we noticed that there is an |
The |
So it is of no use on the LWM2M client side? Relating to this, can you suggest where to look for a way to persist the DTLS sesssion/connection ID on client restart? I noticed from the logs the |
That's it !
"DTLS sesssion/connection ID" , The ID behind connection confuse me a little. I don't know if you want to persist connection or just the connection ID value. I guess this is the former because the latter does not make so much sense ? Anyway, Session and Connection are 2 very different concepts in (D)TLS. In Californium,
In all case if this is pure Californium question maybe better to directly ask there : https://github.com/eclipse-californium/californium/issues
There is some example here : #1395 (comment) |
Ok, so this was solved. It appears that Leshan and californium actually handled the observations and sessions nicely, and stopping and starting the client did in fact keep the The culprit behind the sessions and observations being lost was in the // Try factory bootstrap
LwM2mServer dmServer = factoryBootstrap(); in Thanks @sbernard31 for all the help! Regarding the changes I made on the RegistrationEngine, is this start/stop handling something you could want in the default one, behind a configuration option? I could maybe submit a PR if so? |
Glad to see you find a solution. 👍
You're welcome. 🙂
I have no clear idea if this is something which should be integrated OR not. The another point, I'm not sure if it can work with all transport layer Perhaps you could create a PR mainly to share the code, but I can guarantee that it will be integrated soon (or even later). But if we don't integrate it, how will you reuse Leshan ? by copy/paste/ modify the code ? |
Yeah, good point. I can maybe clean up the code a bit and create that PR. Whether or not it will be integrated, could prove useful for someone.
Not sure I fully understand the question, but for our use case, I just created a new class using the copied RegistrationEngine/Factory code with the changes made. Not ideal but works. Maintenance-wise, upgrading leshan version will require some effort if the implementation changes, so having this in the library code itself would have been ideal. Regarding the Queue mode, this kind of start/stop approach works for us. I.e. we handle the wake-up times and data being sent on start up on a higher level, outside Leshan. But I would guess the other way to have the queue support for client would be to have the periodic communication and queued requests/responses within Leshan. I'm not so familiar with the lwm2m specs, so don't know how much is specified regarding this mode's behavior. The real device we are simulating here will be using queue mode, waking up only for the periodic communication. But at least on our case, the command queues and those will be mostly on server side and the client's queue-mode responsibilities are simpler in that regard. |
👍
That's exactly my question.
Yep and my point was IF we don't integrate your PR THEN we can maybe find a solution where we adapt the
Is it possible to you to share a link to the code I'm curious to see how API is used ? |
The code I cannot share too much, but in short: Client usage wise, we just call its start and stop methods. When the client is supposed to wake up, we do
int objectId = X;
LwM2mObjectEnabler objectEnabler = client.getObjectTree().getObjectEnabler(objectId);
LwM2mInstanceEnabler enablerInstance = ((ObjectEnabler) objectEnabler).getInstance(0);
var objectInstance = (ObjectXEnabler) enablerInstance;
objectInstance.updateResources();
When building the client:
@Override
protected Connector createSecuredConnector(DtlsConnectorConfig dtlsConfig) {
var con = new DTLSConnector(dtlsConfig);
con.setExecutor(sharedExecutor);
return con;
} I'll try to create the PR when I get the chance 👍 |
Thx for details. 🙏 I wait for the PR and then we will see what we do to limit your code duplication about |
@sbernard31 I've added a draft PR for the registration engine. There is one other change I would be interested in making, but I will most likely create a separate issue/PR for that one, as it is unrelated to this feature. |
👍 |
About #1574 (comment),
The modification in Leshan could looks like : 4ddcce5 And so your code to modify behavior would be like : private final DefaultRegistrationEngineFactory engineFactory = new DefaultRegistrationEngineFactory() {
@Override
protected RegistrationEngine doCreate(String endpoint, LwM2mObjectTree objectTree,
EndpointsManager endpointsManager, UplinkRequestSender requestSender, BootstrapHandler bootstrapState,
LwM2mClientObserver observer, java.util.Map<String, String> additionalAttributes,
java.util.Map<String, String> bsAdditionalAttributes, ScheduledExecutorService executor,
long requestTimeoutInMs, long deregistrationTimeoutInMs, int bootstrapSessionTimeoutInSec,
int retryWaitingTimeInMs, Integer communicationPeriodInMs, boolean reconnectOnUpdate,
boolean resumeOnConnect, boolean useQueueMode, ContentFormat preferredContentFormat,
java.util.Set<ContentFormat> supportedContentFormats, LinkFormatHelper linkFormatHelper) {
return new DefaultRegistrationEngine(endpoint, objectTree, endpointsManager, requestSender, bootstrapState,
observer, additionalAttributes, bsAdditionalAttributes, executor, requestTimeoutInMs,
deregistrationTimeoutInMs, bootstrapSessionTimeoutInSec, retryWaitingTimeInMs,
communicationPeriodInMs, reconnectOnUpdate, resumeOnConnect, useQueueMode, preferredContentFormat,
supportedContentFormats, linkFormatHelper) {
@Override
protected void onWakeUp(Map<String, LwM2mServer> registeredServers) {
if (registeredServers.isEmpty()) {
super.onWakeUp(registeredServers);
} else {
// TODO support multiple servers
triggerRegistrationUpdate(registeredServers.values().iterator().next());
}
}
};
}
}; This is not so elegant (too many argument ... ) but it should avoid you a lot of code duplicate. (This could be a solution if we decide to not integrate #1579) |
I think your suggestion looks good, definitely would reduce the duplication on both the default engine and the factory. I think that kind of hook approach would work nicely, but one thing I'm concerned is how to guarantee that the right fields from the engine are included on that onWakeUp (also onShutDown or similar would be needed). Concrete example: I recently had to change the executor-logic on public void start() {
stop(false); // Stop without de-register
synchronized (this) {
started = true;
if (attachedExecutor && scheduledExecutor.isShutdown()) {
scheduledExecutor = createScheduledExecutor();
}
LwM2mServer dmServer;
.... and public void stop(boolean deregister) {
...
cancelRegistrationTask();
// we should manage the case where we stop in the middle of a bootstrap session ...
cancelBootstrapTask();
if (attachedExecutor) {
try {
scheduledExecutor.shutdownNow();
scheduledExecutor.awaitTermination(bootstrapSessionTimeoutInSec, TimeUnit.SECONDS);
} catch (InterruptedException e) {
... So this kind of use would require that the executor could be set/get on the onWakeUp and onShutDown hooks. But, like you said, I'm also not sure how much of use this will be for users without the actual Queue mode support. Is that something that's in the plans for the future? Overall, I think Leshan's been great to work with, despite our use case of massive client amount not being so well supported.. I think most headaches been caused by the thread usage, optimization of which has been lots of trial and error 😄 But I think we're almost there. |
Yep I understand. Not exhaustive list of solutions :
Note that currently Leshan 2.0.0 is in development and API can be change/break between 2 milestones release. (then when stable version will be release we will try to respect semantic versioning. But let's better understand your new needs before to get this kind of decision.
I'm not sure to get why ? because we use sync API to send request in
And so you kill executor on stop to release threads ? like you stop() connector to release thread ?
This is in the scope of the project as this is part of LWM2M specification.
Improving Leshan client to be able to simulate lot of client would be great. I appreciate your work and I would be happy if we could make it better together. Maybe we should create a new issue, where we clearly identify problem than we try to define what would be the real good solution for each problem, then also a workaround/short term solution. By the way long time ago, I did that : https://github.com/sbernard31/benchmark-clients |
Did you try to create to use VirtualThread of Java 21 and so maybe you can use much more thread and no need to stop/start all this scheduler ? (I never tested it) |
This behavior was indeed very weird. I profiled the whole thing with JProfiler, and it seemed like every time there would be too many simultaneous connections, all the threads in the pool (set with clientbuilder.setSharedExecutor) would start blocking, as if they couldn't make the call in RegisterResponse response = sender.send(server, request, requestTimeoutInMs); After we stopped the client, we would get the "Registration/Update task interrupted" on the logs, which would indicate the send was stuck until that point. As for why, could not pinpoint the reason. I actually tried to make an async version of the registrationengine, but the only change was that the behavior seemed to happen in that async thread. The clients would communicate, but the amount of Here you can see that while the dtsl-receiver and timer were nicely shutdown on client.stop(), the async thread kept on going. I could not save the stacktrace from that that, but IIRC it was always stuck on that sender.send. BTW I think it would not maybe be impossible to reproduce this, just by having ~100 clients connecting to a server, starting and stopping per timers, and being given a very limited threadpool (size ~2).
Yep, the executor that is basically just that registrationengine-thread.
Sadly not, still using Java 17 in this project :( But would be interesting to see how that would affect things.
That sounds like a good idea. |
Thx for you explanation. For now it's hard to me to really understand the issue. 🤔 As If all But using an async send should solve the issue because So I'm confused 😕 Did you try to use several shared thread pool. I mean 1 threadpool for all
So at least there is a kind of workaround 😬
Let me know if you test that. |
I created an issue to summarize all about simulating several client with Leshan : #1585 |
Yep, tried to have a separate pool for the DTLS-connector (USE 1 below), this basically affected the DTLS-timers based on my observations. BUT later on I realized that we specified that other pool also to the provider, and not just the client. Not sure if this makes any difference, but didn't happen to test. So: one pool for this: USE 1 @Override
protected Connector createSecuredConnector(DtlsConnectorConfig dtlsConfig) {
var con = new DTLSConnector(dtlsConfig);
con.setExecutor(sharedExecutor2);
return con;
} another pool for these USE 2 clientBuilder.setSharedExecutor(sharedExecutor) USE 3 var endpointsProvider = (CaliforniumClientEndpointsProvider) client.getEndpointsProvider().toArray()[0];
endpointsProvider.getCoapServer().setExecutors(sharedExecutor, sharedExecutor, true);
This was our initial guess, that there would be some kind of deadlock-type of situation where all threads are in use and can't be released. |
Maybe I don't get you correctly, so let me know if I misunderstood you. My point is that you should try a dedicated pool for all RegistrationEngine and a dedicated pool for all CoapServer. (so 3 different thread pool) AFAIK :
Let me know if with this 3 different thread pool you still need destroy/recreate RegistrationEngine Schedulor on stop/start. |
I decided to try this one out with the Java 21 virtual threads, since it turned out we could maybe change the java version after all. With three separate executors provided to these three uses, each using virtual threads, we were able to get much better performance as expected, since we can have way more threads running in each of these cases. When providing this external executor, the RegistrationEngine executor is not touched on stop/start. However, I feel this is more of a band aid solution, and there might be some logical error in the way we are handling all of this. |
Thx for sharing that.
Yep surely a aync For Leshan, If you get a not so bad async implementation of registration engine, Please share the code (opening PR ?) Let me know if you need help. |
Question
We are using the leshan to simulate thousands of devices/clients. Since performance-wise one of the limiting factors is the use of threads, we use shared thread pools and stop clients when the devices are sleeping. This approach has some issues, one such seems to be that when the LeshanClient is stopped, it seems to lose all the observations sent by the server. When testing the client with Leshan Server Demo, the observed values are updated nicely on the server if the client is not stopped. When stopped and started again, no notifications are sent when the observed resources are changed, unless the server re-observes them.
My question is, how are those observations stored on the client side, is there some way we could save these so that the client restart would not affect them?
The text was updated successfully, but these errors were encountered: