Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dlt-connection: add socket timeout #439

Merged
merged 1 commit into from
Feb 28, 2023

Conversation

alexmohr
Copy link
Contributor

in some scenarios a socket might block forever.
This means dlt-daemon won't respond properly anymore. To fix this a timeout is added to the dlt-daemon socket connection.

The program was tested solely for our own use cases, which might differ from yours.
Licensed under Mozilla Public License Version 2.0

Alexander Mohr, alexander.m.mohr@mercedes-benz.com, Mercedes-Benz Tech Innovation GmbH, imprint

@alexmohr alexmohr force-pushed the add-socket-timeout branch 3 times, most recently from a0d22f8 to 0d131fa Compare January 31, 2023 14:40
@lvklevankhanh
Copy link
Collaborator

Hi @alexmohr ,
The code change is fine to me. However, could you please share us the way to test socket not responding, and how to verify your code change to solve the problem?
Thanks,

@alexmohr
Copy link
Contributor Author

Hi @alexmohr , The code change is fine to me. However, could you please share us the way to test socket not responding, and how to verify your code change to solve the problem? Thanks,

I was never able to reproduce the issue in a synthetic test. It only happened from time to time on our hardware target. After changing this the issues are gone but I'm still not able to reproduce it in a unit test

@lvklevankhanh
Copy link
Collaborator

Thanks @alexmohr for your informative!

#ifdef DLT_SYSTEMD_WATCHDOG_ENABLE
char *watchdogUSec = getenv("WATCHDOG_USEC");
if (watchdogUSec)
timeout.tv_sec = atoi(watchdogUSec) / 4000000;
Copy link
Collaborator

@michael-methner michael-methner Feb 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be:

Suggested change
timeout.tv_sec = atoi(watchdogUSec) / 4000000;
{
timeout.tv_sec = atoi(watchdogUSec) / 1000000;
timeout.tv_usec = atoi(watchdogUSec) % 1000000;
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

@michael-methner
Copy link
Collaborator

Hello @alexmohr ,
i assume that multiple read and write operations to this socket will now cause a timeout as returnvalue / errno . Have you checked that all occurrences handle this in a proper way?

@alexmohr
Copy link
Contributor Author

Hello @alexmohr , i assume that multiple read and write operations to this socket will now cause a timeout as returnvalue / errno . Have you checked that all occurrences handle this in a proper way?

It's mainly used for the dlt-daemon events which are sent to the clients.
This PR is related to #385 which I closed because I thought it's super seeded by this PR. It turns out to fully sort out the timeouts both of these patches are necessary. Should I pick the other commit onto this branch so we have one PR to discuss these issues on?

@michael-methner
Copy link
Collaborator

michael-methner commented Feb 22, 2023

Hello @alexmohr , i assume that multiple read and write operations to this socket will now cause a timeout as returnvalue / errno . Have you checked that all occurrences handle this in a proper way?

It's mainly used for the dlt-daemon events which are sent to the clients. This PR is related to #385 which I closed because I thought it's super seeded by this PR. It turns out to fully sort out the timeouts both of these patches are necessary. Should I pick the other commit onto this branch so we have one PR to discuss these issues on?

We can keep them as separate PRs. As they are touching different code lines, it should not be a problem to merge them separately.

in some scenarios a socket might block forever.
This means dlt-daemon won't respond properly anymore.
To fix this a timeout is added to the dlt-daemon socket connection.

Signed-off-by: Alexander Mohr <alexander.m.mohr@mercedes-benz.com>
char *watchdogUSec = getenv("WATCHDOG_USEC");
if (watchdogUSec) {
timeout.tv_sec = atoi(watchdogUSec) / 1000000;
timeout.tv_usec = atoi(watchdogUSec) % 1000000;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fixed is fine.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants