Skip to content

LwM2M string resource is not a zero terminated C-string according to the LwM2M specification #90719

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

GardeningStevie
Copy link
Contributor

I run into an issues with the current implementation of LwM2M String resource.

A string resource may be truncated with a zero-terminator when the buffer was used up entirely. In case of a UTF-8 multi-byte character as final symbol the result is an invalid UTF-8 string.

from lwm2m_engine_set(...)

	case LWM2M_RES_TYPE_STRING:
		if (len) {
			strncpy(data_ptr, value, len - 1);
			((char *)data_ptr)[len - 1] = '\0';
		} else {
			((char *)data_ptr)[0] = '\0';
		}
		break;

From the git history, the zero termination was always done in one or another way.

However, that is not in line with LwM2M specification which does specify that it is an UTF-8 string. In UTF-8, nul is a character like many other.

Many people, which are familiar with the C programming language, think that zero terminated strings are common sense. In reality, it is a concept of the the C programming language.

So, the current implementation which enforces zero-termination may cause compatibility issues with other LwM2M implementations. At least, any unit test should fail that do not get what he was set. That's the case with the current implementation.

My solution proposal
Since the datalen field is introduced, a String resource can be handled like Opaque. So, don't add or remove any character in the resource itself.

lwm2m_get_string() and lwm2m_set_string() can do the C string <-> LwM2M string conversion by adding and removing the C string zero terminator.

Actually, lwm2m_get_string() and lwm2m_get_opaque() functions lack also the information about the actual resource length. So, I added a parameter to get the information about the resource size.
So, the API of these functions breaks compatibility. Anyway, lwm2m_get_opaque() had no useful purpose without that information. lwm2m_get_string() adds now the C zero-termination but requires a buffer that can
take that additional character.

The first commit is too big. I cleaned some code that hurts my eyes. It was not really necessary.

However, any thoughts about the basic topic?

@GardeningStevie GardeningStevie force-pushed the gardena/sc/upstream/lwm2m-invalid-utf8-string-due-to-forced-zero-termination branch 4 times, most recently from acc6a5c to f27a77d Compare June 2, 2025 09:53
It's against LwM2M specification which does not specify zero-termination
for a string resource. It's not required and the current implementation
makes a terminating UTF-8 multibyte character invalid and modifies any
string which use up the entire assigned buffer without any notice.
It may also break compatibility with other LwM2M implementations.

This commit fixes `lwm2m_get_string()` and `lwm2m_get_opaque()`
implementation which could not return the copied data length. So, the
API of these functions breaks compatiblity.
However, `lwm2m_get_opaque` had no useful purpose without the information
about the copied size. `lwm2m_get_string` still adds zero-termination but
requires a buffer that can take the zero-termination character.

To fix the implementation, `lwm2m_engine_get()` was modified in way it
returns the copied data length. So, a return code >= 0 is okay now.
While editing `lwm2m_engine_get()` some "side quests" arise.
- `memcpy` works for everything except a LwM2M Time resource.
  So, a lot of code was dropped.
- reading from a null pointer can return `success`.
- `read_cb()` unittest implementation returns an error (null pointer).

Signed-off-by: Stefan Schwendeler <Stefan.Schwendeler@husqvarnagroup.com>
A read callback can return a null pointer to inidicate an error.
`lwm_engine_get()` will then return -ENOENT as response.

Signed-off-by: Stefan Schwendeler <Stefan.Schwendeler@husqvarnagroup.com>
Verifies that the memory copy works as expected
- return codes are correct
- resource length parameter works

Signed-off-by: Stefan Schwendeler <Stefan.Schwendeler@husqvarnagroup.com>
@GardeningStevie GardeningStevie force-pushed the gardena/sc/upstream/lwm2m-invalid-utf8-string-due-to-forced-zero-termination branch from f27a77d to 88c70cf Compare June 2, 2025 11:16
Copy link

sonarqubecloud bot commented Jun 2, 2025

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant