LwM2M string resource is not a zero terminated C-string according to the LwM2M specification #90719
+341
−173
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
I run into an issues with the current implementation of LwM2M String resource.
A string resource may be truncated with a zero-terminator when the buffer was used up entirely. In case of a UTF-8 multi-byte character as final symbol the result is an invalid UTF-8 string.
from
lwm2m_engine_set(...)
From the git history, the zero termination was always done in one or another way.
However, that is not in line with LwM2M specification which does specify that it is an UTF-8 string. In UTF-8,
nul
is a character like many other.Many people, which are familiar with the C programming language, think that zero terminated strings are common sense. In reality, it is a concept of the the C programming language.
So, the current implementation which enforces zero-termination may cause compatibility issues with other LwM2M implementations. At least, any unit test should fail that do not
get
what he wasset
. That's the case with the current implementation.My solution proposal
Since the
datalen
field is introduced, a String resource can be handled like Opaque. So, don't add or remove any character in the resource itself.lwm2m_get_string()
andlwm2m_set_string()
can do the C string <-> LwM2M string conversion by adding and removing the C string zero terminator.Actually,
lwm2m_get_string()
andlwm2m_get_opaque()
functions lack also the information about the actual resource length. So, I added a parameter to get the information about the resource size.So, the API of these functions breaks compatibility. Anyway,
lwm2m_get_opaque()
had no useful purpose without that information.lwm2m_get_string()
adds now the C zero-termination but requires a buffer that cantake that additional character.
The first commit is too big. I cleaned some code that hurts my eyes. It was not really necessary.
However, any thoughts about the basic topic?