APRS igate tries to UTF-8 re-encode APRS packets #2028

hessu · 2024-03-18T22:59:52Z

Hi,

I'm Hessu, OH7LZB, I maintain aprs.fi and the aprsc server software which the APRS igate implementation connects to. I briefly reviewed the APRS igate in sdrangel and noticed the following:

APRSWorker::handleMessage tries to decode complete APRS packets as Latin1, and then re-encode to UTF-8. Some fields of APRS packets are UTF-8 encoded these days (text message contents, status text, position packet comment text), and the complete APRS packets seen in the wild often contain byte sequences which are not valid UTF-8 or valid Latin1, and the decoding and re-encoding of the complete packet will fail or produce a modified duplicate packet which differs from the original received packet. This is undesirable - the packets need to be handled in binary-clean buffers by igates.

Here's the latin1 decoding:
https://github.com/f4exb/sdrangel/blob/cb89392c86e4a218c53cbec6b1afeff93cd75f23/sdrbase/util/ax25.cpp#L135C5-L135C16

Here's the UTF-8 encoding:

sdrangel/plugins/feature/aprs/aprsworker.cpp

Line 118 in cb89392

send(igateMsg.toUtf8(), igateMsg.length());

The data contents should be held in a byte array, and sent from there, without any decoding or re-encoding, to ensure arbitrary byte sequences are not modified. To protect against APRS-IS server command injection, send bytes until the first CR or LF is met (or the end of the packet, if CR or LF is not found), and append a CR LF sequence.

APRSWorker::send also takes a const char *data - so it'll stop processing on the first NULL byte. There are packets out there which contain NULs.

Here's the relevant part sof my "hints for igate developers" doc:
https://github.com/hessu/aprsc/blob/main/doc/IGATE-HINTS.md#packets-getting-modified-due-to-character-encoding-issues
https://github.com/hessu/aprsc/blob/main/doc/IGATE-HINTS.md#packets-truncated-by-igates-due-to-c-string-handling

Thanks!

The text was updated successfully, but these errors were encountered:

srcejon · 2024-03-20T12:09:54Z

Thanks for the review. Yes, using binary buffer makes sense, so I'll make the change. This will also enable UTF-8 support in the PacketDemod and APRS Feature.

I don't think there's a problem with APRSWorker::send though, as it has a length parameter.

…t UTF-8. PacketDemod: Support UTF-8.

hessu · 2024-03-29T11:32:04Z

Thanks for the fixes.

I added a couple of comments in the commit, most importantly it looks for a CR+LF sequence, instead of looking for either a CR or a LF. To get proper protection against command injection the code needs to look for either a CR or an LF, truncate at the first occurrence of either, and always append a CR + LF.

hessu · 2024-03-29T11:35:57Z

UTF-8 decoding done still done around the code will probably find on the packets which are not valid UTF-8, but I'm not familiar with QT so I'm not sure what the results will be. There are quite a lot of packets out there with unprintable byte sequences, so debug dumps of packets are best served by a logging method which shows non-ascii bytes as their hexadecimal representations, like "<0x1F><0x1E>".

UTF-8 decoding should generally be done for string fields after APRS decoding, for the text message & comment fields for example. Some valid packets might fail UTF-8 decoding because there's a weird Mic-E position packet byte sequence in there, but then have a valid UTF-8 encoded comment text in the end.

…t UTF-8. PacketDemod: Support UTF-8.

srcejon · 2024-04-14T18:18:29Z

Should be fixed in 7.20

srcejon self-assigned this Mar 20, 2024

srcejon added a commit to srcejon/sdrangel that referenced this issue Mar 20, 2024

APRS: Fix forwarding of binary data to APRS-IS for f4exb#2028. Suppor…

68b833a

…t UTF-8. PacketDemod: Support UTF-8.

srcejon added a commit to srcejon/sdrangel that referenced this issue Apr 3, 2024

f4exb#2028 - Check for Cr or LF.

eac144a

srcejon mentioned this issue Apr 5, 2024

SID (Sudden Ionospheric Disturbance) Feature #2052

Merged

dforsi pushed a commit to dforsi/sdrangel that referenced this issue Apr 7, 2024

APRS: Fix forwarding of binary data to APRS-IS for f4exb#2028. Suppor…

7adf90c

…t UTF-8. PacketDemod: Support UTF-8.

dforsi pushed a commit to dforsi/sdrangel that referenced this issue Apr 7, 2024

f4exb#2028 - Check for Cr or LF.

79c8f90

srcejon closed this as completed Apr 14, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

APRS igate tries to UTF-8 re-encode APRS packets #2028

APRS igate tries to UTF-8 re-encode APRS packets #2028

hessu commented Mar 18, 2024 •

edited

srcejon commented Mar 20, 2024

hessu commented Mar 29, 2024

hessu commented Mar 29, 2024

srcejon commented Apr 14, 2024

APRS igate tries to UTF-8 re-encode APRS packets #2028

APRS igate tries to UTF-8 re-encode APRS packets #2028

Comments

hessu commented Mar 18, 2024 • edited

srcejon commented Mar 20, 2024

hessu commented Mar 29, 2024

hessu commented Mar 29, 2024

srcejon commented Apr 14, 2024

hessu commented Mar 18, 2024 •

edited