Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory leak in 1.6 with Lora32 RTL_433 Acurite 5 in 1 #1693

Closed
rknobbe opened this issue Jun 20, 2023 · 61 comments
Closed

Memory leak in 1.6 with Lora32 RTL_433 Acurite 5 in 1 #1693

rknobbe opened this issue Jun 20, 2023 · 61 comments

Comments

@rknobbe
Copy link

rknobbe commented Jun 20, 2023

leak
Before submitting a problem please check the troubleshooting section
https://docs.openmqttgateway.com/upload/troubleshoot.html

Describe the bug
Memory slowly decreasing over the course of the day. Eventually becomes unresponsive.

To Reproduce
Installed lilygo_rtl-433 from the web installer. Configured to connect to my MQTT server. Plotting memory and uptime in HomeAssistant.

Expected behavior
Stable memory usage.

Screenshots
See attached

Environment (please complete the following information):

  • OpenMQTTGateway version used V1.6.0
  • rtl_433

Additional context
Add any other context about the problem here.

  • You should not have a compilation error if you use the versions of the libraries linked into the libraries folder, this badges show you the state of the compilation
    Build Status
  • If you are not sure this is a bug or an enhancement post your question to the forum below
    Community forum
@1technophile
Copy link
Owner

Did you had 1.5.1 before or is it a new installation?
If it is an update was it a different behavior with 1.5.1?

@rknobbe
Copy link
Author

rknobbe commented Jun 21, 2023 via email

@rknobbe
Copy link
Author

rknobbe commented Jun 22, 2023

I stand corrected. v1.5.1 also goes offline and stops updating the rtl_433 topics, as well as the SYS: Uptime and SYS: FreeMemory topics

@1technophile
Copy link
Owner

Could you detail the RF devices that are being decoded, could help spot an issue with a particular decoder

@rknobbe
Copy link
Author

rknobbe commented Jun 22, 2023

Could you detail the RF devices that are being decoded, could help spot an issue with a particular decoder

Normally receiving from 2 AmbientWeather F007TH sensors, an Acurite-5n1, and 2 Oregon-THGR810 sensors.

After letting it run for a few more hours, I'm also getting signals from a number of nearby sensors, including Acurite-986, Acurite-609TXC, Acurite-Atlas, Interlogix-Security, Generic-Motion, Oregon-CM180i, Skylink_HA-434TL_motion, Acurite-606TX, and Springfield-Soil

@meeuuh92
Copy link

Hi, i've installed last firmware 1.6.0 on my lilygo_rtl-433 and it freeze too... afer five minutes (reboot and lost wifi).. I don't kwow if it's memory leak or chacon/dio protocols... in fact, my chacon products are still not recongnized...
Thank you for the good project/job... and progress
Nicolas

@NorthernMan54
Copy link
Collaborator

@rknobbe Can you try disabling the Home Assistant discovery feature in case it is triggering the issue ? https://docs.openmqttgateway.com/use/gateway.html#auto-discovery

@rknobbe
Copy link
Author

rknobbe commented Jun 26, 2023

IMG_0295
Yes I have auto discovery turned off. Still leaking. See attached

@NorthernMan54
Copy link
Collaborator

@rknobbe I wish that the discovery setting was the issue, as determining which signal is triggering the leak is going to be hard. It is possible throw a process of elimination to determine which signal is triggering the leak ?

@rknobbe
Copy link
Author

rknobbe commented Jun 27, 2023

The linux version of rtl_433 has commandline options to select and deselect which parsers to include. Does the OMG version have a similar feature? I haven't seen it in the docs. Otherwise I can selectively turn off some of my sensors. However there are way more stray signals (from neighbors' sensors) than intentional ones, and I can't disable those.

@rknobbe
Copy link
Author

rknobbe commented Jun 27, 2023

By the way, I've set up this automation in HomeAssistant to detect low memory condition and restart the gateway automatically. It's holding up well now.

alias: Reboot Mqtt gateway
description: "memory leak workaround "
trigger:
  - platform: numeric_state
    entity_id: sensor.sys_free_memory_2
    below: 50000
condition: []
action:
  - device_id: c55473586e86188afd9b4c00c838e76b
    domain: button
    entity_id: button.sys_restart_gateway_2
    type: press
mode: single

@NorthernMan54
Copy link
Collaborator

Unfortunately we don't have a feature to enable or disable particular parsers/decoders

@rknobbe
Copy link
Author

rknobbe commented Jun 27, 2023

Unfortunately we don't have a feature to enable or disable particular parsers/decoders

I really only care about the Acurite 5n1 sensor (and it's the only one of mine that I'd prefer not to disconnect since it's on my roof). Let me see if I can get the gateway and library to compile with just that sensor enabled.

@NorthernMan54
Copy link
Collaborator

Take a look in the code base for the directive MY_DEVICES, it is used to specify a subset of devices.

@rknobbe
Copy link
Author

rknobbe commented Jun 27, 2023

Take a look in the code base for the directive MY_DEVICES, it is used to specify a subset of devices.

I see MY_DEVICES in environments.ini, but it's not clear how I specify which DEVICES I want to be compiled in rtl_433_esp (or how to compile rtl_433_esp in VSCode, for that matter). Regardless, I've enabled MY_DEVICES and reflashed with default_envs=lilygo-rtl_433. Will let you know how the memory looks in the morning.

@NorthernMan54
Copy link
Collaborator

Sorry, I should have mentioned that MY_DEVICES is a directive for the rtl_433_ESP library. It is used a couple places in the library to allow testing of a subset of decoders. In the code you would need to specify which decoders.

To use a custom version of rtl_433_ESP do this

1 - git clone the rtl_433_ESP library from GitHub
2 - In your platformio environment file, add this

[libraries]
rtl_433_esp_local = symlink://../rtl_433_ESP  ; Builds library from source directory

The symlink needs to point to where you cloned rtl_433_ESP

3 - Then in your environment switch

${libraries.rtl_433_ESP} to ${libraries.rtl_433_esp_local}

@rknobbe
Copy link
Author

rknobbe commented Jun 27, 2023

@NorthernMan54 - thanks for the hint on how to get rtl_433_ESP listening to only specific device types.

This morning I recompiled v1.6.0 with only support for the Acurite decoders. I've been running most of the day, and the console plus MQTT-Explorer both give me confidence I'm only hearing my Acurite 5-n-1.

However the memory trend doesn't look encouraging.

just-acurite

@rknobbe
Copy link
Author

rknobbe commented Jun 28, 2023

I turned on MEMORY_DEBUG and watched for a while. I do eventually see what looks like a loss.

Here is a capture of about 1800 lines. Look at the section that starts at line 683.

@rknobbe
Copy link
Author

rknobbe commented Jun 28, 2023

Another trace with only Acurite 5n1 messages

@NorthernMan54
Copy link
Collaborator

A temporary dip in heap is expected as the OMG WebUI and Display caches messages for display. But the cache has a max depth, so that it doesn't leak heap memory.

I'm starting a long running test with the latest build in an attempt to recreate this. I also have an acurite device that uses the same decoder

N: Send on /RTL_433toMQTT/Acurite-Tower/B/2043 msg {"model":"Acurite-Tower","id":2043,"channel":"B","battery_ok":1,"temperature_C":17.4,"humidity":77,"mic":"CHECKSUM","protocol":"Acurite 592TXR Temp/Humidity, 592TX Temp, 5n1 Weather Station, 6045 Lightning, 899 Rain, 3N1, Atlas","rssi":-69,"duration":120001}

@NorthernMan54
Copy link
Collaborator

So far I have not been able to recreate this

image

@rknobbe
Copy link
Author

rknobbe commented Jun 29, 2023

So weird! I rebuilt again last night, eliminating the stray Acurite 686's that are in the neighborhood. This build only has the 5n1 decoder and a WH51 that doesn't seem to get picked up. Same slope of memory loss.
I wonder if there is some hardware problem with my Lora32 device that results in a memory leak?

Acurite-WH51

@NorthernMan54
Copy link
Collaborator

Isn't the WH51 using FSK encoding, so you won't be able to receive it at the same time as you are receive OOK signals.

I was looking at your RTLCnt, and am wondering if you have a significantly more messages coming in compared to my setup. And if this is triggering the leak.

Doing some math

   rknobbe NorthernMan
RTLCnt 150000 6000
Uptime 50000 20000
     
Per second  3 0.3
Per Minute  180 18

Does 3 a second make sense to you ?

@rknobbe
Copy link
Author

rknobbe commented Jun 29, 2023

rtlcount vs messages
(Timescale is about 17 minutes across the X axis)

I just reset the gateway 4810 seconds ago. I'm already up to "RTLCnt": 12148. No that volume of messages (~3 per second) doesn't make sense.

The 5n1 sends alternating messages of type 49 and type 56 (one has rain gauge readings, the other has temp and humidity). For some reason RTLCnt goes up by about 60 over the span of about 10 of those messages. Is RTLCnt a counter of decoded messages, or of interrupts, or what?

{
  "uptime": 4810,
  "version": "v1.6.0-rk1",
  "discovery": false,
  "ohdiscovery": false,
  "env": "lilygo-rtl_433",
  "freemem": 142036,
  "mqttport": "1883",
  "mqttsecure": false,
  "minfreemem": 75248,
  "tempc": 51.11111,
  "freestack": 3132,
  "rssi": -53,
  "SSID": "xx",
  "BSSID": "xx",
  "ip": "192.168.1.69",
  "mac": "xx",
  "actRec": 3,
  "mhz": 433.92,
  "RTLRssiThresh": -95,
  "RTLRssi": -105,
  "RTLAVGRssi": -104,
  "RTLCnt": 12148,
  "RTLOOKThresh": 15,
  "modules": [
    "LilyGo_SSD1306",
    "WebUI",
    "rtl_433"
  ]
}

@NorthernMan54
Copy link
Collaborator

NorthernMan54 commented Jun 29, 2023

RTLCnt is incremented for each signal received, and passed to the signal decoder.

Am thinking we are getting closer to the case of the leak, this high rate of signal may be causing a race condition.

If you enable the directive PUBLISH_UNPARSED it should log high level details of each undecoded signal, and if you also include RAW_SIGNAL_DEBUG it will also log the pulse data. It may offer some insight into the source.

@rknobbe
Copy link
Author

rknobbe commented Jun 29, 2023

I'll recompile and upload tonight. I'm encouraged

@rknobbe
Copy link
Author

rknobbe commented Jun 30, 2023

I'm not seeing a huge volume of unparsed messages. Here a few examples

rtl_433_ESP(6): Pre free rtl_433_DecoderTask: 134740
rtl_433_ESP(6): Post free rtl_433_DecoderTask: 144436
rtl_433_ESP(6): rtl_433_DecoderTask uxTaskGetStackHighWaterMark: 7548
rtl_433_ESP(6): rtl_433_ReceiverTask uxTaskGetStackHighWaterMark: 236
rtl_433_ESP(6): rtl_433_ReceiverTask uxTaskGetStackHighWaterMark: 236
rtl_433_ESP(6): Pre copy out of train: 146684
rtl_433_ESP(6): Post copy out of train: 136988
rtl_433_ESP(6): RAW (177992): +0-57+376-576+448-580+380-1024+900-1084+900-1020+452-572+384-576+900-1084+836-1084+900-572+384-1092+892-1028+892-576+388-1084+900-572+384-576+388-1084+900-1084+384-580+892-1028+444-576+900-572+385-575+384-1092+893-1028+443-512+900-1084+900-572+384-1092+892-580+380-1024+900-636+384-580+380-576+384-1092+892-1028+444-512+454-570+385-575+900-573+383-576+388-636+384-577+387-572+384-576+388-636+384-576+388-572+384-576+452-1020+900-1084+900-1020+448-516+444-576+900-1020+896-1092+892-580+380-1024+900-1084+900-572+384-1092+892-580+380-576+384-1092+892-1028+444-576+836-1084+453-507+896-580+444-576+384-1028+892-1092+380-576+900-1084+580-60+256-768+192-1156+764-640+388-1020+901-571+389-635+388-572+384-1092+828-1092+444-512+128-904+376-577+895-581+379-576+388-573+384-639+384-584+384-568+388-637+321-646+381-575+380-576+384-1095+897-1016+844-1140+64-896+448-576+384-328+56-1152+900-1020+896-649+375-1024+900-1084+900-572+384-1028+892-640+388-572+384-1092+828-1092+444-512+900-1084+384-581+892-575+384-644+379-1025+900-1084+388-572+896-1028+892-644+380-1024+900-572+384-1092+892-580+380-576+384-644+380-1024+900-1084+384-580+445-511+448-580+445-0
rtl_433_ESP(6): Pre run_OOK_demods: 136988
rtl_433_ESP(6): Unparsed Signal length: 177992, Signal RSSI: -54, pulses: 137
N: Send on /RTL_433toMQTT/undecoded signal msg {"model":"undecoded signal","protocol":"signal parsing failed","duration":177992,"rssi":-54,"pulses":137}
rtl_433_ESP(6): RAW (177992): rtl_433_ESP(6): Signal processing time: 135179
rtl_433_ESP(6): Post run_ook_demods memory 134740
rtl_433_ESP(7): Process rtl_433_DecoderTask stack free: 7548

rtl_433_ESP(6): RAW (173996): +599-1020+900-1084+388-572+448-512+900-1084+900-572+384-580+380-576+448-580+380-1024+900-1084+900-1020+900-573+447-1028+448-508+900-1084+448-516+892-1092+380-576+448-518+442-512+900-572+448-576+388-1020+448-580+892-576+388-572+384-576+453-1019+896-1027+445-582+378-576+896-580+380-1088+900-572+384-1092+380-576+388-572+448-512+900-636+384-580+380-576+384-580+380-576+448-580+380-576+384-580+444-512+448-581+379-1024+964-1020+900-1084+384-580+380-576+900-1020+964-572+384-576+388-572+384-576+452-1020+901-1083+900-1021+895-579+381-1094+442-513+895-1091+381-576+899-1021+452-572+384-576+451-509+896-579+445-576+384-1027+445-512+963-573+384-579+381-576+384-1091+893-1027+445-576+383-580+892-576+388-1084+900-572+385-1027+444-577+383-581+443-512+900-572+448-576+389-571+384-576+389-571+450-510+455-569+385-575+384-580+446-574+384-1028+892-1092+892-1028+445-575+388-572+896-1028+892-581+443-576+384-580+380-576+384-1092+892-1028+892-1092+892-583+377-1088+384-580+892-1092+380-576+900-1020+448-517+443-577+383-580+892-576+389-571+448-1028+444-512+903-569+450-574+391-569+386-1092+890-1025+451-508+448-576+900-572+384-1028+956-580+380-1024+452-508+448-576+388-572+448-24516+60-580+60-1408+64-836+60-1348+60-256+64-578+62-64+64-384+64-389+59-64+65-1343+128-128+132-252+128-192+128-448+128-64+64-646+122-128+64-134 
rtl_433_ESP(6): Pre run_OOK_demods: 134664
rtl_433_ESP(6): Unparsed Signal length: 173996, Signal RSSI: -67, pulses: 159
N: Send on /RTL_433toMQTT/undecoded signal msg {"model":"undecoded signal","protocol":"signal parsing failed","duration":173996,"rssi":-67,"pulses":159} 
rtl_433_ESP(6): RAW (173996): rtl_433_ESP(6): Signal processing time: 147348
rtl_433_ESP(6): Post run_ook_demods memory 134228
rtl_433_ESP(7): Process rtl_433_DecoderTask stack free: 7548

rtl_433_ESP(6): Pre free rtl_433_DecoderTask: 134364
rtl_433_ESP(6): Post free rtl_433_DecoderTask: 144436
rtl_433_ESP(6): rtl_433_DecoderTask uxTaskGetStackHighWaterMark: 7548
rtl_433_ESP(7): Average RSSI Signal -90 dbm, adjusted RSSI Threshold -81, samples 50000
rtl_433_ESP(6): rtl_433_ReceiverTask uxTaskGetStackHighWaterMark: 236
rtl_433_ESP(6): rtl_433_ReceiverTask uxTaskGetStackHighWaterMark: 236
rtl_433_ESP(6): Pre copy out of train: 146684
rtl_433_ESP(6): Post copy out of train: 136988
rtl_433_ESP(6): RAW (189997): +360-568+384-644+380-576+384-580+380-640+320-644+380-576+384-580+380-576+448-582+378-1024+900-1084+900-1020+448-580+380-576+900-1084+836-1084+903-569+384-1092+892-1030+506-321+63-576+390-1082+897-579+386-570+384-1097+887-1093+379-576+900-1020+455-569+832-644+380-576+384-1093+392-563+388-573+449-510+904-1094+888-571+383-1103+881-576+384-578+386-572+384-644+396-560+384-576+64-896+384-1088+64-326+58-512+448-585+375-576+384-576+902-570+384-640+326-634+384-576+388-572+384-640+325-635+384-576+384-580+380-576+448-1029+891-1092+892-1028+444-512+452-572+896-1028+892-1092+892-580+380-1088+836-1084+900-572+384-1092+892-580+380-576+384-1092+892-1028+444-576+900-1020+448-516+892-576+388-636+384-1028+444-576+384-580+380-576+900-1084+902-570+386-1030+888-640+384-580+380-576+384-580+380-640+384-580+380-576+384-1092+380-576+452-508+448-576+390-570+896-580+380-576+384-580+380-640+384-580+380-576+384-580+444-576+384-582+378-578+382-1092+892-1028+892-1092+380-576+448-516+892-1092+892-1028+892-640+388-1020+902-1082+900-572+385-1091+828-640+384-580+380-1088+836-1084+452-508+896-1092+380-576+900-572+384-580+444-1024+452-508+448-576+388-572+896-1028+892-644+380-1024+900-573+383-646+378-576+384-576+388-572+384-640+388-572+384-1092+380-576+384-580+444-512+448-580+444-0 
rtl_433_ESP(6): Pre run_OOK_demods: 136988
rtl_433_ESP(6): Unparsed Signal length: 189997, Signal RSSI: -54, pulses: 153
N: Send on /RTL_433toMQTT/undecoded signal msg {"model":"undecoded signal","protocol":"signal parsing failed","duration":189997,"rssi":-54,"pulses":153}
rtl_433_ESP(6): RAW (189997): rtl_433_ESP(6): Signal processing time: 145736
rtl_433_ESP(6): Post run_ook_demods memory 134748
rtl_433_ESP(7): Process rtl_433_DecoderTask stack free: 7548

@rknobbe
Copy link
Author

rknobbe commented Jun 30, 2023

Seeing a lot slower pace of RTLCnt with this new build too, don't understand why. Maybe whatever was blasting me this afternoon has turned off.

here is the first 5 minutes or so from the webUI console.
Edit: oops, looks like the first part of the console log got truncated and this is only the last couple of minutes.

You'll notice there is about a 2:1 ratio of unrecognized frames to Acurite 5n1 frames. After about 1000 seconds RTLCnt is up to about 245.

image

@rknobbe
Copy link
Author

rknobbe commented Jun 30, 2023

I'm not sure what to make of the data I'm seeing. Not many "undecoded signal" messages in the terminal trace, but other than the count in MQTT_Explorer I don't know how to count them. After about 4000 seconds I have about 200 messages from my 5n1 weather station and RTLCnt is over 6000. I also captured the "pulses" value from the undecoded_signal topic, but I don't know if that will help you.

image

@NorthernMan54
Copy link
Collaborator

In an attempt to minimize the number of variables and components involved with this, does it make sense to try a build of just the rtl_433_ESP example receiver ? To ensure that the leak is within rtl_433_ESP and not within OMG.

With the OOK_Receiver example, you would need to set the appropriate compiler directives etc.

I was also looking at the logic around memory usage, and each received signal allocates some heap for storing the signal for processing, passes it to the decoder logic, then the decoder logic frees it after processing. I did not see any obvious leaks.

@github-actions github-actions bot closed this as completed Sep 8, 2023
@khorovatin
Copy link

khorovatin commented Oct 27, 2023

Has there been any resolution to this? I have the same issue.
My workaround is to have my Home Assistant server issue a reboot message to the device every six hours.

@ianmtaylor1
Copy link
Contributor

Sorry for the bump. My neighbor has an Acurite 5-in-1 that is within range of half of my house. If I move my device (ESP32/CC1101, RTL_433 receiver running) within range of the 5-in-1, while still running, the memory leak starts. When I move the device out of range again, while still running, the memory leak stops. If a single RTL_433 decoder is responsible for the leak the Acurite 5-in-1 is a likely culprit.

@rknobbe
Copy link
Author

rknobbe commented Nov 16, 2023

I've resigned myself to having to automatically reset the device when freemem falls below 50k. Recent data:
image

@1technophile
Copy link
Owner

Sorry for the bump. My neighbor has an Acurite 5-in-1 that is within range of half of my house. If I move my device (ESP32/CC1101, RTL_433 receiver running) within range of the 5-in-1, while still running, the memory leak starts. When I move the device out of range again, while still running, the memory leak stops. If a single RTL_433 decoder is responsible for the leak the Acurite 5-in-1 is a likely culprit.

Thanks for this it helps, now the next step is to analyze the corresponding decoder in RTL_433 project to identify any memory leak

@1technophile 1technophile changed the title Memory leak in 1.6 with Lora32 433 Memory leak in 1.6 with Lora32 RTL_433 Acurite 5 in 1 Nov 16, 2023
@1technophile 1technophile reopened this Nov 16, 2023
@drdelaney
Copy link

I do not have the ability to move mine outside of the range, but I also have a neighbor with a 5-in-1 and am also experiencing this memory leak.
I have it rebooting when the memory usage gets too high as well.

@rknobbe
Copy link
Author

rknobbe commented Nov 16, 2023

I'll rebuild one of my gateways this weekend with the 5-in-1 removed and compare with one that has it enabled.

@NorthernMan54
Copy link
Collaborator

@rknobbe An easy way to disable a device decoder is to add the disabled flag - FYI - https://github.com/NorthernMan54/rtl_433_ESP/blob/3fea1cf678212ea5ef70e38f625bc8505f73bb28/src/rtl_433/devices/mebus.c#L90

I spent some time yesterday doing a code review of the acurite device decoder and nothing immediately jumped at out my to say this is it. But still reviewing

@danieljkemp
Copy link

I have encountered this as well. We thought that the radio device was bad (flashed with lilygo-rtl_433), but when I swapped it out with a known stable device the same crash was encountered.

@ianmtaylor1
Copy link
Contributor

I'm trying to use a JTAG debugger to examine this problem further, but I'm having trouble following these Espressif instructions for host-based heap tracing. I can't find how to set the configuration options in the first few steps:

  • CONFIG_HEAP_TRACING_DEST
  • CONFIG_APPTRACE_DESTINATION1
  • CONFIG_APPTRACE_SV_ENABLE

Is there a guide available for how to use the esp-idf menuconfig within platformio?

@NorthernMan54
Copy link
Collaborator

@ianmtaylor1 Sorry I have no experience with the JTAG Debugger, I just use print statements in code ( I know that's primitive, but it works most times. )

With the Acurite 5 in 1, I do have 5 n 1 devices in my setup, and I do not have a memory leak in my setup.

@1technophile
Copy link
Owner

1technophile commented Dec 9, 2023

Is there a guide available for how to use the esp-idf menuconfig within platformio?

I think the question here, is how to use menuconfig with Arduino framework, this may be helpful
https://community.platformio.org/t/modifying-arduino-esp32-setup-with-menuconfig/20040/3

@kjetilsn
Copy link

kjetilsn commented Dec 11, 2023

Hello,
Have similar memory leak with rtl_433 with the following modules enabled:

define DEVICES \

DECL(oregon_scientific)          \
DECL(interlogix)                 \
DECL(rubicson)                   \
DECL(ts_ft002)                   \
DECL(smoke_gs558)                \
DECL(kerui)                      \
DECL(nexus)                      \

I run also in combination with BT, but issue still there if BT disabled.
Will try to compile with one at the time and see if it is possible to identify if the leak is related to a single module.

@drdelaney
Copy link

I have a neighbor with a lot of interlogix devices that show up too. I think I've seen one or two Oregon scientific devices.

@kjetilsn
Copy link

Disabling the Oregon module did the trick:
image

@NorthernMan54
Copy link
Collaborator

@kjetilsn Any thoughts on what within the Oregon Scientific device decoder is triggering this ?

@kjetilsn
Copy link

@NorthernMan54
Unfortunately no, I have two active Oregon sensors WGR800 and PCR800. I will try with one at the time and see if the issue can be isolated to one of them, or if it is common for both. I'm not sure how to debug the decoder though.

@NorthernMan54
Copy link
Collaborator

Identifying memory leaks sometimes can be tricky, my approach is to first review the code for anything that looks suspicious, then go from there.

The Oregon decoder is a bit complex, so identifying where to look first is advantageous, but your two devices appear to use this function oregon_scientific_v3_decode so starting there makes sense
https://github.com/NorthernMan54/rtl_433_ESP/blob/3fea1cf678212ea5ef70e38f625bc8505f73bb28/src/rtl_433/devices/oregon_scientific.c#L598

Looking at the function, this line looks like a possibility unsigned char msg[EXPECTED_NUM_BYTES] = {0};

https://github.com/NorthernMan54/rtl_433_ESP/blob/3fea1cf678212ea5ef70e38f625bc8505f73bb28/src/rtl_433/devices/oregon_scientific.c#L612

It is allocating 44 Bytes every invocation. And I'm not sure if it is cleaned up or not

You could try just moving it outside the function, so it only gets invoked once. ie before the function starts

@pconroy328
Copy link

pconroy328 commented Dec 14, 2023 via email

@ianmtaylor1
Copy link
Contributor

ianmtaylor1 commented Dec 15, 2023

I've found a memory leak, but I don't know for sure if it's the only memory leak. At least, my problems appear to have gone away.

The leak is here, in the function responsible for converting units of received signals. While converting inches to millimeters, we get the following section:

else if ((d->type == DATA_DOUBLE) &&
               (str_endswith(d->key, "_in") || str_endswith(d->key, "_inch"))) {
        d->value.v_dbl = inch2mm(d->value.v_dbl);
        char* new_label =
            str_replace(str_replace(d->key, "_inch", "_in"), "_in", "_mm");
        free(d->key);
        d->key = new_label;
        char* new_format_label = str_replace(d->format, "in", "mm");
        free(d->format);
        d->format = new_format_label;
      }

The function str_replace allocates memory for the new string, and so nesting the calls like this creates a leak.

Changing this line appears to have fixed the leak, at least for me. When I enabled MEMORY_DEBUG in rtl_433_ESP, I noticed the leak was only happening on type 49 messages (wind speed, wind direction, rainfall) and not type 56 (wind speed, temperature, humidity) messages.

This explains why it was so hard to pin down a specific device, because it was happening during decoding/output but not in any specific decoder. @kjetilsn does your Oregon Scientific device measure rainfall? @NorthernMan54 are you converting units to SI or no? If not, it could explain why you aren't experiencing this with your 5-in-1.

Like I said, I don't know if this is the only leak, but this is definitely a leak. I'm happy to submit a pull request.

@NorthernMan54
Copy link
Collaborator

With the Library I thought I had hard coded it into Metric mode, hence why everyone is seeing the issue.

And for my devices, I have another sensor that leverages the same acurite device decoders, but it does not do rain fall. So that is why I never noticed this.

Will get this released into rtl_433_ESP over the next few days, it will also need an update on the OMG side as well so stay tuned.

@kjetilsn
Copy link

Thank you @NorthernMan54 for advise on debugging, I did not get around to testing it though.
@ianmtaylor1, that is correct, one of the Oregon sensors is a rain bucket sensor, the other one is a wind sensor. I've applied you patch just now, which looks like have solved the leak, thank you for the fix!

NorthernMan54 added a commit to NorthernMan54/rtl_433 that referenced this issue Dec 18, 2023
@NorthernMan54
Copy link
Collaborator

FYI - The same fix was implemented in rtl_433, well done @ianmtaylor1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

10 participants