Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generic RTL8710BX doesn't start after a reboot #24

Closed
mihsu81 opened this issue Sep 25, 2022 · 31 comments
Closed

Generic RTL8710BX doesn't start after a reboot #24

mihsu81 opened this issue Sep 25, 2022 · 31 comments
Labels
bug Something isn't working

Comments

@mihsu81
Copy link
Contributor

mihsu81 commented Sep 25, 2022

Hi @kuba2k2,

The Generic RTL8710BX (mentioned in my previous support requests) doesn't start after sending a restart command or after an OTA firmware upgrade. https://docs.libretuya.ml/boards/generic-rtl8710bx-4mb-980k/

If I unplug it and plug it again it starts up.

[09:53:04][D][button:013]: 'PLUG-EZVIZ Restart' Pressed.
[09:53:04][I][restart.button:012]: Restarting device...
[09:53:04][I][app:134]: Rebooting safely...
[09:53:04][D][lt.preferences:104]: Saving 1 preferences to flash...
[09:53:04][V][lt.preferences:115]: sync: key: 233825507, len: 4
[09:53:04][D][lt.preferences:133]: Saving 1 preferences to flash: 0 cached, 1 written, 0 failed
←[32mINFO Disconnected from ESPHome API for plug-ezviz.local←[0m
←[33mWARNING Disconnected from API←[0m
←[33mWARNING Can't connect to ESPHome API for plug-ezviz.local: Error connecting to ('192.168.45.123', 6053): [WinError 121] The semaphore timeout period has expired←[0m
←[32mINFO Trying to reconnect to plug-ezviz.local in the background←[0m
←[32mINFO Successfully connected to plug-ezviz.local←[0m
[09:55:04][V][json:033]: Attempting to allocate 512 bytes for JSON serialization
[09:55:04][V][json:053]: Size after shrink 60 bytes
[09:55:04][V][json:033]: Attempting to allocate 512 bytes for JSON serialization
[09:55:04][V][json:053]: Size after shrink 80 bytes
[09:55:04][V][json:033]: Attempting to allocate 512 bytes for JSON serialization
[09:55:04][V][json:053]: Size after shrink 152 bytes
[09:55:04][V][json:033]: Attempting to allocate 512 bytes for JSON serialization
[09:55:04][V][json:053]: Size after shrink 156 bytes
[09:55:04][V][json:033]: Attempting to allocate 512 bytes for JSON serialization
[09:55:04][V][json:053]: Size after shrink 148 bytes
[09:55:04][V][json:033]: Attempting to allocate 512 bytes for JSON serialization
[09:55:04][V][json:053]: Size after shrink 168 bytes
[09:55:17][V][wifi_lt:276]: Event: Connected ssid='SunnyDay' bssid=FC:EF:03:10:01:01 channel=4, authmode=WPA PSK
[09:58:37][I][ota:109]: Boot seems successful, resetting boot loop counter.
[09:58:37][D][lt.preferences:104]: Saving 1 preferences to flash...
[09:58:37][V][lt.preferences:115]: sync: key: 233825507, len: 4
[09:58:37][D][lt.preferences:133]: Saving 1 preferences to flash: 0 cached, 1 written, 0 failed
@mihsu81
Copy link
Contributor Author

mihsu81 commented Sep 25, 2022

And if I make major modifications in the yaml configuration file the device won't start at all after an OTA update.
I have to flash the same firmware file over UART.

@kuba2k2
Copy link
Member

kuba2k2 commented Sep 25, 2022

That's weird. Did you compile the yaml with the correct board specified? If you do not, it may overwrite wrong parts of the flash during an update.

@mihsu81
Copy link
Contributor Author

mihsu81 commented Sep 25, 2022

I'm using "generic-rtl8710bx-4mb-980k", which I'd say is the correct one. The same firmware.uf2 is working fine when flashed over UART using ltchiptool.
Also using the wrong board wouldn't explain the fact that it doesn't start after a "restart" command.

python -m ltchiptool uf2 upload "C:\Users\Bethayn-Dell\OneDrive - TUI\Documents\GitHub\esphome\libretuya-esphome\.esphome\build\plug-ezviz\.pioenvs\plug-ezviz\firmware.uf2" uart COM7
|-- esphome 2022.10.0-dev @ 2022-09-25 10:13:32 -> generic-rtl8710bx-4mb-980k
|-- Using UART
|   |-- Connecting to COM7...
|   |-- Flashing image to OTA 2...
|   |-- Writing 558976 bytes to 0x8100000
|-- Finished in 14.129 s

@kuba2k2
Copy link
Member

kuba2k2 commented Sep 26, 2022

Could you do a flash dump right after doing OTA upgrade? (an upgrade after which it fails to boot, even after reset)

And if you can, share the uf2 file you used for that specific update as well. Note: it may contain your WiFi password, so use a dummy YAML if you consider that sensitive.

@mihsu81
Copy link
Contributor Author

mihsu81 commented Sep 26, 2022

Here are the dump file dump_ezviz_libretuya_2.zip and the firmware file firmware.zip.

And something I should have checked and sent from the beginning, the logs of the device after a "breaking" OTA.
The device seems to crash with RTL8195A[HAL]: Hard Fault Error!!!!.

<RTL8195A>
ROM:
<RTL8195A>
ROM:[V0.1]


FLASHRATE:4


BOOT TYPE:0 XTAL:40000000


IMG1 DATA[1128:10002000]


IMG1 ENTRY[8000521:100021ef]


IMG1 ENTER


CHIPID[000000f6]


read_mode idx:3, flash_speed idx:3


calibration_result:[1:25:13][1:19] 


calibration_result:[2:25:13][1:19] 


calibration_result:[3:0:0][ff:ff] 


calibration_ok:[1:25:13] 


FLASH CALIB[NEW OK]


OTA2 ADDR[8100000]


OTAx SELE[ffff8000]


OTA2 USE


OTA2 SIGN[35393138:31313738]


IMG2 DATA[0x8187840:3812:0x10005000]


IMG2 SIGN[RTKWin(10005008)]


IMG2 ENTRY[0x10005000:0x812d0c5]


System_Init1


<RTL8195A>
ROM:[V0.1]


FLASHRATE:4


BOOT TYPE:0 XTAL:40000000


IMG1 DATA[1128:10002000]


IMG1 ENTRY[8000521:100021ef]


IMG1 ENTER


CHIPID[000000f6]


read_mode idx:3, flash_speed idx:3


calibration_result:[1:25:13][1:19] 


calibration_result:[2:25:13][1:19] 


calibration_result:[3:0:0][ff:ff] 


calibration_ok:[1:25:13] 


FLASH CALIB[NEW OK]


OTA2 ADDR[8100000]


OTAx SELE[ffff8000]


OTA2 USE


OTA2 SIGN[35393138:31313738]


IMG2 DATA[0x8187840:3812:0x10005000]


IMG2 SIGN[RTKWin(10005008)]


IMG2 ENTRY[0x10005000:0x812d0c5]


System_Init1


OSC8M: 800574 

boot reason: 0 

System_Init2

[1B][1;32mI [      0.000] [1B][0mLibreTuya v0.10.0+sha.b404f9f.dirty on generic-rtl8710bx-4mb-980k, compiled at Sep 24 2022 21:43:44
[1B][0;32m[I][logger:283]: Log initialized[1B][0m
[1B][0;35m[C][ota:473]: There have been 6 suspected unsuccessful boot attempts.[1B][0m
[1B][0;36m[D][lt.preferences:104]: Saving 1 preferences to flash...[1B][0m
[1B][0;37m[V][lt.preferences:115]: sync: key: 233825507, len: 4[1B][0m
[1B][0;36m[D][lt.preferences:133]: Saving 1 preferences to flash: 0 cached, 1 written, 0 failed[1B][0m
[1B][0;32m[I][app:029]: Running through setup()...[1B][0m
[1B][0;37m[V][app:030]: Sorting components by setup priority...[1B][0m
[1B][0;37m[V][text_sensor:016]: 'PLUG-EZVIZ MAC Wifi Address': Received new state 64:F2:FB:3B:AF:1C[1B][0m
[1B][0;36m[D][text_sensor:067]: 'PLUG-EZVIZ MAC Wifi Address': Sending state '64:F2:FB:3B:AF:1C'[1B][0m
[1B][0;35m[C][wifi:037]: Setting up WiFi...[1B][0m
[1B][0;37m[V][wifi_lt:034]: Enabling STA.[1B][0m

RTL8195A[HAL]: Hard Fault Error!!!!


RTL8195A[HAL]: R0 = 0x1


RTL8195A[HAL]: R1 = 0x40


RTL8195A[HAL]: R2 = 0x100169de


RTL8195A[HAL]: R3 = 0x815191d


RTL8195A[HAL]: R12 = 0xffffff


RTL8195A[HAL]: LR = 0x815418d


RTL8195A[HAL]: PC = 0x10017d18


RTL8195A[HAL]: PSR = 0x20000000


RTL8195A[HAL]: BFAR = 0xe000ed38


RTL8195A[HAL]: CFSR = 0x20000


RTL8195A[HAL]: HFSR = 0x40000000


RTL8195A[HAL]: DFSR = 0x0


RTL8195A[HAL]: AFSR = 0x0


RTL8195A[HAL]: PriMask 0x0


RTL8195A[HAL]: BasePri 0x0


RTL8195A[HAL]: SVC priority: 0x00


RTL8195A[HAL]: PendSVC priority: 0xf0


RTL8195A[HAL]: Systick priority: 0xf0


<RTL8195A>
ROM:[V0.1]


FLASHRATE:4


BOOT TYPE:0 XTAL:40000000


IMG1 DATA[1128:10002000]


IMG1 ENTRY[8000521:100021ef]


IMG1 ENTER


CHIPID[000000f6]


read_mode idx:3, flash_speed idx:3


calibration_result:[1:25:13][1:19] 


calibration_result:[2:25:13][1:19] 


calibration_result:[3:0:0][ff:ff] 


calibration_ok:[1:25:13] 


FLASH CALIB[NEW OK]


OTA2 ADDR[8100000]


OTAx SELE[ffff8000]


OTA2 USE


OTA2 SIGN[35393138:31313738]


IMG2 DATA[0x8187840:3812:0x10005000]


IMG2 SIGN[RTKWin(10005008)]


IMG2 ENTRY[0x10005000:0x812d0c5]


System_Init1


OSC8M: 7ff808 

boot reason: 1 

System_Init2

[1B][1;32mI [      0.000] [1B][0mLibreTuya v0.10.0+sha.b404f9f.dirty on generic-rtl8710bx-4mb-980k, compiled at Sep 24 2022 21:43:44
[1B][0;32m[I][logger:283]: Log initialized[1B][0m
[1B][0;35m[C][ota:473]: There have been 7 suspected unsuccessful boot attempts.[1B][0m
[1B][0;36m[D][lt.preferences:104]: Saving 1 preferences to flash...[1B][0m
[1B][0;37m[V][lt.preferences:115]: sync: key: 233825507, len: 4[1B][0m
[1B][0;36m[D][lt.preferences:133]: Saving 1 preferences to flash: 0 cached, 1 written, 0 failed[1B][0m
[1B][0;32m[I][app:029]: Running through setup()...[1B][0m
[1B][0;37m[V][app:030]: Sorting components by setup priority...[1B][0m
[1B][0;37m[V][text_sensor:016]: 'PLUG-EZVIZ MAC Wifi Address': Received new state 64:F2:FB:3B:AF:1C[1B][0m
[1B][0;36m[D][text_sensor:067]: 'PLUG-EZVIZ MAC Wifi Address': Sending state '64:F2:FB:3B:AF:1C'[1B][0m
[1B][0;35m[C][wifi:037]: Setting up WiFi...[1B][0m
[1B][0;37m[V][wifi_lt:034]: Enabling STA.[1B][0m

RTL8195A[HAL]: Hard Fault Error!!!!


RTL8195A[HAL]: R0 = 0x1


RTL8195A[HAL]: R1 = 0x40


RTL8195A[HAL]: R2 = 0x100169de


RTL8195A[HAL]: R3 = 0x815191d


RTL8195A[HAL]: R12 = 0xffffff


RTL8195A[HAL]: LR = 0x815418d


RTL8195A[HAL]: PC = 0x10017d18


RTL8195A[HAL]: PSR = 0x20000000


RTL8195A[HAL]: BFAR = 0xe000ed38


RTL8195A[HAL]: CFSR = 0x20000


RTL8195A[HAL]: HFSR = 0x40000000


RTL8195A[HAL]: DFSR = 0x0


RTL8195A[HAL]: AFSR = 0x0


RTL8195A[HAL]: PriMask 0x0


RTL8195A[HAL]: BasePri 0x0


RTL8195A[HAL]: SVC priority: 0x00


RTL8195A[HAL]: PendSVC priority: 0xf0


RTL8195A[HAL]: Systick priority: 0xf0

@kuba2k2
Copy link
Member

kuba2k2 commented Sep 26, 2022

That's interesting. It seems like it starts, but crashes while trying to enable wifi. Does it start after a power cycle?

You can change the loglevel to info or debug, as shown in the docs.

@mihsu81
Copy link
Contributor Author

mihsu81 commented Sep 27, 2022

Adding "loglevel" breaks the linking process.

arm-none-eabi-objcopy -I binary -O elf32-littlearm -B arm C:\Users\Bethayn-Dell\.platformio\platforms\libretuya\platform\realtek-ambz\bin\boot_all_C556.bin .pioenvs\plug-ezviz\boot_all.o
Archiving .pioenvs\plug-ezviz\libambz_mbedtls.a
Archiving .pioenvs\plug-ezviz\libboot_all.a
Archiving .pioenvs\plug-ezviz\libambz_fixups.a
Linking .pioenvs\plug-ezviz\firmware.elf
.pioenvs\plug-ezviz\libretuya_api.a(main.cpp.o): In function `main':
C:/Users/Bethayn-Dell/.platformio/platforms/libretuya/arduino/libretuya/core/main.cpp:58: undefined reference to `LibreTuya::getResetReason()'
collect2.exe: error: ld returned 1 exit status
|-- Image 1: firmware.ota1.elf
Generated file not found: .pioenvs\plug-ezviz\firmware.ota1.elf
*** [.pioenvs\plug-ezviz\firmware.elf] Error 1
============================================= [FAILED] Took 151.23 seconds =============================================
substitutions:
  devicename: plug_ezviz
  friendly_name: PLUG-EZVIZ
  device_description: EZVIZ TT1 Smart Plug CS-T31-16B-EU 

esphome:
  name: plug-ezviz

libretuya:
  board: generic-rtl8710bx-4mb-980k
  framework:
    version: latest
  loglevel: verbose
  sdk_silent: false
  gpio_recover: true

# Enable logging
logger:
  level: VERBOSE

# Enable Home Assistant API
api:
  reboot_timeout: 1h
  encryption:
    key: xxx

ota:
  password: xxx

web_server:
  port: 80
  auth:
    username: xxx
    password: xxx
  include_internal: true

wifi:
  ssid: "xxx"
  password: "xxx"

  ap:
    ssid: "${friendly_name} Fallback"

captive_portal:

text_sensor:
  - platform: wifi_info
    ip_address:
      name: ${friendly_name} IP Address
    ssid:
      name: ${friendly_name} Connected SSID
    bssid:
      name: ${friendly_name} Connected BSSID
    mac_address:
      name: ${friendly_name} MAC Wifi Address

button:
  - platform: restart
    name: ${friendly_name} Restart

sensor:
  - platform: wifi_signal
    name: ${friendly_name} Wifi Signal
    update_interval: 60s
    id: ${devicename}_wifi_signal

@kuba2k2
Copy link
Member

kuba2k2 commented Sep 27, 2022

I've added a dummy reset reason function, so it should compile with the latest commit.

@mihsu81
Copy link
Contributor Author

mihsu81 commented Sep 28, 2022

Adding loglevel seemed to have stopped somewhat the Hard Fault errors, but I did manage to replicate a couple of times.
Even with logging set to verbose, there isn't much info.

Case 1 - The yaml file is the same as in my previous post. The change was adding _ loglevel: verbose_.
libretuya1.zip



<RTL8195A>
ROM:[V0.1]


FLASHRATE:4


BOOT TYPE:0 XTAL:40000000


IMG1 DATA[1128:10002000]


IMG1 ENTRY[8000521:100021ef]


IMG1 ENTER


CHIPID[000000f6]


read_mode idx:3, flash_speed idx:3


calibration_result:[1:25:13][1:19] 


calibration_result:[2:25:13][1:19] 


calibration_result:[3:0:0][ff:ff] 


calibration_ok:[1:25:13] 


FLASH CALIB[NEW OK]


OTA2 ADDR[8100000]


OTAx SELE[fffe0000]


OTA2 USE


OTA2 SIGN[35393138:31313738]


IMG2 DATA[0x8188180:3812:0x10005000]


IMG2 SIGN[RTKWin(10005008)]


IMG2 ENTRY[0x10005000:0x812d569]


System_Init1


boot reason: ï[03][10][01][01]
RTL8195A[HAL]: Hard Fault Error!!!!


RTL8195A[HAL]: R0 = 0x1003ef78


RTL8195A[HAL]: R1 = 0x0


RTL8195A[HAL]: R2 = 0x0


RTL8195A[HAL]: R3 = 0x3411


RTL8195A[HAL]: R12 = 0x9cdd


RTL8195A[HAL]: LR = 0x812d531


RTL8195A[HAL]: PC = 0x0


RTL8195A[HAL]: PSR = 0x40000200


RTL8195A[HAL]: BFAR = 0xe000ed38


RTL8195A[HAL]: CFSR = 0x20000


RTL8195A[HAL]: HFSR = 0x40000000


RTL8195A[HAL]: DFSR = 0x0


RTL8195A[HAL]: AFSR = 0x0


RTL8195A[HAL]: PriMask 0x0


RTL8195A[HAL]: BasePri 0x0


RTL8195A[HAL]: SVC priority: 0x00


RTL8195A[HAL]: PendSVC priority: 0x00


RTL8195A[HAL]: Systick priority: 0x00

Case 2 - To the above yaml I've added the status_led. This seems to cause Hard Fault even after flashing it over UART.

status_led:
  pin:
    number: D2
    inverted: true

libretuya2.zip



<RTL8195A>
ROM:[V0.1]


FLASHRATE:4


BOOT TYPE:0 XTAL:40000000


IMG1 DATA[1128:10002000]


IMG1 ENTRY[8000521:100021ef]


IMG1 ENTER


CHIPID[000000f6]


read_mode idx:3, flash_speed idx:3


calibration_result:[1:25:13][1:19] 


calibration_result:[2:25:13][1:19] 


calibration_result:[3:0:0][ff:ff] 


calibration_ok:[1:25:13] 


FLASH CALIB[NEW OK]


OTA2 ADDR[8100000]


OTAx SELE[fff00000]


OTA1 USE


IMG2 DATA[0x8098040:3812:0x10005000]


IMG2 SIGN[RTKWin(10005008)]


IMG2 ENTRY[0x10005000:0x803cbf9]


System_Init1


System_Init2

[1B][1;32mI [      0.000] [1B][0mmain():54: - LibreTuya v0.10.0+sha.b404f9f.dirty on generic-rtl8710bx-4mb-980k, compiled at Sep 27 2022 22:33:49
[1B][1;32mI [      0.000] [1B][0mmain():58: - Reset reason: 0
[1B][1;32mI [      0.010] [1B][0mopen():81: mainTask: Preferences initialized
[1B][0;32m[I][logger:283]: Log initialized[1B][0m
[1B][0;35m[C][status_led:014]: Setting up Status LED...[1B][0m

RTL8195A[HAL]: Hard Fault Error!!!!


RTL8195A[HAL]: R0 = 0x0


RTL8195A[HAL]: R1 = 0x0


RTL8195A[HAL]: R2 = 0x1440100


RTL8195A[HAL]: R3 = 0x2


RTL8195A[HAL]: R12 = 0x0


RTL8195A[HAL]: LR = 0x8039073


RTL8195A[HAL]: PC = 0x8032db4


RTL8195A[HAL]: PSR = 0x41000000


RTL8195A[HAL]: BFAR = 0xe000ed38


RTL8195A[HAL]: CFSR = 0x10000


RTL8195A[HAL]: HFSR = 0x40000000


RTL8195A[HAL]: DFSR = 0x0


RTL8195A[HAL]: AFSR = 0x0


RTL8195A[HAL]: PriMask 0x0


RTL8195A[HAL]: BasePri 0x0


RTL8195A[HAL]: SVC priority: 0x00


RTL8195A[HAL]: PendSVC priority: 0xf0


RTL8195A[HAL]: Systick priority: 0xf0

@kuba2k2
Copy link
Member

kuba2k2 commented Sep 28, 2022

Hmm I should've suggested that from the start:
Run the PlatformIO's serial monitor, configured like in the docs:
https://docs.libretuya.ml/docs/platform/realtek/exception-decoder/

This should print the source file name and line number, next to each HardFault error, so that you'll exactly see what's going wrong.

For ESPHome, place the .ini line in .esphome/build/<project>/platformio.ini and run platformio monitor from within that directory. Note that this .ini will be restored to stock at every esphome compile/run; you can add the .ini option to YAML as well to prevent that.

@mihsu81
Copy link
Contributor Author

mihsu81 commented Sep 30, 2022

Hi @kuba2k2,
I'm attaching a dump and the firmware. libretuya.zip

I've checked the files with a hex editor, and it looks like the firmware.uf2 is corrupted and this causes the boot failure (below start-up log).

image

ROM:[V0.1]
FLASHRATE:4
BOOT TYPE:0 XTAL:40000000
IMG1 DATA[1128:10002000]
IMG1 ENTRY[8000521:100021ef]
IMG1 ENTER
CHIPID[000000f6]
read_mode idx:3, flash_speed idx:3
calibration_result:[1:25:13][1:19]
calibration_result:[2:25:13][1:19]
calibration_result:[3:0:0][ff:ff]
calibration_ok:[1:25:13]
FLASH CALIB[NEW OK]
OTA2 ADDR[8100000]
OTAx SELE[ff800000]
OTA2 USE
OTA2 SIGN[35393138:31313738]
IMG2 DATA[0x8193640:3868:0x10005000]
IMG2 SIGN[RTKWin(10005008)]
IMG2 ENTRY[0x10005000:0x8135ed1]
System_Init1
RTL8195A[HAL]: Hard Fault Error!!!!
RTL8195A[HAL]: R0 = 0x1003ef64
RTL8195A[HAL]: R1 = 0x40003400
RTL8195A[HAL]: R2 = 0xff
RTL8195A[HAL]: R3 = 0xfa205
RTL8195A[HAL]: R12 = 0x9cdd
RTL8195A[HAL]: LR = 0x8135365
RTL8195A[HAL]: PC = 0xfa204
RTL8195A[HAL]: PSR = 0x1000200
RTL8195A[HAL]: BFAR = 0xe000ed38
RTL8195A[HAL]: CFSR = 0x0
RTL8195A[HAL]: HFSR = 0x80000000
RTL8195A[HAL]: DFSR = 0x2
RTL8195A[HAL]: AFSR = 0x0
RTL8195A[HAL]: PriMask 0x0
RTL8195A[HAL]: BasePri 0x0
RTL8195A[HAL]: SVC priority: 0x00
RTL8195A[HAL]: PendSVC priority: 0x00
RTL8195A[HAL]: Systick priority: 0x00

@kuba2k2
Copy link
Member

kuba2k2 commented Sep 30, 2022

Where do you see that uf2 is corrupted? Each 512 byte block has a 32 byte header, 256 bytes of actual data, and then some amount of "extension tags". Unless you mean that the data or tags themselves are corrupted, it looks perfectly valid to me.

If you configure the exception decoder, it will tell you what is causing the crash.

@mihsu81
Copy link
Contributor Author

mihsu81 commented Sep 30, 2022

The content is corrupted. The text has some of the characters replaced by special characters. Also at the bottom is my SSID and WPA2 key which normally appear correctly but, in the firmware.uf2 I've posted, contain special characters.

image

I did set up the decoder, but the above log is all i get.

esphome:
  name: plug-ezviz
  platformio_options:
    monitor_speed: 115200
    monitor_filters: rtl_hard_fault_decoder

@kuba2k2
Copy link
Member

kuba2k2 commented Sep 30, 2022

At 0xF4320 the extension tags begin. This means that the rest of your ssid is in the next UF2 block. Scroll down to 0xF4420 and your ssid should be there.

@mihsu81
Copy link
Contributor Author

mihsu81 commented Sep 30, 2022

I know it continues on the next line, but it should say SunnyDay.

image

@mihsu81
Copy link
Contributor Author

mihsu81 commented Sep 30, 2022

Oh I see now ... sorry 😁

@mihsu81
Copy link
Contributor Author

mihsu81 commented Sep 30, 2022

Im not able to get more out of platformio device monitor.
The below output is after an OTA update.
Again, flashing the same firmware.uf2 over UART got the device working.

PS C:\Users\Bethayn-Dell\GitHub\libretuya-esphome\.esphome\build\plug-ezviz> platformio device monitor
************************************************************************************************************************
Obsolete PIO Core v6.0.2 is used (previous was 6.1.4)
Please remove multiple PIO Cores from a system:
https://docs.platformio.org/en/latest/core/installation/troubleshooting.html
************************************************************************************************************************
--- Available filters and text transformations: colorize, debug, default, direct, hexlify, log2file, nocontrol, printable, rtl_hard_fault_decoder, send_on_enter, time
--- More details at https://bit.ly/pio-monitor-filters
RtlHardFaultDecoder: firmware at C:\Users\Bethayn-Dell\GitHub\libretuya-esphome\.esphome\build\plug-ezviz\.pio\build\plug-ezviz\firmware.elf does not exist, rebuild the project?

Please build project in debug configuration to get more details about an exception.
See https://docs.platformio.org/page/projectconf/build_configurations.html


--- Miniterm on COM8  115200,8,N,1 ---
--- Quit: Ctrl+C | Menu: Ctrl+T | Help: Ctrl+T followed by Ctrl+H ---
ROM:[V0.1]
FLASHRATE:4
BOOT TYPE:0 XTAL:40000000
IMG1 DATA[1128:10002000]
IMG1 ENTRY[8000521:100021ef]
IMG1 ENTER
CHIPID[000000f6]
read_mode idx:3, flash_speed idx:3
calibration_result:[1:25:13][1:19]
calibration_result:[2:25:13][1:19]
calibration_result:[3:0:0][ff:ff]
calibration_ok:[1:25:13]
FLASH CALIB[NEW OK]
OTA2 ADDR[8100000]
OTAx SELE[fe000000]
OTA2 USE
OTA2 SIGN[35393138:31313738]
IMG2 DATA[0x8193640:3868:0x10005000]
IMG2 SIGN[RTKWin(10005008)]
IMG2 ENTRY[0x10005000:0x8135ed1]
System_Init1
RTL8195A[HAL]: Hard Fault Error!!!!
RTL8195A[HAL]: R0 = 0x1003ef64
RTL8195A[HAL]: R1 = 0x40003400
RTL8195A[HAL]: R2 = 0xff
RTL8195A[HAL]: R3 = 0xfa205
RTL8195A[HAL]: R12 = 0x9cdd
RTL8195A[HAL]: LR = 0x8135365
RTL8195A[HAL]: PC = 0xfa204
RTL8195A[HAL]: PSR = 0x1000200
RTL8195A[HAL]: BFAR = 0xe000ed38
RTL8195A[HAL]: CFSR = 0x0
RTL8195A[HAL]: HFSR = 0x80000000
RTL8195A[HAL]: DFSR = 0x2
RTL8195A[HAL]: AFSR = 0x0
RTL8195A[HAL]: PriMask 0x0
RTL8195A[HAL]: BasePri 0x0
RTL8195A[HAL]: SVC priority: 0x00
RTL8195A[HAL]: PendSVC priority: 0x00
RTL8195A[HAL]: Systick priority: 0x00

Here's platformio.ini.

; Auto generated code by esphome

[common]
lib_deps =
build_flags =
upload_flags =

; ========== AUTO GENERATED CODE BEGIN ===========
[env:plug-ezviz]
board = generic-rtl8710bx-4mb-980k
build_flags =
    -DCONFIG_ASYNC_TCP_STACK_SIZE=4096
    -DESPHOME_LOG_LEVEL=ESPHOME_LOG_LEVEL_VERBOSE
    -DLT_DEBUG_ALL=1
    -DLT_LOGGER=1
    -DLT_LOGGER_CALLER=1
    -DLT_LOGGER_COLOR=1
    -DLT_LOGGER_TASK=1
    -DLT_LOGGER_TIMESTAMP=1
    -DLT_LOGLEVEL=LT_LEVEL_VERBOSE
    -DLT_LOG_HEAP=1
    -DLT_UART_SILENT_ALL=1
    -DLT_USE_TIME=1
    -DUSE_ARDUINO
    -DUSE_LIBRETUYA
    -Wno-sign-compare
    -Wno-unused-but-set-variable
    -Wno-unused-variable
    -fno-exceptions
build_src_flags = -include Arduino.h
custom_fw_name = esphome
custom_fw_output = firmware.uf2
custom_fw_version = 2022.10.0-dev
framework = arduino
lib_deps =
    esphome/ESPAsyncWebServer-esphome@3.0.0
    DNSServer
    esphome/noise-c@0.1.4
    bblanchon/ArduinoJson@6.18.5
    ${common.lib_deps}
lib_ldf_mode = off
monitor_filters = rtl_hard_fault_decoder
monitor_speed = 115200
platform = libretuya
; =========== AUTO GENERATED CODE END ============

plug-eezviz.yaml

substitutions:
  devicename: plug_ezviz
  friendly_name: PLUG-EZVIZ
  device_description: EZVIZ TT1 Smart Plug CS-T31-16B-EU 
  current_res: "0.00221" # Random value. Requires power monitoring calibration
  voltage_div: "955" # Random value. Requires power monitoring calibration

esphome:
  name: plug-ezviz
  platformio_options:
    monitor_speed: 115200
    monitor_filters: rtl_hard_fault_decoder

libretuya:
  board: generic-rtl8710bx-4mb-980k
  framework:
    version: latest
  loglevel: verbose
  lt_config:
    LT_LOGGER: 1
    LT_LOGLEVEL: LT_LEVEL_VERBOSE
    LT_LOGGER_TIMESTAMP: 1
    LT_LOGGER_CALLER: 1
    LT_LOGGER_TASK: 1
    LT_LOGGER_COLOR: 1
    LT_LOG_HEAP: 1
  sdk_silent: true
  gpio_recover: true

# Enable logging
logger:
  level: VERBOSE

# Enable Home Assistant API
api:
  reboot_timeout: 1h
  encryption:
    key: e3OB6WUqYiEKNoBNbY/RkJnixIDTZCkCqhHqLLI2N1c=

ota:
  password: a84b93a735d24a1c5efe59139871e307

web_server:
  port: 80
  auth:
    username: xxxxx
    password: xxxxxxxxxxxxxxx
  include_internal: true

wifi:
  ssid: "SunnyDay"
  password: "xxxxxxxx"

  # Enable fallback hotspot in case wifi connection fails
  ap:
    ssid: "${friendly_name} Fallback"
    password: "xxxxxxxxxx"

captive_portal:


text_sensor:
  - platform: wifi_info
    ip_address:
      name: ${friendly_name} IP Address
    ssid:
      name: ${friendly_name} Connected SSID
    bssid:
      name: ${friendly_name} Connected BSSID
    mac_address:
      name: ${friendly_name} MAC Wifi Address

button:
  - platform: restart
    name: ${friendly_name} Restart

sensor:

  - platform: wifi_signal
    name: ${friendly_name} Wifi Signal
    update_interval: 60s
    id: ${devicename}_wifi_signal

binary_sensor:

  - platform: gpio
    device_class: power
    pin:
      number: D9
      mode: INPUT_PULLUP
      inverted: true
    name: ${friendly_name} Power Button

  - platform: gpio
    device_class: power
    pin:
      number: D11
      mode: INPUT_PULLUP
      inverted: true
    name: ${friendly_name} Reset Button

status_led:
  pin:
    number: D1
    inverted: true

light:
  - platform: binary
    name: "Test Light"
    output: light_output

output:
  - id: light_output
    platform: gpio
    pin: D0

@mihsu81
Copy link
Contributor Author

mihsu81 commented Sep 30, 2022

At the risk of sounding crazy or stupid😁, I've checked again the dump from the last OTA update, and I definitely see corrupted characters.
dump_ezviz_libretuya_2.zip

image

image

@dbuezas
Copy link
Contributor

dbuezas commented Sep 30, 2022

Would you mind trying this repo/branch?
https://github.com/dbuezas/libretuya-esphome/tree/experiment/use-bds-sockets
This one fixes OTA updates in BK7231N (where data corruption is worse) and I suspect it to be a generic solution for this.

I'll make a PR today, but have no means to test it in RTL chips (only Beken).

@kuba2k2
Copy link
Member

kuba2k2 commented Sep 30, 2022

@mihsu81 okay, I see the corrupted characters. I'll look into the dumps you posted later today.

About the exception decoder: it's showing you a warning that firmware.elf doesn't exist. I don't know why but Esphome uses a different build folder name... Like .pioenvs instead of .pio. I'll have to fix this somehow, but for now you can put the firmware.elf that Esphome compiled, into the path that PIO monitor shows it expects.

@dbuezas honestly I don't think it's caused by network issues. I see it happens when using OTA2 mostly, so there may be some problems with UF2 binpatching. That would explain the corrupted characters.

@mihsu81
Copy link
Contributor Author

mihsu81 commented Sep 30, 2022

RtlHardFaultDecoder: firmware at C:\Users\Bethayn-Dell\GitHub\libretuya-esphome.esphome\build\plug-ezviz.pio\build\plug-ezviz\firmware.elf does not exist, r

Thanks @dbuezas. I've tried it but the compilation fails.

Compiling .pioenvs\plug-ezviz\src\esphome\components\web_server\list_entities.cpp.o
src/esphome/components/socket/lwip_sockets_LT.cpp: In function 'ssize_t readv(int, const iovec*, int)':
src/esphome/components/socket/lwip_sockets_LT.cpp:13:97: error: 'lwip_readv' was not declared in this scope
 ssize_t readv(int fd_, const struct iovec *iov, int iovcnt) { return lwip_readv(fd_, iov, iovcnt);  }
                                                                                                 ^
src/esphome/components/socket/lwip_sockets_LT.cpp: In function 'const char* inet_ntop(int, const void*, char*, socklen_t)':
src/esphome/components/socket/lwip_sockets_LT.cpp:19:118: error: 'lwip_inet_ntop' was not declared in this scope
 const char * inet_ntop(int af, const void *src, char *dst, socklen_t size) { return lwip_inet_ntop(af, src, dst, size); }
                                                                                                                      ^
*** [.pioenvs\plug-ezviz\src\esphome\components\socket\lwip_sockets_LT.cpp.o] Error 1
============================================= [FAILED] Took 27.37 seconds =============================================

@mihsu81
Copy link
Contributor Author

mihsu81 commented Sep 30, 2022

@mihsu81 okay, I see the corrupted characters. I'll look into the dumps you posted later today.

About the exception decoder: it's showing you a warning that firmware.elf doesn't exist. I don't know why but Esphome uses a different build folder name... Like .pioenvs instead of .pio. I'll have to fix this somehow, but for now you can put the firmware.elf that Esphome compiled, into the path that PIO monitor shows it expects.

@dbuezas honestly I don't think it's caused by network issues. I see it happens when using OTA2 mostly, so there may be some problems with UF2 binpatching. That would explain the corrupted characters.

@kuba2k2 thank you for the tips with getting platformio device monitor to work, but it doesn't seem to give many details.
At least none that I understand 😁

PS C:\Users\Bethayn-Dell\GitHub\libretuya-esphome\.esphome\build\plug-ezviz> platformio device monitor
**********************************************************************************************************************************************************************************************************************************************************************
Obsolete PIO Core v6.0.2 is used (previous was 6.1.4)
Please remove multiple PIO Cores from a system:
https://docs.platformio.org/en/latest/core/installation/troubleshooting.html
**********************************************************************************************************************************************************************************************************************************************************************
--- Available filters and text transformations: colorize, debug, default, direct, hexlify, log2file, nocontrol, printable, rtl_hard_fault_decoder, send_on_enter, time
--- More details at https://bit.ly/pio-monitor-filters
C:\Users\Bethayn-Dell\.platformio\packages\toolchain-gccarmnoneeabi\bin\arm-none-eabi-addr2line.exe

Please build project in debug configuration to get more details about an exception.
See https://docs.platformio.org/page/projectconf/build_configurations.html


--- Miniterm on COM8  115200,8,N,1 ---
--- Quit: Ctrl+C | Menu: Ctrl+T | Help: Ctrl+T followed by Ctrl+H ---
ROM:[V0.1]
FLASHRATE:4
BOOT TYPE:0 XTAL:40000000
IMG1 DATA[1128:10002000]
IMG1 ENTRY[8000521:100021ef]
IMG1 ENTER
CHIPID[000000f6]
read_mode idx:3, flash_speed idx:3
calibration_result:[1:25:13][1:19]
calibration_result:[2:25:13][1:19]
calibration_result:[3:0:0][ff:ff]
calibration_ok:[1:25:13]
FLASH CALIB[NEW OK]
OTA2 ADDR[8100000]
OTAx SELE[f8000000]
OTA2 USE
OTA2 SIGN[35393138:31313738]
IMG2 DATA[0x8193640:3868:0x10005000]
IMG2 SIGN[RTKWin(10005008)]
IMG2 ENTRY[0x10005000:0x8135ed1]
System_Init1
RTL8195A[HAL]: Hard Fault Error!!!!
RTL8195A[HAL]: R0 = 0x1003ef64
RTL8195A[HAL]: R1 = 0x40003400
RTL8195A[HAL]: R2 = 0xff
RTL8195A[HAL]: R3 = 0xfa205
RTL8195A[HAL]: R12 = 0x9cdd
RTL8195A[HAL]: LR = 0x8135365
  0x8135365 in ?? ??:0
RTL8195A[HAL]: PC = 0xfa204
  0xfa204 in ?? ??:0
RTL8195A[HAL]: PSR = 0x1000200
RTL8195A[HAL]: BFAR = 0xe000ed38
RTL8195A[HAL]: CFSR = 0x0
RTL8195A[HAL]: HFSR = 0x80000000
RTL8195A[HAL]: DFSR = 0x2
RTL8195A[HAL]: AFSR = 0x0
RTL8195A[HAL]: PriMask 0x0
RTL8195A[HAL]: BasePri 0x0
RTL8195A[HAL]: SVC priority: 0x00
RTL8195A[HAL]: PendSVC priority: 0x00
RTL8195A[HAL]: Systick priority: 0x00

@kuba2k2
Copy link
Member

kuba2k2 commented Sep 30, 2022

Right, sorry for that. You need to copy the firmware.ota2.elf instead of the one without OTA number.

@mihsu81
Copy link
Contributor Author

mihsu81 commented Sep 30, 2022

Using firmware.ota2.elf renamed to firmware.elf.

ROM:[V0.1]
FLASHRATE:4
ROM:[V0.1]
FLASHRATE:4
BOOT TYPE:0 XTAL:40000000
IMG1 DATA[1128:10002000]
IMG1 ENTRY[8000521:100021ef]
IMG1 ENTER
CHIPID[000000f6]
read_mode idx:3, flash_speed idx:3
calibration_result:[1:25:13][1:19]
calibration_result:[2:25:13][1:19]
calibration_result:[3:0:0][ff:ff]
calibration_ok:[1:25:13]
FLASH CALIB[NEW OK]
OTA2 ADDR[8100000]
OTAx SELE[f8000000]
OTA2 USE
OTA2 SIGN[35393138:31313738]
IMG2 DATA[0x8193640:3868:0x10005000]
IMG2 SIGN[RTKWin(10005008)]
IMG2 ENTRY[0x10005000:0x8135ed1]
System_Init1
RTL8195A[HAL]: Hard Fault Error!!!!
RTL8195A[HAL]: R0 = 0x1003ef64
RTL8195A[HAL]: R1 = 0x40003400
RTL8195A[HAL]: R2 = 0xff
RTL8195A[HAL]: R3 = 0xfa205
RTL8195A[HAL]: R12 = 0x9cdd
RTL8195A[HAL]: LR = 0x8135365
  0x8135365 in Gen_RandomSeed at C:/Users/Bethayn-Dell/.platformio/packages/framework-realtek-amb1/component/soc/realtek/8711b/cmsis/device/system_8195a.c:105
RTL8195A[HAL]: PC = 0xfa204
  0xfa204 in ?? ??:0
RTL8195A[HAL]: PSR = 0x1000200
RTL8195A[HAL]: BFAR = 0xe000ed38
RTL8195A[HAL]: CFSR = 0x0
RTL8195A[HAL]: HFSR = 0x80000000
RTL8195A[HAL]: DFSR = 0x2
RTL8195A[HAL]: AFSR = 0x0
RTL8195A[HAL]: PriMask 0x0
RTL8195A[HAL]: BasePri 0x0
RTL8195A[HAL]: SVC priority: 0x00
RTL8195A[HAL]: PendSVC priority: 0x00
RTL8195A[HAL]: Systick priority: 0x00

@kuba2k2
Copy link
Member

kuba2k2 commented Sep 30, 2022

Yeah, so now it will tell you what causes the crash. This one actually looks like some kind of corruption - PC (program counter) at 0xfa204, meaning way below the actual flash chip, yet way above the ROM.

I see you've edited the last firmware.uf2 you posted, to remove the wifi credentials. If you do so, in the future please leave the first 32 bytes of each block intact - otherwise it's more difficult to analyze it.

@mihsu81
Copy link
Contributor Author

mihsu81 commented Sep 30, 2022

Yeah, so now it will tell you what causes the crash. This one actually looks like some kind of corruption - PC (program counter) at 0xfa204, meaning way below the actual flash chip, yet way above the ROM.

I see you've edited the last firmware.uf2 you posted, to remove the wifi credentials. If you do so, in the future please leave the first 32 bytes of each block intact - otherwise it's more difficult to analyze it.

Does this one work?
firmware_2.zip

@kuba2k2
Copy link
Member

kuba2k2 commented Sep 30, 2022

Yes. But your flash dumps seem to be corrupted in some way - they should have exactly 4 194 304 bytes (4096 * 1024).

Please, compile a firmware file (using esphome compile), then upload it to the board using UART:

  • with esphome upload
  • or using ltchiptool uf2 upload firmware.uf2 uart COMx
    (to get the board to work at all).

Then, upload the same file (firmware.uf2) over OTA - without using esphome compile, so that the firmware is the same. If the board stops booting, do a flash dump. You can mask out passwords and SSIDs, as long as you don't change the UF2 block header, and keep the file length unchanged (i.e. replace an 8-letter password with 8 "X" characters).

If the board doesn't stop booting (meaning that the last OTA upload used OTA1), upload with OTA again, using the exact same firmware.uf2 file. Then do a flash dump.

Post both files here - firmware.uf2 and the flash dump (exactly 4 MiB in size).

It is very important that you always upload the same file during this (somewhat complex) procedure, so that I can analyze exactly what's wrong and what corruption in the binaries causes the board to stop working.

TL;DR: upload the same file until the board stops booting, using UART in the first run and only OTA later. Should take 2 or 3 uploads, not more, not less.

@kuba2k2
Copy link
Member

kuba2k2 commented Sep 30, 2022

Okay, actually, hold up with that for a minute @mihsu81 ... sorry for the confusion.

Seems like I found a bug in the implementation, that produces same results for me as your corrupted firmware. I'll work on it and let you know.

@kuba2k2
Copy link
Member

kuba2k2 commented Sep 30, 2022

@mihsu81 I found the bug. I've written a technical explanation on Discord, but will post it here as well, for future reference.

[6:27 PM]Oranż Metylowy: the program works fine using just the "uf2ota" library
[6:27 PM]Oranż Metylowy: it also works fine when using Update.h library, and fetching it 512 byte chunks
[6:27 PM]Oranż Metylowy: BUT it breaks when fetching it 1024 byte chunks
[6:28 PM]Oranż Metylowy: seems to apply the binary patch of the previous block, to a block that doesn't have binpatches
[6:28 PM]Oranż Metylowy: and that corrupts the data
[6:28 PM]Oranż Metylowy: but I did this in the "block parsing" code:
ctx->binpatch_len = 0; // binpatch applies to one block only
 
[6:28 PM]Oranż Metylowy: so it removes the previous binpatch in subsequent blocks - makes sense, doesn't it?
[6:29 PM]Oranż Metylowy: and now I noticed this:
    if (!block->has_tags)
        // no tags in this block, no further processing needed
        return UF2_ERR_OK;
[6:29 PM]Oranż Metylowy: before the binpatch_len = 0 line
[6:29 PM]Oranż Metylowy: so it only clears binpatch_len if the following block has ANY tags
[6:30 PM]Oranż Metylowy: if it doesn't, it uses tags from previous block...
[6:31 PM]Oranż Metylowy: you might ask "why does it work fine with 512 byte chunks then??"
[6:31 PM]Oranż Metylowy: the uf2ota library doesn't memcpy() the binpatch - instead, it stores a pointer to the data within the block
[6:32 PM]Oranż Metylowy: so if you're giving it 512 byte blocks, the buffer is completely overwritten with new data, that has 0x00's instead of binpatch
[6:32 PM]Oranż Metylowy: the binpatch_len is still not 0, but there's no readable binpatch to apply

You can now update LT (platformio platform install -f https://github.com/kuba2k2/libretuya), compile firmware, and upload it using UART first. After that, OTA should work fine!

@kuba2k2 kuba2k2 added the bug Something isn't working label Sep 30, 2022
@kuba2k2 kuba2k2 mentioned this issue Sep 30, 2022
@mihsu81
Copy link
Contributor Author

mihsu81 commented Sep 30, 2022

Thanks a lot @kuba2k2. I can confirm ltchiptool v1.6.0 fixes the OTA update issue.
Would be awesome as well if the device would reboot after a successful/failed OTA update.
At the moment the device remains unreachable.

@kuba2k2
Copy link
Member

kuba2k2 commented Sep 30, 2022

That's weird, since I added another function to reset the chip few days ago. But that's a story for another issue I think. I'm glad that OTA upload works now 👍

@kuba2k2 kuba2k2 closed this as completed Sep 30, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants