Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UART RS485 Failing to Enable Transmitter After Some Time (IDFGH-1983) #4178

Closed
dasrue opened this issue Oct 8, 2019 · 25 comments
Closed

UART RS485 Failing to Enable Transmitter After Some Time (IDFGH-1983) #4178

dasrue opened this issue Oct 8, 2019 · 25 comments
Labels
Awaiting Response awaiting a response from the author Resolution: Done Issue is done internally Status: Done Issue is done internally

Comments

@dasrue
Copy link

dasrue commented Oct 8, 2019

Environment

  • Development Kit: Custom PCB. RS485 Driver=ST3485EC
  • Module or chip used: ESP32-WROOM-32
  • IDF version: v3.3
  • Build System: Make
  • Compiler version: 1.22.0-80-g6c4433a
  • Operating System: Linux
  • Power Supply: Custom PCB (3.3V TS30041 IC)

Problem Description

After running for some time (around 30 minutes), the UART driver fails to enable the RS485 driver properly. After probing with a scope, I found that the driver is enabled breifly, then disabled before the UART tries to send the data.

Expected Behavior

RTS pin goes high, data is transmitted, then RTS pin goes low

Actual Behavior

Behaviour is OK for around 30 Minutes after reset.
After 30 Minutes:
RTS pin goes high for around 50us then RTS pin goes low, data is transmitted from UART, but is not transmitted onto RS485 bus due to driver EN signal being off.

Steps to repropduce

  1. UART Setup as follows:
// Initialise UART
    uart_config_t uart_config = {
        .baud_rate = 115200,
        .data_bits = UART_DATA_8_BITS,
        .parity = UART_PARITY_DISABLE,
        .stop_bits = UART_STOP_BITS_1,
        .flow_ctrl = UART_HW_FLOWCTRL_DISABLE,
        .rx_flow_ctrl_thresh = 122,
    };
    uart_param_config(UART_NUM_2, &uart_config);
    uart_set_pin(UART_NUM_2, GPIO_NUM_4, GPIO_NUM_5, GPIO_NUM_17, UART_PIN_NO_CHANGE);
    uart_set_rx_timeout(UART_NUM_2, 2);	// Set RX Timeout to be 2 chars
    uart_driver_install(UART_NUM_2, 256, 0, 10, &uart2_event_queue, 0);
    uart_set_mode(UART_NUM_2, UART_MODE_RS485_HALF_DUPLEX);
  1. Make UART receive and respond to requests for around 30 minutes.
  2. After some time notice UART is no longer responding properly.

Code to reproduce this issue

Working on creating a program to reproduce bug

Debug Logs

No errors printed to log.

@github-actions github-actions bot changed the title UART RS485 Failing to Enable Transmitter After Some Time UART RS485 Failing to Enable Transmitter After Some Time (IDFGH-1983) Oct 8, 2019
@dasrue
Copy link
Author

dasrue commented Oct 8, 2019

I have taken some screenshots from the scope. Here is it working initially:
DS1Z_QuickPrint8
And around 30 Minutes later:
DS1Z_QuickPrint11
The yellow trace is the 485 bus data (via max485 decoder), the blue is the transmitter enable, and the purple is the esp tx. Notice the ESP32 fails to keep the transmit enable high for the entire data packet on the second image

@dasrue
Copy link
Author

dasrue commented Oct 9, 2019

I have performed a test by changing

uart_reg->conf0.sw_rts = 1;

to

                uart_reg->conf0.sw_rts = 1;
                __asm__ __volatile__("nop;nop;nop;nop;nop;nop;nop;");
                uart_reg->conf0.sw_rts = 0;
                __asm__ __volatile__("nop;nop;nop;nop;nop;nop;nop;");
                uart_reg->conf0.sw_rts = 1;

in order to see if it was the interrupt setting RTS low, or if it was being set from an external source. After waiting around 30 minutes again I found that it is the isr function that is setting RTS low.
DS1Z_QuickPrint2
I will do some further investigation into the ISR and see what I can find

@negativekelvin
Copy link
Contributor

Have you tried clearing UART_TX_DONE_INT right before it is enabled?

@dasrue
Copy link
Author

dasrue commented Oct 9, 2019

I have tried adding
uart_clear_intr_status(uart_num, UART_TX_DONE_INT_CLR_M);
just before this line:

UART[uart_num]->int_ena.tx_done = 1;

and it seems to be working properly now. However I will test it overnight to see how it goes

@dasrue
Copy link
Author

dasrue commented Oct 9, 2019

OK, it looks like that change (clearing the UART_TX_DONE_INT interrupt before writing to the FIFO) has fixed it! After leaving it overnight I have come back to the system still working happily this morning.

@alisitsyn
Copy link
Collaborator

Hi dasrue,

Thank you for your issue report. I have hard time trying to reproduce the issue with Modbus examples and with specific RS485 test example and it was working good for even much more time. Your solution looks good to me but I have to reproduce it first to fix it appropriately.

--
Alex

@alisitsyn
Copy link
Collaborator

Hi @dasrue,
I appologize for the late update. The issue was checked but uart_clear_intr_status(uart_num, UART_TX_DONE_INT_CLR_M); does not solve issue completely. I propose to do this workaround instead:

components/driver/uart.c: line 1109:

} else if(uart_intr_status & UART_TX_DONE_INT_ST_M) {
            if (UART_IS_MODE_SET(uart_num, UART_MODE_RS485_HALF_DUPLEX) && uart_reg->status.st_utx_out != 0) {
                // The TX_DONE interrupt is triggered in RS485 half duplex mode but transmission is active
                // then postpone interrupt processing for next interrupt
                uart_event.type = UART_EVENT_MAX;
            } else {
                // Disable TX_DONE interrupt and clear interrupt status
                uart_disable_intr_mask_from_isr(uart_num, UART_TX_DONE_INT_ENA_M);
                uart_clear_intr_status(uart_num, UART_TX_DONE_INT_CLR_M);
                // Workaround for RS485: If the RS485 half duplex mode is active 
                // and transmitter is in idle state then reset received buffer and reset RTS pin
                // skip this behavior for other UART modes
                if (UART_IS_MODE_SET(uart_num, UART_MODE_RS485_HALF_DUPLEX)) {
                    UART_ENTER_CRITICAL_ISR(&uart_spinlock[uart_num]);
                    uart_reg->conf0.sw_rts = 1;
                    uart_reset_rx_fifo(uart_num); // Allows to avoid hardware issue with the RXFIFO reset
                    UART_EXIT_CRITICAL_ISR(&uart_spinlock[uart_num]);
                }
                xSemaphoreGiveFromISR(p_uart_obj[uart_num]->tx_done_sem, &HPTaskAwoken);
            }

This workaround allows to control the transmission state and reset RTS pin when transmission is done. If it is not done the interrupt will be activated later because status bit is not cleared.
Could you check this approach for your code?
Thanks.

@negativekelvin
Copy link
Contributor

@alisitsyn isn't not clearing the interrupt basically the same as while(uart_reg->status.st_utx_out != 0); ? How does that help?

@alisitsyn
Copy link
Collaborator

@negativekelvin, Not clearing interrupt is similar to waiting but in case of waiting in interrupt handler we block other interrupts processing on the same CPU but even on other CPU while waiting.
It is really bad practice to delay interrupt processing. Interrupt handling should be as short as possible. In this fix we wait idle state of transmitter and then reset RTS in RS485 HF mode only and do not block the interrupts.

@negativekelvin
Copy link
Contributor

You are saying UART_TX_DONE_INT does not provide the correct timing for rts, but that wasn't the issue reported here. The issue was UART_TX_DONE_INT was pending before the start of packet. Are you saying waiting for idle state will prevent spurious UART_TX_DONE_INT? What is the typical delay between UART_TX_DONE_INT and st_utx_out = 0?

@alisitsyn
Copy link
Collaborator

alisitsyn commented Nov 25, 2019

I can just guess why TX_DONE interrupt is triggered in this case, but the scenario looks like:

  • We call uart_tx_chars() (it fills fifo and set RTS = 1, TX_DONE_EN=1, transmission is started).
  • UART_ST_UTX_OUT = TX_STRT;
  • Wrong TX_DONE_INT activated.
  • This fix will keep the RTS line in logic 1 while transmission is active (UART_ST_UTX_OUT != TX_IDLE).

Let us imagine worse case:

User app sending the data and fills fifo. Default uart interrupt handler is in flash. User task is switched and other higher priority task disables CPU cache (access flash memory). First portion of data sent out and other portion written into fifo later and transmission is started but TX_DONE interrupt processing is delayed and triggered in the middle of transmission of next portion of data (in the middle of the byte). In this case the RTS will be reset and byte will be transmitted incorrectly. This fix will keep RTS line in logic one while transmission is not done and allow to avoid incorrect transmission.

So, I think this fix helps to solve both issues with direction control in RS485 HF mode.

@negativekelvin
Copy link
Contributor

I think the clear before enable is more important. I'm not convinced about st_utx_out because if it is not guaranteed to be a short time then it is better to clear the interrupt but not disable it instead of keep entering the isr.

@alisitsyn
Copy link
Collaborator

alisitsyn commented Feb 18, 2020

@negativekelvin,
I am very sorry for even later response to your message. I missed your message. The solution I described is already implemented in HAL of new UART driver. However I think it is worth to add your proposal with uart_clear_intr_status(uart_num, UART_TX_DONE_INT_CLR_M); .
@dasrue, @negativekelvin, Could you try to reproduce the issue with updated UART driver? On my side this issue does not happen.
Thank you.

@Alvin1Zhang
Copy link
Collaborator

@dasrue @negativekelvin Thanks for reporting, would you please help share if any updates for the issue? Thanks.

@Alvin1Zhang Alvin1Zhang added the Awaiting Response awaiting a response from the author label Aug 19, 2020
@Alvin1Zhang
Copy link
Collaborator

Thanks for reporting, will close due to short of feedback. Feel free to reopen with more details/updates. Thanks.

@espressif-bot espressif-bot added the Status: In Progress Work is in progress label Jun 16, 2021
@alisitsyn
Copy link
Collaborator

It seems the issue is still actual and discussed here: https://esp32.com/viewtopic.php?f=13&t=21835&p=80501#p80501

@Alvin1Zhang Alvin1Zhang reopened this Jul 27, 2021
@Maldus512
Copy link

I recently encountered the exact same issue; my workaround is to manually switch the RTS signal (as a normal GPIO) instead of relying on the uart driver.

@AxelLin
Copy link
Contributor

AxelLin commented Mar 3, 2022

I recently encountered the exact same issue; my workaround is to manually switch the RTS signal (as a normal GPIO) instead of relying on the uart driver.

What is your esp-idf version (git describe --tags)?

@Maldus512
Copy link

What is your esp-idf version (git describe --tags)?

v4.3-970-g7467c68a17

@alisitsyn
Copy link
Collaborator

alisitsyn commented Mar 3, 2022

@Maldus512 ,

Please apply patch below git apply 0001_driver_uart_fix_tx_bytes_rts_assert_failure.patch to your version of IDF.
Then check the communication issues again and report results here.

0001_driver_uart_fix_tx_bytes_rts_assert_failure.patch.log

Thank you.

@alisitsyn
Copy link
Collaborator

Hi @Maldus512 ,

Could you please share your results here?
Thanks.

@Maldus512
Copy link

Sorry for getting back at you so late.
Unfortunately I wasn't able to reproduce the issue; it was reported to me by a field tester for the device I'm working on and I had to produce a sure fix (hence moving to manually handing RTS). I've been trying to find the problem again with my own setup but couldn't do it, both with and without the patch.
It's honestly puzzling I haven't met this issue sooner, I've been using ESP32 with RS485 in various setups for a while now. I'll keep this in mind and come back here if I can gather more information.

@alisitsyn
Copy link
Collaborator

@Maldus512 ,

Thank you for the update. Please post the results here once the problem comes again.
Thanks.

@alisitsyn
Copy link
Collaborator

alisitsyn commented May 11, 2022

The fix was merged into master with commit ID: 09c1fba
It takes some time to be synced with github.

@espressif-bot espressif-bot added the Resolution: Done Issue is done internally label Jun 13, 2022
@espressif-bot espressif-bot added Status: Done Issue is done internally and removed Status: In Progress Work is in progress labels Jun 13, 2022
@ginkgm
Copy link
Collaborator

ginkgm commented Aug 5, 2022

Issue resolved, and backports created:

Target Branch Merged Supported Last N.A.
master (v5.0) 09c1fba null null
release/v4.4 7a72f8e v4.4.2 v4.4.1
release/v4.3 34c9af3 v4.3.3 v4.3.2
release/v4.2 b74fc00 null v4.2.3
release/v4.1 a84d729 null v4.1.3

Issue closed

@ginkgm ginkgm closed this as completed Aug 5, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Awaiting Response awaiting a response from the author Resolution: Done Issue is done internally Status: Done Issue is done internally
Projects
None yet
Development

No branches or pull requests

8 participants