Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Time-out crashes #26

Closed
maarten-pennings opened this issue Jan 17, 2024 · 13 comments
Closed

Time-out crashes #26

maarten-pennings opened this issue Jan 17, 2024 · 13 comments

Comments

@maarten-pennings
Copy link

Hi Hideaki Tai,

Thanks for the lib.

I have an a-typical setup. My ESP is connected to a sensor, the sensor is the SPI master and the ESP is the SPI slave. The ESP triggers the sensor (via non-SPI lines), and then the sensor is supposed to use SPI to send the sensor result to the SPI.

There might be a problem with the sensor, in which case it will not send an SPI message. So I want to do a wait-with-timeout (the waitticks).

ESP code ino file:

ESP32SPISlave spiin;

#define BUF_SIZE 32
uint8_t spiin_mosi_buf[BUF_SIZE];
uint8_t spiin_miso_buf[BUF_SIZE];

...

void setup() {
  Serial.begin(115200); 
  Serial.printf("\n\nspi2x - rx\n");
  ...
  spiin.setDataMode(SPI_MODE0);
  spiin.begin(VSPI , /*sck*/SPIIN_SCLK, /*miso*/SPIIN_MISO, /*mosi*/SPIIN_MOSI, /*ss*/SPIIN_SSEL);
  Serial.printf("waiting for spi message...\n");
}

void loop() {
  sensor_trigger();
  // Wait max 10 seconds for SPI message
  esp_err_t res = spiin.waitticks(spiin_mosi_buf, spiin_miso_buf, BUF_SIZE, 10000); 
  Serial.printf("res=%d (%s)", res, res==0?"ok":(res==263?"timeout":"err") );
  if( res==ESP_OK && spiin.size() ) {
    ... print message in spiin_mosi_buf ...
  }
}

Yes, sorry, I made waitticks myself. I added the following function just blow wait in ESP32SPISlave.h:

    // Patch made by Maarten: added time-out argument (ticks), removed printf
    esp_err_t waitticks(uint8_t* rx_buf, const uint8_t* tx_buf, const size_t size,TickType_t ticks) {
        if (!transactions.empty()) {
            printf(
                "[WARN] cannot execute transfer if queued transaction exists. queued transactions = %d\n",
                transactions.size());
            return 0;
        }

        addTransaction(rx_buf, tx_buf, size);

        // ticks happen to be in in ms (configTICK_RATE_HZ) 
        esp_err_t e = spi_slave_transmit(host, &transactions.back(), ticks);
        if (e!=ESP_OK) transactions.pop_back();

        // ESP_OK, ESP_ERR_TIMEOUT, maybe others
        return e;
    }

As long as there is no timeout, the ESP can receive SPI message after message.
Once there is one timeout (or more), the ESP can receive only one more SPI message.
If, after that first message (there are some timeouts and then) there is a second message, the ESP crashes.

My conclusion is that I do not understand well enough the execution architecture of the SPI stack.
Would you be able to give me help on how to add timeout?

Kind regards,

Maarten

Here is an example run (the message from the sensor is msg#)

ets Jun  8 2016 00:22:57

rst:0x1 (POWERON_RESET),boot:0x13 (SPI_FAST_FLASH_BOOT)
configsip: 0, SPIWP:0xee
clk_drv:0x00,q_drv:0x00,d_drv:0x00,cs0_drv:0x00,hd_drv:0x00,wp_drv:0x00
mode:DIO, clock div:1
load:0x3fff0030,len:1184
load:0x40078000,len:13260
load:0x40080400,len:3028
entry 0x400805e4


spi2x - rx
waiting for spi message...
res=0 (ok) size=4 << msg1
res=0 (ok) size=4 << msg2
res=0 (ok) size=4 << msg3
res=0 (ok) size=4 << msg4
res=263 (timeout)
res=0 (ok) size=4 << msg5

assert failed: spi_slave_transmit spi_slave.c:322 (ret_trans == trans_desc)


Backtrace: 0x40083511:0x3ffb20b0 0x40088139:0x3ffb20d0 0x4008d359:0x3ffb20f0 0x400d4aa9:0x3ffb2220 0x400d178d:0x3ffb2250 0x400d31b9:0x3ffb2290




ELF file SHA256: 0f65a65fee37cc4e

Rebooting...
ets Jun  8 2016 00:22:57

rst:0xc (SW_CPU_RESET),boot:0x13 (SPI_FAST_FLASH_BOOT)
configsip: 0, SPIWP:0xee
clk_drv:0x00,q_drv:0x00,d_drv:0x00,cs0_drv:0x00,hd_drv:0x00,wp_drv:0x00
mode:DIO, clock div:1
load:0x3fff0030,len:1184
load:0x40078000,len:13260
load:0x40080400,len:3028
entry 0x400805e4


spi2x - rx
waiting for spi message...
res=263 (timeout)
@hideakitai
Copy link
Owner

Could you try v0.4.0 release? Also notice Notes for Communication Errors

@maarten-pennings
Copy link
Author

I switched to v0.4.1 (running in Arduino 2.3.0)
I'm running the slave code on an ESP32 (the DOIT oldy).

This is my slave code

#include <ESP32SPISlave.h>

#define SPIIN_SCLK 18
#define SPIIN_MISO 19
#define SPIIN_MOSI 23
#define SPIIN_SSEL  5

#define SPIIN_BUS  VSPI // HSPI

ESP32SPISlave spiin;

#define BUF_SIZE 32
uint8_t spiin_buf_tx[BUF_SIZE];
uint8_t spiin_buf_rx[BUF_SIZE];

void setup() {
  // Serial
  Serial.begin(115200); 
  delay(2000);
  Serial.printf("\n\nspi2x - rx\n");
  // SPI
  spiin.setDataMode(SPI_MODE0);
  spiin.begin(/*bus*/SPIIN_BUS, /*sck*/SPIIN_SCLK, /*miso*/SPIIN_MISO, /*mosi*/SPIIN_MOSI, /*ss*/SPIIN_SSEL);
  Serial.printf("waiting for spi message...\n" );
}


void loop() {
  uint32_t t0=millis();
  const size_t num = spiin.transfer(spiin_buf_tx, spiin_buf_rx, BUF_SIZE,1000);
  uint32_t t1=millis();

  if( num==0 ) {
    Serial.printf("time=%d num=%d (timeout)\n", t1-t0, num );
  } else {
    Serial.printf("time=%d num=%d << ", t1-t0, num );
    for( int i=0; i<num; i++ ) Serial.printf("%c",spiin_buf_rx[i]);
    Serial.printf("\n");
  }
}

Note that this is more or less your "Blocking big transfer() one by one" example. Some differences: I have explicit bus and pins, I did not override QueueSize, I set a timeout of 1000ms, and I measure the blocking time of the transfer call.

I have another ESP32 connected to it, which acts as a master.
When I enter a text on Serial, it sends it out over SPI to this slave.
But I'm not entering any text for these tests.

This is the output of the slave program.


spi2x - rx
waiting for spi message...
time=1000 num=0 (timeout)
time=1000 num=0 (timeout)
time=4999 num=0 (timeout)
time=5000 num=0 (timeout)
time=5000 num=0 (timeout)
time=5000 num=0 (timeout)
time=5000 num=0 (timeout)
time=5000 num=0 (timeout)
time=5000 num=0 (timeout)
...

If I change the timeout to 10 000 ms (spiin.transfer(spiin_buf_tx, spiin_buf_rx, BUF_SIZE,10*1000)) I get this trace.


spi2x - rx
waiting for spi message...
time=10000 num=0 (timeout)
time=10000 num=0 (timeout)
time=4999 num=0 (timeout)
time=5000 num=0 (timeout)
time=5000 num=0 (timeout)
...

Conclusion: the timeout I specify only works for the first 2 calls, then the library seems to revert to 5000ms.

When I remove the timeout (i.e. it gets the default value of 0, I guess "no-timeout"), the call blocks for ever. That seems correct.

But, I have a second observation, first messages are missed.

On my master, every string I enter over serial is send over SPI. There is one special string "*". When I enter that, the master will send 1, 222, 333, 4444, 55555, 666666, 7777777, 88888888, 999999999 with a delay of 500ms in between.

Here is what the slave shows, when I enter * twice.

time=5000 num=0 (timeout)
time=5000 num=0 (timeout)
time=3333 num=0 (timeout)
time=500 num=3 << 333
time=500 num=4 << 4444
time=500 num=5 << 55555
time=500 num=6 << 666666
time=500 num=7 << 7777777
time=500 num=8 << 88888888
time=500 num=9 << 999999999
time=500 num=1 << *
time=500 num=1 << 1
time=500 num=2 << 22
time=500 num=3 << 333
time=500 num=4 << 4444
time=500 num=5 << 55555
time=500 num=6 << 666666
time=500 num=7 << 7777777
time=500 num=8 << 88888888
time=500 num=9 << 999999999
time=1000 num=0 (timeout)
time=1000 num=0 (timeout)
time=4999 num=0 (timeout)

Here is what the slave shows when I enter * once and then later another *.

time=5000 num=0 (timeout)
time=5000 num=0 (timeout)
time=5000 num=0 (timeout)
time=5000 num=0 (timeout)
time=5000 num=0 (timeout)
time=5000 num=0 (timeout)
time=5000 num=0 (timeout)
time=1103 num=0 (timeout)
time=500 num=3 << 333
time=500 num=4 << 4444
time=500 num=5 << 55555
time=500 num=6 << 666666
time=500 num=7 << 7777777
time=500 num=8 << 88888888
time=500 num=9 << 999999999
time=1000 num=0 (timeout)
time=1000 num=0 (timeout)
time=4999 num=0 (timeout)
time=5000 num=0 (timeout)
time=2342 num=2 << 22
time=500 num=3 << 333
time=500 num=4 << 4444
time=500 num=5 << 55555
time=500 num=6 << 666666
time=500 num=7 << 7777777
time=500 num=8 << 88888888
time=499 num=9 << 999999999
time=1000 num=0 (timeout)
time=1000 num=0 (timeout)

Conclusion: the first 2 or 3 messages after a timeout are not received by the slave.

Any suggestions?

Thanks.

@hideakitai
Copy link
Owner

Fixed by #29

Please try v0.4.2

@maarten-pennings
Copy link
Author

maarten-pennings commented Feb 12, 2024 via email

@maarten-pennings
Copy link
Author

Sorry, lot's of wokr just after my trip.
I did a quick check with 0.4.2.

I use timeout of 1000, that seems now stable - thanks!

When I enter a * on the master, it sends and the slave receives the * and all 10 digit sequences - thanks!

When I enter two *s, I get all 22 sequences - thanks.

spi2x - rx 0.4.2
waiting for spi message...
time=1000 num=0 (timeout)
time=1000 num=0 (timeout)
time=1000 num=0 (timeout)
time=1000 num=0 (timeout)
time=1000 num=0 (timeout)
time=1000 num=0 (timeout)
time=1000 num=0 (timeout)
time=1000 num=0 (timeout)
time=1000 num=0 (timeout)
time=1000 num=0 (timeout)
time=54 num=1 << *
time=500 num=1 << 1
time=500 num=2 << 22
time=500 num=3 << 333
time=500 num=4 << 4444
time=500 num=5 << 55555
time=500 num=6 << 666666
time=500 num=7 << 7777777
time=500 num=8 << 88888888
time=500 num=9 << 999999999
time=1000 num=0 (timeout)
time=1000 num=0 (timeout)
time=1000 num=0 (timeout)
time=708 num=1 << *
time=500 num=1 << 1
time=500 num=2 << 22
time=500 num=3 << 333
time=500 num=4 << 4444
time=500 num=5 << 55555
time=500 num=6 << 666666
time=500 num=7 << 7777777
time=500 num=8 << 88888888
time=500 num=9 << 999999999
time=500 num=1 << *
time=500 num=1 << 1
time=500 num=2 << 22
time=500 num=3 << 333
time=500 num=4 << 4444
time=500 num=5 << 55555
time=500 num=6 << 666666
time=500 num=7 << 7777777
time=500 num=8 << 88888888
time=500 num=9 << 999999999
time=1000 num=0 (timeout)
time=1000 num=0 (timeout)
time=1000 num=0 (timeout)
time=1000 num=0 (timeout)
time=1000 num=0 (timeout)
time=1000 num=0 (timeout)

This release seems to fix all my problems.

Tomorrow I have time for a more thorough test but wanted to let you know your work is appreciated.

@maarten-pennings
Copy link
Author

All looking good.

I'm now also using queue/trigger.

I was wondering, if hasTransactionsCompletedAndAllResultsReady is not true after some time (because I know a SPI messages should have been sent), can I then dequeue (drop, cancel).

@hideakitai
Copy link
Owner

It’s a strange behavior. Could you enable the debug log by setting the log level to “debug” from Tool > Core Debug Level in Arduino IDE? Then, please let me know the log when it occurs.

@maarten-pennings
Copy link
Author

I think we have a misunderstanding. I did't have a problem. I had I question.

The scenario

  • queue and trigger one transaction
  • do something else (and/or wait)
  • then check if the transaction completed
  • if not, there was a problem.
  • question: can I unqueue the transaction (to free the buffer)

@hideakitai
Copy link
Owner

All transactions that were queued at the time of triggering are executed. Therefore, there is no concept of unqueue. If a transaction fails, the received bytes is currently set to zero. However, it might be a good idea to set the error code to received bytes with a negative value in case of failure.

@hideakitai
Copy link
Owner

#30 added the feature to return an error code if the transaction failed. Please try v0.5.0.

@maarten-pennings
Copy link
Author

I saw int64_t instead of size_t. I must say that that doesn't feel good for embedded software. How big are SPI tranfers? 8 bytes, 1k byte? Why isn't an int enough for the sizes?

@hideakitai
Copy link
Owner

error code is 32bit, so it should be 64bit. But certainly it should be improved so I will try it in another issue. If you have solved the original issue (timeout crash), please close this issue.

@maarten-pennings
Copy link
Author

Thanks for the support.
My system is working now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants