Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Serial USB: writing messages without reading causes the board to stop responding #182

Open
dgrat opened this issue Dec 6, 2013 · 36 comments
Assignees

Comments

@dgrat
Copy link

dgrat commented Dec 6, 2013

http://stackoverflow.com/questions/20360432/arduino-serial-timeouts-after-several-serial-writes/20382547#20382547
There I made an example and it would be nice if someone tries to reproduce it.

@ntruchsess
Copy link

I didn't try to reproduce using your code, but I've noticed this behaviour in my own projects as well. Writing to Serial without reading resulted in corrupted (overwritten) memory in other places. For that reason I allways use a '#ifdef DEBUG' to disable Serial debug-messages before using it in production (which saves a few kb as well).

@dgrat
Copy link
Author

dgrat commented Jan 9, 2014

What I plan to do is to connect my Arduino with a Raspberry Pi over USB. I want to implement a simple JSON based communication protocol. So Arduino has to send messages to my Raspberry and the other way round. I cannot disable serial writes. I am not sure where exactly the bug is located. But the fact that something that obvious is not corrected tells me, that it is maybe the USB processor in the Arduino board and not a simple driver bug.
My workaround so far is, to make sure that everything is red and to clear on the side of the operating system the input buffer always! But this cannot be a solution, because clearing frequently the input buffer has side effects as well.
Unfortunately the lack of a function to clear the output buffer on the side of the board makes it impossible to take care of such problems, because you never can be sure, that your stuff is red on the other side and if in my case the board has to fly a quadro, than it is pretty dangerous too.

@dgrat
Copy link
Author

dgrat commented Jan 9, 2014

This was btw my minimal example, which was deleted in stackoverflow, because it was not question related (I wonder why not all of the source is printed on this forum?! brackets after include don't work):

I made a minimal example which shows one of the problems. As long as I use the serial monitor or a client reading the output everything is fine. If I close the serial monitor or my python script while the firmware/board is running I get a timeout. If I reduce the amount of text e.g. "FUCK IT" it takes very long to happen. If I send that much text it takes just seconds for the timeout to happen. I doubt this firmware is without obvious bugs and a mistake in the client (serial monitor or my script) causing the error is to exclude, because these program hinder the error to happen and the serial monitor is included in the Arduino SDK. I doubt I see just two possibilities atm:

My board is defective
There is a bug in the SDK
Can someone try to reproduce it with an adapted Arduino firmware?! I don't have hardware for comparison and the standard Arduino SDK is slightly different. Instead of "hal.console->printf()" e.g. "Serial->write()". Just start such a firmware, open serial monitor, watch tx/rx led blink every 0.1s, then close serial monitor and wait until or whether it flushes up constantly.

#include <AP_Common.h>
#include <AP_Param.h>
#include <AP_Progmem.h>
#include <AP_HAL.h>
#include <AP_HAL_AVR.h>


// ArduPilot Hardware Abstraction Layer
const AP_HAL::HAL& hal = AP_HAL_AVR_APM2;


inline void foo_loop() {
  static int timer = 0;
  int time = hal.scheduler->millis() - timer;

  if(time > 100) {  
    hal.console->printf("FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT!\n");
    timer = hal.scheduler->millis();
  }
}

void setup() {
  hal.uartA->begin(115200);
}

void loop() {
  foo_loop();
}

AP_HAL_MAIN();

@dgrat
Copy link
Author

dgrat commented Jan 9, 2014

Is here btw a possibility to mark it as a severe bug?

@matthijskooijman
Copy link
Collaborator

I guess only the developers do that. Also, given that you're the first to report this bug, I'm not sure it warrants being marked as "severe", whatever that would mean exactly.

You seem to be using some HAL library instead of the regular Arduino Serial object and milis() function. Does the problem also occur when you remove this HAL and use Arduino directly?

Furthermore, you say:

I doubt this firmware is without obvious bugs

What firmware are you referring to?

Also, you say:

If I close the serial monitor or my python script while the firmware/board is running I get a timeout.

What kind of timeout is this? Looking at the code, it only does serial output, so if you close the serial monitor, you'll have no way to see what is happening inside your Arduino. How can you then conclude there is any problem in there?

Finally, it would be best if you could edit your comment above and indent the code you pasted by at least 4 spaces, so github will show it in a code block and not eat the <> tags in your includes.

@cmaglie
Copy link
Member

cmaglie commented Jan 9, 2014

I've edited the comment for you, BTW the consideration made by @matthijskooijman are valid, what board are you working on, which libraries are you using and what Arduino IDE version?

C

@dgrat
Copy link
Author

dgrat commented Jan 9, 2014

What firmware are you referring to?

Just an empty one with:
inline void foo_loop() {
static int timer = 0;
int time = hal.scheduler->millis() - timer;

if(time > 100) {
hal.console->printf("FOOFOOFOOFOOFOOFOOFOOFOOFOOFOOFOOFOOFOOFOOFOOFOOFOOFOOFOOFOOFOOFOOFOOFOOFOOFOOFOOFOOFOOFOOFOOFOOFOOFOOFOOFOOFOOFOOFOOFOOFOOFOOFOOFOOFOOFOOFOOFOOFOOFOOFOOFOO!\n");
timer = hal.scheduler->millis();
}
}

What kind of timeout is this? Looking at the code, it only does serial output, so if you close the serial monitor, you'll have no way to see what is happening inside your Arduino. How can you then conclude there is any problem in there?

Well I was writing my own programs for that. First problem I recognized was, when I was writing a program which is sending to the board without clearing the input buffer. After some time I got processing times in the order of several seconds for one command (in the order of 100bytes or so). After adding a command to clean the input buffer everything was fine and board continued to respond.
Now the other way: When my boards sends data it also shows similiar behavior if no program is reading the data. E.g. If I would use my script just for sending but not reading whats inside of the serial buffer, then I would get timeouts, if they are set.
In general you can see this also on the LED of the board, instead of blinking from time to time when data is send it is glowing continously then. The last point is reproducible with serial monitor (in my case at least).

I've edited the comment for you, BTW the consideration made by @matthijskooijman are valid, what board are you working on, which libraries are you using and what Arduino IDE version?

Well and this was the final reason for me to ask around for someone to reproduce it! I work with an APM 2.5 from DIY drones with modified SDK 1.03. https://github.com/diydrones/ardupilot/tree/master/ArduCopter
So would PLEASE someone use this easy printf example and test it with another official Arduino board? The changes in code would be small.

@dgrat
Copy link
Author

dgrat commented Jan 9, 2014

Atm I use a server script like this to communicate with the board.
If you want to do test stuff, you can adapt my server script. It is working more or less pretty fine and avoids causing these problems, by reading and deleting buffer from time to time.
It should work with any Arduino board and for repro, it can be modified.
https://code.google.com/p/rpicopter/source/browse/RPiQuadroServer.py

@matthijskooijman
Copy link
Collaborator

 static int timer = 0;
 int time = hal.scheduler->millis() - timer;

millis() returns a unsigned long, you're storing this in an int. int goes up to 32768, so after 32 seconds, things will start to behave weirdly. I'm wondering if this might be responsible for perhaps even all of the problems you're seeing? Perhaps you can change these to unsigned long and test again?

As for trying to reproduce, it would be easier to remove this HAL thing and just use the Arduino API directly. That removes code and thus complexity. Having said that, I'd consider including this ArduCopter thing and trying your sketch as-is, but I'm still not sure what to test. If I upload your code and then not open (or open and close) the serial monitor, how can I tell it is or isn't working? Please be more specific about this.

@dgrat
Copy link
Author

dgrat commented Jan 9, 2014

Well this unsigned conversion stuff is not nice, I admit it, but you calculated it wrong (it is way more then 32s until unexpected behavior appr. 596h, its an 32 bit int). Unfortunately it is not influencing the observed problem anyway.

Edit: At least in my case I deal with 32 bit ints :)

@matthijskooijman
Copy link
Collaborator

Huh? "int" on AVR is 16-bit, not 32 like on x86/amd64. long is 32-bit.

@dgrat
Copy link
Author

dgrat commented Jan 9, 2014

Hmm, I was wrong in that. I change code and test it again.

@dgrat
Copy link
Author

dgrat commented Jan 9, 2014

I exchanged the int with uint32_t. The timeout problem is still there.
But I will think more about 32/16 bit and signed conversions in future.

@dgrat
Copy link
Author

dgrat commented Jan 12, 2014

God I really don't understand this talking around the problem. I posted some example code just for illustration. You can easily remove the timer or replace the int with an unsigned and it will be still there. And no one willing for reproduction, even if it's just 5 lines of code.

@matthijskooijman
Copy link
Collaborator

Ok, let me be more clear, then.

  • I think that the example code you posted is too complex, because it uses the Ardupilot library. That introduces extra complexity, and might even be causing the bug.
  • I have asked you to further reduce the example code to not use the Ardupilot library to a) make debugging easier (less code) and b) rule out that Ardupilot is causing the bug. You have not done this.
  • It is still not entirely clear to me how the example you posted can be used to reproduce the problem (e.g., what steps to take and what the expected and actual results are). I've asked for clarification, but you haven't responded to that.
  • I pointed out the 16/32 bit problem, because I thought that it might have influenced your testing and perhaps the problem was also present in your original code (but I can't know that for sure). In any case, the problem should be fixed to confirm that it is indeed not causing the timeout problem.
  • You seem to be the first one to report this problem. I myself have been using sketches with debug output, which run perfectly with or without a serial console attached. To me, this means that it's unlikely that the problem is as general as you originally suggested and that it's not unlikely that there is a problem specific to your setup or even that the problem is caused by wrong expectations.
  • I'm perfectly willing to try and reproduce the problem, but I don't want to spend a lot of time on this. Sure, it's easy to remove the timer, or replace the int, or rewrite the code to not use Ardupilot. However, it does take some time and, more importantly, if I change anything about the code and the problem does not occcur then I can't be sure if that is because I changed the code, or if there is something else going on. Hence, please provide a minimal example with clear instructions on what to do and what to expect.
  • Saying "I don't want to spend a lot of time on this" might sound unfriendly, but it's just how things are. I'm helping you out here voluntarily and I really have a lot of other things to do as well. As for the official developers (which I'm not one of), they have to handle all of the bugs and patches reported, which I think is a few dozen every week. There is limited time, so if you want someone to help fix your particular problem, you'll have to invest some extra time to make things as easy as possible for that someone.

Saying all this probably sounds harsh, but I'm just trying to help you understand how things normally work and why things appear as they do.

For future reference, the APM2.5 board that dgrat is using is specified to use "Atmel’s ATMEGA2560 and ATMEGA32U-2 chips for processing and usb functions respectively". This means that an Arduino Mega is probably best suited for trying to reproduce this problem.

@ntruchsess
Copy link

@matthijskooijman : As I wrote before @dgrat is not the only one. For me this issue was not very important since my usecases used Serial for debugging only (and there I can easy turn it of for production, which is the reason I never inverstigated any further but just acceptet that it sometimes makes a difference whether the serial-data is read or not).
But I entirely agree: we need a small example that reproduces the issue. An example that one can run and investigate in without debugging within some other third-party-library.

  • Norbert

@dgrat
Copy link
Author

dgrat commented Jan 12, 2014

Well it is atm. not so easy for me to write a super minimal example because I use a DIY board with some changes in the SDK. I will try to write my own build script with standard library and look how it works. For someone else on the other side it would be easy to do.
But I add at this point that the hal libary is just a abstraction layer. Means: Forwarding some functions and adding functionality on other parts. It is very likely that the error will be there anyway if I try to use it without hal. I mean I cannot proof it right now, but my intention is not unlikely.

@dgrat
Copy link
Author

dgrat commented Jan 12, 2014

Ah forget what I wrote before. It is working with the standard SDK totally fine! ERROR IS STILL THERE
Arduino SKD 1.0.5 just for info.

/*
  AnalogReadSerial
  Reads an analog input on pin 0, prints the result to the serial monitor.
  Attach the center pin of a potentiometer to pin A0, and the outside pins to +5V and ground.

 This example code is in the public domain.
 */

inline void foo_loop() {
  static uint32_t timer = 0;
  uint32_t time = millis() - timer;

  if(time > 100) {  
    Serial.print("FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT!\n");
    timer = millis();
  }
}


void setup() {
  Serial.begin(115200);
}

void loop() {
  foo_loop();
}

@matthijskooijman
Copy link
Collaborator

It is working with the standard SDK totally fine! ERROR IS STILL THERE

Huh? You're saying it works fine and there is an error? I'm sorry, but I'm not sure what you mean here...

@dgrat
Copy link
Author

dgrat commented Jan 12, 2014

I was not sure whether upload to APM 2.5 works with standard Arduino SDK.
It is uploading the sketch and the problem is still there with my example.

Just to illustrate what I plan to do with this board in realitiy: https://code.google.com/p/rpicopter/
I use the USB interface for communication with RPi and the state of the art is, that there is an issue on the side of Arduino.

@dgrat
Copy link
Author

dgrat commented Jan 12, 2014

And to end this. I did everything.
I made easy examples, even a nice python template is available
I described very detailed what happens in different scenarios
I even made a conclusion
That's it for me

@matthijskooijman
Copy link
Collaborator

Right, so you mean that using the standard Arduino API/SDK things are working (compiling, uploading, roughly doing what you'd expect), but also that the "timeout problem" also occurs with the sketch you pasted. Did I get that right?

Good that there is now a minimal Arduino API-only sketch that shows the problem. However, it is still not entirely clear how to use this sketch to reproduce the problem (or rather, I might not have understood the problem completely correctly yet...).

If I'd upload the sketch and attach the serial monitor, I'd expect I would see the text printed in the monito every 100ms. IIUC, this works as expected. However, when I disconnect the monitor, how can I tell what's happening inside the Arduino? How can I tell this "timeout" is happening?

@dgrat
Copy link
Author

dgrat commented Jan 12, 2014

However, when I disconnect the monitor, how can I tell what's happening inside the Arduino? How can I tell this "timeout" is happening?

  1. Write a program which is not reading but e.g. sending and measure time for sending.
  2. Or take a look on the LED whether it is blinking every 100ms. If it stops blinking in that interval you can be sure you have the same problem and verify with 1).

And don't forget that I can also cause these timeouts (as I wrote) just by sending rapidly to my Arduino messages without deleting the input buffer. If I would set here "writeTimeout" to anything like 0.5s I would get a timeout after some time of sending. This is an example which should work to cause problem, but atm I am not at home to test it again, I just typed it now.

# PYTHON example
import serial
from time import *

# Maybe in your distri it is another device and not ttyUSB0
ser = serial.Serial('/dev/ttyUSB0', '57600') # set here writeTimeout=* to something

#main loop
def main():
  while True:   
      ser.write("foooooooooooooooooooooooooooooooooooooooofoooooooooooooooooooooooooofoooooooooooooooooooooooooooooooooooooooofoooooooooooooooooooooooooo")

main()

However with a "ser,flushInput()" at the end no timeout is happeing. This is the most strange.

edit: I hope it is better understandable now

@dgrat
Copy link
Author

dgrat commented Jan 13, 2014

I can reproduce it also with other APM boards btw. They all show similiar behavior if they get stressed like explained.
Is there any firmware available for the USB controller of the Arduino? I think they all share the same chip for this. I want to check it myself. And I will try to collect debbugging info for the CDC ACM module.

@matthijskooijman
Copy link
Collaborator

Great, that looks like something I can use :-)

I tried this on an Arduino Mega and I think I can reproduce the problem now. However, one remark about the TX/RX leds: You shouldn't be using those as an indication of wether the main MCU is still running. These leds are controlled by the 32u2 that does USB->Serial conversion. It seems that, if nobody is listening on the USB side and thus the TX buffer of the 32u2 is full, the led stops blinking and remains full-on (though it occasionally blinks off, not sure why). This does not mean that the main MCU has stopped running or sending serial data. There is no flow control between the main MCU and the 32u2, so the main MCU will keep sending data, which is dropped by the 32u2.

For this reason, I added a pin 13 blink to the sketch. Here's what I tested with:

/*
  AnalogReadSerial
  Reads an analog input on pin 0, prints the result to the serial monitor.
  Attach the center pin of a potentiometer to pin A0, and the outside pins to +5V and ground.

 This example code is in the public domain.
 */

inline void foo_loop() {
  static uint32_t timer = 0;
  uint32_t time = millis() - timer;

  if(time > 1000) {  
    Serial.print("FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT! FUCK IT!\n");
    timer = millis();
    digitalWrite(13, !digitalRead(13));
  }
}


void setup() {
  Serial.begin(115200);
  pinMode(13, OUTPUT);
}

void loop() {
  foo_loop();
}

I also increased the time to 1000ms, to make the blinking a bit more
visible.

Now, if I just run the Arduino with or without serial console open, the pin 13
led blinks with a 2 second period as expected. However, when I run the
python program you pasted, the led immediately turns on and stays on
while the program is running (regardless of the current state, I can't
make it stay off instead). After I quit the python program, the led
stays on for 10 seconds (it seems to be 10 seconds every time,
regardless of how long the python programming has been running).

So, there's definately something fishy going on there.

I tested this both with 1.0.5 (Debian version) as well as ide-1.5.x
trunk with my HardwareSerial patch series applied, the problem occurs
with both.

Thinking on this a bit more, I think I understand what is happening.
When the python program opens up the serial console, the 32u2 resets the
Arduino / main MCU (this is needed to allow uploading sketches through
the bootloader). Then, the python program starts writing data, directly.
The bootloader doesn't recognize any commands in the data, but as long
as it keeps receiving data, it will not timeout and continue trying to
find commands in the data stream.

Specifically, it will be caught in this loop indefinately

https://github.com/arduino/Arduino/blob/ide-1.5.x/hardware/arduino/avr/bootloaders/stk500v2/stk500boot.c#L673

When the python program is stopped and the data stream stops, this
timeout is triggered:

https://github.com/arduino/Arduino/blob/ide-1.5.x/hardware/arduino/avr/bootloaders/stk500v2/stk500boot.c#L505

The timeout should be a just a few seconds, but I suspect it takes 10
seconds due to buffered data on the PC side.

In any case, the fix seems to be easy, just wait for a second (or
better, 2) between opening the serial port and sending data. Can you
see if this also helps to fix your problem, or if there is perhaps
another problem here?

e.g.:

def main():
  time.sleep(2)
  while True:   
      ser.write("foooooooooooooooooooooooooooooooooooooooofoooooooooooooooooooooooooofoooooooooooooooooooooooooooooooooooooooofoooooooooooooooooooooooooo")

Gr.

Matthijs

@matthijskooijman
Copy link
Collaborator

Oh, seems our comments got crossposted. As for the firmware running on the 16u2 / USB chip, see the links in my previous comment. However, you'll have to check with the APM folks to see what sources/version they used exactly, the ones I link are for the Arduino Mega 2560, I think.

@dgrat
Copy link
Author

dgrat commented Jan 13, 2014

However, you'll have to check with the APM folks to see what sources/version they used exactly, the ones I link are for the Arduino Mega 2560, I think.

I am pretty sure they use the same firmware as it is also Mega 2560 based.

In any case, the fix seems to be easy, just wait for a second (or
better, 2) between opening the serial port and sending data. Can you
see if this also helps to fix your problem, or if there is perhaps
another problem here?

I will try your approach later when I am home and I hope it will work. Thanks so far.

Thinking on this a bit more, I think I understand what is happening.
When the python program opens up the serial console, the 32u2 resets the
Arduino / main MCU (this is needed to allow uploading sketches through
the bootloader). Then, the python program starts writing data, directly.
The bootloader doesn't recognize any commands in the data, but as long
as it keeps receiving data, it will not timeout and continue trying to
find commands in the data stream.

Your explanation sounds reliable. My remote control works like the following: I send every 20 ms to a server running on the RPi a command. The command is parsed there and the server is forwarding a (shorter) command to the Arduino. Here my firmware is parsing this command and then doing calculating stuff, sensor readout and controlling motors.
My python script that was sending these commands to the Arduino was not using this wait so far. The strange thing for me is, that my firmware worked like expected for some seconds. The Arduino was controlling motors etc. Just that after these few seconds I got timeouts and then processing of each command was incredible much longer than it should be.
These timeouts disappeared when I used the ser.flushInput() command. So there definitely is a buffer problem related. But I am not sure how this parsing step for the upload of the firmware works to cause this problem which I noticed. So far the boot.c file looks complicated for me.
Furthermore my firmware was only sending if data arrived over WiFi and I think this problem can also happen, if I start sending 10 mins after the connection was established. It scales with the size of the message but not the time when start sending.
Well I am pretty sure the answer is in boot.c I will test a bit more.

Thanks so far

@matthijskooijman
Copy link
Collaborator

Reading your comment, I think there might be an additional problem. When the Arduino is running with the serial console detached, the buffer inside the 32u2 will fill up. Now, when you open the serial port from your python program, the main MCU will reset, but (I think) the serial buffer in the 32u2 is not flushed, causing you to read out some old bytes. In your case, I suspect these old bytes might be causing problems. Flusing the buffer on startup is probably the solution here.

So, I'd do:

  • open the serial port
  • flush any bytes you get
  • wait 2 seconds
  • continue with the rest of your program

Btw, the boot.c I linked to is the bootloader, which runs in the main MCU on a reset. You asked for the firmware running in the 32u2 I think, which I think are here: https://github.com/arduino/Arduino/tree/master/hardware/arduino/firmwares/atmegaxxu2

@matthijskooijman
Copy link
Collaborator

Btw, I agree that my analysis so far wouldn't explain why things work for a few seconds and then start to timeout. However, perhaps things only appear to work (due to buffered bytes?), or perhaps the 16 vs 32bit problem from before also caused this. Best to apply my previous suggestion to at least fix all the problems we've diagnosed so far and then see if any problems remain and if so, see if you can also find a reduced example to reproduce those problems.

@dgrat
Copy link
Author

dgrat commented Jan 13, 2014

if you can also find a reduced example to reproduce those problems.

I will post them, when I have them ready.

perhaps the 16 vs 32bit problem

I know it was dumb of me and caused for sure undefinable behavior as well. Such things always cause undefined behavior.

@ntruchsess
Copy link

@dgrat: I did investigate a bit. The HardwareSerial actually will hang on write (https://github.com/arduino/Arduino/blob/master/hardware/arduino/cores/arduino/HardwareSerial.cpp#L467) whenever data is not read from the tx_buffer and the tx_buffer is full. So if you happen to implement a protocol that stops reading the serial interface on the pc while transmitting data to the arduino you migh run into a deadlock where both sides try to send data waiting for the other side to pick it up.
As a solution you have to check the serial interface from the pc side regularly and read all outstanding data even if you don't process it immediatly but keep it in a buffer for processing later. (Doing this on the arduino-side would be equivalent but most likely not possible as the available memory is scarce).

  • Norbert

@matthijskooijman
Copy link
Collaborator

@ntruchsess, I think your analysis is wrong. The UART hardware on the Arduino side will keep transmitting bytes, eventually always emptying the tx buffer, even when they're not being read at the remote end. There is not flow control, so the buffer inside the 16u2 that does the serial-to-usb conversion will just overflow in this case, dropping bytes. It should not deadlock.

@ntruchsess
Copy link

@matthijskooijman We don't know but that assumption may not hold true for the APM 2.5 board. E.g. this depends on the mode XCK is configured.

@dgrat: I think you should try changing the HardwareSerial.write Line 467 to 'if (i == _tx_buffer->tail) return 0;' and see whether this makes a difference.

@dgrat
Copy link
Author

dgrat commented Jan 22, 2014

I still wait for delivery of my programmer :(

Edit: So far I also have no clue how to replace the bootloader on the 32u2. But maybe I don't need a programmer :)

@odbol
Copy link

odbol commented Nov 5, 2015

@matthijskooijman is that also true on SAM boards like the Due? I believe I am having this same issue: board runs fine as long as you are not connected, and if you connect Native USB port to a computer it works fine too (sort of). But if you unplug from the computer, the next time the Arduino tries to send data over USB, it freezes up.

@matthijskooijman
Copy link
Collaborator

@odbol, I'm actually not sure, haven't dug into the Due design much. But given that it uses a native USB port, just like the Leonardo, I think things will be different for the Due.

@sandeepmistry sandeepmistry transferred this issue from arduino/Arduino Sep 16, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants