-
Notifications
You must be signed in to change notification settings - Fork 202
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
uartdmx malf measures from start of frame not Mark after last frame. #1087
Comments
Patch is at: http://prive.bitwizard.nl/ola-drain.patch |
"default malf is 100 microseconds, but in practise 17ms is seen." @richardash1981 do you have any thoughts? Can we have an olad -l 4 @rewolff just in case it shows anything. |
Also your drain function is commented out in your patch. I take it that's just a typo? |
Should have said first time around: Of course this is uart-dmx on a raspberry pi, in case someone doesn't connect the dots. I commented it back out to verify that 1) it worked "reasonable" again after removing the drain, 2) That I was really running the code i had just compiled. And... The patch didn't work. So it can be used as a "framework" for where to add the "drain" call, but it doesn't work "as is". (Working reasonable is that I can reduce the malf to a few miliseconds as opposed to the 17 miliseconds seen with the default settings. This is done by setting malf to 24 miliseconds (24000 microseconds) in the config file..... ). No, when you set malf to 100 microseconds, a delay between the packets of 17 miliseconds is seen. No typo. |
I'm not sure the summary is correct (I think it's actually measured from the end of the write, which is indeterminate), but I can well believe that there may be a problem. I should be able to get a DSO onto my Pi near the end of this week and see what timings I am getting. I will admit I didn't pay a lot of attention to the malf control, because it wasn't critical to what I was doing. The code is mostly copied from plugins/ftdidmx/FtdiDmxThread.cpp, although that seems to have got smarter since I last looked at it! This particular problem won't show itself there, because that code maintains a target framerate by adjusting the amount of sleeping done, so it doesn't matter how quickly Write() returns. My code assumes that Write() blocks until actual bytes are sent, which might not be true (and seems not to be). calling tcdrain() would seem like the right solution, so I'm surprised it didn't work, but there is obviously more than one strange going on here, because without a drain I would expect successive frames of DMX to get concatenated back to back as the OS buffers fill up, although that depends on what the break ioctls do with buffered data, which isn't documented that I can see. Looking at the kernel source, it seems that starting the break (at the start of the frame) will flush the buffer out, which may be the only reason it works ... http://lxr.free-electrons.com/source/drivers/tty/tty_io.c#L2873 I would be tempted to ditch m_granularity from the UART code, would make it quite a bit simpler to read, and I don't believe that BAD mode actually works (the initialisation check can stay and issue a warning if measurement results are poor). But I don't think this is related to the problem at hand. Some instrumentation of how long the various sleeps actually take would also be good to have, and might help us make sense of what is going on. The log doesn't tell us much because |
First: Yes! the write returns immediately, around the beginning of the bytes being sent out. Then the usleep (malf) sleeps for 100 microseconds, 25 bits, or less than 3 bytes..... Reading your "analysis" made me realize that calling break immediately and my tcdrain caused very similar behaviors: it uses a very coarse granularity before it unblocks the thread waiting for the "drain" to happen. So you're probably right that break calls "drain". In the old days, saving interrupts was the issue, I've written linux serial device drivers that would run off a 100Hz timer, because handling everything 100 times a second is more efficient than interrupting on every character (or every buffer-half-full). I suspect that this is a leftover from that time. The uart-driver should, when handling "TX buffer has room" notice the "no further characters in the buffer" situation and "if someone is waiting for drain" then wake them up. It seems we're coming to the conclusion that this is a restriction of the serial driver on the raspberry pi. But if we make the code "correct" by adding the tcdrain call, the actual achievable performance goes down the drain. So.... My conclusion is: Please don't fix this bug! on the other hand, we could calculate the "sleep time" as:
That would be a "working" fix. |
Your "working" fix is effectively what the FTDI output plugin does (although it's targeting a frame rate not a delay, so slightly different code), and was what I was starting to think of doing (because it doesn't need kernel changes). I would still like to check that the tcdrain() call really adds so much excess delay, because it would seem to be a bug in the serial driver (which is fairly close to being pi-specific) for it to do so - I haven't looked at the code path to see if it is obvious why! |
I found this in the kernel source. drivers/tty/tty_io.c: The comment says that tcdrain effectively sends a zero-length-break. That explains the similarity between the two behaviours that we see. :-)
That said. I've traced the code to ___wait_event that suspends a thread until a condition is met. I didn't investigate further. From what we see, it seems the condition is checked periodically. |
I did get to check this with my Pi2B and work's digital scope: I've put some error counters into my branch at https://github.com/richardash1981/ola to see if we are getting any system call failures (at the moment we wouldn't know), and intend to add some timing statistics to this and re-test this week. This will then let us measure whether any changes we make help! |
Oh. I've seen the "wrong baud rate" problem too. Didn't investigate further so wasn't sure enough to report it. |
I don't know if you reported that @richardash1981 but @rbalykov found that in #1523 . With no frame data, we just send break, start code, MALFT continually! |
Edit: Should have said first time around: Of course this is uart-dmx on a raspberry pi.
default malf is 100 microseconds, but in practise 17ms is seen.
Setting malf to 24000 results in a only a few ms of inter-frame delay. So malf counts from close to the beginning of the frame, instead of from the end.
It seems that a "wait for buffer drain" needs to be added after
in UartDmxThread.cpp .
(a quick research shows that it might be necessary to try to change the tty settings with an O_DRAIN option set, but it might be as simple as a tcdrain).
hmm. I think I've produced a patch. My compilation takes a few hours (I haven't figured out how to do working partial compiles yet), and I have to leave before it finishes. Bug me about the patch if it doesn't show up here in 24 hours.
The text was updated successfully, but these errors were encountered: