Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

uartdmx malf measures from start of frame not Mark after last frame. #1087

Open
rewolff opened this issue Jun 1, 2016 · 11 comments
Open

uartdmx malf measures from start of frame not Mark after last frame. #1087

rewolff opened this issue Jun 1, 2016 · 11 comments

Comments

@rewolff
Copy link
Contributor

rewolff commented Jun 1, 2016

Edit: Should have said first time around: Of course this is uart-dmx on a raspberry pi.

default malf is 100 microseconds, but in practise 17ms is seen.

Setting malf to 24000 results in a only a few ms of inter-frame delay. So malf counts from close to the beginning of the frame, instead of from the end.

It seems that a "wait for buffer drain" needs to be added after

 if (!m_widget->Write(buffer))
      goto framesleep;

in UartDmxThread.cpp .
(a quick research shows that it might be necessary to try to change the tty settings with an O_DRAIN option set, but it might be as simple as a tcdrain).

hmm. I think I've produced a patch. My compilation takes a few hours (I haven't figured out how to do working partial compiles yet), and I have to leave before it finishes. Bug me about the patch if it doesn't show up here in 24 hours.

@rewolff
Copy link
Contributor Author

rewolff commented Jun 1, 2016

Patch is at: http://prive.bitwizard.nl/ola-drain.patch
The good news is that it compiled in one go even though I'm not that familiar with C++. The bad news (you know there is always bad news coming when someone starts off with the good news) is that it doesn't work. :-(
It seems that tcdrain waits way longer than the time required for draining the buffer.
I'm out of ideas at the moment.

@peternewman
Copy link
Member

"default malf is 100 microseconds, but in practise 17ms is seen."
Do you mean 17 u(micro)s rather than milliseconds?

@richardash1981 do you have any thoughts?

Can we have an olad -l 4 @rewolff just in case it shows anything.

@peternewman
Copy link
Member

Also your drain function is commented out in your patch. I take it that's just a typo?

@rewolff
Copy link
Contributor Author

rewolff commented Jun 1, 2016

Should have said first time around: Of course this is uart-dmx on a raspberry pi, in case someone doesn't connect the dots.

I commented it back out to verify that 1) it worked "reasonable" again after removing the drain, 2) That I was really running the code i had just compiled. And... The patch didn't work. So it can be used as a "framework" for where to add the "drain" call, but it doesn't work "as is".

(Working reasonable is that I can reduce the malf to a few miliseconds as opposed to the 17 miliseconds seen with the default settings. This is done by setting malf to 24 miliseconds (24000 microseconds) in the config file..... ).

No, when you set malf to 100 microseconds, a delay between the packets of 17 miliseconds is seen. No typo.
olad-l4.txt

@richardash1981
Copy link
Contributor

richardash1981 commented Jun 5, 2016

I'm not sure the summary is correct (I think it's actually measured from the end of the write, which is indeterminate), but I can well believe that there may be a problem. I should be able to get a DSO onto my Pi near the end of this week and see what timings I am getting. I will admit I didn't pay a lot of attention to the malf control, because it wasn't critical to what I was doing. The code is mostly copied from plugins/ftdidmx/FtdiDmxThread.cpp, although that seems to have got smarter since I last looked at it!

This particular problem won't show itself there, because that code maintains a target framerate by adjusting the amount of sleeping done, so it doesn't matter how quickly Write() returns. My code assumes that Write() blocks until actual bytes are sent, which might not be true (and seems not to be). calling tcdrain() would seem like the right solution, so I'm surprised it didn't work, but there is obviously more than one strange going on here, because without a drain I would expect successive frames of DMX to get concatenated back to back as the OS buffers fill up, although that depends on what the break ioctls do with buffered data, which isn't documented that I can see. Looking at the kernel source, it seems that starting the break (at the start of the frame) will flush the buffer out, which may be the only reason it works ... http://lxr.free-electrons.com/source/drivers/tty/tty_io.c#L2873

I would be tempted to ditch m_granularity from the UART code, would make it quite a bit simpler to read, and I don't believe that BAD mode actually works (the initialisation check can stay and issue a warning if measurement results are poor). But I don't think this is related to the problem at hand. Some instrumentation of how long the various sleeps actually take would also be good to have, and might help us make sense of what is going on.

The log doesn't tell us much because
olad/PluginManager.cpp:73: Skipping UART native DMX because it was disabled
! Sure this is what you are using?

@rewolff
Copy link
Contributor Author

rewolff commented Jun 6, 2016

First: Yes! the write returns immediately, around the beginning of the bytes being sent out. Then the usleep (malf) sleeps for 100 microseconds, 25 bits, or less than 3 bytes.....

Reading your "analysis" made me realize that calling break immediately and my tcdrain caused very similar behaviors: it uses a very coarse granularity before it unblocks the thread waiting for the "drain" to happen.

So you're probably right that break calls "drain".

In the old days, saving interrupts was the issue, I've written linux serial device drivers that would run off a 100Hz timer, because handling everything 100 times a second is more efficient than interrupting on every character (or every buffer-half-full). I suspect that this is a leftover from that time.

The uart-driver should, when handling "TX buffer has room" notice the "no further characters in the buffer" situation and "if someone is waiting for drain" then wake them up.

It seems we're coming to the conclusion that this is a restriction of the serial driver on the raspberry pi. But if we make the code "correct" by adding the tcdrain call, the actual achievable performance goes down the drain. So.... My conclusion is: Please don't fix this bug!

on the other hand, we could calculate the "sleep time" as:

  nbits = nbytes * (1 + 8 + 1);  // 1 startbit, 8 databits, 1 stopbit. 
  time_per_bit = 1000000 /* usec per sec */ / BAUD_RATE;
  sleep_time = nbits * time_per_bit + malf; 

That would be a "working" fix.

@richardash1981
Copy link
Contributor

Your "working" fix is effectively what the FTDI output plugin does (although it's targeting a frame rate not a delay, so slightly different code), and was what I was starting to think of doing (because it doesn't need kernel changes). I would still like to check that the tcdrain() call really adds so much excess delay, because it would seem to be a bug in the serial driver (which is fairly close to being pi-specific) for it to do so - I haven't looked at the code path to see if it is obvious why!

@rewolff
Copy link
Contributor Author

rewolff commented Jun 8, 2016

I found this in the kernel source. drivers/tty/tty_io.c: The comment says that tcdrain effectively sends a zero-length-break. That explains the similarity between the two behaviours that we see. :-)

        case TCSBRK:   /* SVID version: non-zero arg --> no break */
                /* non-zero arg means wait for all output data
                 * to be sent (performed above) but don't send break.
                 * This is used by the tcdrain() termios function.
                 */
                if (!arg)
                        return send_break(tty, 250);
                return 0;

That said. I've traced the code to ___wait_event that suspends a thread until a condition is met. I didn't investigate further. From what we see, it seems the condition is checked periodically.

@richardash1981
Copy link
Contributor

I did get to check this with my Pi2B and work's digital scope:
tek00000
so I agree, there is a lot of wasted time after each frame, so that the time taken is 40ms per frame instead of the 23ms per frame the data takes up. It looks like there is a 20ms timer somewhere which we are ending up waiting on!
I also found some sort of initialisation bug which meant that the output was completely wrong until something (e.g. opening the web UI and going on to the sliders tab) sent some data to the universe - data wasn't even at the right baud rate like this! I'll raise that as a separate bug.

I've put some error counters into my branch at https://github.com/richardash1981/ola to see if we are getting any system call failures (at the moment we wouldn't know), and intend to add some timing statistics to this and re-test this week. This will then let us measure whether any changes we make help!

@rewolff
Copy link
Contributor Author

rewolff commented Jun 13, 2016

Oh. I've seen the "wrong baud rate" problem too. Didn't investigate further so wasn't sure enough to report it.

@peternewman peternewman changed the title malf measures from start of frame not Mark after last frame. uartdmx malf measures from start of frame not Mark after last frame. Oct 7, 2018
@peternewman
Copy link
Member

I also found some sort of initialisation bug which meant that the output was completely wrong until something (e.g. opening the web UI and going on to the sliders tab) sent some data to the universe - data wasn't even at the right baud rate like this! I'll raise that as a separate bug.

I don't know if you reported that @richardash1981 but @rbalykov found that in #1523 . With no frame data, we just send break, start code, MALFT continually!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants