Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[7e3cb56] Staging View performance drop #72

Open
LordHDL opened this issue Feb 18, 2015 · 18 comments
Open

[7e3cb56] Staging View performance drop #72

LordHDL opened this issue Feb 18, 2015 · 18 comments

Comments

@LordHDL
Copy link

LordHDL commented Feb 18, 2015

Noticing an odd performance drop when using staging view. The setup tested is pretty minimal and low intensity. I tried OpenEmu (native Syphon output) and a regular Syphon capture source. With staging view on I actually get decreased FPS in both CS and OpenEmu. With staging off performance jumps back to 60 FPS. This is before I even start streaming/recording. The actual difference in CPU usage is fairly low.

Here is the usage with staging view on:

screen shot 2015-02-18 at 2 48 27 pm

Staging off:

screen shot 2015-02-18 at 2 50 49 pm

@zakk4223
Copy link
Owner

Is it related to syphon sources, or openemu in particular? Also does openemu have an option to show the fps? I'm sorta faking it by using Simple Syphon Client since it show the framerate there, but I'm not 100% sure how accurate that is.

@LordHDL
Copy link
Author

LordHDL commented Feb 19, 2015

OpenEmu doesn't have a show FPS option, but the drop in rate was pretty drastic. I can't seem to recreate this anymore since I compiled 41cd0e0. I tested more anyway just to be sure, here's what I noticed:

OpenEmu: Another test. Resolution set to 576x324, 60 FPS. With staging on and off I actually got full FPS in OpenEmu. Staging preview had a big drop in rate while the live preview was more performant (yet not quite 60).

Desktop: I set the source and layout to 576x324 to minimize CPU and GPU load. I played back a video at 426x240 resolution and watched it with staging on and off (without recording). There were noticeable drops in FPS, but live preview was slightly better than staging preview.

Kega Fusion: This emulator has a show FPS option (but no native Syphon support). Same results as OpenEmu.

So basically I can't recreate the performance drop I saw before, but there are still other drops occurring. The most obvious of which is in the staging preview (admittedly not very important), but still some drops in live preview and recordings.

I should note that Syphon itself is flawed with FPS. It cannot send a perfect 60 FPS stream in the way that ffmpeg can record flawlessly. But unfortunately since Desktop source seems to have its own issues I'll have to find another, more reliable way to test performance.

@zakk4223
Copy link
Owner

OSX is all sorts of weird and aggressive about frame rates. Desktop source is especially weird because the OS only draws desktops when it thinks it needs to. So if you're capturing a desktop that's rarely updating it may only be sending CocoaSplit desktop frames at 10fps. And nothing can really draw faster than the display vsync unless it is a fullscreen app (although I've seen posts on apple's dev forums that point to there being a bug with the fullscreen optimization stuff)

I can never tell if stuff like this is CS having an issue or the OS just being weird. I did change some things in the openGL code to remove some unnecessary openGL calls, but I can't really say one way or the other if that's why you can't recreate the exact same issue with OpenEMU.

You may want to download the 'graphics tools for Xcode' and try running Quartz Debug. It has a desktop framerate meter which may shed some light on what's going on. You can also use it to disable 'beam sync' which I've notice can fix some weird issues I have with some games.

What mac/video card are you running this stuff on anyways?

@LordHDL
Copy link
Author

LordHDL commented Feb 19, 2015

I actually did have the graphics tools and forgot about them. I'll try doing better tests using that.

My hardware:

MacBook Pro 2011
10.10.2
Intel i7 2.2 GHz 2720QM quad core
AMD Radeon 6750M (1 GB VRAM)
8 GB Memory

@LordHDL
Copy link
Author

LordHDL commented Feb 19, 2015

https://www.dropbox.com/s/yuqktk20u2ef4wu/CS%20FPS%20Tests.zip?dl=0

More tests using Quartz Debug to turn sync on and off. You'll notice the emulator itself never drops in performance, but the videos have very noticeable skips in them. If you're used to seeing 60 FPS content this will be obvious, but a better way to tell is to look at the yellow shield in the game. It flickers on and off every frame (1 frame on, 1 frame off, etc.). Unfortunately the frame meter didn't really show me anything that iStat Menus didn't already show with more precision.

I actually tested 120 FPS as well, even though my monitor is only 60 Hz. I did this because some other software (like CamTwist) would actually not give full 60 FPS when I had 60 selected, but would look perfect with 120 selected (and only 60 rendered due to vsync). I wanted to see if something similar happened in CS (it didn't, video looked significantly worse).

The drop in rate never happens in the video sources, only video capture/playback. I wonder if there is some underlying cause in OS X or OpenGL.

@zakk4223
Copy link
Owner

I modified the 'elapsed time' source to print the elapsed time since it was last asked to provide output. That's done every frametick. Replace the 'frameTick' function in CSTimeIntervalBase.m with this one: https://gist.github.com/zakk4223/46c8a5379e9c2c0a7271

Add an 'Elapsed Time' source: now it'll show you how long it was since the last frame was rendered. Go into the source config for it and set the format to s.SSSS (or just add an extra S to the default one). For 60fps you should see something near 0.0166

If you record from CS at 60fps, you can go through it frame by frame in VLC and see if that number ever gets much bigger than 0.0166. If it does a frame was skipped/took too long to render etc.
Add a syphon/desktop source with a timer (That sonic timer is good, although a timer with a thousandths digit would be ideal) and you should be able to figure out if it's just frames being missed/duplicated from the source, or if it's the entire CS scene rendering that's slowing down.
I'm poking around for an emulator or some tool that I can syphonInject that will display a frame count, that would make it super easy to see what's going on. I thought mame had it but I guess not..

Both the staging and live views you see in CS are driven from a single CVDisplayLink, which means they are rendered at whatever your display refresh rate is (so probably 60fps). The difference is that the Staging view is only rendered based on the displayLink (which means the staging view actually ignores the layout's FPS setting). The 'live' layout does it's rendering via an internal timing loop based on the layout's FPS setting. The on-screen display just asks the live layout for the last rendered scene and then displays that.

In theory if you use a syphon source the source should ask the syphon server for the current frame every time the render loop fires in CS. I'm wondering if some of the weirdness is because those two things aren't sync'ed up, so it may occasionally grab a frame a bit too fast and get a duplicate. I'll try adding some debugging into the Syphon source so I can see how fast things are firing there.

One possibility is adding a config option that lets you tell the layout it should drive the render loop timing from a certain source. The advantage being that you'd never miss or duplicate a frame from that source (unless the source itself sent dup frames or skipped them), but at the cost of your stream framerate only being as stable as the source you tie it to.

@zakk4223
Copy link
Owner

The latest commit has an additional option for Syphon sources that controls how it renders the syphon output into the source layer. I found out that if you let Core Animation control the timing of the layer's drawing, it doesn't necessarily ask for a redraw every time it renders the layout So it was certainly missing frames at times. The difference between the options is (in order):

  1. draw into the layer every time the syphon server publishes a frame
  2. request the current frame from the syphon server every time the layout is rendered
  3. the old behavior, where it lets Core Animation decide when to render the layer.

The choice between using the first or second option probably depends on your stream FPS. If the syphon source is 60fps and you're streaming at 30fps selecting the second one would reduce the amount of pointless layer rendering you do. (since half the frames would never be displayed anyways)

In my tests the first option reduced the number of duplicate frames in the output by quite a bit; I'm not sure I ever actually saw a dup or skip, but I was just spot checking parts of a long recording.

@LordHDL
Copy link
Author

LordHDL commented Feb 20, 2015

Oh yes, significant improvement for Syphon: https://www.dropbox.com/s/nskjhkd5el6bpaq/CS%20Syphon%20FPS.mp4.zip?dl=0

However I think if we're going to delve into more scientific testing we probably shouldn't use Syphon, for this reason: http://v002.info/forums/topic/syphon-recorder-fps-issue/

Note the thread is about the recorder but a dev sheds some light on how Syphon itself handles frames. Basically it doesn't do a perfect job at frame delivery, which means it'll influence FPS tests within CS.

Here's a simple test I did using a native PC game that runs at exactly 60 FPS (not 59.94 like Genesis games) using a regular Desktop source. This should eliminate any extra sync and frame delivery factors that would otherwise influence tests. Again, done at low res to remove the possibility of hardware limitations factoring in: https://www.dropbox.com/s/fjqp8nxpv9wsq2q/CS%20Desktop%20FPS.mp4.zip?dl=0

And a test done with a split timer: https://www.dropbox.com/s/p7mdcsa7r2a4rqt/CS%20Desktop%20Timer.mp4.zip?dl=0

I would also suggest viewing these videos with ffplay, as VLC and QuickTime both have their own performance issues. Frame rate just seems like a hard thing for lots of software to get a firm handle on. For reference, screen recording with ffmpeg on Windows or Linux will produce perfect frame rate (as for OS X, there's this: http://trac.ffmpeg.org/ticket/4080).

@zakk4223
Copy link
Owner

Desktop capture isn't all that different from syphon, it's still delivering asynchronous frame notifications across process boundaries. There's jitter even if you've got something forcing the desktop to update at 60fps. It might be slightly better than syphon, likely because the notification mechanism is a bit more lightweight.

It'll suffer the same timing issues as other sources, mainly that the CS update timer is firing independently of the desktop frame events, so there's the risk that CS needs to render a new frame just slightllllly before the new desktop frame event arrives, so it'll just use the old one and dup the frame.

All of the official and performant video capture APIs in OS X are asynchronous. They all provide timestamps of the frames they deliver, which is great if you're doing non-realtime recording, but for realtime stuff you can't really tolerate a frame being even a few milliseconds late. I see in 10.9 AVFoundation connections were changed to have a 'minimum frame rate' type config option, but it's dependent on the source supporting it. I may mess with that to see how that goes. I would suspect there's still jitter though (assuming anything supports that config option)

I think the changes I made to Syphon sources are about as good as it can get, unless there are instances where the actual CS rendering loop isn't maintaining a consistent framerate. I'll likely implement that same type of change for desktop and webcam capture too.

edit: not a single device I tested with AVFoundation supported the minimum FPS option on the capture connection. so never mind that. Maybe the build in desktop capture does, but I don't remember how good or bad grabbing the desktop that way is.

@LordHDL
Copy link
Author

LordHDL commented Feb 22, 2015

The "internal frame tick" option seems to freeze the video for me entirely.

By built-in desktop capture, you mean with AV Foundation? Because that's how ffmpeg does screen capture, but since they've been really slow to implement the frame rate function it's currently capped at 15 FPS, so no real way to see how efficient it is.

@zakk4223
Copy link
Owner

I just did a test with the AVFoundation desktop capture and it seems to max out at 15fps no matter what. Even if you set the 'max frame duration' property on the connection to the equivalent of 60 or 30 fps. I think I'm remembering why I abandoned using it...

@LordHDL
Copy link
Author

LordHDL commented Feb 22, 2015

I see, so is this just a general limitation of AVF capture then? I figured it was an ffmpeg thing since OS X is the only platform the -framerate option isn't implemented for. It's quite a shame if that's the case.

Edit: According to this 60 FPS is definitely possible: https://developer.apple.com/library/mac/documentation/AudioVideo/Conceptual/AVFoundationPG/Articles/04_MediaCapture.html

It's under high frame rate video capture.

@zakk4223
Copy link
Owner

All that documentation is for hardware capture devices, so it doesn't necessarily apply to anything else.

However after digging around in some header files and some apple example code it looks like you can do 60fps desktop capture. The 'max framerate' property actually defaults to 15fps but that default isn't documented anywhere I can find. So you have to explicitly change it to 60.

I think the only real remaining potential problem with it is that the documentation is somewhat explicit that the capture system can choose to deliver frames at a lower rate if it doesn't have the resources to maintain a higher one. No idea if that's even relevant on semi-modern hardware or not. Sometimes these things only apply to iPhones and they just sort of leave them in the OS X docs.

It doesn't seem to be too cpu/resource intensive either, probably no worse than the other desktop capture API.

You can probably test this in ffmpeg if you're willing to compile it yourself. Just find where they are creating the AVCaptureScreenInput instance and set it's minFrameDuration property. it would look something like blah.minFrameDuration = CMTimeMake(1, 60) for 60fps.

@LordHDL
Copy link
Author

LordHDL commented Feb 26, 2015

I cannot find minFrameDuration anywhere in the ffmpeg source. All I found was this:

#if __MAC_OS_X_VERSION_MIN_REQUIRED >= 1070
CGDirectDisplayID screens[num_screens];
CGGetActiveDisplayList(num_screens, screens, &num_screens);
AVCaptureScreenInput* capture_screen_input = [[[AVCaptureScreenInput alloc] initWithDisplayID:

Which is located in ffmpeg/libavdevice/avfoundation.m.

@zakk4223
Copy link
Owner

You'll need to add it yourself, they aren't setting it so they're getting the 15fps default. Find where they're adding it to the capture session and then set capture_screen_input.minFrameDuration just before it.

@LordHDL
Copy link
Author

LordHDL commented Feb 27, 2015

That's probably not something I can do with my very limited knowledge of coding. I tend to struggle in this department with even simple things. I did try a few things anyway but as expected they all resulted in compile errors. My forte is more quality assurance.

Have you already personally made this change and tried it? You should submit a pull request to them to get it implemented sooner.

@LordHDL
Copy link
Author

LordHDL commented Apr 1, 2015

Just to follow up on the AVF thing, the -framerate option is now implemented in ffmpeg. By default AVF recordings are 30 FPS now, but you can now achieve 60 with -framerate 60. Not sure if this helps CS in any way.

@zakk4223
Copy link
Owner

zakk4223 commented Apr 5, 2015

I don't use ffmpeg for capture, but if they're managing to use it at 60fps without the associated system performance issue it might provide a good comparison point to narrow down what exactly is causing that when I try it

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants