FAQ: HD Video Playback
HD quality video is very taxing on your CPU, video card, and disks, so first and foremost you'll want a fast machine.
If you are concerned about dropped frames, the safest option is to load all of
your video frames into VRAM, and then play them back. The
LoadMovieIntoTexturesDemoOSX
is good for determining what is possible, in
terms of load times, and playback rates, when called in benchmark mode:
LoadMovieIntoTexturesDemoOSX(<video file name> ,[], [], [], 1)
Note that it works cross-platform, not just under OS X. Usually the limiting factor is decoding the videos into textures, not the drawing of the textures to the screen.
You may also have more success using GStreamer than Quicktime. Type
help gstreamer
to find out more.
Q: What kind of file→texture decode rates are possible?
A: Using Version 3.0.9 - (Build date: Mar 17 2012), with an SSD drive as source storage:
Video: 1280x720x120hz, hardware: i7 2600: 4 core 3.4gz, Gforce GT 545, 3gb vram
- Format mjpeg: 138 fps
- Format mpeg4 (xvid, not avc): 133
- Format x264 (avc): 43 fps
On the topic of video formats that might decode fastest, T Wolf said:
Just my ¤0.02, Handbrake's defaults enable all the bells and whistles that codec developers have come up with in the last 10 years to make the file as small as possible. This means decoding is complex. So don't use the defaults for fast decode.
There are profiles in H.264 that limit decoding complexity. In x264 they are exposed as:
--profile <string> Force the limits of an H.264 profile Overrides all settings. - baseline,main,high,high10,high422,high444
I guess you want baseline. Note that high10 can give you 10-bit color and high444 gives you RGB. Otherwise you get 4:2:2 subsampled color, and much smaller gamut. Not all decoders support these. But Gstreamer does.
You can also try --tune fastdecode in addition. Don't worry about quality, this is governed by CRF constant quality factor. The file will just be bigger, but faster to decode.
FFmpeg might use a different MP4 muxer (for both ASP and AVC) [Than handbreak]. The encoder is the same. I would go with matroska mkv in any case. Quicktime won't play that though.
Note that while decoding a video frame does take a while, uploading that frame
to VRAM isn't instantaneous. See MakeTextureTimingTest2
to find out how long
this step takes.
Here are some other comments on this issue from Mario:
Try PlayGaplessMoviesDemo2
– GStreamer only, but "even more gapless" gapless
playback than the classic gapless demo, because it can make use of GSTreamers
builtin gapless playback support instead of needing to play tricks like with
QT.
On OS choice: there's always Linux as the better alternative. Linux GStreamer is of a more recent version, with improved support for multi-threaded decoding from Ubuntu 11.10 onwards. E.g., H264 encoded material can get a nice boost, utilizing up to 4 cores of a 8-core machine.
Simultaneously decoding videos and drawing them to the screen (no pre-loading),
via OpenMovie
with asyncflag
set to 4:
[This] decodes the movie at highest speed and queues up all video frames
in memory for presentation, doesn't drop frames, doesn't care about
playback timing or audio-video sync etc. It is the fastest method if you
don't care about random access to specific frames (like you could with
the method of LoadMovieIntoTextureDemo
), don't care about audio-video
sync (because there isn't any audio to sync to) and control playback
timing yourself via Screen('Flip')
'when'
parameter. I don't know if
things like looping the movie would still work, but for your task this
should be the most efficient way of doing it.
This method does put more stress on the OS scheduler because there isn't
any throttling of decoding to playback framerate anymore, so GStreamer
will just get any amount of CPU time it can get to decode as fast as
possible – the GStreamer threads compete for CPU resources with the
main Matlab/PTB thread, so it could e.g., happen that GStreamer gets the
CPU to decode and queue yet another frame when the main thread or
graphics driver would need the CPU more urgently to avoid a skipped
presentation deadline. The Priority()
command can help a bit there, but
how much it helps depends on the underlying realtime capabilities of the
operating system scheduler. In that category Linux, when configured
properly for realtime use, has a fabulous reputation, whereas MS-Windows
defines the absolute zero reference point. OS/X is somewhere in between
those two.
The additional buffering is mostly only useful for movies without sound, because it prevents automatic control of playback framerate and automatic audio-video sync.
Steps:
-
OpenMovie
withasyncflag
set to 4. - Start movie playback. This starts the decoding process.
- Wait for a few seconds to prebuffer data.
- Start your
Screen('GetMovieImage')
fetch and draw loop.
The engine will decode video buffers and queue them in an internal queue, as
soon as playback is started, until it gets stopped. GetMovieImage
will fetch
the oldest buffer (fifo order) and convert it to a texture. You can control the
maximum amount of buffered video via the preloadSecs
parameter (default = 1
second), a setting of -1
would allow infinite buffering, ie., until you run out
of system memory.
You'll probably have to use Priority()
to make sure your main thread isn't
deprived of computation time by all the GStreamer threads running at maximum
decoding speed.
The remaining bottleneck would be the texture creation/upload/draw time. There you can try a few things, which may or may not have any effect on performance, in a good or bad direction:
-
A new optional parameter
specialFlags1
inOpenMovie
: A setting of2
disables audio-decoding,1
tries to use YUV color encoded textures instead of RGBA textures if the GPU and driver supports this – may increase or decrease performance. A setting2+1
gives you both. -
There's some
Screen('Preference', 'ConserveVRAM')
(help ConserVRAMSettings
) setting calledTextureFormatOverride
or something like that. It allows to try an alternative texture encoding for RGBA which also may be faster or slower, if the YUV parameter doesn't help.
And then you can do little micro-optimizations, e.g. using the
dontclear=2
flag in Flip
to prevent clearing the framebuffer if
you're overdrawing the video stimulus anyway – may save up to one msec
or so...
Using the additional conservevram
setting 512 aka kPsychAvoidCPUGPUSync
could also make sense (see help ConserveVRAMSettings
, all numbers of all used
flags add up). This would disable any kind of OpenGL/GPU error checking in
texture creation, DrawTexture, Flip etc. Usually not recommended, but once your
code works error-free it may save some fraction of a msec.
Clever use of Screen('DrawingFinished')
after the last drawing command,
before you do other stuff like KbChecks and such may also help to increase
parallelism between CPU and GPU. Could be that the remaining skipped frames are
due to delays on the GPU, not CPU – you can only time the CPU with GetSecs
,
tic/toc, the profiler etc. For the GPU there is special profiling support on
supported GPUs, as shown in DrawingSpeedTest
if you follow the gpumeasure
flag.
In the end, if we talk about occassional misses by a (few) msecs, we're in the world of endless tweaks. E.g., running the GPU always at its highest performance setting to avoid interference of GPU power management, choosing the right operating system instead of the wrong one, tweaking CPU power management and other settings on operating systems that support such things, and so on...
This wiki is complementary to the main website at http://psychtoolbox.org
Please feel encouraged to edit these pages and add helpful content. Take care.