Added this code to enable 4:3 widescreen (black bars) dxva-decoded content to be auto-cropped.
enable auto crop black bars on dxva decoded
Yes for the idea, but no for the implementation. Performance is going to be bad with the gpu->sys mem copies. That's the reason it hasn't been done yet with dxva decoding or vdpau.
The work needs to be done in gpu memory and only the result should be sent back, to save on upload bandwidth. This may be the first reason we have to look into OpenCL as I'm not sure a shader can do it in a reasonable manner.
i'm experimenting with hardware decoding of dvds and on the ION it seems to be doing just fine. Now what we could do to mitigate performance risk, is to only auto crop every 25 frames (or so) and assume that in between the same crop rectangle has to be applied. This would reduce the load significantly (that could even be used on software decoded).
What matters here is the time available between two vsyncs and the amount of work to get done. Whether the copy is done every frame or once in 25 frames, that doesn't change your worst case.
To reduce the load, the work would have to be broken down in 1/25 (or whatever) slices of images at each frame.
Meaning an extra buffer (1/25 the height for example) receiving only a slice of the frame. Uploading that to sys mem would take less resources.
At each frame, a different slice gets copied so that in 25 frames, the whole frame has been transferred. As you said, black bars shouldn't move much so it's not important that the 25 slices are from the same frame.
Or just do the work with the GPU.