New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bytes_to_yuv slow implementation #308

Closed
goosst opened this Issue Jul 22, 2016 · 1 comment

Comments

Projects
None yet
2 participants
@goosst

goosst commented Jul 22, 2016

Hello

The bytes_to_yuv implementation takes quite a long time to execute and is really a limiting factor when doing imageprocessing with reasonable fps.

Below you can find a different implementation which is significantly faster.
to give you an idea, this is a comparison of the time benefit (including a check if the output is the same)

bytes_to_yuv picamera 0.134664058685
bytes_to_yuv stijn 0.0265998840332
equal matrices True
bytes_to_yuv picamera 0.131421089172
bytes_to_yuv stijn 0.0261979103088
equal matrices True
bytes_to_yuv picamera 0.132745981216
bytes_to_yuv stijn 0.027575969696
equal matrices True
bytes_to_yuv picamera 0.132982969284
bytes_to_yuv stijn 0.0273771286011
equal matrices True
bytes_to_yuv picamera 0.132987976074
bytes_to_yuv stijn 0.026458978653
equal matrices True
bytes_to_yuv picamera 0.132247924805
bytes_to_yuv stijn 0.0261468887329
equal matrices True
bytes_to_yuv picamera 0.132241010666
bytes_to_yuv stijn 0.0266060829163
equal matrices True
bytes_to_yuv picamera 0.133563995361
bytes_to_yuv stijn 0.0265071392059
equal matrices True
bytes_to_yuv picamera 0.133520126343
bytes_to_yuv stijn 0.0271589756012
equal matrices True
bytes_to_yuv picamera 0.132179021835
bytes_to_yuv stijn 0.026967048645
equal matrices True
bytes_to_yuv picamera 0.132380962372
bytes_to_yuv stijn 0.0270209312439
equal matrices True

def bytes_to_yuv_stijn(bufferstream,resolution):
        width = resolution[0]
        height = resolution[1]
        fwidth = (width + 31) // 32 * 32
        fheight = (height + 15) // 16 * 16
        y_len = fwidth * fheight
        uv_len = (fwidth // 2) * (fheight // 2)
        # Separate out the Y, U, and V values from the array
        a = np.frombuffer(bufferstream, dtype=np.uint8)
        Y = a[:y_len]
        U = a[y_len:-uv_len]
        V = a[-uv_len:]

        # Reshape the values into two dimensions, and double the size of the
        # U and V values (which only have quarter resolution in YUV4:2:0)
        Y = Y.reshape((fheight, fwidth)) #this one is still fficient                
        U=U.reshape((fheight // 2, fwidth // 2))
        V=V.reshape((fheight // 2, fwidth // 2))        

        #define matrices
        Ustart=np.empty_like(Y,dtype=np.uint8)              
        Vstart=np.empty_like(Y,dtype=np.uint8)

        Ustart[::2,::2]=U
        Ustart[1::2,1::2]=U
        Ustart[1::2,::2]=U
        Ustart[::2,1::2]=U

        Vstart[::2,::2]=V
        Vstart[1::2,1::2]=V
        Vstart[1::2,::2]=V
        Vstart[::2,1::2]=V          

        return np.dstack((Y, Ustart, Vstart))[:height, :width]
@waveform80

This comment has been minimized.

Owner

waveform80 commented Feb 23, 2017

Sorry this has taken so long to get around to. Definitely looks faster on all platforms - I'll get this merged in for 1.13. Must admit I'm slightly surprised that shoving the bits around from Python is faster than numpy's repeat method (which is written in C) but then the latter is generic which may account for the difference (I haven't bothered tracing the C version to find out where the real bottleneck is).

@waveform80 waveform80 added this to the 1.13 milestone Feb 23, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment