Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Converting numpy array to video #246

Open
Santhosh1509 opened this issue Jul 31, 2019 · 18 comments
Open

Converting numpy array to video #246

Santhosh1509 opened this issue Jul 31, 2019 · 18 comments

Comments

@Santhosh1509
Copy link

I'm using OpenCV for processing a video, saving the processed video

Example:

import numpy as np
import cv2

cap = cv2.VideoCapture(0)

# Define the codec and create VideoWriter object
fourcc = cv2.VideoWriter_fourcc(*'XVID')
out = cv2.VideoWriter('output.avi',fourcc, 20.0, (640,480))

while(cap.isOpened()):
    ret, frame = cap.read()
    if ret==True:
        frame = cv2.flip(frame,0)

        # write the flipped frame
        out.write(frame)

        cv2.imshow('frame',frame)
        if cv2.waitKey(1) & 0xFF == ord('q'):
            break
    else:
        break

# Release everything if job is finished
cap.release()
out.release()
cv2.destroyAllWindows()

Source file is FULL HD 2 minutes clip in avi format with Data Rate 7468kbps
Saved file is FULL HD 2 minutes clip in avi format with Data Rate 99532kbps

this is confusing
if i save each frame and give it to input, I get an error in the .output saving there is no such file

import ffmeg
(
    ffmpeg
    .input('/path/to/jpegs/*.jpg', pattern_type='glob', framerate=25)
    .output('movie.mp4')
    .run()
)

How do i save the video as the same size as the source using ffmeg-python?

@kylemcdonald
Copy link
Contributor

kylemcdonald commented Aug 11, 2019

not sure if this is what you were asking, but here is some code to save frames from memory straight to a video file. if you chop this up a little you could hack it into your initial code and avoid writing the jpgs to disk:

def vidwrite(fn, images, framerate=60, vcodec='libx264'):
    if not isinstance(images, np.ndarray):
        images = np.asarray(images)
    n,height,width,channels = images.shape
    process = (
        ffmpeg
            .input('pipe:', format='rawvideo', pix_fmt='rgb24', s='{}x{}'.format(width, height))
            .output(fn, pix_fmt='yuv420p', vcodec=vcodec, r=framerate)
            .overwrite_output()
            .run_async(pipe_stdin=True)
    )
    for frame in images:
        process.stdin.write(
            frame
                .astype(np.uint8)
                .tobytes()
        )
    process.stdin.close()
    process.wait()

Edit 2020-01-28: My working version of this function is backed by a small class, implemented in my python-utils/ffmpeg.py

@Santhosh1509
Copy link
Author

@kylemcdonald Thank You it worked

How can I alter crf?

@jpreiss
Copy link

jpreiss commented Jul 14, 2020

Is @kylemcdonald's code example still the preferred way to stream frames from in-memory numpy arrays to an ffmpeg process?

@jblugagne
Copy link

jblugagne commented Jul 17, 2020

When I try to run @kylemcdonald's function on an image array of 228x2048x2048x3 np.uint8, only 65 frames are saved, and it looks like a bunch of them are skipped

ffmpeg version 4.0.2 Copyright (c) 2000-2018 the FFmpeg developers
  built with gcc 4.8.2 (GCC) 20140120 (Red Hat 4.8.2-15)
  configuration: --prefix=/home/jeanbaptiste/.conda/envs/ffmpeg_env --disable-doc --disable-openssl --enable-shared --enable-static --extra-cflags='-Wall -g -m64 -pipe -O3 -march=x86-64 -fPIC' --extra-cxxflags='-Wall -g -m64 -pipe -O3 -march=x86-64 -fPIC' --extra-libs='-lpthread -lm -lz' --enable-zlib --enable-pic --enable-pthreads --enable-gpl --enable-version3 --enable-hardcoded-tables --enable-avresample --enable-libfreetype --enable-gnutls --enable-libx264 --enable-libopenh264
  libavutil      56. 14.100 / 56. 14.100
  libavcodec     58. 18.100 / 58. 18.100
  libavformat    58. 12.100 / 58. 12.100
  libavdevice    58.  3.100 / 58.  3.100
  libavfilter     7. 16.100 /  7. 16.100
  libavresample   4.  0.  0 /  4.  0.  0
  libswscale      5.  1.100 /  5.  1.100
  libswresample   3.  1.100 /  3.  1.100
  libpostproc    55.  1.100 / 55.  1.100
Input #0, rawvideo, from 'pipe:':
  Duration: N/A, start: 0.000000, bitrate: 2516582 kb/s
    Stream #0:0: Video: rawvideo (RGB[24] / 0x18424752), rgb24, 2048x2048, 2516582 kb/s, 25 tbr, 25 tbn, 25 tbc
Stream mapping:
  Stream #0:0 -> #0:0 (rawvideo (native) -> h264 (libx264))
[libx264 @ 0x1e1a400] using cpu capabilities: MMX2 SSE2Fast SSSE3 SSE4.2 AVX FMA3 BMI2 AVX2
[libx264 @ 0x1e1a400] profile High, level 5.0
[libx264 @ 0x1e1a400] 264 - core 152 - H.264/MPEG-4 AVC codec - Copyleft 2003-2017 - http://www.videolan.org/x264.html - options: cabac=1 ref=3 deblock=1:0:0 analyse=0x3:0x113 me=hex subme=7 psy=1 psy_rd=1.00:0.00 mixed_ref=1 me_range=16 chroma_me=1 trellis=1 8x8dct=1 cqm=0 deadzone=21,11 fast_pskip=1 chroma_qp_offset=-2 threads=12 lookahead_threads=2 sliced_threads=0 nr=0 decimate=1 interlaced=0 bluray_compat=0 constrained_intra=0 bframes=3 b_pyramid=2 b_adapt=1 b_bias=0 direct=1 weightb=1 open_gop=0 weightp=2 keyint=250 keyint_min=7 scenecut=40 intra_refresh=0 rc_lookahead=40 rc=crf mbtree=1 crf=23.0 qcomp=0.60 qpmin=0 qpmax=69 qpstep=4 ip_ratio=1.40 aq=1:1.00
Output #0, mp4, to '/run/media/jeanbaptiste/SAMSUNG/compression_test/test2.mp4':
  Metadata:
    encoder         : Lavf58.12.100
    Stream #0:0: Video: h264 (libx264) (avc1 / 0x31637661), yuv420p, 2048x2048, q=-1--1, 7 fps, 14336 tbn, 7 tbc
    Metadata:
      encoder         : Lavc58.18.100 libx264
    Side data:
      cpb: bitrate max/min/avg: 0/0/0 buffer size: 0 vbv_delay: -1
frame=    9 fps=0.0 q=0.0 size=       0kB time=00:00:00.00 bitrate=N/A dup=0 drop=16 speed=   0x    
frame=   16 fps= 16 q=0.0 size=       0kB time=00:00:00.00 bitrate=N/A dup=0 drop=34 speed=   0x    
frame=   23 fps= 15 q=0.0 size=       0kB time=00:00:00.00 bitrate=N/A dup=0 drop=54 speed=   0x    
frame=   31 fps= 15 q=0.0 size=       0kB time=00:00:00.00 bitrate=N/A dup=0 drop=73 speed=   0x    
frame=   39 fps= 15 q=0.0 size=       0kB time=00:00:00.00 bitrate=N/A dup=0 drop=93 speed=   0x    
frame=   46 fps= 13 q=0.0 size=       0kB time=00:00:00.00 bitrate=N/A dup=0 drop=111 speed=   0x    
frame=   51 fps= 13 q=0.0 size=       0kB time=00:00:00.00 bitrate=N/A dup=0 drop=126 speed=   0x    
frame=   55 fps= 12 q=0.0 size=       0kB time=00:00:00.00 bitrate=N/A dup=0 drop=136 speed=   0x    
frame=   60 fps= 12 q=24.0 size=     512kB time=00:00:00.14 bitrate=29348.5kbits/s dup=0 drop=147 speed=0.0282x    
frame=   63 fps= 11 q=24.0 size=    1024kB time=00:00:00.57 bitrate=14679.0kbits/s dup=0 drop=157 speed=0.102x    
frame=   65 fps=6.3 q=-1.0 Lsize=    9993kB time=00:00:08.85 bitrate=9242.1kbits/s dup=0 drop=163 speed=0.86x    
video:9991kB audio:0kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.015101%
[libx264 @ 0x1e1a400] frame I:2     Avg QP:18.44  size:381836
[libx264 @ 0x1e1a400] frame P:36    Avg QP:19.80  size:169252
[libx264 @ 0x1e1a400] frame B:27    Avg QP:20.04  size:124944
[libx264 @ 0x1e1a400] consecutive B-frames: 43.1%  3.1%  4.6% 49.2%
[libx264 @ 0x1e1a400] mb I  I16..4:  4.5% 89.8%  5.7%
[libx264 @ 0x1e1a400] mb P  I16..4:  1.1% 30.9%  0.4%  P16..4: 34.6% 16.1% 10.5%  0.0%  0.0%    skip: 6.5%
[libx264 @ 0x1e1a400] mb B  I16..4:  0.4% 12.7%  0.0%  B16..8: 55.2% 10.8%  3.0%  direct: 3.6%  skip:14.2%  L0:53.9% L1:44.4% BI: 1.8%
[libx264 @ 0x1e1a400] 8x8 transform intra:95.0% inter:71.9%
[libx264 @ 0x1e1a400] coded y,uvDC,uvAC intra: 86.4% 0.0% 0.0% inter: 40.4% 0.0% 0.0%
[libx264 @ 0x1e1a400] i16 v,h,dc,p: 16%  9% 34% 42%
[libx264 @ 0x1e1a400] i8 v,h,dc,ddl,ddr,vr,hd,vl,hu: 12%  8% 49%  5%  5%  5%  5%  5%  5%
[libx264 @ 0x1e1a400] i4 v,h,dc,ddl,ddr,vr,hd,vl,hu: 16%  8% 28%  9%  9%  9%  7%  8%  6%
[libx264 @ 0x1e1a400] i8c dc,h,v,p: 100%  0%  0%  0%
[libx264 @ 0x1e1a400] Weighted P-Frames: Y:5.6% UV:0.0%
[libx264 @ 0x1e1a400] ref P L0: 41.4% 13.4% 27.6% 16.5%  1.0%
[libx264 @ 0x1e1a400] ref B L0: 68.8% 26.7%  4.5%
[libx264 @ 0x1e1a400] ref B L1: 88.8% 11.2%
[libx264 @ 0x1e1a400] kb/s:8813.73

Am I missing something here?

@jpreiss
Copy link

jpreiss commented Jul 17, 2020

@jblugagne I encountered a related problem - I was getting duplicated frames in my stream. I had to pass the r=framerate argument to the input() method instead of the output() method.

The ffmpeg documentation says:

-r[:stream_specifier] fps (input/output,per-stream)

Set frame rate (Hz value, fraction or abbreviation).

As an input option, ignore any timestamps stored in the file and instead generate timestamps assuming constant frame rate fps.

As an output option, duplicate or drop input frames to achieve constant output frame rate fps.

Since our "input" is a stream of raw video frames over a pipe, it should not contain any timestamps at all, so it makes sense that we would need some mechanism of specifying timestamps like the "input" option.

I don't fully understand the behavior of the "output option". If our input stream has no timestamps, how did it decide to drop frames for you, but duplicate them for me? Are the timestamps generated implicitly by the real wall clock time when the frames arrive over the pipe? Regardless, dropping and duplicating frames are both bad for this application.

@jblugagne
Copy link

@jpreiss thank you! That solved my problem. Not sure what is going on with the r output option thing either.

@lminer
Copy link

lminer commented Aug 4, 2020

Is there a way to do this where you pass in a numpy array (audio in this case) and get a numpy array in return?

@jaehobang
Copy link

When trying to run @kylemcdonald's function written above with the modifications of frame rate given by @jpreiss , I am running into an error of Broken Pipe. The input is an 15000x241x369x3 np.uint8 array. The error is as follows:

---------------------------------------------------------------------------
BrokenPipeError                           Traceback (most recent call last)
<ipython-input-30-41eb52741086> in <module>
----> 1 vidwrite(output_filename, images_cut)

<ipython-input-29-34624c1ce396> in vidwrite(fn, images, framerate, vcodec)
     16         process.stdin.write(
     17             frame
---> 18                 .astype(np.uint8)
     19                 .tobytes()
     20         )

BrokenPipeError: [Errno 32] Broken pipe

It seems that this error is raised while trying to write the 2nd frame.

Did anyone encounter a similar issue or know of a fix? Thank you in advance.

@valin1
Copy link

valin1 commented Nov 12, 2020

@jaehobang Were you able to figure out this problem? Because I am having the same problem with a [Errno 32] Broken pipe error.

@samrere
Copy link

samrere commented Nov 29, 2020

vidwrite('test', ...) will produce broken pipe error, but vidwrite('test.mp4', frames) will be fine

@solankiharsh
Copy link

@kylemcdonald Thank you so much. I was able to implement your provided lines of code in my use case.

However, I need help with one part. Is there a way to have the separate h264 encoded frames instead of one .h264 file.

This is what I am doing exactly:

def start_streaming(self,channel_name):
request = api.RenderRequest()
request.deeptwin_uuid = self._deeptwin_uuid
request.neutral_expr_coeffs.extend(self._deeptwin.neutral_expr_coeffs)
response = self._client.Start(request)
print('Started Streaming ...')
# 10 frames, resolution 256x256, and 1 fps
#width, height, n_frames, fps = 256, 256, 10, 1
path = 'out.h264'
#cv2.imwrite(path, img)
#print(f'written {res.status} {res.id} to file at {datetime.datetime.now()}')
process = (
ffmpeg
.input('pipe:', format='rawvideo', pix_fmt='bgr24', s='{}x{}'.format(self._deeptwin.image_width, self._deeptwin.image_height))
.output(path, pix_fmt='yuv420p', vcodec='libx264', r=25)
.overwrite_output()
.run_async(pipe_stdin=True)
)
for res in response:
print(f'{res.status} image {res.id} rendered at {datetime.datetime.now()}')
img = np.frombuffer(res.image, dtype=np.uint8)
img = img.reshape((int(self._deeptwin.image_height * 1.5), self._deeptwin.image_width))
img = cv2.cvtColor(img, cv2.COLOR_YUV2BGR_I420)
print(f'before write {res.status} {res.id} :: {datetime.datetime.now()}')
#path = f'{storage}/{channel_name}/{res.status}-{res.id}.h264'

        for frame in img:
            process.stdin.write(
                frame
                .astype(np.uint8)
                .tobytes()
            )
    process.stdin.close()
    process.wait()

@ayushjn20
Copy link

@jpreiss
Quoting you,

Since our "input" is a stream of raw video frames over a pipe, it should not contain any timestamps at all, so it makes sense that we would need some mechanism of specifying timestamps like the "input option.

@jpreiss @kylemcdonald
What is that mechanism where we can specify timestamps in the input options?

I am facing similar issues because the original input video (from where frames need to be extracted out) has variable FPS. The frames are extracted using OpenCV's VideoCapture, then each frame is processed independently and the sequence of frames has to be written down to a new video again. I am using the same method to convert that sequence of processed images to video as mentioned by @kylemcdonald here.

@jpreiss
Copy link

jpreiss commented Feb 23, 2021

@ayushjn20 sorry, I have no idea how to work with variable frame rates.

@CharlesSS07
Copy link

@jaehobang Were you able to figure out this problem? Because I am having the same problem with a [Errno 32] Broken pipe error.

[Error 32] Broken Pipe means the process errored and closed so there was no pipe to pipe input into. You can figure out what the error was by looking in process.stderr, like so:

import ffmpeg
import io


def vidwrite(fn, images, framerate=60, vcodec='libx264'):
    if not isinstance(images, np.ndarray):
        images = np.asarray(images)
    _,height,width,channels = images.shape
    process = (
        ffmpeg
            .input('pipe:', format='rawvideo', pix_fmt='rgb24', s='{}x{}'.format(width, height), r=framerate)
            .output(fn, pix_fmt='yuv420p', vcodec=vcodec, r=framerate)
            .overwrite_output()
            .run_async(pipe_stdin=True, overwrite_output=True, pipe_stderr=True)
    )
    for frame in images:
        try:
            process.stdin.write(
                frame.astype(np.uint8).tobytes()
            )
        except Exception as e: # should probably be an exception related to process.stdin.write
            for line in io.TextIOWrapper(process.stderr, encoding="utf-8"): # I didn't know how to get the stderr from the process, but this worked for me
                print(line) # <-- print all the lines in the processes stderr after it has errored
            process.stdin.close()
            process.wait()
            return # cant run anymore so end the for loop and the function execution

In my case , it was just

Unknown encoder 'libx264'

because I hadnt't installed that library

@omrivolk
Copy link

omrivolk commented Feb 17, 2022

@kylemcdonald do you know how to achieve this with rgba? My numpy array shape is (m,n,4) with the 4th value being the opacity between 0-1.
I tried switching rgb24 to rgba and yuv420p to yuva420p but it does not work. The alpha values are ignored.

I want to overlay my video on a map so I need some parts to be transparent.

@antortjim
Copy link

antortjim commented Feb 22, 2022

Is it possible to implement CUDA support in this python wrapper? I have made a small repository where I write a constant random frame to video using ffmpeg with CUDA support, but I am not getting the performance I expected.

Maybe my ffmpeg flags are not correct?

Any help would be very appreciated :)

PS I built ffmpeg with CUDA support enabled
PSS I am trying to do this because the cv2.cuda.VideoWriter is apparently not supported in Linux at the moment
opencv/opencv_contrib#3044

@LvJC
Copy link

LvJC commented Mar 16, 2022

@jaehobang Have you figured it out? I have the same problem...

@Yaxit
Copy link

Yaxit commented May 30, 2022

I'm following as well because I have a similar issue. My input buffer comes from a stream from youtube. In my case, the conversion works out well, but I still get the BrokenPipe exception at the end. Any idea why this happens?

from io import BytesIO
buff = BytesIO()
streams = pytube.YouTube('https://www.youtube.com/watch?v=xxxxx').streams
streams.filter(only_audio=True).first().stream_to_buffer(buff)

buff.seek(0)
process = (
    ffmpeg
    .input('pipe:', ss=420, to=430, f='mp4', )
    .output('out.wav', ac=1, ar=16000, acodec='pcm_s16le')
    .overwrite_output()
    .run_async(pipe_stdin=True)
)
process.stdin.write(buff.read()) # <-- BrokenPipe here
process.stdin.close()
process.wait()

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests