Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some questions about coviar data loader #7

Closed
Manolo1988 opened this issue Jul 24, 2018 · 3 comments
Closed

Some questions about coviar data loader #7

Manolo1988 opened this issue Jul 24, 2018 · 3 comments

Comments

@Manolo1988
Copy link

Manolo1988 commented Jul 24, 2018

Hi,
I have some questions when reading the coviar_data_loader.c
Firstly, you init the variable accu_src_old as follows but i whether why:

                    for (size_t x = 0; x < w; ++x) {
                        for (size_t y = 0; y < h; ++y) {
                            accu_src_old[x * h * 2 + y * 2    ]  = x;
                            accu_src_old[x * h * 2 + y * 2 + 1]  = y;
                        }
                    }

Secondly, is the following codes means that every frame in the target gop before target frame will be decoded, and only the I-frame and the target frame will be transit to bgr format?

            if (cur_gop == gop_target && cur_pos <= pos_target) {
                ret = avcodec_decode_video2(pCodecCtx, pFrame, &got_picture, &packet);  
......
                if (got_picture) {

                    if ((cur_pos == 0              && accumulate  && representation == RESIDUAL) ||
                        (cur_pos == pos_target - 1 && !accumulate && representation == RESIDUAL) ||
                        cur_pos == pos_target) {
                        create_and_load_bgr(
                            pFrame, pFrameBGR, buffer, bgr_arr, cur_pos, pos_target);
                    }

Thirdly, in dataset.py, I whether why you process the img like follows:

def clip_and_scale(img, size):
    return (img * (127.5 / size)).astype(np.int32)
Thanks for your excellent work and code. Looking forward to your reply : )
@chaoyuaw
Copy link
Owner

chaoyuaw commented Aug 4, 2018

Hi @Manolo1988 ,

Thanks for your questions.

  1. To get accumulated MV and residuals, we keep track of "where a pixel moves in the following frames". Initially, the location of a pixel at (x, y) is (x, y). So we initialized it as (x, y). And through motion compensation at the following frames, the pixel might be copied to other locations. We compare the final location with the original location to get accumulated motion vectors (and accumulated residuals). Please feel free to let me know if this makes sense to you.

  2. I'm not sure if I fully understand your question, but I'll try to answer based on my understanding. Please let me know if this answers your question. The reason why we construct BGR is to compute accumulated residuals, which is the difference between the predicted frame (without adding residual along the path) and the actual frame. Predicted frame is a function of MVs and I-frame. So we only need MVs of the frames in between the target frame and I-frame, without the need to decode BGR for them.

  3. This is following the convention from two-stream networks, where optical flows are clipped at certain magnitude. Here we we scale [-20, 20] to [0, 255], and values beyond the range are clipped.

@Manolo1988
Copy link
Author

Thank you for your reply. I finally figure out the meanings of the above codes and highly appreciate your ideas.

@RyanCV
Copy link

RyanCV commented Aug 13, 2018

@chaoyuaw
1. why x_start and y_start start with negative value?

for (int x_start = (-1 * mv->w / 2); x_start < mv->w / 2; ++x_start) {
for (int y_start = (-1 * mv->h / 2); y_start < mv->h / 2; ++y_start) {
...
}

  1. in line 288-296, why bgr_arr has dims[4]? what does dims[0] mean?

// Initialize arrays.
if (! (*bgr_arr)) {
npy_intp dims[4];
dims[0] = 2;
dims[1] = h;
dims[2] = w;
dims[3] = 3;
*bgr_arr = PyArray_ZEROS(4, dims, NPY_UINT8, 0);
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants