Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

H.264 encoder: SIGFPE in intel_mfc_brc_postpack_vbr #215

Closed
sebastinas opened this issue Jun 30, 2017 · 17 comments
Closed

H.264 encoder: SIGFPE in intel_mfc_brc_postpack_vbr #215

sebastinas opened this issue Jun 30, 2017 · 17 comments
Assignees
Labels

Comments

@sebastinas
Copy link
Contributor

Forwarding a bug report from @sesse we received in the Debian BTS #866512:

After dist-upgrading, I can no longer start nageru without a segfault:
·
Thread 13 "QS_Encode" received signal SIGFPE, Arithmetic exception.
[Switching to Thread 0x7fffbaa7e700 (LWP 8583)]
0x00007fffc0bfab37 in intel_mfc_brc_postpack_vbr (frame_bits=3696,·
encoder_context=0x5555565487d0, encode_state=0x55555653dff8)
at gen6_mfc_common.c:402
402»gen6_mfc_common.c: Ingen slik fil eller filkatalog.
(gdb) bt
#0 0x00007fffc0bfab37 in intel_mfc_brc_postpack_vbr (frame_bits=3696,·
encoder_context=0x5555565487d0, encode_state=0x55555653dff8)
at gen6_mfc_common.c:402
#1 intel_mfc_brc_postpack (encode_state=encode_state@entry=0x55555653dff8,·
encoder_context=encoder_context@entry=0x5555565487d0,·
frame_bits=frame_bits@entry=3696) at gen6_mfc_common.c:484
#2 0x00007fffc0c1586f in gen75_mfc_avc_encode_picture (
encoder_context=, encode_state=0x55555653dff8,·
ctx=0x5555564f3500) at gen75_mfc.c:1707
#3 gen75_mfc_pipeline (ctx=0x5555564f3500, profile=,·
encode_state=0x55555653dff8, encoder_context=)
at gen75_mfc.c:2529
#4 0x00007fffc0c58bac in intel_encoder_end_picture (ctx=0x5555564f3500,·
profile=, codec_state=0x55555653dff8,·
hw_context=0x5555565487d0) at i965_encoder.c:1327
#5 0x00007ffff5111dbf in vaEndPicture ()
from /usr/lib/x86_64-linux-gnu/libva.so.1
#6 0x00005555555f1a4d in QuickSyncEncoderImpl::encode_frame (
this=this@entry=0x555556068c70, frame=...,·
encoding_frame_num=encoding_frame_num@entry=2,·
display_frame_num=display_frame_num@entry=1,·
gop_start_display_frame_num=gop_start_display_frame_num@entry=0,·
frame_type=frame_type@entry=1, pts=16800, dts=14000, duration=4800,·
ycbcr_coefficients=movit::YCBCR_REC_601) at quicksync_encoder.cpp:2045
·
Downgrading i965-va-driver to 1.7.3-1 (the version in stable) fixes the issue.

This crash occurs with 1.8.2 and 1.8.3.

@QuPengfei
Copy link

From #6, i assumed you were running MediaSDK with open source VAAPI driver. by now they do not work together. Please contact MediaSDK team and ask the help.

@sesse
Copy link
Contributor

sesse commented Jun 30, 2017

What? No, I'm not using Media SDK.

@fhvwy
Copy link
Contributor

fhvwy commented Jun 30, 2017

This is a bit odd - it looks like you've somehow ended up in VBR mode with a target bitrate of zero (hence integer divide by zero -> SIGFPE).

Can you print the content of mfc_context->brc after the exception there? It would also be interesting to know what happened before that point - is this the first frame?

@fhvwy
Copy link
Contributor

fhvwy commented Jun 30, 2017

Aha, I think I see the problem. Have you told the rate controller that you aren't going to use any B-frames (ip_period == 1), but then are encoding a B-frame? The rate controller will have allocated zero bits to B-frames, which then gives the divide by zero here.

@sesse
Copy link
Contributor

sesse commented Jun 30, 2017

FWIW, I run with constant quantizer (so target bitrate shouldn't matter). My ip_period is 3, and I should communicate this down. My code is based on the H.264 encoder example; you can see it here:

https://git.sesse.net/?p=nageru;a=blob;f=quicksync_encoder.cpp;h=f7e1696ab9f13419f182ed03a556e2e913c2c1b1;hb=HEAD

https://git.sesse.net/?p=nageru;a=blob;f=quicksync_encoder_impl.h;h=bc84e0adc7280beadb9e11ad1ee278eec59ac1ec;hb=HEAD

IIRC it's complaining about wrong frame order at the very start, but so did the example. :-) I can get the debugging information you requested later today; not at the right machine right now.

@QuPengfei
Copy link

@sesse Aha, sorry. i mixed it. :(

The inpu RC mode is CQP and the call stack show it run into the VBR mode. i can not open the link. it show the error.
400 - Invalid hash base parameter

@sesse
Copy link
Contributor

sesse commented Jun 30, 2017

I edit the comment to fix the links; please try again.

@QuPengfei
Copy link

Do you set the rc_mode with value -1 defaultly? if so, the VBR mode will be assigned.

881 if (attrib[VAConfigAttribRateControl].value != VA_ATTRIB_NOT_SUPPORTED) {
882 int tmp = attrib[VAConfigAttribRateControl].value;
883
884 if (rc_mode == -1 || !(rc_mode & tmp)) {
885 if (rc_mode != -1) {
886 printf("Warning: Don't support the specified RateControl mode: %s!!!, switch to ", rc_to_string(rc_mode));
887 }
888
889 for (i = 0; i < sizeof(rc_default_modes) / sizeof(rc_default_modes[0]); i++) {
890 if (rc_default_modes[i] & tmp) {
891 rc_mode = rc_default_modes[i];
892 break;
893 }
894 }
895 }
896
897 config_attrib[config_attrib_num].type = VAConfigAttribRateControl;
898 config_attrib[config_attrib_num].value = rc_mode;
899 config_attrib_num++;
900 }

@sesse
Copy link
Contributor

sesse commented Jun 30, 2017

Look at lines 889–891; it goes through the list of available modes (where CQP has first priority in the array) and assigns the first one that works. So the rc_mode should be CQP, unless VA-API stops advertising it (which I haven't checked).

@xhaihao
Copy link
Contributor

xhaihao commented Jun 30, 2017

@sesse 1.7.3 doesn't support VBR so CQP has first priority, but 1.8.0+ supports VBR, so VBR has first priority now, you should provide right VBR related parameters or change the array.

@sesse
Copy link
Contributor

sesse commented Jun 30, 2017

Ah. I should just hard-code CQP, but that explains it. I'll keep the bug open until I can verify it fixes the issue.

@xhaihao xhaihao added the bug label Jun 30, 2017
@sesse
Copy link
Contributor

sesse commented Jun 30, 2017

You're right, hard-coding CQP works. Thanks for the investigation!

@xhaihao
Copy link
Contributor

xhaihao commented Jul 13, 2017

@sesse, I'd like to avoid SIGFPE for VBR in the driver. I see bitrate, intra_period, ip_period and target_percentage are all set in your code, it seems it is impossible that mfc_context->brc.target_frame_size[0][slice_type] is 0. I also tried the H.264 encoder example and it works fine for me. Could you help to debug why mfc_context->brc.target_frame_size[0][slice_type] is 0 in your case?

@sesse
Copy link
Contributor

sesse commented Jul 13, 2017

@xhaihao Most likely the SIGFPE is because I never set a bitrate, yet asked for VBR by accident. It's an invalid configuration, which I've fixed on my side (since I never wanted to use VBR in the first place). So this bug can be closed.

@xhaihao
Copy link
Contributor

xhaihao commented Jul 13, 2017

I saw you set bitrate in the following code, did I misread something?

1111 seq_param.picture_height_in_mbs = frame_height_mbaligned / 16;
1112 seq_param.bits_per_second = frame_bitrate;

1152 misc_rate_ctrl->bits_per_second = frame_bitrate;
1153 misc_rate_ctrl->target_percentage = 66;

@sesse
Copy link
Contributor

sesse commented Jul 13, 2017

Hm, well, that's odd, yes. I'm unfortunately a bit busy with other things this week, but I can try to answer your question next week? If not, you can download and compile Nageru yourself; 1.6.0 and older should show the issue.

@xhaihao
Copy link
Contributor

xhaihao commented Jan 18, 2018

the reporter has fixed the invalid configuration. In addition, intel_mfc_brc_postpack_vbr is not used for new media kernels.

@xhaihao xhaihao closed this as completed Jan 18, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

6 participants