Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RK3328 : HDR media playback issues. #22

Closed
LongChair opened this issue Aug 10, 2017 · 27 comments
Closed

RK3328 : HDR media playback issues. #22

LongChair opened this issue Aug 10, 2017 · 27 comments

Comments

@LongChair
Copy link
Collaborator

We are having issues to play our HDR samples on RK3328.

One of the samples that shows that issue is http://demo-uhd3d.com/fiche.php?cat=uhd&id=159

From what we see, MPP seems to error while parsing the data, this is the log we have been able to retrieve :

[ffmpeg] hevc_rkmpp: Initializing RKMPP decoder.
mpi: mpp version:
[ffmpeg] hevc_rkmpp: RKMPP decoder initialized successfully.
[ffmpeg] hevc_rkmpp: Flush.
[ffmpeg] hevc_rkmpp: Wrote 1874 bytes to decoder
H265D_PARSER: No start code is found.
[ffmpeg] hevc_rkmpp: Wrote 705617 bytes to decoder
H265D_PARSER: nal_unit_type: 32, nuh_layer_id: 0 temporal_id: 0
H265D_PARSER: nal_unit_type: 33, nuh_layer_id: 0 temporal_id: 0
H265D_PARSER: nal_unit_type: 34, nuh_layer_id: 0 temporal_id: 0
H265D_PARSER: nal_unit_type: 39, nuh_layer_id: 0 temporal_id: 0
H265D_PARSER: nal_unit_type: 39, nuh_layer_id: 0 temporal_id: 0
H265D_PARSER: nal_unit_type: 39, nuh_layer_id: 0 temporal_id: 0
H265D_PARSER: nal_unit_type: 32, nuh_layer_id: 0 temporal_id: 0
H265D_PARSER: nal_unit_type: 33, nuh_layer_id: 0 temporal_id: 0
H265D_PARSER: nal_unit_type: 34, nuh_layer_id: 0 temporal_id: 0
H265D_PARSER: nal_unit_type: 39, nuh_layer_id: 0 temporal_id: 0
H265D_PARSER: nal_unit_type: 19, nuh_layer_id: 0 temporal_id: 0
H265D_PARSER: nal_unit_type: 32, nuh_layer_id: 0 temporal_id: 0
H265D_PARSER: s->nal_unit_type = 32,len = 24
H265PARSER_PS: Decoding VPS
H265PARSER_PS: vps_id = 0x0
H265PARSER_PS: Main 10 profile bitstream
H265D_PARSER: nal_unit_type: 33, nuh_layer_id: 0 temporal_id: 0
H265D_PARSER: s->nal_unit_type = 33,len = 46
H265PARSER_PS: Decoding SPS
H265PARSER_PS: Main 10 profile bitstream
H265PARSER_PS: read bit left 161
H265PARSER_PS: 2 read bit left 148
H265PARSER_PS: sps->log2_min_cb_size 3
H265PARSER_PS: sps->log2_diff_max_min_coding_block_size 3
H265PARSER_PS: sps->log2_min_tb_size 2
H265PARSER_PS: sps->log2_diff_max_min_transform_block_size 3
H265PARSER_PS: sps->log2_max_trafo_size 5
H265PARSER_PS: sps->amp_enabled_flag = 0
H265PARSER_PS: sps->sao_enabled = 1
H265PARSER_PS: sps->pcm_enabled_flag = 0
H265PARSER_PS: Decoding VUI
H265PARSER_PS: sps->log2_min_cb_size = 3 sps->log2_diff_max_min_coding_block_size = 3
H265PARSER_PS: plus sps->log2_ctb_size 6
H265PARSER_PS: sps->log2_ctb_size 6
H265D_PARSER: nal_unit_type: 34, nuh_layer_id: 0 temporal_id: 0
H265D_PARSER: s->nal_unit_type = 34,len = 7
H265PARSER_PS: Decoding PPS
H265PARSER_PS: Decoding PPS 1
H265PARSER_PS: num bit left 3
H265PARSER_PS: num bit left 2
H265PARSER_PS: num bit left 1
H265D_PARSER: nal_unit_type: 39, nuh_layer_id: 0 temporal_id: 0
H265D_PARSER: s->nal_unit_type = 39,len = 10
h265d_sei: Decoding SEI
h265d_sei: s->nal_unit_type 39 payload_type 144 payload_size 4
h265d_sei: Skipped PREFIX SEI 144
H265D_PARSER: nal_unit_type: 39, nuh_layer_id: 0 temporal_id: 0
H265D_PARSER: s->nal_unit_type = 39,len = 30
h265d_sei: Decoding SEI
h265d_sei: s->nal_unit_type 39 payload_type 137 payload_size 24
h265d_sei: mastering_display_colour_volume in
H265D_PARSER: nal_unit_type: 39, nuh_layer_id: 0 temporal_id: 0
H265D_PARSER: s->nal_unit_type = 39,len = 1754
h265d_sei: Decoding SEI
h265d_sei: s->nal_unit_type 39 payload_type 5 payload_size 1743
h265d_sei: Skipped PREFIX SEI 5
H265D_PARSER: nal_unit_type: 32, nuh_layer_id: 0 temporal_id: 0
H265D_PARSER: s->nal_unit_type = 32,len = 24
H265PARSER_PS: Decoding VPS
H265PARSER_PS: vps_id = 0x0
H265PARSER_PS: Main 10 profile bitstream
H265D_PARSER: nal_unit_type: 33, nuh_layer_id: 0 temporal_id: 0
H265D_PARSER: s->nal_unit_type = 33,len = 46
H265PARSER_PS: Decoding SPS
H265PARSER_PS: Main 10 profile bitstream
H265PARSER_PS: read bit left 161
H265PARSER_PS: 2 read bit left 148
H265PARSER_PS: sps->log2_min_cb_size 3
H265PARSER_PS: sps->log2_diff_max_min_coding_block_size 3
H265PARSER_PS: sps->log2_min_tb_size 2
H265PARSER_PS: sps->log2_diff_max_min_transform_block_size 3
H265PARSER_PS: sps->log2_max_trafo_size 5
H265PARSER_PS: sps->amp_enabled_flag = 0
H265PARSER_PS: sps->sao_enabled = 1
H265PARSER_PS: sps->pcm_enabled_flag = 0
H265PARSER_PS: Decoding VUI
H265PARSER_PS: sps->log2_min_cb_size = 3 sps->log2_diff_max_min_coding_block_size = 3
H265PARSER_PS: plus sps->log2_ctb_size 6
H265PARSER_PS: sps->log2_ctb_size 6
H265D_PARSER: nal_unit_type: 34, nuh_layer_id: 0 temporal_id: 0
H265D_PARSER: s->nal_unit_type = 34,len = 7
H265PARSER_PS: Decoding PPS
H265PARSER_PS: Decoding PPS 1
H265PARSER_PS: num bit left 7
H265PARSER_PS: num bit left 6
H265PARSER_PS: num bit left 5
H265D_PARSER: nal_unit_type: 39, nuh_layer_id: 0 temporal_id: 0
H265D_PARSER: s->nal_unit_type = 39,len = 1754
h265d_sei: Decoding SEI
h265d_sei: s->nal_unit_type 39 payload_type 5 payload_size 1743
h265d_sei: Skipped PREFIX SEI 5
H265D_PARSER: nal_unit_type: 19, nuh_layer_id: 0 temporal_id: 0
H265D_PARSER: s->nal_unit_type = 19,len = 701871
H265D_PARSER: hls_slice_header in
H265D_PARSER: hls_slice_header out
H265_PARSER_REF: width = 3840 height = 2160
H265_PARSER_REF: w_stride 3840 h_stride 2160
H265_PARSER_REF: alloc frame poc 0 slot_index 0
mpp_buf_slot: new width 3840 height 2160 stride hor 4864 ver 2160 fmt    1
H265D_PARSER: decode poc = 0
H265D_PARSER: Decoded frame with POC 0.
H265D_PARSER: Output frame with POC 0 frame->slot_index = 0
[ffmpeg] hevc_rkmpp: Wrote 61183 bytes to decoder
[ffmpeg] hevc_rkmpp: Wrote 2284 bytes to decoder
[ffmpeg] hevc_rkmpp: Wrote 1312 bytes to decoder
[ffmpeg] hevc_rkmpp: Wrote 1493 bytes to decoder
[ffmpeg] hevc_rkmpp: Decoder noticed an info change (3840x2160), format=1
H265D_PARSER: nal_unit_type: 1, nuh_layer_id: 0 temporal_id: 0
H265D_PARSER: IS_IRAP frame found error
H265D_PARSER: nal_unit_type: 1, nuh_layer_id: 0 temporal_id: 0
H265D_PARSER: s->nal_unit_type = 1,len = 61179
H265D_PARSER: hls_slice_header in
[ffmpeg] hevc_rkmpp: Received a discard/errinfo frame.
H265D_PARSER: Invalid collocated_ref_idx: 7.
H265D_PARSER: hls_slice_header out
H265D_PARSER: hls_slice_header error ret = -1004
H265D_PARSER: Error parsing NAL unit #0,error ret = 0xd.
H265D_PARSER: current stream is no right skip it
H265D_PARSER: nal_unit_type: 1, nuh_layer_id: 0 temporal_id: 0
[ffmpeg] hevc_rkmpp: Wrote 91034 bytes to decoder
H265D_PARSER: nal_unit_type: 1, nuh_layer_id: 0 temporal_id: 0
H265D_PARSER: s->nal_unit_type = 1,len = 2280
H265D_PARSER: hls_slice_header in
H265D_PARSER: hls_slice_header out
H265D_PARSER: decode poc = 2
H265D_PARSER: nal_unit_type: 0, nuh_layer_id: 0 temporal_id: 0
[ffmpeg] hevc_rkmpp: Wrote 2520 bytes to decoder
@Kwiboo
Copy link
Owner

Kwiboo commented Aug 23, 2017

Using the following patch seems to make the sample linked to in original post work a little bit better. It also fixes playback of my Astra-SES_Demo_UHD_satellite_end-of-2015.mkv sample.
I have no clue as to why this solved the problem or what this hw reg controls.

--- a/mpp/hal/rkdec/h265d/hal_h265d_reg.c
+++ b/mpp/hal/rkdec/h265d/hal_h265d_reg.c
@@ -1468,8 +1468,6 @@ MPP_RET hal_h265d_gen_regs(void *hal,  HalTaskInfo *syn)
     }
     hw_regs->sw_interrupt.sw_dec_e         = 1;
     hw_regs->sw_interrupt.sw_dec_timeout_e = 1;
-    hw_regs->sw_interrupt.sw_wr_ddr_align_en = dxva_cxt->pp.tiles_enabled_flag
-                                               ? 0 : 1;


     ///find s->rps_model[i] position, and set register

There might also be an issue with the sw_interrupt, when comparing the hw regs for h264d/h265d/vp9d sw_dec_timeout_sta is missing in hal_h265d_reg.h
Below is my best guess at a fix, but without any hw regs documentation it is only a guess.

index 1bccb02..432b8db 100644
--- a/mpp/hal/rkdec/h265d/hal_h265d_reg.h
+++ b/mpp/hal/rkdec/h265d/hal_h265d_reg.h
@@ -50,7 +50,8 @@ typedef struct {
     struct swreg_int {
         RK_U32    sw_dec_e            : 1  ;
         RK_U32    sw_dec_clkgate_e    : 1  ;
-        RK_U32    reserve0            : 2  ;
+        RK_U32    reserve0            : 1  ;
+        RK_U32    sw_timeout_mode     : 1  ;
         RK_U32    sw_dec_irq_dis      : 1  ;
         RK_U32    sw_dec_timeout_e    : 1  ;
         RK_U32    sw_buf_empty_en     : 1  ;
@@ -61,8 +62,9 @@ typedef struct {
         RK_U32    sw_dec_rdy_sta      : 1  ;
         RK_U32    sw_dec_bus_sta      : 1  ;
         RK_U32    sw_dec_error_sta    : 1  ;
+        RK_U32    sw_dec_timeout_sta  : 1  ;
         RK_U32    sw_dec_empty_sta    : 1  ;
-        RK_U32    reserve4            : 4  ;
+        RK_U32    reserve3            : 3  ;
         RK_U32    sw_softrst_en_p     : 1  ;
         RK_U32    sw_force_softreset_valid: 1 ;
         RK_U32    sw_softreset_rdy    : 1  ;

@LongChair
Copy link
Collaborator Author

LongChair commented Aug 26, 2017

Looks to fix HDR playback issues for me too.
We would need @yanghanxing to have this patch checked and eventually ammended / merged.

@yanghanxing
Copy link

Yeah, Our engineer said it would not impact to the HDR video playback. And we will check it again.

@Koloss78
Copy link

Koloss78 commented Oct 8, 2017

I can start HDR Video but my TV did not Show HDR Content on Display.

My Samsung TV say UHD 3180x2160 but not HDR UHD 3180x2160.

@LongChair
Copy link
Collaborator Author

LongChair commented Jan 11, 2018

@yanghanxing : ok here is some update on our HDR investigations and understanding

The following DRM Atomic properties need to be set :

  • HDR_SOURCE_METADATA : this is a blob property that needs to be set to the following contents :
struct hdr_static_metadata {
    uint16_t eotf;
    uint16_t type;
    uint16_t display_primaries_x[3];
    uint16_t display_primaries_y[3];
    uint16_t white_point_x;
    uint16_t white_point_y;
    uint16_t max_mastering_display_luminance;
    uint16_t min_mastering_display_luminance;
    uint16_t max_fall;
    uint16_t max_cll;
    uint16_t min_cll;
};

Most of this information is coming from the media itself.
The first eotf property should be set to the following enumeration :

TRADITIONAL_GAMMA_SDR = 0,
TRADITIONA_GAMMA_HDR,
SMPTE_ST2084,
FUTURE_EOTF

We can probably feed most of this information from the stream metadata information.

it looks that ffmpeg grabs this from media stream and puts this in the structs below that are stored in side_data arrays :
https://www.ffmpeg.org/doxygen/trunk/mastering__display__metadata_8h_source.html

They can be retrieved from AvCodecContext with these keys :
https://www.ffmpeg.org/doxygen/trunk/mastering__display__metadata_8c_source.html

The only field that we didn't find information for was the type field. What are the possible values for it ?

  • hdmi_output_depth : this one is a regular 64 bits integer property.

Should bet set to one of the following values :

enum dw_hdmi_rockchip_color_depth {
	ROCKCHIP_HDMI_DEPTH_8,
	ROCKCHIP_HDMI_DEPTH_10,
	ROCKCHIP_HDMI_DEPTH_12,
	ROCKCHIP_HDMI_DEPTH_16,
	ROCKCHIP_HDMI_DEPTH_420_10,
	ROCKCHIP_HDMI_DEPTH_420_12,
	ROCKCHIP_HDMI_DEPTH_420_16
};

is setting this property to 8 or 10 enough ?

  • hdmi_output_format : this one is a regular 64 bits integer property.
    Should be set to the following values :
enum drm_hdmi_output_type {
	DRM_HDMI_OUTPUT_DEFAULT_RGB, /* default RGB */
	DRM_HDMI_OUTPUT_YCBCR444, /* YCBCR 444 */
	DRM_HDMI_OUTPUT_YCBCR422, /* YCBCR 422 */
	DRM_HDMI_OUTPUT_YCBCR420, /* YCBCR 420 */
	DRM_HDMI_OUTPUT_YCBCR_HQ, /* Highest subsampled YUV */
	DRM_HDMI_OUTPUT_YCBCR_LQ, /* Lowest subsampled YUV */
	DRM_HDMI_OUTPUT_INVALID, /* Guess what ? */
};

Presumably, we would use DRM_HDMI_OUTPUT_YCBCR420 and try that mode, expecting that if the TV doesn't support it, it will get back to RGB. is that correct ?

  • hdmi_output_colorimetry : this one is a regular 64 bits integer property.

should be set to the following values

enum hdmi_extended_colorimetry {
	HDMI_EXTENDED_COLORIMETRY_XV_YCC_601,
	HDMI_EXTENDED_COLORIMETRY_XV_YCC_709,
	HDMI_EXTENDED_COLORIMETRY_S_YCC_601,
	HDMI_EXTENDED_COLORIMETRY_ADOBE_YCC_601,
	HDMI_EXTENDED_COLORIMETRY_ADOBE_RGB,

	/* The following EC values are only defined in CEA-861-F. */
	HDMI_EXTENDED_COLORIMETRY_BT2020_CONST_LUM,
	HDMI_EXTENDED_COLORIMETRY_BT2020,
	HDMI_EXTENDED_COLORIMETRY_RESERVED,
};

We could probably link that one to the ffmpeg followong enum (https://ffmpeg.org/doxygen/2.7/pixfmt_8h.html#aff71a069509a1ad3ff54d53a1c894c85)

enum AVColorSpace {
  501     AVCOL_SPC_RGB         = 0,  ///< order of coefficients is actually GBR, also IEC 61966-2-1 (sRGB)
  502     AVCOL_SPC_BT709       = 1,  ///< also ITU-R BT1361 / IEC 61966-2-4 xvYCC709 / SMPTE RP177 Annex B
  503     AVCOL_SPC_UNSPECIFIED = 2,
  504     AVCOL_SPC_RESERVED    = 3,
  505     AVCOL_SPC_FCC         = 4,  ///< FCC Title 47 Code of Federal Regulations 73.682 (a)(20)
  506     AVCOL_SPC_BT470BG     = 5,  ///< also ITU-R BT601-6 625 / ITU-R BT1358 625 / ITU-R BT1700 625 PAL & SECAM / IEC 61966-2-4 xvYCC601
  507     AVCOL_SPC_SMPTE170M   = 6,  ///< also ITU-R BT601-6 525 / ITU-R BT1358 525 / ITU-R BT1700 NTSC / functionally identical to above
  508     AVCOL_SPC_SMPTE240M   = 7,
  509     AVCOL_SPC_YCOCG       = 8,  ///< Used by Dirac / VC-2 and H.264 FRext, see ITU-T SG16
  510     AVCOL_SPC_BT2020_NCL  = 9,  ///< ITU-R BT2020 non-constant luminance system
  511     AVCOL_SPC_BT2020_CL   = 10, ///< ITU-R BT2020 constant luminance system
  512     AVCOL_SPC_NB,               ///< Not part of ABI
  513 };

unless that's the enum we should use for the COLOR_SPACE property below ...

  • COLOR_SPACE : this one is a regular 64 bits integer property.

should be set to the following values

enum v4l2_colorspace {
	/*
	 * Default colorspace, i.e. let the driver figure it out.
	 * Can only be used with video capture.
	 */
	V4L2_COLORSPACE_DEFAULT       = 0,

	/* SMPTE 170M: used for broadcast NTSC/PAL SDTV */
	V4L2_COLORSPACE_SMPTE170M     = 1,

	/* Obsolete pre-1998 SMPTE 240M HDTV standard, superseded by Rec 709 */
	V4L2_COLORSPACE_SMPTE240M     = 2,

	/* Rec.709: used for HDTV */
	V4L2_COLORSPACE_REC709        = 3,

	/*
	 * Deprecated, do not use. No driver will ever return this. This was
	 * based on a misunderstanding of the bt878 datasheet.
	 */
	V4L2_COLORSPACE_BT878         = 4,

	/*
	 * NTSC 1953 colorspace. This only makes sense when dealing with
	 * really, really old NTSC recordings. Superseded by SMPTE 170M.
	 */
	V4L2_COLORSPACE_470_SYSTEM_M  = 5,

	/*
	 * EBU Tech 3213 PAL/SECAM colorspace. This only makes sense when
	 * dealing with really old PAL/SECAM recordings. Superseded by
	 * SMPTE 170M.
	 */
	V4L2_COLORSPACE_470_SYSTEM_BG = 6,

	/*
	 * Effectively shorthand for V4L2_COLORSPACE_SRGB, V4L2_YCBCR_ENC_601
	 * and V4L2_QUANTIZATION_FULL_RANGE. To be used for (Motion-)JPEG.
	 */
	V4L2_COLORSPACE_JPEG          = 7,

	/* For RGB colorspaces such as produces by most webcams. */
	V4L2_COLORSPACE_SRGB          = 8,

	/* AdobeRGB colorspace */
	V4L2_COLORSPACE_ADOBERGB      = 9,

	/* BT.2020 colorspace, used for UHDTV. */
	V4L2_COLORSPACE_BT2020        = 10,

	/* Raw colorspace: for RAW unprocessed images */
	V4L2_COLORSPACE_RAW           = 11,

	/* DCI-P3 colorspace, used by cinema projectors */
	V4L2_COLORSPACE_DCI_P3        = 12,
};

Could you please confirm that the above information is correct and give use possible values for type field as mentioned above ?

Are all these properties required to get proper HDR display on media ?
Also, how many of these properties are RockChip specific ? :)

Thanks !

@yzheng2012
Copy link

yzheng2012 commented Jan 12, 2018

@LongChair ,HDR_SOURCE_METADATA is the only property required, other properties are optional, HDMI driver will automatically select the appropriate value, prefer YCbCr422 BT2020. The type field should be zero, and eotf filed should be SMPTE_ST2084.
Before setting HDR display, it's better to check TV support HDR by property HDR_PANEL_METADATA, which is also a blob property defined by struct hdr_static_metadata, each bit of eotf filed identify the Electro-Optical Transfer Functions supported by the Sink.

@LongChair
Copy link
Collaborator Author

LongChair commented Jan 12, 2018

@yzheng2012 : what will happen if i set HDR_SOURCE_METADATA with an HDR movie and that the tv doesn't support it, will it fallback to a supported mode smartly and display it the best it can or will the display be broken ?

Because it seems kernel is already fetching TV capabilitites before the switch to me :)

@mcerveny
Copy link

mcerveny commented Jan 25, 2018

Beware. Panel "COLOR_SPACE" must be set too. If "COLOR_SPACE" is leaved in default VOP makes conversions.
Video output is switched to HDR/BT2020 (tv also say it). The connector read/checked "HDR_PANEL_METADATA" to support EOTF and set "HDR_SOURCE_METADATA" (filled "struct hdr_static_metadata").

# cat /sys/kernel/debug/dw-hdmi/status 
PHY: enabled			Mode: HDMI
Pixel Clk: 297000000Hz		TMDS Clk: 297000000Hz
Color Format: YUV422		Color Depth: 10 bit
Colorimetry: ITU.BT2020		EOTF: ST2084
x0: 35400				y0: 14599
x1: 8500				y1: 39850
x2: 6550				y2: 2300
white x: 15635			white y: 16451
max lum: 0			min lum: 0
max cll: 0			max fall: 0
  • variant "A" with less colors due to y2r-> r2r-> csc[CSC_BT709L]-> post_r2y-> post_sdr2hdr-> post_csc[CSC_BT2020] (check scheme and CSC).
    Win0 plane with NA12 data, set "EOTF"=2 (SMPTE_ST2084) "COLOR_SPACE"=0 (V4L2_COLORSPACE_DEFAULT)
# cat /sys/kernel/debug/dri/0/summary 
VOP [ff370000.vop]: ACTIVE
    Connector: HDMI-A
	overlay_mode[0] bus_format[2016] output_mode[f] color_space[10]
    Display mode: 3840x2160p30
	clk[297000] real_clk[297000] type[40] flag[5]
	H: 3840 4016 4104 4400
	V: 2160 2168 2178 2250
    win0-0: ACTIVE
	format: NA12 little-endian (0x3231414e) HDR[2] color_space[0]
	csc: y2r[1] r2r[1] r2y[0] csc mode[1]
	zpos: 0
	src: pos[0x0] rect[3840x2160]
	dst: pos[0x0] rect[3840x2160]
	buf[0]: addr: 0x0000000006b74000 pitch: 4864 offset: 0
	buf[1]: addr: 0x0000000006b74000 pitch: 4864 offset: 10506240
    win1-0: DISABLED
    win2-0: DISABLED
    post: sdr2hdr[1] hdr2sdr[0]
    pre : sdr2hdr[0]
    post CSC: r2y[1] y2r[0] CSC mode[3]
  • variant "B" with corrected "passthrough" color. Win0 plane is the same except "COLOR_SPACE"=10 (V4L2_COLORSPACE_BT2020).
# cat /sys/kernel/debug/dri/0/summary 
VOP [ff370000.vop]: ACTIVE
    Connector: HDMI-A
	overlay_mode[1] bus_format[2016] output_mode[f] color_space[10]
    Display mode: 3840x2160p30
	clk[297000] real_clk[297000] type[40] flag[5]
	H: 3840 4016 4104 4400
	V: 2160 2168 2178 2250
    win0-0: ACTIVE
	format: NA12 little-endian (0x3231414e) HDR[2] color_space[10]
	csc: y2r[0] r2r[0] r2y[0] csc mode[3]
	zpos: 0
	src: pos[0x0] rect[3840x2160]
	dst: pos[0x0] rect[3840x2160]
	buf[0]: addr: 0x000000000e8b5000 pitch: 4864 offset: 0
	buf[1]: addr: 0x000000000e8b5000 pitch: 4864 offset: 10506240
    win1-0: DISABLED
    win2-0: DISABLED
    post: sdr2hdr[0] hdr2sdr[0]
    pre : sdr2hdr[0]
    post CSC: r2y[0] y2r[0] CSC mode[3]

@mcerveny
Copy link

Beware. You must always set "hdmi_output_format" on connector before sending video (even with the same value like DRM_HDMI_OUTPUT_YCBCR_LQ). If not re-set "post CSC" chain remains in previous state (eg. when play hdr and then sdr, sdr video becomes magenta colored).

@mcerveny
Copy link

Performance is not perfect. I am still slightly under 60FPS for HDR with optimization (see rockchip-linux#59).

@Kwiboo
Copy link
Owner

Kwiboo commented Jan 25, 2018

@mcerveny thanks for sharing your findings, I have also had some stability issues and seen some artifacts in decoding using the latest kernel and ddr dvfs enabled. I will probably stay on kernel commit rockchip-linux@9b04710 that seems more stable in my tests, at least until I have more time to test a more recent kernel commit.

@mcerveny
Copy link

There are problems with HDR/HLG (Hybrid Log-Gamma) EOTF (Electro-Optical Transfer Function)

@mcerveny
Copy link

My test environment:

@LongChair
Copy link
Collaborator Author

@yzheng2012 : could you please confirm Martin’s findings so that we have a clear view on what is required and what’s not ?

@mcerveny
Copy link

mcerveny commented Jan 27, 2018

HDR/HLG working. Added patch to kernel (see rockchip-linux#61). MPP patch will not be available (see rockchip-linux/mpp#38). I patched test utility to do workaround (see mcerveny/utils@12b893a). Tested (HDR/HLG recognized by LG OLED TV output limited "video=HDMI-A-1:3840x2160@30" due to lack of HDMI2 cables) and running @ 67FPS without boost CPU/DDR:

# ffmpeg -i "LG Cymatic Jazz 4K Demo.ts" -c:v copy -an -bsf:v hevc_mp4toannexb 4k60hdr_hlg.hevc
# ./mpi_dec ../4k60hdr_hlg.hevc 16777220 | grep 'frame(3000)'
FRAME time 222.664861 render_time  11250 us fps=(time(44566471 us)/frame(3000)) 67.32

# echo ---; cat /sys/kernel/debug/dw-hdmi/status; echo ----; cat /sys/kernel/debug/dri/0/summary; echo -----; egrep 'armclk |aclk_rkvdec |cabac |vdec_core |ddrc ' /sys/kernel/debug/clk/clk_summary
---
PHY: enabled			Mode: HDMI
Pixel Clk: 297000000Hz		TMDS Clk: 297000000Hz
Color Format: YUV422		Color Depth: 10 bit
Colorimetry: ITU.BT2020		EOTF: HLG
x0: 35400				y0: 14599
x1: 8500				y1: 39850
x2: 6550				y2: 2300
white x: 15635			white y: 16451
max lum: 0			min lum: 0
max cll: 0			max fall: 0
----
VOP [ff370000.vop]: ACTIVE
    Connector: HDMI-A
	overlay_mode[1] bus_format[2016] output_mode[f] color_space[10]
    Display mode: 3840x2160p30
	clk[297000] real_clk[297000] type[40] flag[5]
	H: 3840 4016 4104 4400
	V: 2160 2168 2178 2250
    win0-0: ACTIVE
	format: NA12 little-endian (0x3231414e) HDR[3] color_space[10]
	csc: y2r[0] r2r[0] r2y[0] csc mode[0]
	zpos: 0
	src: pos[0x0] rect[3840x2160]
	dst: pos[0x0] rect[3840x2160]
	buf[0]: addr: 0x000000001ca2a000 pitch: 4864 offset: 0
	buf[1]: addr: 0x000000001ca2a000 pitch: 4864 offset: 10506240
    win1-0: DISABLED
    win2-0: DISABLED
    post: sdr2hdr[0] hdr2sdr[0]
    pre : sdr2hdr[0]
    post CSC: r2y[0] y2r[0] CSC mode[1]
-----
             armclk                       0            0   816000000          0 0  
          sclk_vdec_core                  2            3   300000000          0 0  
             aclk_rkvdec                  3            4   600000000          0 0  
          sclk_vdec_cabac                 2            3   300000000          0 0  
          sclk_ddrc                       2            2   786000000          0 0  

@LongChair
Copy link
Collaborator Author

@mcerveny : great ! and thanks for sharing once again :)
Would you mind summing up the different properties that are required then ?

@mcerveny
Copy link

mcerveny commented Jan 27, 2018

Check the code mpi_dec.c in utils:

@mcerveny
Copy link

mcerveny commented Jan 29, 2018

I looked to rockchip-linux code and some up-conversion (like SDR2HDR_FOR_HLG_HDR), down-conversion and whole HDR cross-coversion code missing (between different plane and connector EOTF>0 (PQ<->HLG<->LOG) check slide 15). (also ignored by rockchip rockchip-linux#61).

@mcerveny
Copy link

mcerveny commented Jan 31, 2018

HDR playback resolution 3840x2160p60 is partially solved by mcerveny/rockchip-linux@b4bc703. I am not able to decode/resolve problems in undocumented (and probably untested) rockchip registry setting part or similar.
( upstream patchworks with the same errors https://patchwork.kernel.org/patch/10200571/ )

@Kwiboo
Copy link
Owner

Kwiboo commented Jan 31, 2018

@mcerveny nice findings 👍, I will test 2160p@60hz 10-bit 4:2:0 mode with your patch ASAP.

Agree, it would be great to have documentation for the inno hdmi phy. Some hw register information can be deduced from https://github.com/rockchip-linux/kernel/blob/release-3.10/drivers/video/rockchip/hdmi/rockchip-hdmiv2/rockchip_hdmiv2_hw.h#L1584-L1750, EXT_PHY and EXT_PHY1 should mainly differ in register address.

There is also two inno hdmi phy related commits in the recent release-4.4 push that might be of interest, see: rockchip-linux@bc980e7 and rockchip-linux@fe62b85

@mcerveny
Copy link

mcerveny commented May 4, 2018

4k60 HDR+HDR/HLG (10bit 4:2:0 - TMDS clock 371Mhz) with never patched rockchip-4.4 kernel is working (tested on LG TV). TMDS Clk: 92812500Hz in output is OK.

# ./mpi_dec ../4k60hdr.hevc 16777220 > /dev/null
# ( echo ---; cat /sys/kernel/debug/dw-hdmi/status; echo ----; cat /sys/kernel/debug/dri/0/summary; echo -----; egrep 'armclk |aclk_rkvdec |cabac |vdec_core |ddrc |hdmi' /sys/kernel/debug/clk/clk_summary; echo ----; cat /sys/class/thermal/thermal_zone0/temp )
---
PHY: enabled			Mode: HDMI
Pixel Clk: 594000000Hz		TMDS Clk: 92812500Hz
Color Format: YUV420		Color Depth: 10 bit
Colorimetry: ITU.BT2020		EOTF: ST2084
x0: 35400				y0: 14599
x1: 8500				y1: 39850
x2: 6550				y2: 2300
white x: 15635			white y: 16451
max lum: 0			min lum: 0
max cll: 0			max fall: 0
----
VOP [ff370000.vop]: ACTIVE
    Connector: HDMI-A
	overlay_mode[1] bus_format[2027] output_mode[e] color_space[10]
    Display mode: 3840x2160p60
	clk[594000] real_clk[594000] type[48] flag[5]
	H: 3840 4016 4104 4400
	V: 2160 2168 2178 2250
    win0-0: ACTIVE
	format: NA12 little-endian (0x3231414e) HDR[2] color_space[10]
	csc: y2r[0] r2r[0] r2y[0] csc mode[1]
	zpos: 0
	src: pos[0x0] rect[3840x2160]
	dst: pos[0x0] rect[3840x2160]
	buf[0]: addr: 0x000000001b11d000 pitch: 4864 offset: 0
	buf[1]: addr: 0x000000001b11d000 pitch: 4864 offset: 10506240
    win1-0: DISABLED
    win2-0: DISABLED
    post: sdr2hdr[0] hdr2sdr[0]
    pre : sdr2hdr[0]
    post CSC: r2y[0] y2r[0] CSC mode[1]
-----
    hdmi_phy                              1            1   594000000          0 0  
       hdmiphy                            1            1   594000000          0 0  
          hdmiphy_peri                    0            0   594000000          0 0  
    clk_hdmi_sfc                          1            1    24000000          0 0  
             armclk                       0            0  1008000000          0 0  
          sclk_vdec_core                  2            3   300000000          0 0  
             aclk_rkvdec                  3            4   600000000          0 0  
                   pclk_hdmiphy           1            1    75000000          0 0  
                pclk_hdmi                 1            1   100000000          0 0  
          sclk_vdec_cabac                 2            3   400000000          0 0  
             dclk_hdmiphy                 0            0     8000000          0 0  
          sclk_ddrc                       2            2   924000000          0 0  
----
62916

@LongChair
Copy link
Collaborator Author

@mcerveny : nice :) we're based on an older kernel still, and HDR works, but not in 60 fps.
Any idea on which commit fixed that ?

@mcerveny
Copy link

mcerveny commented May 4, 2018

I did not make bisect. But rockchip-linux@d4953cf equals to my mcerveny@b4bc703 and add something new for faster "clock ratio". Maybe more in history sysnopsys, drm/rockchip and/or phy/rockchip (for example rockchip-linux@41ada53).

@mo123
Copy link

mo123 commented Jun 19, 2018

@mcerveny Hi, Can you rebase your changes to latest 4.4 kernel & mpp.
I want to test HDR on RK3328 LibreELEC, what changes are still needed for 4K@60hz HDR?

@mcerveny
Copy link

mcerveny commented Aug 9, 2018

@Kwiboo:
Hi, I see that you moving forward (xbmc, ffmpeg+rkvpu).
Did you successfully implemented HDR/HLG chain in MPV/KODI ?

I have still performance issue with "LG Chess 4K Demo.mp4" in mpi_dec tester or with ffmpeg. Even if I set cpu>=1008MHz and mem==924MHz with actual rockchip-4.4 kernel and mpp+ffpeg from github I am getting only 57 FPS eg. missing additional 5-7% performance in mpp to get playable output. Are you in contact with rockchip to resolve this problem (rockchip-linux#59) ?

@lorenz
Copy link

lorenz commented Aug 9, 2018

HDR10 (SMTPE ST 2084/BT2020/YUV420 10bit) works on the RK3328 with Kodi (with @Kwiboo's sources), but there are still files that break the decoder (colors all over the place/black) or where it can't keep up (mainly 60FPS content, but also everything over 120Mbps).

@Kwiboo
Copy link
Owner

Kwiboo commented Sep 2, 2018

@mcerveny sorry for late reply, as @lorenz point out HDR can be activated with LibreELEC/LibreELEC.tv#2927, it does not set full static metadata and other limitations (60fps), hopefully it will be good enough for 24fps movies

Kwiboo pushed a commit that referenced this issue Dec 15, 2018
Increase kasan instrumented kernel stack size from 32k to 64k. Other
architectures seems to get away with just doubling kernel stack size under
kasan, but on s390 this appears to be not enough due to bigger frame size.
The particular pain point is kasan inlined checks (CONFIG_KASAN_INLINE
vs CONFIG_KASAN_OUTLINE). With inlined checks one particular case hitting
stack overflow is fs sync on xfs filesystem:

 #0 [9a0681e8]  704 bytes  check_usage at 34b1fc
 #1 [9a0684a8]  432 bytes  check_usage at 34c710
 #2 [9a068658]  1048 bytes  validate_chain at 35044a
 #3 [9a068a70]  312 bytes  __lock_acquire at 3559fe
 #4 [9a068ba8]  440 bytes  lock_acquire at 3576ee
 #5 [9a068d60]  104 bytes  _raw_spin_lock at 21b44e0
 #6 [9a068dc8]  1992 bytes  enqueue_entity at 2dbf72
 #7 [9a069590]  1496 bytes  enqueue_task_fair at 2df5f0
 #8 [9a069b68]  64 bytes  ttwu_do_activate at 28f438
 #9 [9a069ba8]  552 bytes  try_to_wake_up at 298c4c
 #10 [9a069dd0]  168 bytes  wake_up_worker at 23f97c
 #11 [9a069e78]  200 bytes  insert_work at 23fc2e
 #12 [9a069f40]  648 bytes  __queue_work at 2487c0
 #13 [9a06a1c8]  200 bytes  __queue_delayed_work at 24db28
 #14 [9a06a290]  248 bytes  mod_delayed_work_on at 24de84
 #15 [9a06a388]  24 bytes  kblockd_mod_delayed_work_on at 153e2a0
 #16 [9a06a3a0]  288 bytes  __blk_mq_delay_run_hw_queue at 158168c
 #17 [9a06a4c0]  192 bytes  blk_mq_run_hw_queue at 1581a3c
 #18 [9a06a580]  184 bytes  blk_mq_sched_insert_requests at 15a2192
 #19 [9a06a638]  1024 bytes  blk_mq_flush_plug_list at 1590f3a
 #20 [9a06aa38]  704 bytes  blk_flush_plug_list at 1555028
 #21 [9a06acf8]  320 bytes  schedule at 219e476
 #22 [9a06ae38]  760 bytes  schedule_timeout at 21b0aac
 #23 [9a06b130]  408 bytes  wait_for_common at 21a1706
 #24 [9a06b2c8]  360 bytes  xfs_buf_iowait at fa1540
 #25 [9a06b430]  256 bytes  __xfs_buf_submit at fadae6
 #26 [9a06b530]  264 bytes  xfs_buf_read_map at fae3f6
 #27 [9a06b638]  656 bytes  xfs_trans_read_buf_map at 10ac9a8
 #28 [9a06b8c8]  304 bytes  xfs_btree_kill_root at e72426
 #29 [9a06b9f8]  288 bytes  xfs_btree_lookup_get_block at e7bc5e
 #30 [9a06bb18]  624 bytes  xfs_btree_lookup at e7e1a6
 #31 [9a06bd88]  2664 bytes  xfs_alloc_ag_vextent_near at dfa070
 #32 [9a06c7f0]  144 bytes  xfs_alloc_ag_vextent at dff3ca
 #33 [9a06c880]  1128 bytes  xfs_alloc_vextent at e05fce
 #34 [9a06cce8]  584 bytes  xfs_bmap_btalloc at e58342
 #35 [9a06cf30]  1336 bytes  xfs_bmapi_write at e618de
 #36 [9a06d468]  776 bytes  xfs_iomap_write_allocate at ff678e
 #37 [9a06d770]  720 bytes  xfs_map_blocks at f82af8
 rockchip-linux#38 [9a06da40]  928 bytes  xfs_writepage_map at f83cd6
 rockchip-linux#39 [9a06dde0]  320 bytes  xfs_do_writepage at f85872
 rockchip-linux#40 [9a06df20]  1320 bytes  write_cache_pages at 73dfe8
 rockchip-linux#41 [9a06e448]  208 bytes  xfs_vm_writepages at f7f892
 rockchip-linux#42 [9a06e518]  88 bytes  do_writepages at 73fe6a
 rockchip-linux#43 [9a06e570]  872 bytes  __writeback_single_inode at a20cb6
 rockchip-linux#44 [9a06e8d8]  664 bytes  writeback_sb_inodes at a23be2
 rockchip-linux#45 [9a06eb70]  296 bytes  __writeback_inodes_wb at a242e0
 rockchip-linux#46 [9a06ec98]  928 bytes  wb_writeback at a2500e
 rockchip-linux#47 [9a06f038]  848 bytes  wb_do_writeback at a260ae
 rockchip-linux#48 [9a06f388]  536 bytes  wb_workfn at a28228
 rockchip-linux#49 [9a06f5a0]  1088 bytes  process_one_work at 24a234
 rockchip-linux#50 [9a06f9e0]  1120 bytes  worker_thread at 24ba26
 rockchip-linux#51 [9a06fe40]  104 bytes  kthread at 26545a
 rockchip-linux#52 [9a06fea8]             kernel_thread_starter at 21b6b62

To be able to increase the stack size to 64k reuse LLILL instruction
in __switch_to function to load 64k - STACK_FRAME_OVERHEAD - __PT_SIZE
(65192) value as unsigned.

Reported-by: Benjamin Block <bblock@linux.ibm.com>
Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Kwiboo pushed a commit that referenced this issue Dec 15, 2018
conn_free() holds lock with spin_lock() and it is called by both
nf_conncount_lookup() and nf_conncount_gc_list(). nf_conncount_lookup()
is called from bottom-half context and nf_conncount_gc_list() from
process context. So that spin_lock() call is not safe. Hence
conn_free() should use spin_lock_bh() instead of spin_lock().

test commands:
   %nft add table ip filter
   %nft add chain ip filter input { type filter hook input priority 0\; }
   %nft add rule filter input meter test { ip saddr ct count over 2 } \
	   counter

splat looks like:
[  461.996507] ================================
[  461.998999] WARNING: inconsistent lock state
[  461.998999] 4.19.0-rc6+ #22 Not tainted
[  461.998999] --------------------------------
[  461.998999] inconsistent {IN-SOFTIRQ-W} -> {SOFTIRQ-ON-W} usage.
[  461.998999] kworker/0:2/134 [HC0[0]:SC0[0]:HE1:SE1] takes:
[  461.998999] 00000000a71a559a (&(&list->list_lock)->rlock){+.?.}, at: conn_free+0x69/0x2b0 [nf_conncount]
[  461.998999] {IN-SOFTIRQ-W} state was registered at:
[  461.998999]   _raw_spin_lock+0x30/0x70
[  461.998999]   nf_conncount_add+0x28a/0x520 [nf_conncount]
[  461.998999]   nft_connlimit_eval+0x401/0x580 [nft_connlimit]
[  461.998999]   nft_dynset_eval+0x32b/0x590 [nf_tables]
[  461.998999]   nft_do_chain+0x497/0x1430 [nf_tables]
[  461.998999]   nft_do_chain_ipv4+0x255/0x330 [nf_tables]
[  461.998999]   nf_hook_slow+0xb1/0x160
[ ... ]
[  461.998999] other info that might help us debug this:
[  461.998999]  Possible unsafe locking scenario:
[  461.998999]
[  461.998999]        CPU0
[  461.998999]        ----
[  461.998999]   lock(&(&list->list_lock)->rlock);
[  461.998999]   <Interrupt>
[  461.998999]     lock(&(&list->list_lock)->rlock);
[  461.998999]
[  461.998999]  *** DEADLOCK ***
[  461.998999]
[ ... ]

Fixes: 5c789e1 ("netfilter: nf_conncount: Add list lock and gc worker, and RCU for init tree search")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Kwiboo pushed a commit that referenced this issue Dec 15, 2018
nf_conncount_tuple is an element of nft_connlimit and that is deleted by
conn_free(). Elements can be deleted by both GC routine and data path
functions (nf_conncount_lookup, nf_conncount_add) and they call
conn_free() to free elements. But conn_free() only protects lists, not
each element. So that list_del corruption could occurred.

The conn_free() doesn't check whether element is already deleted. In
order to protect elements, dead flag is added. If an element is deleted,
dead flag is set. The only conn_free() can delete elements so that both
list lock and dead flag are enough to protect it.

test commands:
   %nft add table ip filter
   %nft add chain ip filter input { type filter hook input priority 0\; }
   %nft add rule filter input meter test { ip id ct count over 2 } counter

splat looks like:
[ 1779.495778] list_del corruption, ffff8800b6e12008->prev is LIST_POISON2 (dead000000000200)
[ 1779.505453] ------------[ cut here ]------------
[ 1779.506260] kernel BUG at lib/list_debug.c:50!
[ 1779.515831] invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN PTI
[ 1779.516772] CPU: 0 PID: 33 Comm: kworker/0:2 Not tainted 4.19.0-rc6+ #22
[ 1779.516772] Workqueue: events_power_efficient nft_rhash_gc [nf_tables_set]
[ 1779.516772] RIP: 0010:__list_del_entry_valid+0xd8/0x150
[ 1779.516772] Code: 39 48 83 c4 08 b8 01 00 00 00 5b 5d c3 48 89 ea 48 c7 c7 00 c3 5b 98 e8 0f dc 40 ff 0f 0b 48 c7 c7 60 c3 5b 98 e8 01 dc 40 ff <0f> 0b 48 c7 c7 c0 c3 5b 98 e8 f3 db 40 ff 0f 0b 48 c7 c7 20 c4 5b
[ 1779.516772] RSP: 0018:ffff880119127420 EFLAGS: 00010286
[ 1779.516772] RAX: 000000000000004e RBX: dead000000000200 RCX: 0000000000000000
[ 1779.516772] RDX: 000000000000004e RSI: 0000000000000008 RDI: ffffed0023224e7a
[ 1779.516772] RBP: ffff88011934bc10 R08: ffffed002367cea9 R09: ffffed002367cea9
[ 1779.516772] R10: 0000000000000001 R11: ffffed002367cea8 R12: ffff8800b6e12008
[ 1779.516772] R13: ffff8800b6e12010 R14: ffff88011934bc20 R15: ffff8800b6e12008
[ 1779.516772] FS:  0000000000000000(0000) GS:ffff88011b200000(0000) knlGS:0000000000000000
[ 1779.516772] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1779.516772] CR2: 00007fc876534010 CR3: 000000010da16000 CR4: 00000000001006f0
[ 1779.516772] Call Trace:
[ 1779.516772]  conn_free+0x9f/0x2b0 [nf_conncount]
[ 1779.516772]  ? nf_ct_tmpl_alloc+0x2a0/0x2a0 [nf_conntrack]
[ 1779.516772]  ? nf_conncount_add+0x520/0x520 [nf_conncount]
[ 1779.516772]  ? do_raw_spin_trylock+0x1a0/0x1a0
[ 1779.516772]  ? do_raw_spin_trylock+0x10/0x1a0
[ 1779.516772]  find_or_evict+0xe5/0x150 [nf_conncount]
[ 1779.516772]  nf_conncount_gc_list+0x162/0x360 [nf_conncount]
[ 1779.516772]  ? nf_conncount_lookup+0xee0/0xee0 [nf_conncount]
[ 1779.516772]  ? _raw_spin_unlock_irqrestore+0x45/0x50
[ 1779.516772]  ? trace_hardirqs_off+0x6b/0x220
[ 1779.516772]  ? trace_hardirqs_on_caller+0x220/0x220
[ 1779.516772]  nft_rhash_gc+0x16b/0x540 [nf_tables_set]
[ ... ]

Fixes: 5c789e1 ("netfilter: nf_conncount: Add list lock and gc worker, and RCU for init tree search")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
@Kwiboo Kwiboo closed this as completed Jan 1, 2020
Kwiboo pushed a commit that referenced this issue Jan 8, 2020
'chrdev_open()' calls 'cdev_get()' to obtain a reference to the
'struct cdev *' stashed in the 'i_cdev' field of the target inode
structure. If the pointer is NULL, then it is initialised lazily by
looking up the kobject in the 'cdev_map' and so the whole procedure is
protected by the 'cdev_lock' spinlock to serialise initialisation of
the shared pointer.

Unfortunately, it is possible for the initialising thread to fail *after*
installing the new pointer, for example if the subsequent '->open()' call
on the file fails. In this case, 'cdev_put()' is called, the reference
count on the kobject is dropped and, if nobody else has taken a reference,
the release function is called which finally clears 'inode->i_cdev' from
'cdev_purge()' before potentially freeing the object. The problem here
is that a racing thread can happily take the 'cdev_lock' and see the
non-NULL pointer in the inode, which can result in a refcount increment
from zero and a warning:

  |  ------------[ cut here ]------------
  |  refcount_t: addition on 0; use-after-free.
  |  WARNING: CPU: 2 PID: 6385 at lib/refcount.c:25 refcount_warn_saturate+0x6d/0xf0
  |  Modules linked in:
  |  CPU: 2 PID: 6385 Comm: repro Not tainted 5.5.0-rc2+ #22
  |  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014
  |  RIP: 0010:refcount_warn_saturate+0x6d/0xf0
  |  Code: 05 55 9a 15 01 01 e8 9d aa c8 ff 0f 0b c3 80 3d 45 9a 15 01 00 75 ce 48 c7 c7 00 9c 62 b3 c6 08
  |  RSP: 0018:ffffb524c1b9bc70 EFLAGS: 00010282
  |  RAX: 0000000000000000 RBX: ffff9e9da1f71390 RCX: 0000000000000000
  |  RDX: ffff9e9dbbd27618 RSI: ffff9e9dbbd18798 RDI: ffff9e9dbbd18798
  |  RBP: 0000000000000000 R08: 000000000000095f R09: 0000000000000039
  |  R10: 0000000000000000 R11: ffffb524c1b9bb20 R12: ffff9e9da1e8c700
  |  R13: ffffffffb25ee8b0 R14: 0000000000000000 R15: ffff9e9da1e8c700
  |  FS:  00007f3b87d26700(0000) GS:ffff9e9dbbd00000(0000) knlGS:0000000000000000
  |  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  |  CR2: 00007fc16909c000 CR3: 000000012df9c000 CR4: 00000000000006e0
  |  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  |  DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
  |  Call Trace:
  |   kobject_get+0x5c/0x60
  |   cdev_get+0x2b/0x60
  |   chrdev_open+0x55/0x220
  |   ? cdev_put.part.3+0x20/0x20
  |   do_dentry_open+0x13a/0x390
  |   path_openat+0x2c8/0x1470
  |   do_filp_open+0x93/0x100
  |   ? selinux_file_ioctl+0x17f/0x220
  |   do_sys_open+0x186/0x220
  |   do_syscall_64+0x48/0x150
  |   entry_SYSCALL_64_after_hwframe+0x44/0xa9
  |  RIP: 0033:0x7f3b87efcd0e
  |  Code: 89 54 24 08 e8 a3 f4 ff ff 8b 74 24 0c 48 8b 3c 24 41 89 c0 44 8b 54 24 08 b8 01 01 00 00 89 f4
  |  RSP: 002b:00007f3b87d259f0 EFLAGS: 00000293 ORIG_RAX: 0000000000000101
  |  RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f3b87efcd0e
  |  RDX: 0000000000000000 RSI: 00007f3b87d25a80 RDI: 00000000ffffff9c
  |  RBP: 00007f3b87d25e90 R08: 0000000000000000 R09: 0000000000000000
  |  R10: 0000000000000000 R11: 0000000000000293 R12: 00007ffe188f504e
  |  R13: 00007ffe188f504f R14: 00007f3b87d26700 R15: 0000000000000000
  |  ---[ end trace 24f53ca58db8180a ]---

Since 'cdev_get()' can already fail to obtain a reference, simply move
it over to use 'kobject_get_unless_zero()' instead of 'kobject_get()',
which will cause the racing thread to return -ENXIO if the initialising
thread fails unexpectedly.

Cc: Hillf Danton <hdanton@sina.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Reported-by: syzbot+82defefbbd8527e1c2cb@syzkaller.appspotmail.com
Signed-off-by: Will Deacon <will@kernel.org>
Cc: stable <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20191219120203.32691-1-will@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Kwiboo pushed a commit that referenced this issue Jul 20, 2020
The following deadlock was captured. The first process is holding 'kernfs_mutex'
and hung by io. The io was staging in 'r1conf.pending_bio_list' of raid1 device,
this pending bio list would be flushed by second process 'md127_raid1', but
it was hung by 'kernfs_mutex'. Using sysfs_notify_dirent_safe() to replace
sysfs_notify() can fix it. There were other sysfs_notify() invoked from io
path, removed all of them.

 PID: 40430  TASK: ffff8ee9c8c65c40  CPU: 29  COMMAND: "probe_file"
  #0 [ffffb87c4df37260] __schedule at ffffffff9a8678ec
  #1 [ffffb87c4df372f8] schedule at ffffffff9a867f06
  #2 [ffffb87c4df37310] io_schedule at ffffffff9a0c73e6
  #3 [ffffb87c4df37328] __dta___xfs_iunpin_wait_3443 at ffffffffc03a4057 [xfs]
  #4 [ffffb87c4df373a0] xfs_iunpin_wait at ffffffffc03a6c79 [xfs]
  #5 [ffffb87c4df373b0] __dta_xfs_reclaim_inode_3357 at ffffffffc039a46c [xfs]
  #6 [ffffb87c4df37400] xfs_reclaim_inodes_ag at ffffffffc039a8b6 [xfs]
  #7 [ffffb87c4df37590] xfs_reclaim_inodes_nr at ffffffffc039bb33 [xfs]
  #8 [ffffb87c4df375b0] xfs_fs_free_cached_objects at ffffffffc03af0e9 [xfs]
  #9 [ffffb87c4df375c0] super_cache_scan at ffffffff9a287ec7
 #10 [ffffb87c4df37618] shrink_slab at ffffffff9a1efd93
 #11 [ffffb87c4df37700] shrink_node at ffffffff9a1f5968
 #12 [ffffb87c4df37788] do_try_to_free_pages at ffffffff9a1f5ea2
 #13 [ffffb87c4df377f0] try_to_free_mem_cgroup_pages at ffffffff9a1f6445
 #14 [ffffb87c4df37880] try_charge at ffffffff9a26cc5f
 #15 [ffffb87c4df37920] memcg_kmem_charge_memcg at ffffffff9a270f6a
 #16 [ffffb87c4df37958] new_slab at ffffffff9a251430
 #17 [ffffb87c4df379c0] ___slab_alloc at ffffffff9a251c85
 #18 [ffffb87c4df37a80] __slab_alloc at ffffffff9a25635d
 #19 [ffffb87c4df37ac0] kmem_cache_alloc at ffffffff9a251f89
 #20 [ffffb87c4df37b00] alloc_inode at ffffffff9a2a2b10
 #21 [ffffb87c4df37b20] iget_locked at ffffffff9a2a4854
 #22 [ffffb87c4df37b60] kernfs_get_inode at ffffffff9a311377
 #23 [ffffb87c4df37b80] kernfs_iop_lookup at ffffffff9a311e2b
 #24 [ffffb87c4df37ba8] lookup_slow at ffffffff9a290118
 #25 [ffffb87c4df37c10] walk_component at ffffffff9a291e83
 #26 [ffffb87c4df37c78] path_lookupat at ffffffff9a293619
 #27 [ffffb87c4df37cd8] filename_lookup at ffffffff9a2953af
 #28 [ffffb87c4df37de8] user_path_at_empty at ffffffff9a295566
 #29 [ffffb87c4df37e10] vfs_statx at ffffffff9a289787
 #30 [ffffb87c4df37e70] SYSC_newlstat at ffffffff9a289d5d
 #31 [ffffb87c4df37f18] sys_newlstat at ffffffff9a28a60e
 #32 [ffffb87c4df37f28] do_syscall_64 at ffffffff9a003949
 #33 [ffffb87c4df37f50] entry_SYSCALL_64_after_hwframe at ffffffff9aa001ad
     RIP: 00007f617a5f2905  RSP: 00007f607334f838  RFLAGS: 00000246
     RAX: ffffffffffffffda  RBX: 00007f6064044b20  RCX: 00007f617a5f2905
     RDX: 00007f6064044b20  RSI: 00007f6064044b20  RDI: 00007f6064005890
     RBP: 00007f6064044aa0   R8: 0000000000000030   R9: 000000000000011c
     R10: 0000000000000013  R11: 0000000000000246  R12: 00007f606417e6d0
     R13: 00007f6064044aa0  R14: 00007f6064044b10  R15: 00000000ffffffff
     ORIG_RAX: 0000000000000006  CS: 0033  SS: 002b

 PID: 927    TASK: ffff8f15ac5dbd80  CPU: 42  COMMAND: "md127_raid1"
  #0 [ffffb87c4df07b28] __schedule at ffffffff9a8678ec
  #1 [ffffb87c4df07bc0] schedule at ffffffff9a867f06
  #2 [ffffb87c4df07bd8] schedule_preempt_disabled at ffffffff9a86825e
  #3 [ffffb87c4df07be8] __mutex_lock at ffffffff9a869bcc
  #4 [ffffb87c4df07ca0] __mutex_lock_slowpath at ffffffff9a86a013
  #5 [ffffb87c4df07cb0] mutex_lock at ffffffff9a86a04f
  #6 [ffffb87c4df07cc8] kernfs_find_and_get_ns at ffffffff9a311d83
  #7 [ffffb87c4df07cf0] sysfs_notify at ffffffff9a314b3a
  #8 [ffffb87c4df07d18] md_update_sb at ffffffff9a688696
  #9 [ffffb87c4df07d98] md_update_sb at ffffffff9a6886d5
 #10 [ffffb87c4df07da8] md_check_recovery at ffffffff9a68ad9c
 #11 [ffffb87c4df07dd0] raid1d at ffffffffc01f0375 [raid1]
 #12 [ffffb87c4df07ea0] md_thread at ffffffff9a680348
 #13 [ffffb87c4df07f08] kthread at ffffffff9a0b8005
 #14 [ffffb87c4df07f50] ret_from_fork at ffffffff9aa00344

Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com>
Signed-off-by: Song Liu <songliubraving@fb.com>
Kwiboo pushed a commit that referenced this issue Nov 10, 2020
'chrdev_open()' calls 'cdev_get()' to obtain a reference to the
'struct cdev *' stashed in the 'i_cdev' field of the target inode
structure. If the pointer is NULL, then it is initialised lazily by
looking up the kobject in the 'cdev_map' and so the whole procedure is
protected by the 'cdev_lock' spinlock to serialise initialisation of
the shared pointer.

Unfortunately, it is possible for the initialising thread to fail *after*
installing the new pointer, for example if the subsequent '->open()' call
on the file fails. In this case, 'cdev_put()' is called, the reference
count on the kobject is dropped and, if nobody else has taken a reference,
the release function is called which finally clears 'inode->i_cdev' from
'cdev_purge()' before potentially freeing the object. The problem here
is that a racing thread can happily take the 'cdev_lock' and see the
non-NULL pointer in the inode, which can result in a refcount increment
from zero and a warning:

  |  ------------[ cut here ]------------
  |  refcount_t: addition on 0; use-after-free.
  |  WARNING: CPU: 2 PID: 6385 at lib/refcount.c:25 refcount_warn_saturate+0x6d/0xf0
  |  Modules linked in:
  |  CPU: 2 PID: 6385 Comm: repro Not tainted 5.5.0-rc2+ #22
  |  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014
  |  RIP: 0010:refcount_warn_saturate+0x6d/0xf0
  |  Code: 05 55 9a 15 01 01 e8 9d aa c8 ff 0f 0b c3 80 3d 45 9a 15 01 00 75 ce 48 c7 c7 00 9c 62 b3 c6 08
  |  RSP: 0018:ffffb524c1b9bc70 EFLAGS: 00010282
  |  RAX: 0000000000000000 RBX: ffff9e9da1f71390 RCX: 0000000000000000
  |  RDX: ffff9e9dbbd27618 RSI: ffff9e9dbbd18798 RDI: ffff9e9dbbd18798
  |  RBP: 0000000000000000 R08: 000000000000095f R09: 0000000000000039
  |  R10: 0000000000000000 R11: ffffb524c1b9bb20 R12: ffff9e9da1e8c700
  |  R13: ffffffffb25ee8b0 R14: 0000000000000000 R15: ffff9e9da1e8c700
  |  FS:  00007f3b87d26700(0000) GS:ffff9e9dbbd00000(0000) knlGS:0000000000000000
  |  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  |  CR2: 00007fc16909c000 CR3: 000000012df9c000 CR4: 00000000000006e0
  |  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  |  DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
  |  Call Trace:
  |   kobject_get+0x5c/0x60
  |   cdev_get+0x2b/0x60
  |   chrdev_open+0x55/0x220
  |   ? cdev_put.part.3+0x20/0x20
  |   do_dentry_open+0x13a/0x390
  |   path_openat+0x2c8/0x1470
  |   do_filp_open+0x93/0x100
  |   ? selinux_file_ioctl+0x17f/0x220
  |   do_sys_open+0x186/0x220
  |   do_syscall_64+0x48/0x150
  |   entry_SYSCALL_64_after_hwframe+0x44/0xa9
  |  RIP: 0033:0x7f3b87efcd0e
  |  Code: 89 54 24 08 e8 a3 f4 ff ff 8b 74 24 0c 48 8b 3c 24 41 89 c0 44 8b 54 24 08 b8 01 01 00 00 89 f4
  |  RSP: 002b:00007f3b87d259f0 EFLAGS: 00000293 ORIG_RAX: 0000000000000101
  |  RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f3b87efcd0e
  |  RDX: 0000000000000000 RSI: 00007f3b87d25a80 RDI: 00000000ffffff9c
  |  RBP: 00007f3b87d25e90 R08: 0000000000000000 R09: 0000000000000000
  |  R10: 0000000000000000 R11: 0000000000000293 R12: 00007ffe188f504e
  |  R13: 00007ffe188f504f R14: 00007f3b87d26700 R15: 0000000000000000
  |  ---[ end trace 24f53ca58db8180a ]---

Since 'cdev_get()' can already fail to obtain a reference, simply move
it over to use 'kobject_get_unless_zero()' instead of 'kobject_get()',
which will cause the racing thread to return -ENXIO if the initialising
thread fails unexpectedly.

Cc: Hillf Danton <hdanton@sina.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Reported-by: syzbot+82defefbbd8527e1c2cb@syzkaller.appspotmail.com
Signed-off-by: Will Deacon <will@kernel.org>
Cc: stable <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20191219120203.32691-1-will@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Wenping Zhang <wenping.zhang@rock-chips.com>
Change-Id: Idc4211025f422fdc4440a23ad491b5ccc459d4bd
(cherry picked from commit 68faa67)
Kwiboo pushed a commit that referenced this issue Nov 29, 2020
[ Upstream commit 4ff753f ]

When an UE or memory error exception is encountered the MCE handler
tries to find the pfn using addr_to_pfn() which takes effective
address as an argument, later pfn is used to poison the page where
memory error occurred, recent rework in this area made addr_to_pfn
to run in real mode, which can be fatal as it may try to access
memory outside RMO region.

Have two helper functions to separate things to be done in real mode
and virtual mode without changing any functionality. This also fixes
the following error as the use of addr_to_pfn is now moved to virtual
mode.

Without this change following kernel crash is seen on hitting UE.

[  485.128036] Oops: Kernel access of bad area, sig: 11 [#1]
[  485.128040] LE SMP NR_CPUS=2048 NUMA pSeries
[  485.128047] Modules linked in:
[  485.128067] CPU: 15 PID: 6536 Comm: insmod Kdump: loaded Tainted: G OE 5.7.0 #22
[  485.128074] NIP:  c00000000009b24c LR: c0000000000398d8 CTR: c000000000cd57c0
[  485.128078] REGS: c000000003f1f970 TRAP: 0300   Tainted: G OE (5.7.0)
[  485.128082] MSR:  8000000000001003 <SF,ME,RI,LE>  CR: 28008284  XER: 00000001
[  485.128088] CFAR: c00000000009b190 DAR: c0000001fab00000 DSISR: 40000000 IRQMASK: 1
[  485.128088] GPR00: 0000000000000001 c000000003f1fbf0 c000000001634300 0000b0fa01000000
[  485.128088] GPR04: d000000002220000 0000000000000000 00000000fab00000 0000000000000022
[  485.128088] GPR08: c0000001fab00000 0000000000000000 c0000001fab00000 c000000003f1fc14
[  485.128088] GPR12: 0000000000000008 c000000003ff5880 d000000002100008 0000000000000000
[  485.128088] GPR16: 000000000000ff20 000000000000fff1 000000000000fff2 d0000000021a1100
[  485.128088] GPR20: d000000002200000 c00000015c893c50 c000000000d49b28 c00000015c893c50
[  485.128088] GPR24: d0000000021a0d08 c0000000014e5da8 d0000000021a0818 000000000000000a
[  485.128088] GPR28: 0000000000000008 000000000000000a c0000000017e2970 000000000000000a
[  485.128125] NIP [c00000000009b24c] __find_linux_pte+0x11c/0x310
[  485.128130] LR [c0000000000398d8] addr_to_pfn+0x138/0x170
[  485.128133] Call Trace:
[  485.128135] Instruction dump:
[  485.128138] 3929ffff 7d4a3378 7c883c36 7d2907b4 794a1564 7d294038 794af082 3900ffff
[  485.128144] 79291f24 790af00e 78e70020 7d095214 <7c69502a> 2fa30000 419e011c 70690040
[  485.128152] ---[ end trace d34b27e29ae0e340 ]---

Fixes: 9ca766f ("powerpc/64s/pseries: machine check convert to use common event code")
Signed-off-by: Ganesh Goudar <ganeshgr@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200724063946.21378-1-ganeshgr@linux.ibm.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
rubenvb pushed a commit to rubenvb/linux-rockchip that referenced this issue Jan 3, 2021
[ Upstream commit 4e79f02 ]

When running in BE mode on LPAE hardware with a PA-to-VA translation
that exceeds 4 GB, we patch bits 39:32 of the offset into the wrong
byte of the opcode. So fix that, by rotating the offset in r0 to the
right by 8 bits, which will put the 8-bit immediate in bits 31:24.

Note that this will also move bit Kwiboo#22 in its correct place when
applying the rotation to the constant #0x400000.

Fixes: d9a790d ("ARM: 7883/1: fix mov to mvn conversion in case of 64 bit phys_addr_t and BE")
Acked-by: Nicolas Pitre <nico@fluxnic.net>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

8 participants