Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Platform supporting MIPI, VCU etc. #2

Closed
syed-ahmed opened this issue Dec 28, 2021 · 6 comments
Closed

Platform supporting MIPI, VCU etc. #2

syed-ahmed opened this issue Dec 28, 2021 · 6 comments
Assignees

Comments

@syed-ahmed
Copy link

Hi!

I was able to run the KRS examples on KV260 and am currently working on accelerating ORB-SLAM2 ROS node using KRS. I looked into the vivado platform shipped by KRS and it looks like it's a bare minimum acceleration platform. I was wondering if there were any instructions on how that platform was created? I looked into the artifacts of this repo but seems like it only ships with the exported hardware platform (whereas I'm interested in tcl scripts that created the hardware project and petalinux meta recipes). I want to build a pipeline like this using KRS and so was wondering if the platform in this repo need to be updated, such that it supports MIPI/VCU/audio pipelines.

@vmayoral
Copy link
Member

Hello @syed-ahmed,

I was able to run the KRS examples on KV260 and am currently working on accelerating ORB-SLAM2 ROS node using KRS.

Great to hear that!

I looked into the vivado platform shipped by KRS and it looks like it's a bare minimum acceleration platform. I was wondering if there were any instructions on how that platform was created? I looked into the artifacts of this repo but seems like it only ships with the exported hardware platform (whereas I'm interested in tcl scripts that created the hardware project and petalinux meta recipes). I want to build a pipeline like this using KRS and so was wondering if the platform in this repo need to be updated, such that it supports MIPI/VCU/audio pipelines.

That's correct, KRS alpha only ships a minimalistic Vitis platform that's then used by the Vitis compiler as a ground base to add whatever accelerators you have in your ROS 2 workspace. KRS alpha is only meant for basic (single Node) accelerators. Support for multiple accelerators/multiple Nodes is coming up in KRS beta and with that, also tools for simplified replacement of the Vitis platforms.

Right now, in alpha, the process is quite cumbersome and I don't recommend it but definitely doable if you know what you're doing. The source code of the platform files and the scripts to automate it are available in:

A few notes:

  • Many of these things are lacking documentation. You're working on the cutting edge (which is cool, but like me, expect a bumpy road)
  • Changes in the Vitis platform will also force you to review/update the device tree blobs as appropriate, if you want the extensions to the ROS 2 build system (ament_acceleration) and build tools (colcon-acceleration) to produce valid kernels, otherwise, your kernels will synthesize + place&route just fine, but they device tree inconsistencies won't allow them to interact with hardware successfully.

My plan is to release at least two Vitis platforms with KRS beta with tools to switch them easily and document the process on how to contribute your own platform. I'd be great to get your platform landing in here as a third one.

As a side note, I see ORB-SLAM just turned 3! (UZ-SLAMLab/ORB_SLAM3)

@vmayoral vmayoral self-assigned this Jan 10, 2022
@syed-ahmed
Copy link
Author

Thanks @vmayoral! That explains a lot! Let's keep this issue open and I can document the process as I work on this.

@vmayoral
Copy link
Member

Hey @syed-ahmed!

Do you have any updates to share with us on your research? Let us know how we can help.

@syed-ahmed
Copy link
Author

Hi @vmayoral . Apologies for the late reply. I was transitioning from academia and had to stop working on this.

I was able to make a custom platform. The process was simpler than I thought. I was able to skip petalinux and reuse the artifacts that were already in kv260 firmware release of KRS. The only thing changed here is the hardware platform. However, a full set of instructions on generating the kv260 artifacts would of course help in the future (e.g. patching with PREEMPT_RT kernel, xilinx BSP modifications if any, added packages etc.). I don't really have a formal writeup, but here's an attempt at documenting the process here:

  1. Clone the platform repo
git clone https://github.com/syed-ahmed/xilinx-k26-som-2021.2 \
  && cd xilinx-k26-som-2021.2 \
  && git submodule update --init --recursive
  1. Make the platform.
cd kv260-vitis \
  && make platform PFM=kv260_ispMipiRx_vcu_DP
  1. Replace the krs kv260 platform with the generated platform from the previous step:
mv ~/krs_ws/src/acceleration/acceleration_firmware_kv260/acceleration_firmware_kv260/firmware/platform ~/krs_ws/src/acceleration/acceleration_firmware_kv260/acceleration_firmware_kv260/firmware/platform_bk \
  && cp -r platforms/xilinx_kv260_ispMipiRx_vcu_DP_202110_1 ~/krs_ws/src/acceleration/acceleration_firmware_kv260/acceleration_firmware_kv260/firmware/platform
  1. Generate the device tree from .xsa produced in step 2 by following the directions here.
  2. Replace the .dtsi in one of the acceleration examples to test. Example: replace the vadd_faster.dtsi in krs_ws/src/acceleration/acceleration_examples/nodes/faster_doublevadd_publisher/src with the .dtsi generated from step 4. Mine looks like as follows. Note firmware-name reflects that of vadd_faster.
/*
 * CAUTION: This file is automatically generated by Xilinx.
 * Version: XSCT 2021.2
 * Today is: Tue Mar 22 00:42:13 2022
 */


/dts-v1/;
/plugin/;
/ {
	fragment@0 {
		target = <&fpga_full>;
		overlay0: __overlay__ {
			#address-cells = <2>;
			#size-cells = <2>;
			firmware-name = "vadd_faster.bit.bin";
			resets = <&zynqmp_reset 116>, <&zynqmp_reset 117>, <&zynqmp_reset 118>, <&zynqmp_reset 119>;
		};
	};
	fragment@1 {
		target = <&amba>;
		overlay1: __overlay__ {
			afi0: afi0 {
				compatible = "xlnx,afi-fpga";
				config-afi = < 0 0>, <1 0>, <2 0>, <3 0>, <4 0>, <5 0>, <6 0>, <7 0>, <8 0>, <9 0>, <10 0>, <11 0>, <12 0>, <13 0>, <14 0x0>, <15 0x000>;
			};
			clocking0: clocking0 {
				#clock-cells = <0>;
				assigned-clock-rates = <99999001>;
				assigned-clocks = <&zynqmp_clk 71>;
				clock-output-names = "fabric_clk";
				clocks = <&zynqmp_clk 71>;
				compatible = "xlnx,fclk";
			};
			clocking1: clocking1 {
				#clock-cells = <0>;
				assigned-clock-rates = <99999001>;
				assigned-clocks = <&zynqmp_clk 72>;
				clock-output-names = "fabric_clk";
				clocks = <&zynqmp_clk 72>;
				compatible = "xlnx,fclk";
			};
		};
	};
	fragment@2 {
		target = <&amba>;
		overlay2: __overlay__ {
			#address-cells = <2>;
			#size-cells = <2>;
			audio_ss_0_audio_formatter_0: audio_formatter@80040000 {
				clock-names = "s_axi_lite_aclk", "m_axis_mm2s_aclk", "aud_mclk", "s_axis_s2mm_aclk";
				clocks = <&misc_clk_0>, <&misc_clk_1>, <&misc_clk_1>, <&misc_clk_0>;
				compatible = "xlnx,audio-formatter-1.0", "xlnx,audio-formatter-1.0";
				interrupt-names = "irq_mm2s", "irq_s2mm";
				interrupt-parent = <&gic>;
				interrupts = <0 111 4 0 110 4>;
				reg = <0x0 0x80040000 0x0 0x10000>;
				xlnx,include-mm2s = <0x1>;
				xlnx,include-s2mm = <0x1>;
				xlnx,max-num-channels-mm2s = <0x2>;
				xlnx,max-num-channels-s2mm = <0x2>;
				xlnx,mm2s-addr-width = <0x40>;
				xlnx,mm2s-async-clock = <0x1>;
				xlnx,mm2s-dataformat = <0x3>;
				xlnx,packing-mode-mm2s = <0x0>;
				xlnx,packing-mode-s2mm = <0x0>;
				xlnx,rx = <&audio_ss_0_i2s_receiver_0>;
				xlnx,s2mm-addr-width = <0x40>;
				xlnx,s2mm-async-clock = <0x1>;
				xlnx,s2mm-dataformat = <0x1>;
				xlnx,tx = <&audio_ss_0_i2s_transmitter_0>;
			};
			misc_clk_0: misc_clk_0 {
				#clock-cells = <0>;
				clock-frequency = <99999000>;
				compatible = "fixed-clock";
			};
			misc_clk_1: misc_clk_1 {
				#clock-cells = <0>;
				clock-frequency = <18432995>;
				compatible = "fixed-clock";
			};
			audio_ss_0_i2s_receiver_0: i2s_receiver@80060000 {
				aud_mclk = <18432995>;
				clock-names = "s_axi_ctrl_aclk", "aud_mclk", "m_axis_aud_aclk";
				clocks = <&misc_clk_0>, <&misc_clk_1>, <&misc_clk_0>;
				compatible = "xlnx,i2s-receiver-1.0", "xlnx,i2s-receiver-1.0";
				interrupt-names = "irq";
				interrupt-parent = <&gic>;
				interrupts = <0 108 4>;
				reg = <0x0 0x80060000 0x0 0x10000>;
				xlnx,depth = <0x80>;
				xlnx,dwidth = <0x18>;
				xlnx,num-channels = <0x1>;
				xlnx,snd-pcm = <&audio_ss_0_audio_formatter_0>;
			};
			audio_ss_0_i2s_transmitter_0: i2s_transmitter@80070000 {
				aud_mclk = <18432995>;
				clock-names = "s_axi_ctrl_aclk", "aud_mclk", "s_axis_aud_aclk";
				clocks = <&misc_clk_0>, <&misc_clk_1>, <&misc_clk_1>;
				compatible = "xlnx,i2s-transmitter-1.0", "xlnx,i2s-transmitter-1.0";
				interrupt-names = "irq";
				interrupt-parent = <&gic>;
				interrupts = <0 109 4>;
				reg = <0x0 0x80070000 0x0 0x10000>;
				xlnx,depth = <0x80>;
				xlnx,dwidth = <0x18>;
				xlnx,num-channels = <0x1>;
				xlnx,snd-pcm = <&audio_ss_0_audio_formatter_0>;
			};
			axi_iic_0: i2c@80030000 {
				#address-cells = <1>;
				#size-cells = <0>;
				clock-names = "s_axi_aclk";
				clocks = <&misc_clk_0>;
				compatible = "xlnx,axi-iic-2.1", "xlnx,xps-iic-2.00.a";
				interrupt-names = "iic2intc_irpt";
				interrupt-parent = <&gic>;
				interrupts = <0 107 4>;
				reg = <0x0 0x80030000 0x0 0x10000>;
			};
			axi_vip_0: axi_vip@a0000000 {
				/* This is a place holder node for a custom IP, user may need to update the entries */
				clock-names = "aclk";
				clocks = <&misc_clk_2>;
				compatible = "xlnx,axi-vip-1.1";
				reg = <0x0 0xa0000000 0x0 0x10000>;
				xlnx,axi-addr-width = <0x20>;
				xlnx,axi-aruser-width = <0x10>;
				xlnx,axi-awuser-width = <0x10>;
				xlnx,axi-buser-width = <0x0>;
				xlnx,axi-has-aresetn = <0x1>;
				xlnx,axi-has-bresp = <0x1>;
				xlnx,axi-has-burst = <0x1>;
				xlnx,axi-has-cache = <0x1>;
				xlnx,axi-has-lock = <0x1>;
				xlnx,axi-has-prot = <0x1>;
				xlnx,axi-has-qos = <0x1>;
				xlnx,axi-has-region = <0x0>;
				xlnx,axi-has-rresp = <0x1>;
				xlnx,axi-has-wstrb = <0x1>;
				xlnx,axi-interface-mode = <0x2>;
				xlnx,axi-protocol = <0x0>;
				xlnx,axi-rdata-width = <0x20>;
				xlnx,axi-rid-width = <0x10>;
				xlnx,axi-ruser-width = <0x0>;
				xlnx,axi-supports-narrow = <0x1>;
				xlnx,axi-wdata-width = <0x20>;
				xlnx,axi-wid-width = <0x10>;
				xlnx,axi-wuser-width = <0x0>;
			};
			misc_clk_2: misc_clk_2 {
				#clock-cells = <0>;
				clock-frequency = <299997000>;
				compatible = "fixed-clock";
			};
			capture_pipeline_mipi_csi2_rx_subsyst_0: mipi_csi2_rx_subsystem@80000000 {
				clock-names = "lite_aclk", "dphy_clk_200M", "video_aclk";
				clocks = <&misc_clk_0>, <&misc_clk_3>, <&misc_clk_2>;
				compatible = "xlnx,mipi-csi2-rx-subsystem-5.1", "xlnx,mipi-csi2-rx-subsystem-5.0";
				interrupt-names = "csirxss_csi_irq";
				interrupt-parent = <&gic>;
				interrupts = <0 104 4>;
				reg = <0x0 0x80000000 0x0 0x2000>;
				xlnx,axis-tdata-width = <32>;
				xlnx,max-lanes = <4>;
				xlnx,ppc = <2>;
				xlnx,vfb ;
				mipi_csi_portscapture_pipeline_mipi_csi2_rx_subsyst_0: ports {
					#address-cells = <1>;
					#size-cells = <0>;
					mipi_csi_port1capture_pipeline_mipi_csi2_rx_subsyst_0: port@1 {
						/* Fill cfa-pattern=rggb for raw data types, other fields video-format and video-width user needs to fill */
						reg = <1>;
						xlnx,cfa-pattern = "rggb";
						xlnx,video-format = <12>;
						xlnx,video-width = <8>;
						mipi_csirx_outcapture_pipeline_mipi_csi2_rx_subsyst_0: endpoint {
							remote-endpoint = <&capture_pipeline_v_frmbuf_wr_0capture_pipeline_mipi_csi2_rx_subsyst_0>;
						};
					};
					mipi_csi_port0capture_pipeline_mipi_csi2_rx_subsyst_0: port@0 {
						/* Fill cfa-pattern=rggb for raw data types, other fields video-format,video-width user needs to fill */
						/* User need to add something like remote-endpoint=<&out> under the node csiss_in:endpoint */
						reg = <0>;
						xlnx,cfa-pattern = "rggb";
						xlnx,video-format = <12>;
						xlnx,video-width = <8>;
						mipi_csi_incapture_pipeline_mipi_csi2_rx_subsyst_0: endpoint {
							data-lanes = <1 2 3 4>;
						};
					};
				};
			};
			misc_clk_3: misc_clk_3 {
				#clock-cells = <0>;
				clock-frequency = <199998000>;
				compatible = "fixed-clock";
			};
			capture_pipeline_v_frmbuf_wr_0: v_frmbuf_wr@b0010000 {
				#dma-cells = <1>;
				clock-names = "ap_clk";
				clocks = <&misc_clk_2>;
				compatible = "xlnx,v-frmbuf-wr-2.3", "xlnx,axi-frmbuf-wr-v2.2";
				interrupt-names = "interrupt";
				interrupt-parent = <&gic>;
				interrupts = <0 105 4>;
				reg = <0x0 0xb0010000 0x0 0x10000>;
				reset-gpios = <&gpio 78 1>;
				xlnx,dma-addr-width = <32>;
				xlnx,dma-align = <16>;
				xlnx,max-height = <2160>;
				xlnx,max-width = <3840>;
				xlnx,pixels-per-clock = <2>;
				xlnx,s-axi-ctrl-addr-width = <0x7>;
				xlnx,s-axi-ctrl-data-width = <0x20>;
				xlnx,vid-formats = "nv12";
				xlnx,video-width = <8>;
			};
			vcu_vcu_0: vcu@80100000 {
				#address-cells = <2>;
				#clock-cells = <1>;
				#size-cells = <2>;
				clock-names = "pll_ref", "aclk", "vcu_core_enc", "vcu_core_dec", "vcu_mcu_enc", "vcu_mcu_dec";
				clocks = <&misc_clk_4>, <&misc_clk_0>, <&vcu_vcu_0 1>, <&vcu_vcu_0 2>, <&vcu_vcu_0 3>, <&vcu_vcu_0 4>;
				compatible = "xlnx,vcu-1.2", "xlnx,vcu";
				interrupt-names = "vcu_host_interrupt";
				interrupt-parent = <&gic>;
				interrupts = <0 106 4>;
				ranges ;
				reg = <0x0 0x80140000 0x0 0x1000>, <0x0 0x80141000 0x0 0x1000>;
				reg-names = "vcu_slcr", "logicore";
				reset-gpios = <&gpio 80 0>;
				encoder: al5e@80100000 {
					compatible = "al,al5e-1.2", "al,al5e";
					interrupt-parent = <&gic>;
					interrupts = <0 106 4>;
					reg = <0x0 0x80100000 0x0 0x10000>;
				};
				decoder: al5d@80120000 {
					compatible = "al,al5d-1.2", "al,al5d";
					interrupt-parent = <&gic>;
					interrupts = <0 106 4>;
					reg = <0x0 0x80120000 0x0 0x10000>;
				};
			};
			misc_clk_4: misc_clk_4 {
				#clock-cells = <0>;
				clock-frequency = <49999500>;
				compatible = "fixed-clock";
			};
			zyxclmm_drm {
				compatible = "xlnx,zocl";
			};
			vcap_capture_pipeline_mipi_csi2_rx_subsyst_0 {
				compatible = "xlnx,video";
				dma-names = "port0";
				dmas = <&capture_pipeline_v_frmbuf_wr_0 0>;
				vcap_portscapture_pipeline_mipi_csi2_rx_subsyst_0: ports {
					#address-cells = <1>;
					#size-cells = <0>;
					vcap_portcapture_pipeline_mipi_csi2_rx_subsyst_0: port@0 {
						direction = "input";
						reg = <0>;
						capture_pipeline_v_frmbuf_wr_0capture_pipeline_mipi_csi2_rx_subsyst_0: endpoint {
							remote-endpoint = <&mipi_csirx_outcapture_pipeline_mipi_csi2_rx_subsyst_0>;
						};
					};
				};
			};
		};
	};
};
  1. Make changes to kv260.cfg to reflect new platform requirements (platform name, clks, vivado strategy etc.). Mine looks like this:
platform=kv260_ispMipiRx_vcu_DP
save-temps=1
debug=1

# Enable profiling of data ports
[profile]
data=all:all:all

[vivado]
prop=run.impl_1.strategy=Performance_ExploreWithRemap
  1. Compile acceleration examples as in KRS docs.

That's my progress so far. My next step would have been to:

Unfortunately I have to stop here since I don't have the bandwidth to work on this anymore :(. I hope somebody else will pick this up. May be I'll find some time in the future.

@vmayoral
Copy link
Member

Thanks for the update @syed-ahmed, progress looks great to me. I'll keep this open. Keep us posted on your next steps, this should be helpful to others following your path.

@vmayoral vmayoral assigned syed-ahmed and unassigned vmayoral Mar 30, 2022
@vmayoral
Copy link
Member

I'm closing this for now @syed-ahmed, feel free to re-open or ping me if anything else is needed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

2 participants