Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dispmanx fails with zero-copied MMAL buffer? #386

Open
Terminus-IMRC opened this issue Mar 29, 2017 · 25 comments
Open

Dispmanx fails with zero-copied MMAL buffer? #386

Terminus-IMRC opened this issue Mar 29, 2017 · 25 comments
Assignees

Comments

@Terminus-IMRC
Copy link
Contributor

@Terminus-IMRC Terminus-IMRC commented Mar 29, 2017

It seems that Dispmanx fails with zero-copied MMAL buffer.

Please try with this code:
https://github.com/Terminus-IMRC/dispmanx-fails-with-zero-copied-mmal-buffer
This code gets 128x128 random image from vc.ril.source component and feeds it to dispmanx to display it on screen.
img_9947

If you keep #define ZERO_COPY 0, it will work.
If you change this to #define ZERO_COPY 1, it will fail with this error:
main.c:85: Dispmanx call failed: 0xffffffff
I also confirmed this issue with vc.ril.camera component.

Am I wrong in usage of the APIs?

Thanks.

@6by9
Copy link
Contributor

@6by9 6by9 commented Mar 29, 2017

Everything looks sane in your code, but you are perfectly correct that dispmanx rejects the buffer.
Enabling VPU side asserts I see

assert( !success ) failed; ../../../../../applications/vmcs/dispserve/display_server.c::ds_dispmanx_bulk_write line 1593 rev 50fff83

firing every time the dispmanx call is being made. That is a VCHI bulk receive call failing
The cause appears to be the bulk transfer aborting, but I've not drilled down into why.

@pelwell Any immediate thoughts? vchiq_bulk_transfer in vchiq_core.c is failing through the
else if (bulk_waiter.actual == VCHIQ_BULK_ACTUAL_ABORTED) {
clause on the bulk receive on the VPU side.
Is the transfer going to get upset that source and destination are both in the reloc heap?

@luked99
Copy link
Contributor

@luked99 luked99 commented Mar 29, 2017

@6by9
Copy link
Contributor

@6by9 6by9 commented Mar 29, 2017

I might be talking rubbish, but why is dispmanx even doing bulk receive if we're doing zero copy?

MMAL is doing zero copy. DispmanX is doing the bulk transfer back the other way. Nothing fundamentally wrong with that as vc_dispmanx_resource_write_data just takes a user pointer to a block of memory.

Now there is a potential win by using

vc_dispmanx_resource_write_data_handle( DISPMANX_RESOURCE_HANDLE_T res, VC_IMAGE_TYPE_T src_type, int src_pitch, VCHI_MEM_HANDLE_T handle, uint32_t offset, const VC_RECT_T * rect );

instead. However you've then got to manually set up the MMAL buffer headers instead of allowing zero copy and mmal_port_pool_create to do the magic of switching buffer->data back and forth between MEM_HANDLE and user pointer.
I'm not certain that it actually saves the copy either, just alters how the source buffer is specified.

@pelwell
Copy link
Contributor

@pelwell pelwell commented Mar 29, 2017

No, that sounds like a reasonable question.

The bulk transmit is failing on the ARM side because the kernel doesn't know how to map the buffer it is given:

[   42.035046] vchiq: vchiq_ioctl - instance b794b000, cmd QUEUE_BULK_TRANSMIT, arg 7e8e35cc
[   42.037496] vchiq: create_pagelist - ba90a000 10000@76eb6000
[   42.037515] vchiq: create_pagelist - only -14/16 pages locked
[   42.037538] vchiq:   ioctl instance b794b000, cmd QUEUE_BULK_TRANSMIT -> status -1, -5

That -14 is EFAULT. Contrast it with the ZERO_COPY=0 case:

[  404.844633] vchiq: vchiq_ioctl - instance b794c000, cmd QUEUE_BULK_TRANSMIT, arg 7ec025cc
[  404.851503] vchiq: create_pagelist - ba90a000 10000@00b53640
[  404.851602] vchiq: 0: bt (3->10) tx 10000@fa90a000 b8f56e00
[  404.851614] vchiq: 0: qm BULK_TX@ba9a2de8,8 (3->10)
[  404.851627] vchiq: 0: bt:3 tx li=1 ri=0 p=0
[  404.851838] vchiq: 0: prs BULK_TX_DONE@ba983198 (10->3) 10000@fa90a000
[  404.851849] vchiq: 0: prs:3 tx li=1 ri=1 p=0
[  404.851857] vchiq: free_pagelist - ba90a000, 65536
[  404.851904] vchiq: 0: nb:3 tx - p=1 rn=0 r=0
[  404.851932] vchiq:   ioctl instance b794c000, cmd QUEUE_BULK_TRANSMIT -> status 0, 0

The buffer pointer in this case is clearly a regular user buffer.

What exactly is being sent when ZERO_COPY=1?

@6by9
Copy link
Contributor

@6by9 6by9 commented Mar 29, 2017

What exactly is being sent when ZERO_COPY=1?

VCSM should be doing the allocation via mmal_vc_shm_alloc (https://github.com/raspberrypi/userland/blob/master/interface/mmal/vc/mmal_vc_shm.c#L145).

As the buffer comes back from the VPU, mmal_vc_do_callback calls mmal_vc_shm_lock to cause VCSM to lock the buffer and return the user pointer into buffer->data. (We disabled the bit that relied on the fault handler doing the VCSM lock due to deadlock issues).
That is the pointer that is being passed in to dispmanx.

A guess: If the ARM hasn't touched the actual buffers then we haven't called the vcsm_vma_fault (https://github.com/raspberrypi/linux/blob/rpi-4.9.y/drivers/char/broadcom/vc_sm/vmcs_sm.c#L1122). Am I right in thinking that fault handler is what is going to lock the buffers?
I had tried just touching the first byte of the buffer, but we've got multiple pages there. I'll try adding a few more reads to see if it helps.

@6by9
Copy link
Contributor

@6by9 6by9 commented Mar 29, 2017

I had tried just touching the first byte of the buffer, but we've got multiple pages there. I'll try adding a few more reads to see if it helps.

Nope. not that.

@Terminus-IMRC
Copy link
Contributor Author

@Terminus-IMRC Terminus-IMRC commented Mar 29, 2017

It seems that vc_dispmanx_resource_write_data_handle and vc_dispmanx_resource_write_data are both doing bulk transmitting, so they aren't able to do zero-copy from MMAL to dispmanx? (Maybe I'm wrong here.)

@6by9
Copy link
Contributor

@6by9 6by9 commented Mar 29, 2017

It seems that vc_dispmanx_resource_write_data_handle and vc_dispmanx_resource_write_data are both doing bulk transmitting, so they aren't able to do zero-copy from MMAL to dispmanx? (Maybe I'm wrong here.)

I think you're correct there. They are both writing data into an existing DispmanX resource, so they have to copy the pixel data.
I'm not sure if there is an API call that allows a client to pass in an existing MEM_HANDLE_T to be used as the source data. I couldn't see it with a quick look.

@Terminus-IMRC
Copy link
Contributor Author

@Terminus-IMRC Terminus-IMRC commented Mar 31, 2017

Excuse me for wandering from the subject, but I found that since MMAL's vc.ril.video_render supports RGBA and zero-copy rendering, video_render can be used as a zero-copy rendering API instead of DispmanX. (video_render uses DispmanX internally however).

@6by9
Copy link
Contributor

@6by9 6by9 commented Mar 31, 2017

Yes, video_render gives you an alternative route to the same result. Using a MMAL_CONNECTION_T with the tunnelling flag can also pass the buffers between the components without needing any intervention from the ARM at all.
I was sort of assuming you had some specific reason for wanting to use DispmanX in this manner, and it should work so worth investigating why it fails (although not got much resource to investigate it at the moment).

@Terminus-IMRC
Copy link
Contributor Author

@Terminus-IMRC Terminus-IMRC commented Apr 1, 2017

Yes. I want VPU to do alpha blending if possible.

@6by9
Copy link
Contributor

@6by9 6by9 commented May 23, 2017

I've just hit this same issue when trying to take a CMA allocation made using V4L2, mmaped into the user process, and then passing that user pointer into VCHIQ as a MMAL buffer.

get_user_pages is calling check_vma_flags(vma, gup_flags). check_vma_flags is returning -EFAULT because vm_flags has both VM_IO and VM_PFNMAP set - http://elixir.free-electrons.com/linux/v4.9/source/mm/gup.c#L435

vm_flags is 0x040444FB, which corresponds to

VM_READ 
VM_WRITE
VM_SHARED
VM_MAYREAD
VM_MAYWRITE
VM_MAYEXEC
VM_MAYSHARE
VM_PFNMAP  /* Page-ranges managed without "struct page", just pure PFN */
VM_IO           /* Memory mapped I/O or similar */
VM_DONTEXPAND
VM_DONTDUMP

That looks all valid, but having VM_PFNMAP and VM_IO is going to be different from the "normal" userspace case, and I guess needs special handling in VCHIQ. We already have a special case for vmalloc addresses, and Eric had added a special case for in a kthread when he was hacking dmabuf with the other V4L2 driver on 4.4. I suspect that it's the conditional selecting that path that is wrong. anholt/linux@47e15c3

@pelwell
Copy link
Contributor

@pelwell pelwell commented May 23, 2017

Just to clarify, check_vma_flags returns -EFAULT when either VM_IO or VM_PFNMAP is set - it just happens that in this case both are set.

@6by9
Copy link
Contributor

@6by9 6by9 commented May 23, 2017

Yes, bad wording by me. Sorry.

They are both set due to remap_pfn_range being used to map the buffer in the first place.
http://elixir.free-electrons.com/linux/v4.9/source/mm/memory.c#L1760
VCSM uses the same call to remap gpu_mem into the kernel, so that confirms that the zero copy path is having issues for the same reason.

I'm trying to find my way through the maze of twisty passages all alike to work out how VCHIQ can get that VM_PFNMAP flag and therefore tell that this buffer is remapped so should be treated as a list of PFNs.

@JamesH65
Copy link
Collaborator

@JamesH65 JamesH65 commented Jul 2, 2018

@6by9 Any progress? I know there been quite a bit of work in this area.

@6by9
Copy link
Contributor

@6by9 6by9 commented Jul 2, 2018

I have a patch queued that has a chance of solving this, but it's not a case I've specifically tested.

@Terminus-IMRC
Copy link
Contributor Author

@Terminus-IMRC Terminus-IMRC commented Jul 2, 2018

Would you please tell me where the patch is and I'll be able to test it.

@6by9
Copy link
Contributor

@6by9 6by9 commented Jul 2, 2018

I have a long patchset of cleaning up the V4L2 camera driver and adding a V4L2 codec driver. Not ready for a PR yet, but I've pushed it to https://github.com/6by9/linux/tree/rpi-4.14.y-codecs-push2.
6by9/linux@07e61b8 is the commit that may fix it.

@Terminus-IMRC
Copy link
Contributor Author

@Terminus-IMRC Terminus-IMRC commented Jul 2, 2018

Thanks. By the way, what is your final goal of implementing V4L2 multimedia driver though there already is MMAL which works well?

@6by9
Copy link
Contributor

@6by9 6by9 commented Jul 2, 2018

Standards.
V4L2 is the Linux standard interface for cameras and codecs, and supports things like dmabufs for zero copy of data into DRM/KMS. Whilst that can be done with MMAL it requires a load of extra setup to be done by the app.
GStreamer already supports an optimised decode path, and patches for FFmpeg are in the pipeline by others.

@Terminus-IMRC
Copy link
Contributor Author

@Terminus-IMRC Terminus-IMRC commented Jul 2, 2018

Oh, that's great! Please continue rocking!

@6by9
Copy link
Contributor

@6by9 6by9 commented Jul 5, 2018

I've just tried your test app again - no joy. The call doesn't fail anymore, but the data copied is incorrect.

@Terminus-IMRC
Copy link
Contributor Author

@Terminus-IMRC Terminus-IMRC commented Jul 7, 2018

Yeah, I ended up with the same result, and the kernel sometimes crashes when executed the app multiple times

@JamesH65
Copy link
Collaborator

@JamesH65 JamesH65 commented Jan 8, 2019

@6by9 Anything happened recently on this?

@6by9
Copy link
Contributor

@6by9 6by9 commented Jan 8, 2019

Sort of.
Please don't close at the moment as it is a good reminder for me to ensure that this case is fixed with the new version of vc-sm.

@6by9 6by9 mentioned this issue Mar 18, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
5 participants
You can’t perform that action at this time.