Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Revert "mm: alloc_contig: re-allow CMA to compact FS pages" #2647

Merged
merged 1 commit into from
Aug 17, 2018

Conversation

popcornmix
Copy link
Collaborator

The upstream commit caused poor video playback performance
on a busy system for many users.

See: #2503

This reverts commit 424f6c4.

Signed-off-by: popcornmix popcornmix@gmail.com

The upstream commit caused poor video playback performance
on a busy system for many users.

See: raspberrypi#2503

This reverts commit 424f6c4.

Signed-off-by: popcornmix <popcornmix@gmail.com>
@popcornmix
Copy link
Collaborator Author

Ping @pelwell @6by9 @lategoodbye for comments.

@pelwell
Copy link
Contributor

pelwell commented Aug 15, 2018

I'm not going to drill down to understand why the original commit degraded performance, although I know from past experience that CMA compactions can be slow so having more could slow things down. If reverting causes a noticeable improvement in some use cases then I'm happy to merge, at least until a negative side-effect is discovered.

@lategoodbye
Copy link
Contributor

Thanks for pinging me. Unfortunately memory management is beyond my knowledge.

I think reverting such fundamental things is a temporary solution to make the user happy again. But we still need to identify and fix the root cause in the relevant drivers. Since this is CMA related, i think we can exclude sdhost and dwc_otg.

@6by9
Copy link
Contributor

6by9 commented Aug 16, 2018

Likewise not an area I know lots about.

I guess this leaves us in a position that could result in a more fragmented CMA heap. That'll affect the vc4 driver, and potentially mem_allocs when I get that merged.

I know Kodi can use vc4-kms-v3d, but I didn't think it did by default. Video decode doesn't currently either, so it seems strange that there is anything triggering a CMA compaction.
alsa, or USB?
Does increasing the size of the CMA heap defer the problem?
Is it just that compaction is hitting memory bandwidth hard enough to affect other things?

I'm not dismissing it, but it does raise questions that really need to be answered. Can we reproduce the problem here?

@popcornmix
Copy link
Collaborator Author

I've not reproduced it. It needs more than just kodi running to see the problem.
(People with issues tend to be recording and playing live TV or running background jobs like torrents).

Hard to say if an actual cma allocation is getting blocked, or it's just just busy cpu/sdram than blocks things. As a guess I'd say the VCHIQ messages are getting blocked for approaching 100ms (typical amount of queued audio on VPU), hence the audio glitch. But I don't believe VCHIQ messages rely on CMA allocations.

@popcornmix
Copy link
Collaborator Author

I guess this leaves us in a position that could result in a more fragmented CMA heap. That'll affect the vc4 driver, and potentially mem_allocs when I get that merged.

The CMA compaction problem didn't exist on 4.9 kernel, so I'm assuming it wasn't done there. So probably we are no worse off than on current 4.9 kernel.

@6by9
Copy link
Contributor

6by9 commented Aug 16, 2018

Sorry, I'd misremembered the thread as it being external sound cards rather than HDMI audio.

The CMA compaction problem didn't exist on 4.9 kernel, so I'm assuming it wasn't done there. So probably we are no worse off than on current 4.9 kernel.

Very true.
I thought I'd found the debug stats for cma, but I may be misremembering the dmabuf stuff. We could really do with some cma profiling generally (particularly if gpu_mem gets shifted to cma), and that would be a good start.

@6by9
Copy link
Contributor

6by9 commented Aug 16, 2018

I knew I'd seen CMA debugging - it's under CONFIG_CMA_DEBUGFS which isn't enabled by the standard configs.

There's also CONFIG_CMA_DEBUG which logs all CMA operations (probably overkill for most situations).

@popcornmix
Copy link
Collaborator Author

cma=64M added to cmdline.txt was reported to not help.

@6by9
Copy link
Contributor

6by9 commented Aug 16, 2018

I'm happy enough with the revert then as it has been shown to solve the problem.

CMA is used for all (almost?) DMA allocations.
Having enabled CONFIG_CMA_DEBUG, something like connecting a USB device will log things like

[  249.785355] tveeprom: Hauppauge model 204109, rev B3I6, serial# 13918364
[  249.785363] tveeprom: tuner model is SiLabs Si2157 (idx 186, type 4)
[  249.785368] tveeprom: TV standards PAL(B/G) NTSC(M) PAL(I) SECAM(L/L') PAL(D/D1/K) ATSC/DVB Digital (eeprom 0xfc)
[  249.785371] tveeprom: audio processor is None (idx 0)
[  249.785374] tveeprom: has no radio, has IR receiver, has no IR transmitter
[  249.785382] em28xx 1-1.1.3:1.0: dvb set to isoc mode.
[  249.785622] usbcore: registered new interface driver em28xx
[  249.828145] em28xx 1-1.1.3:1.0: Binding DVB extension
[  249.828575] cma: cma_alloc(cma 80d10fc0, count 15, align 4)
[  249.828817] cma: cma_alloc(): returned afd0d880
[  249.828902] cma: cma_alloc(cma 80d10fc0, count 15, align 4)
[  249.828934] cma: cma_alloc(): returned afd0dac0
[  249.829012] cma: cma_alloc(cma 80d10fc0, count 15, align 4)
[  249.829090] cma: cma_alloc(): returned afd0dd00
[  249.829193] cma: cma_alloc(cma 80d10fc0, count 15, align 4)
[  249.829265] cma: cma_alloc(): returned afd0df40
[  249.829357] cma: cma_alloc(cma 80d10fc0, count 15, align 4)
[  249.829451] cma: cma_alloc(): returned afd0e180
[  249.861013] i2c i2c-4: Added multiplexed i2c bus 5
...
[  389.513743] usb 1-1.1.3: USB disconnect, device number 8
[  389.513878] em28xx 1-1.1.3:1.0: Disconnecting
[  389.513883] em28xx 1-1.1.3:1.0: Closing DVB extension
[  389.513895] cma: cma_release(page afd0d880)
[  389.513905] cma: cma_release(page afd0dac0)
[  389.513913] cma: cma_release(page afd0dd00)
[  389.513921] cma: cma_release(page afd0df40)
[  389.513940] cma: cma_release(page afd0e180)

Enabling CONFIG_CMA_DEBUGFS by default wouldn't be a bad thing, and then /sys/kernel/debug/cma/cma-reserved/bitmap gives you an indication of how used the CMA heap is.
cat /proc/vmstats also dumps out a number of entries starting compact_, so that might reveal something on these glitching systems.

I'll investigate whether we can easily add dumping info on all the allocations currently made via CMA. It'll only be a rough idea as all allocations are done in units of pages by the time it hits CMA.

@6by9
Copy link
Contributor

6by9 commented Aug 16, 2018

I'll investigate whether we can easily add dumping info on all the allocations currently made via CMA. It'll only be a rough idea as all allocations are done in units of pages by the time it hits CMA.

Nope, not possible. CMA itself is just keeping a list of used pages in a bitmask, not a list of who has allocated anything, nor the size of those allocations. If other subsystems screw up, then they could ask to allocate 5 pages, but then release 6 pages.
It'll require a chunk more back tracing to see if there is a master list of allocations (I suspect not for all allocations seeing as Android's ION allocator over CMA goes direct to the CMA functions, whilst everything else goes via DMA calls).

@popcornmix
Copy link
Collaborator Author

Improvement is confirmed in Milhouse LibreELEC nightly. No complaints.
I'll merge for now, with the hope a better solution will replace the revert.

@popcornmix popcornmix merged commit 00f0e83 into raspberrypi:rpi-4.14.y Aug 17, 2018
@popcornmix popcornmix deleted the revert_cma_compact branch August 17, 2018 15:16
samnazarko added a commit to osmc/osmc that referenced this pull request Aug 19, 2018
popcornmix added a commit to raspberrypi/firmware that referenced this pull request Aug 24, 2018
kernel: bcm2835: interpolate audio delay
See: #1026

kernel: dtoverlays: Add support for ADV7280-M and ADV7281-M chips
See: raspberrypi/linux#2656

kernel: First batch of fixes for V4L2 camera driver
See: raspberrypi/linux#2609

kernel: Revert mm: alloc_contig: re-allow CMA to compact FS pages
See: raspberrypi/linux#2647

firmware: rawcam: Fix double buffer return issue
firmware: rawcam: Code cleanup

firmware: host_apps: Fixup partially merged commit from userland
See: #1027

firmware: mmal: Add KEEP_PORT_FORMATS flag to mmal connection
See: raspberrypi/userland#483

firmware: RaspiStill: Apply gpsd info as EXIF tags
See: raspberrypi/userland#286
popcornmix added a commit to Hexxeh/rpi-firmware that referenced this pull request Aug 24, 2018
kernel: bcm2835: interpolate audio delay
See: raspberrypi/firmware#1026

kernel: dtoverlays: Add support for ADV7280-M and ADV7281-M chips
See: raspberrypi/linux#2656

kernel: First batch of fixes for V4L2 camera driver
See: raspberrypi/linux#2609

kernel: Revert mm: alloc_contig: re-allow CMA to compact FS pages
See: raspberrypi/linux#2647

firmware: rawcam: Fix double buffer return issue
firmware: rawcam: Code cleanup

firmware: host_apps: Fixup partially merged commit from userland
See: raspberrypi/firmware#1027

firmware: mmal: Add KEEP_PORT_FORMATS flag to mmal connection
See: raspberrypi/userland#483

firmware: RaspiStill: Apply gpsd info as EXIF tags
See: raspberrypi/userland#286
5schatten pushed a commit to 5schatten/LibreELEC-RR that referenced this pull request Sep 5, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants