Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ffmpeg] Backport of armv6 and vfp optmisations for DTS #3016

Merged
merged 13 commits into from
Aug 3, 2013

Conversation

popcornmix
Copy link
Member

There have been a sequence of optimisations accepted to upstream libavcodec/ffmpeg designed to make software decode of DTS run better on the Raspberry Pi.

Since DTS is a floating point codec, the optimisations mostly rely upon hand-scheduled VFP code and the use
of short vectors.

Overall this reduces the time spent in DTS by about 36%. This makes software DTS decode on Pi much more feasible without overclocking.

These patches can be dropped when ffmpeg is updated to a sufficiently new version. These patches aren't present in v.2.0, which may be the next logical upgrade step for ffmpeg, so I'd like to get these cherry-picked, rather than waiting for an ffmpeg bump.

The patches have been present in rbej builds for a few weeks and no regressions have been reported.

@elupus
Copy link
Contributor

elupus commented Jul 28, 2013

fine be me. @aballier any comments?

@davilla
Copy link
Contributor

davilla commented Jul 28, 2013

needs testing under iOS since it uses ASLR and this will cause issues if symbols are not handled correctly.

@aballier
Copy link
Contributor

fine by me; I don't know much about arm asm and since those come from upstream they should be fine

@MartijnKaijser
Copy link
Member

don't these patches need to be added in the patch file folder too?
https://github.com/xbmc/xbmc/tree/master/lib/ffmpeg/patches

Edit:
nevermind. read the last commit wrong

@elupus
Copy link
Contributor

elupus commented Jul 30, 2013

yes since they are likely not in the next target ffmpeg.

On Mon, Jul 29, 2013 at 8:40 PM, Martijn Kaijser
notifications@github.comwrote:

don't these patches need to be added in the patch file folder too?
https://github.com/xbmc/xbmc/tree/master/lib/ffmpeg/patches


Reply to this email directly or view it on GitHubhttps://github.com//pull/3016#issuecomment-21741496
.

@popcornmix
Copy link
Member Author

@elupus
the patches are present.

Can anyone test this on iOS or android?

jenkins build this please

@davilla
Copy link
Contributor

davilla commented Aug 1, 2013

reminder, this does NOT go in until tested under iOS.

@popcornmix
Copy link
Member Author

Understood. Jenkins did give a failure on ATV2 (although it disappeared with the rebase to bump the patch numbers).
Looked like the assembler not supporting a feature used in the assembly - I'll check with original author.

@popcornmix
Copy link
Member Author

Hmmm. The error reported by jenkins:

http://jenkins.xbmc.org/job/XBMC-IOS-ATV2/273/console

libavcodec/arm/mdct_vfp.S:114:unknown register alias 'TCOS_D0_HEAD' libavcodec/arm/mdct_vfp.S:115:unknown register alias 'TCOS_D1_HEAD' libavcodec/arm/mdct_vfp.S:116:unknown register alias 'TCOS_S0_TAIL' make[2]: *** [libavcodec/arm/mdct_vfp.o] Error 1 make[2]: *** Waiting for unfinished jobs....

doesn't seem to match the code.
popcornmix@e7fce26#L2R114

Do we have a way to view files on jenkins?

Just in case the rebase (renumbering of the patches, which shoudn't be used in build) caused a build issue, I'll try again.

jenkins build this please

@Memphiz
Copy link
Member

Memphiz commented Aug 1, 2013

You can watch the workspace of a Job/Slave here

http://jenkins.xbmc.org/job/XBMC-IOS-ATV2/ws/

Though this is the workspace of the current Build now - not of this Pr (you need to Look at it when it is Building...)

@popcornmix
Copy link
Member Author

Same errors again. Can anyone offer an explanation as to what's wrong on ATV2 build?
libavcodec/arm/mdct_vfp.S:114 has no reference to TCOS_D0_HEAD.

@Memphiz
Copy link
Member

Memphiz commented Aug 2, 2013

lib/ffmpeg/patches/0045-ffmpeg-backport-arm-Add-VFP-accelerated-version-of-i.patch

this patch adds a reference to TCOS_D0_HEAD

@Memphiz
Copy link
Member

Memphiz commented Aug 2, 2013

@popcornmix - please try there newest gas-preprocessor.pl from here:

http://git.libav.org/?p=gas-preprocessor.git;a=log

just stick it in tools/depends/native/gas-preprocessor-native and redo a jenkins build with your branch afterwards.

@popcornmix
Copy link
Member Author

Trying with updated gas-preprocessor.pl
jenkins build this please

@Memphiz
Copy link
Member

Memphiz commented Aug 2, 2013

Now its going crazy on droid and ios/atv2. Someone hasn't altered the xcode project files for all darwin platforms as it seems. For android i don't know what the issue is:

"cp: cannot stat `xbmc/xbmc.d': No such file or directory
make: *** [xbmc/xbmc.o] Error 1"

Nevertheless - the gas-preprocessor fixed the ffmpeg compile for ios ...

@Memphiz
Copy link
Member

Memphiz commented Aug 2, 2013

As is see davilla fixed the xcode project. A rebase to master might fix the build for ios/atv2.

@sraue
Copy link
Member

sraue commented Aug 2, 2013

@Memphiz
"cp: cannot stat `xbmc/xbmc.d': No such file or directory
make: *** [xbmc/xbmc.o] Error 1"

this i get sometimes under linux too for some time now, removing the sourcedir and rebuild helps in this case (here)

popcornmix and others added 13 commits August 2, 2013 14:22
…r_float

               Before           After
               Mean    StdDev   Mean    StdDev  Change
This function   9295.0 114.9     4853.2 83.5    +91.5%
Overall        23699.8 397.6    19285.5 292.0   +22.9%

Signed-off-by: Martin Storsjö <martin@martin.st>
…oat_fmul_scalar

               Before           After
               Mean    StdDev   Mean    StdDev  Change
This function   1175.0   4.4      366.2  18.3   +220.8%
Overall        19285.5 292.0    18420.5 489.1     +4.7%

Signed-off-by: Martin Storsjö <martin@martin.st>
…ul_array8

This is similar to int32_to_float_fmul_scalar, but
loads a new scalar multiplier every 8 input samples.
This enables the use of much larger input arrays, which
is important for pipelining on some CPUs (such as
ARMv6).

Signed-off-by: Martin Storsjö <martin@martin.st>
Signed-off-by: Martin Storsjö <martin@martin.st>
…oat_fmul_array8

               Before           After
               Mean    StdDev   Mean    StdDev  Change
This function    366.2  18.3      277.8  13.7   +31.9%
Overall        18420.5 489.1    17049.5 408.2    +8.0%

Signed-off-by: Martin Storsjö <martin@martin.st>
               Before           After
               Mean    StdDev   Mean    StdDev  Change
This function   2653.0  28.5     1108.8  51.4   +139.3%
Overall        17049.5 408.2    15973.0 223.2     +6.7%

Signed-off-by: Martin Storsjö <martin@martin.st>
               Before           After
               Mean    StdDev   Mean    StdDev  Change
This function    868.2  33.5      436.0  27.0   +99.1%
Overall        15973.0 223.2    15577.5  83.2    +2.5%

Signed-off-by: Martin Storsjö <martin@martin.st>
               Before           After
               Mean    StdDev   Mean    StdDev  Change
This function   1389.3  4.2       967.8  35.1   +43.6%
Overall        15577.5 83.2     15400.0 336.4    +1.2%

Signed-off-by: Martin Storsjö <martin@martin.st>
This does most of the work formerly carried out by
the static function qmf_32_subbands() in dcadec.c.

Signed-off-by: Martin Storsjö <martin@martin.st>
…ands

               Before           After
               Mean    StdDev   Mean    StdDev  Change
This function   1323.0  98.0      746.2  60.6   +77.3%
Overall        15400.0 336.4    14147.5 288.4    +8.9%

Signed-off-by: Martin Storsjö <martin@martin.st>
…p assembly files

Reviewed-by: Kostya Shishkov
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
@popcornmix
Copy link
Member Author

Rebased.
jenkins build this please
(and don't mess it up this time...)

@Memphiz
Copy link
Member

Memphiz commented Aug 2, 2013

@sraue - shouldn't be an issue on jenkins though as we git clean -xfd the tree before building (in some cases without depends though ...)

@davilla
Copy link
Contributor

davilla commented Aug 2, 2013

wow, gas-preprocessor.pl got bumped :)

@popcornmix
Copy link
Member Author

Woo-hoo!
A pass from Jenkins.

Do you want the gas-preprocessor.pl bump in a separate PR, or is it okay here?
Can anyone run this on ATV2?

@Memphiz
Copy link
Member

Memphiz commented Aug 2, 2013

It is ok in here imo. I pulled the debs for ios and atv2 from the buildslave and will upload it on my dropbox in a min for willing testers ... (not sure if i myself have time to try it).

@davilla do you think this one has potential to panic the atv2 kernel again? ^^

@davilla
Copy link
Contributor

davilla commented Aug 2, 2013

@Memphiz , until it gets tested, I worry above any asm changes in ffmpeg. it's just so picky.

@Memphiz
Copy link
Member

Memphiz commented Aug 3, 2013

testing ...

@Memphiz
Copy link
Member

Memphiz commented Aug 3, 2013

This is good to go for atv2 and ios. I've tested DTS 2 stereo decode and it worked flawless. Tough DTS passthrough shouldn't be affected by this - i confirmed it is still working too.

popcornmix added a commit that referenced this pull request Aug 3, 2013
[ffmpeg] Backport of armv6 and vfp optmisations for DTS
@popcornmix popcornmix merged commit eccd7ba into xbmc:master Aug 3, 2013
@popcornmix
Copy link
Member Author

Thanks for testing Memphiz.

@popcornmix popcornmix deleted the ffmpeg_dts_up branch August 3, 2013 17:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants