Skip to content

[pull] master from FFmpeg:master#783

Merged
pull[bot] merged 14 commits intoksti:masterfrom
FFmpeg:master
May 15, 2020
Merged

[pull] master from FFmpeg:master#783
pull[bot] merged 14 commits intoksti:masterfrom
FFmpeg:master

Conversation

@pull
Copy link
Copy Markdown

@pull pull bot commented May 15, 2020

See Commits and Changes for more details.


Created by pull[bot]. Want to support this open source service? Please star it : )

mstorsjo and others added 14 commits May 15, 2020 21:22
On windows and darwin (and modern android), the x18 register is reserved
and shouldn't be modified by user code, while it is freely available on
linux. Strictly avoid it, to keep the assembly code portable.

This would have helped catch the issue fixed in 872790b
immediately.

Signed-off-by: Martin Storsjö <martin@martin.st>
This makes it easier to share code with e.g. the dav1d implementation
of checkasm.

Signed-off-by: Martin Storsjö <martin@martin.st>
We should just use a normal bl here, and the linker will add the 'x'
bit if necessary.

This fixes calling the checkasm_fail_func on windows, where the
code is built in thumb mode (and the linker doesn't clear the 'x'
bit in the blx instruction).

Signed-off-by: Martin Storsjö <martin@martin.st>
Figure out the number of stack parameters and make sure that the
value on the stack after those is untouched.

Signed-off-by: Martin Storsjö <martin@martin.st>
Also fill x8-x17 with garbage before calling the function.

Figure out the number of stack parameters and make sure that the
value on the stack after those is untouched.

Signed-off-by: Martin Storsjö <martin@martin.st>
Signed-off-by: Dale Curtis <dalecurtis@chromium.org>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Many places are using their own custom code for handling overflow
around timestamps or other int64_t values. There are enough of these
now that having some common saturated math functions seems sound.

Signed-off-by: Dale Curtis <dalecurtis@chromium.org>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
If you have a file with multiple Metadata Keys, the second time you parse
the keys, you will re-alloc c->meta_keys without freeing the old one.
This change will avoid parsing all the consecutive Metadata keys.

Reviewed-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
The NEON hscale function only supports X8 filter sizes and should only
be selected when these are being used. At the moment filterAlign is
set to 8 but in the future when extra NEON assembly for specific sizes is
added they will need to have checks here too.

The immediate usecase for this change is making the hscale checkasm
test easier and without NEON specific edge-cases (x86 already has these
guards).

This applies the same fix from 718c8f9
on the 32 bit arm version of the function, fixing fate-checkasm-sw_scale
there.

Signed-off-by: Martin Storsjö <martin@martin.st>
…xels_unaligned

Signed-off-by: Martin Storsjö <martin@martin.st>
                         Cortex A7     A8     A9    A53   A72
get_pixels_c:                144.7  146.0  143.0  137.7   69.0
get_pixels_armv6:            112.0  106.7   90.2   95.0   72.5
get_pixels_neon:              69.0   29.7   68.7   40.2   19.0
get_pixels_unaligned_c:      144.7  146.2  143.0  137.7   69.0
get_pixels_unaligned_neon:    77.0   36.5   72.5   48.5   19.0
diff_pixels_c:               376.7  319.7  265.5  307.7  148.0
diff_pixels_armv6:           179.0  159.5  205.5  139.0  142.0
diff_pixels_neon:             69.0   40.2   77.5   53.2   26.0
diff_pixels_unaligned_c:     376.7  319.7  265.5  307.7  148.0
diff_pixels_unaligned_neon:   85.0   54.5   93.5   66.7   26.0

Signed-off-by: Martin Storsjö <martin@martin.st>
                        Cortex A53    A72    A73
get_pixels_c:                140.7   87.7   72.5
get_pixels_neon:              46.0   20.0   19.5
get_pixels_unaligned_c:      140.7   87.7   73.0
get_pixels_unaligned_neon:    49.2   20.2   26.2
diff_pixels_c:               209.7  133.7  138.7
diff_pixels_neon:             54.2   31.7   23.5
diff_pixels_unaligned_c:     209.7  134.2  139.0
diff_pixels_unaligned_neon:   68.0   27.7   41.7

Signed-off-by: Martin Storsjö <martin@martin.st>
Signed-off-by: Martin Storsjö <martin@martin.st>
This allows speeding up format conversions from yuv420 to nv12.

                             Cortex A53      A72      A73
interleave_bytes_c:             86077.5  51433.0  66972.0
interleave_bytes_neon:          19701.7  23019.2  15859.2
interleave_bytes_aligned_c:     86603.0  52017.2  67484.2
interleave_bytes_aligned_neon:   9061.0   7623.0   6309.0

Signed-off-by: Martin Storsjö <martin@martin.st>
@pull pull bot added the ⤵️ pull label May 15, 2020
@pull pull bot merged commit e0604d5 into ksti:master May 15, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants