Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Standalone: New kernels #116

Merged
merged 1 commit into from
Dec 13, 2017
Merged

Standalone: New kernels #116

merged 1 commit into from
Dec 13, 2017

Conversation

DavidMansell
Copy link
Contributor

@DavidMansell DavidMansell commented Dec 12, 2017

This pull request adds three new kernels to the standalone benchmark:

Float32 kernel optimized for Cortex-A55r1.
uint8->uint32 kernel using Armv8.2 dot product extension with generic optimizations.
uint8->uint32 kernel using Armv8.2 dot product extension, optimized for Cortex-A55r1.

Float32 kernel optimized for Cortex-A55r1.
uint8->uint32 kernel using Armv8.2 dot product extension with generic optimizations.
uint8->uint32 kernel using Armv8.2 dot product extension, optimized for Cortex-A55r1.
@bjacob bjacob merged commit fcf32e7 into google:master Dec 13, 2017
bjacob added a commit that referenced this pull request Oct 16, 2018
instructions (UDOT) available on newer CPUs such as Cortex-A76.

This particular kernel is not tuned for one particular CPU; it does
well on Cortex-A76 in particular.  ARM had contributed another kernel
more specifically optimized for Cortex-A55r1; it would have to be
imported separately.

Context: for ARM's contributions, see
#116

Notice that the new kernel is not automatically enabled when the
instructions are present: the user is required to also define
a preprocessor token, GEMMLOWP_DOTPROD_KERNEL, to opt in to using
that kernel. Rationale: avoid worsening the ODR-violations situations
with more inline symbols having different definitions based on
preprocessor tokens. Conveniently, this will also allow in the future
to have multiple such kernels coexist, behind separate opt-in tokens.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants