Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix FMA4 detection #262

Merged
merged 1 commit into from May 16, 2019

Conversation

@colesbury
Copy link
Contributor

commented May 16, 2019

FMA4 support is in bit 16 of register ECX, not EDX of the "extended
processor info" (0x80000001).

The mapping of registers to reg is:

reg[0] = eax
reg[1] = ebx
reg[2] = ecx <---
reg[3] = edx

Bit 16 of EDX is PAT (Page Attribute Table) on AMD CPUs, which is widely
supported. Intel CPUs do not set this bit. This causes "Illegal instruction"
errors on AMD CPUs that do not support FMA4.

See pytorch/pytorch#12112
See #261

http://developer.amd.com/wordpress/media/2012/10/254811.pdf (Page 20)

Fixes #261

FMA4 support is in bit 16 of register ECX, not EDX of the "extended
processor info" (0x80000001).

The mapping of registers to reg is:

  reg[0] = eax
  reg[1] = ebx
  reg[2] = ecx <---
  reg[3] = edx

Bit 16 of EDX is PAT (Page Attribute Table) on AMD CPUs, which is widely
supported. Intel CPUs do not set this bit. This causes "Illegal instruction"
errors on AMD CPUs that do not support FMA4.

See pytorch/pytorch#12112
See #261

http://developer.amd.com/wordpress/media/2012/10/254811.pdf (Page 20)
@colesbury colesbury referenced this pull request May 16, 2019
@shibatch shibatch merged commit 939f753 into shibatch:master May 16, 2019
3 checks passed
3 checks passed
continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/jenkins/pr-merge This commit looks good
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants
You can’t perform that action at this time.