Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added support for AMD's Zen3 architecture #561

Merged
merged 15 commits into from Nov 17, 2021
Merged

Conversation

dzambare
Copy link
Contributor

Added support for Zen3 Configuration.

-- Added new zen3 configuration and auto detection of zen3 architecture
-- Added configuration family amd64 for all zen architectures
-- Moved older AMD architecture in amd64_legacy family
-- Updated zen2 makefiles to pick znver2 for clang (if supported by the detected clang version)


AMD BLIS Upstream:

This PR includes following commits for AMD BLIS version 3.0.1

pick 9c7814d Added support for zen3 configuration
squash 536edc4 Added support for zen3 configuration
squash 23a2073 Added support for zen3 configuration
squash 449ee37 Added support for zen3 configuration
pick 25d23cd Zen3 support, disabled IR, JR loop parallelization
squash 9d7978e Zen3 support, disabled IR, JR loop parallelization
pick f8ab9f6 Enabled znver3 flag for zen3 architecture
squash 38a8008 Enabled znver3 flag for zen3 architecture
pick b1144a8 Added -fomit-frame-pointer option to CKOPTFLAGS.
pick ce99b1e Added dynamic block size selection logic for DGEMM.
squash f9d06c7 Added dynamic block size selection logic for DGEMM.
pick 14e2160 Update amd64 bundle configuration
squash c2f63fc Update amd64 bundle configuration

dzambare and others added 5 commits October 14, 2021 13:11
    - User can now specify zen3 configuration,
      currently it reuses block sizes and kernels from zen2.
    - Auto configuration can detect and enable if zen3 config is needed
    - Added support for amd64 bundle which contains all zen platforms
    - Moved exiting amd bundle to amd64 legacy.

AMD-Internal: [CPUPL-500, CPUPL-1013]
AMD-Internal: [CPUPL-1013]

Change-Id: I859152d63d1a56519c508dfa19587f25123e08b4
znver3 flag will be enabled if compiler is AOCC Clang version 3.0
and configuration is zen3

Change-Id: Ie164f4d469bf3f8df31ccf8fed9f80dfc62efb39
AMD-Internal: [CPUPL-1353]
Block sizes (MC, KC, NC) for DGEMM are determined at runtime
based on following parameters

    - Single or multithreaded build
    - Processor Architecture (currently support only zen3)
    - Number of threads requested while running the library

Change-Id: Ia793484b77adb87486e630d0d3b4c7856ae52094
The configuration is updated to

   - Enable EPYC architecture optimizations
   - Removed macros to override block sizes as there
     was no performance gain using them

AMD-Internal : [CPUPL-1350]
config_registry Outdated Show resolved Hide resolved
# Ignore everything in this directory
*
# Except this file
!.gitignore
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you have local files in this directory which you're not ready to push, I recommend ignoring them in .git/info/exclude which is purely local.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently we don't have any zen3 specific kernels but we do need empty folder to satisfy build scripts.

configure Outdated Show resolved Hide resolved
endif
endif
# These setting should come from makefiles for individial configuration
# included in this bundle.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't there still need to be some setting here for compiling framework code (like COPTFLAGS)?

@fgvanzee
Copy link
Member

fgvanzee commented Oct 14, 2021

  1. We need to figure out how to get Travis CI to run on this PR. (It did not, and I have no idea why.)
  2. Not sure how, but AppVeyor (which did run) failed on one of its tests (LIB_TYPE=shared, CONFIG=x86_64, CC=clang, THREADING=pthreads):
configure: manual configuration requested; configuring with 'x86_64'.
configure: checking configuration against contents of 'config_registry'.
configure: configuration 'x86_64' is registered.
configure: 'x86_64' is defined as having the following sub-configurations:
configure:    skx knl haswell sandybridge penryn generic zen3 zen2 zen generic_legacy
configure: which collectively require the following kernels:
configure:    skx knl sandybridge penryn zen3 zen2 haswell zen piledriver bulldozer generic
configure: checking sub-configurations:
configure:   'skx' is registered...and exists.
configure:   'knl' is registered...and exists.
configure:   'haswell' is registered...and exists.
configure:   'sandybridge' is registered...and exists.
configure:   'penryn' is registered...and exists.
configure:   'generic' is registered...and exists.
configure:   'zen3' is registered...and exists.
configure:   'zen2' is registered...and exists.
configure:   'zen' is registered...and exists.
configure: 'generic_legacy' is NOT registered!
configure: 
configure: *** Cannot continue with unregistered configuration 'generic_legacy'. ***
configure: 

As far as I can tell, there is no generic_legacy subconfig in the PR branch. 🤔

@dzambare
Copy link
Contributor Author

@fgvanzee , generic_legacy it is failing because there is issue with amd64_legacy configuration, interestingly amd64_legacy builds fine on its own but fails via x86_64.

@devinamatthews, thanks for the review, will address your comments.

@dzambare
Copy link
Contributor Author

dzambare commented Oct 14, 2021

@fgvanzee , generic_legacy it is failing because there is issue with amd64_legacy configuration, interestingly amd64_legacy builds fine on its own but fails via x86_64.

Seems like there is some name conflict, if I rename amd64_legacy to amd_legacy it works but amd64legacy doesn't. Do we ignore characters after certain length in the config name?

@fgvanzee
Copy link
Member

@fgvanzee , generic_legacy it is failing because there is issue with amd64_legacy configuration, interestingly amd64_legacy builds fine on its own but fails via x86_64.

Seems like there is some name conflict, if I rename amd64_legacy to amd_legacy it works but amd64legacy doesn't. Do we ignore characters after certain length in the config name?

No. But maybe something is going awry with a function that searches for the presence of a string in a list of strings.

@fgvanzee
Copy link
Member

fgvanzee commented Oct 14, 2021

I tried to reproduce this by running ./configure on the PR branch locally on my workstation -- no luck. The problem didn't manifest. 😕

I was able to reproduce it after all. (I had forgotten to check out the zen3_support branch after cloning.)

@fgvanzee
Copy link
Member

I've found the source of the amd64_legacy / generic_legacy bug. It is indeed the result of a poorly-formed sed substitution. I'll try to commit a fix soon.

Details:
- Fixed a bug in configure related to the building of the so-called
  config list. When processing the contents of config_registry,
  configure creates a series of structures and list that allow for
  various mappings related to configuration families, subconfigs,
  and kernel sets. Two of those lists are built via subsitituion
  of umbrella families with their subconfig members, and one of
  those lists was improperly performing the subtitution in a way
  that would erroneously match on partial umbrella family names.
  That code was changed to match the code that was already doing
  the subtitution properly, via substitute_words().
- Added comments noting the importance of using substitute_words()
  in both instances.
@fgvanzee
Copy link
Member

Travis is building this PR now, but we have a new compilation error:

Compiling obj/x86_64/ref_kernels/zen3/bli_cntx_zen3_ref.o ('zen3' CFLAGS for ref. kernel init)
Compiling obj/x86_64/ref_kernels/zen3/1/bli_addv_zen3_ref.o ('zen3' CFLAGS for ref. kernels)
cc1: error: bad value (‘znver2’) for ‘-march=’ switch
compilation terminated due to -Wfatal-errors.
make: *** [Makefile:597: obj/x86_64/ref_kernels/zen3/1/bli_addv_zen3_ref.o] Error 1
make: *** Waiting for unfinished jobs....
The command "make -j 2" exited with 2.
$ make install
Compiling obj/x86_64/ref_kernels/zen3/1/bli_addv_zen3_ref.o ('zen3' CFLAGS for ref. kernels)
cc1: error: bad value (‘znver2’) for ‘-march=’ switch
compilation terminated due to -Wfatal-errors.
make: *** [Makefile:597: obj/x86_64/ref_kernels/zen3/1/bli_addv_zen3_ref.o] Error 1

Maybe gcc 8 isn't new enough for the -march=znver2 flag?

@fgvanzee
Copy link
Member

fgvanzee commented Oct 14, 2021

Confirmed. gcc 8 doesn't know about znver2. We need to bump our travis.yml file up to gcc 9, at minimum.

Details:
- .travis.yml was previously using gcc 8, which did not support
  -march=znver2. The file has been updated to use gcc 9 everywhere
  gcc 8 was previously being used.
@devinamatthews
Copy link
Member

Uh, isn't the fix the check for a high enough version in the make defs? No need to change the Travis CI setup.

@fgvanzee
Copy link
Member

Uh, isn't the fix the check for a high enough version in the make defs? No need to change the Travis CI setup.

Hmm, yeah, good point. We already have a GCC_OT_9_1_0 variable, but for some reason the zen3 subconfig doesn't use it. That file is also kind of a mess. I think I should rewrite it, and in the process implement the fix as you suggest.

@fgvanzee
Copy link
Member

@devinamatthews The fix should have worked, though. Instead, the x86_64 build is still failing, but in a way that I don't understand. Can you take a look?

@devinamatthews
Copy link
Member

Could be a transient network problem. I restarted it.

@fgvanzee
Copy link
Member

I had already restarted dev on a similar error. It is also still failing. 😞

@devinamatthews
Copy link
Member

Try downloading SDE from that link by hand. Maybe it has changed.

@fgvanzee
Copy link
Member

This link?

@devinamatthews
Copy link
Member

Whichever one is in do_sde.sh.

@fgvanzee
Copy link
Member

At this point I don't know what else to do other than disable the SDE testing altogether.

Related: why don't we nix the testing of piledriver, steamroller, and excavator? I think this would save time, yeah?

@devinamatthews
Copy link
Member

Not that much time, but sure.

@devinamatthews
Copy link
Member

Report it to Travis support. Something odd with the network because it works fine on my laptop.

#gcc or clang version must be atleast 4.0
# gcc 9.0 or later:
ifeq ($(shell test $(GCC_VERSION) -ge 9; echo $$?),0)
CKVECFLAGS += -march=znver2
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that GCC 10.3 and newer support -march=znver3 but it's not selected.
Is that on purpose or an oversight?
https://gcc.gnu.org/gcc-10/changes.html

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good eye, @bartoldeman. I had noticed this, too, and have fixed it in my (as yet unpushed) pass of edits.

@fgvanzee
Copy link
Member

fgvanzee commented Nov 2, 2021

One thing that will help me get started on the CLANG_OT_* variables, @dzambare, is if you could enumerate all of the important version thresholds needed by the zen2 and zen3 subconfigs. (If you want to include zen too, you can.)

Another question: can we distinguish the important AOCC versions through its clang versions (since AOCC is based on clang), or do you think we will need a separate set of AAOC_OT_* variables to properly distinguish between important groupings of versions of AOCC?

@devinamatthews
Copy link
Member

Why not just check the compiler for support for various flags in configure? (or heck, you could do it in the Makefile if you wanted)

@fgvanzee
Copy link
Member

fgvanzee commented Nov 2, 2021

Why not just check the compiler for support for various flags in configure? (or heck, you could do it in the Makefile if you wanted)

I'm not sure how this is incompatible with what I've proposed.

@devinamatthews
Copy link
Member

devinamatthews commented Nov 2, 2021

For example:

has_znver3=no
if echo | $CC -march=znver3 -x c -c -o /dev/null -; then
    has_znver3=yes
fi

Then sub @HAS_ZNVER3@ or something in config.mk.

@devinamatthews
Copy link
Member

And of course you want to add 2> /dev/null to catch the potential error message.

@devinamatthews
Copy link
Member

The conflict is that then you would not need any version flags at all.

@fgvanzee
Copy link
Member

fgvanzee commented Nov 2, 2021

The conflict is that then you would not need any version flags at all.

You wouldn't need any versions variables, true, but this solution still creates new make variables that need to be communicated to the makefile fragment. You're just changing what information is communicated.

I like your solution in principle since it narrowly identifies the key piece of information, which is: "Does the compiler support -march=znver3?" But I would want to switch to this system of tracking options (rather than versions) uniformly for all subconfigs. It's also not clear how that would work in the context of the older versions of Intel flags (-march=core-avx2). Do newer compilers still support those older flags? Or maybe we don't care if it is supposed to work either way.

I could also imagine someone arguing that versions are still a bit more useful because in theory you could have a version of a compiler that thinks it supports a compiler flag, but that support is actually broken in practice.

@dzambare
Copy link
Contributor Author

dzambare commented Nov 5, 2021

One thing that will help me get started on the CLANG_OT_* variables, @dzambare, is if you could enumerate all of the important version thresholds needed by the zen2 and zen3 subconfigs. (If you want to include zen too, you can.)

Another question: can we distinguish the important AOCC versions through its clang versions (since AOCC is based on clang), or do you think we will need a separate set of AAOC_OT_* variables to properly distinguish between important groupings of versions of AOCC?

For AOCC (AMD clang) we have following thresholds:

AOCC > 3.0.0 supports znver3.
AOCC > 2.0.0 supports znver2.
AOCC < 2.0.0 supports znver1.

For clang I have limited information

clang > 9.0.0 supports znver2
clang < 9.0.0 supports znver1

@dzambare
Copy link
Contributor Author

dzambare commented Nov 5, 2021

The conflict is that then you would not need any version flags at all.

You wouldn't need any versions variables, true, but this solution still creates new make variables that need to be communicated to the makefile fragment. You're just changing what information is communicated.

I like your solution in principle since it narrowly identifies the key piece of information, which is: "Does the compiler support -march=znver3? But I would want to switch to this system of tracking options (rather than versions) uniformly for all subconfigs. It's also not clear how that would work in the context of the older versions of Intel flags (-march=core-avx2). Do newer compilers still support those older flags? Or maybe we don't care if it is supposed to work either way.

I could also imagine someone arguing that versions are still a bit more useful because in theory you could have a version of a compiler that thinks it supports a compiler flag, but that support is actually broken in practice.

Me too like this idea, specifically when we are dealing with vanilla clang and AMD clang it will avoid many redundant checks/variables.

@devinamatthews
Copy link
Member

Clang 12+ supports the -znver3 flag.

Details:
- Restructured clang and AOCC support for zen2/zen3 make_defs.mk files.
  The clang and AOCC version detection now happens in configure, not in
  the subconfigs' makefile fragments. That is, we've added logic to
  configure that detects the version of clang/AOCC, outputs an
  appropriate variable to config.mk (ie: CLANG_OT_*, AOCC_OT_*), and
  then checks for it within the makefile fragment (as is currently done
  for the GCC_OT_* variables).
- Added configure support for a GCC_OT_10_1_0 variable (and associated
  substitution anchor) to communicate whether the gcc version is older
  than 10.1.0, and use this variable to check for recent enough versions
  of gcc to use -march=znver3.
- Adjusted zen3 MC blocksizes for c and z datatypes to match that of
  zen2.
- Disabled gemmsup for c and z datatypes in the zen3 subconfig by
  setting those sup thresholds to -1 given that we do not have any
  confirmed/tested sup kernels for the complex domain.
- Inlined the contents of config/zen/amd_config.mk into the zen2
  make_defs.mk so that the file is self-contained and immune to changes
  in other subconfigs.
- Added indenting (with spaces) of GNU make conditionals for easier
  reading in zen2 and zen3 make_defs.mk files.
- Comment updates.
- Whitespace changes.
@fgvanzee
Copy link
Member

fgvanzee commented Nov 8, 2021

@dzambare Please take a look at 7872c3a -- in particular, the compiler detection logic in configure (starting around line 1438), and the make_defs.mk files for the zen2 and zen3 subconfigs. If you have time, it would also be good to perform some quick tests within your own development environment for zen3 with both clang and AOCC to make sure that configure performs as intended.

I also wanted to add CPUID files for Zen2 and Zen3 in travis/cpuid, but I'm not confident that I would create valid files. So, for now, we are only able to test Zen1 via the SDE in the Travis CI builds. I might be able to return to this later someday, but if you wanted to take a stab at this you are welcome to.

Details:
- Restructured clang and AOCC support in config/zen/make_defs.mk to
  bring that file into alignment with recent changes to the make_defs.mk
  of zen2 and zen3 subconfigs.
- Added missing options (-mllvm -disable-licm-vrp) to the znver1
  conditional case of aocc handling in zen2/make_defs.mk. These options
  were present in the amd_config.mk fragment that was being included
  in the previous version of zen2/make_defs.mk, but was accidentally
  omitted from the newer version introduced recently in 7872c3a.
@fgvanzee
Copy link
Member

@dzambare I made a minor change of note in b641cf7: I added -mllvm -disable-licm-vrp back to the list of compiler options for Zen1 with AOCC in the zen2 subconfig. These options were previously included via amd_config.mk, but I forgot to include them in my rewrite.

I also rewrote the zen/make_defs.mk file to match that of the zen2 and zen3 fragments.

@dzambare
Copy link
Contributor Author

dzambare commented Nov 11, 2021 via email

@fgvanzee
Copy link
Member

@dzambare Thanks Dipal. Devin provided me guidance on the CPUID .def files, so no need to investigate that yet. I might be able to take care of it myself, but I may wait and do it outside of this PR.

Details:
- Adjusted the range of models checked by bli_cpuid_is_zen() (which was
  previously 0x00 ~ 0xff and is now 0x00 ~ 0x2f) so that it is
  completely disjoint from the models checked by bli_cpuid_is_zen2()
  (0x30 ~ 0xff). This is normally necessary because Zen and Zen2
  microarchitectures share the same family (23, or 0x17), and so the
  model code is the only way to differentiate the two. But in our case,
  fixing the model range for zen *wasn't* actually necessarily since we
  checked for zen2 first, and therefore the wide zen range acted like
  the 'else' of an 'if-else' statement. That said, the change helps
  improve clarity for the reader by encoding useful knowledge, which
  was obtained from https://en.wikichip.org/wiki/amd/cpuid .
- Added a zen2.def file to the collection in travis/cpuid. Thanks to
  Devin Matthews for his guidance in hacking this file as a slight
  modification of zen.def.
- Enabled testing of zen2 via the SDE in travis/do_sde.sh.
- Comment updates to bli_cpuid.c.
Details:
- Added a zen3.def file to the collection in travis/cpuid. Note that
  support for zen, zen2, and zen3 is now present, and while all the
  three microarchitectures have identical instruction sets from the
  perspective of BLIS, they each correspond to different subconfigs
  and therefore merit separate testing.
- Enabled testing of zen3 via the SDE in travis/do_sde.sh.
- Added comments to zen2.def to briefly explain how the file was
  created. (A similar comment is also present in zen3.def.)
Details:
- Updated travis/do_sde.sh to grab the tarball for the latest version
  of the Intel SDE from the flame/ci-utils repository.
@fgvanzee
Copy link
Member

@dzambare Looks like the SDE testing for zen2 and zen3 subconfigurations is working.

@dzambare
Copy link
Contributor Author

Hi @fgvanzee, I have tested the new makefiles for gcc 9, 10 , 11, aocc 2.0, 3.0 and 3.1 and clang 11. Everything is working as expected.

Really appreciate your help in updating makefiles and SDE.

@fgvanzee
Copy link
Member

You're welcome, @dzambare. Thank you for your help as well! Glad the new configuration logic works as intended.

@fgvanzee fgvanzee merged commit 26e4b6b into flame:master Nov 17, 2021
@fgvanzee
Copy link
Member

@bartoldeman This (rather significant) PR merge is done. 🙂

dzambare added a commit to Meghana-vankadari/blis that referenced this pull request Jan 6, 2022
Details:
- Added a new 'zen3' subconfiguration targeting support for the AMD Zen3
  microarchitecture (flame#561). Thanks to AMD for this contribution.
- Restructured clang and AOCC support for zen, zen2, and zen3
  make_defs.mk files. The clang and AOCC version detection now happens
  in configure, not in the subconfigurations' makefile fragments. That
  is, we've added logic to configure that detects the version of
  clang/AOCC, outputs an appropriate variable to config.mk
  (ie: CLANG_OT_*, AOCC_OT_*), and then checks for it within the
  makefile fragment (as is currently done for the GCC_OT_* variables).
- Added configure support for a GCC_OT_10_1_0 variable (and associated
  substitution anchor) to communicate whether the gcc version is older
  than 10.1.0, and use this variable to check for recent enough versions
  of gcc to use -march=znver3 in the zen3 subconfig.
- Inlined the contents of config/zen/amd_config.mk into the zen and zen2
  make_defs.mk so that the files are self-contained, harmonizing the
  format of all three Zen-based subconfigurations' make_defs.mk files.
- Added indenting (with spaces) of GNU make conditionals for easier
  reading in zen, zen2, and zen3 make_defs.mk files.
- Adjusted the range of models checked by bli_cpuid_is_zen() (which was
  previously 0x00 ~ 0xff and is now 0x00 ~ 0x2f) so that it is
  completely disjoint from the models checked by bli_cpuid_is_zen2()
  (0x30 ~ 0xff). This is normally necessary because Zen and Zen2
  microarchitectures share the same family (23, or 0x17), and so the
  model code is the only way to differentiate the two. But in our case,
  fixing the model range for zen *wasn't* actually necessary since we
  checked for zen2 first, and therefore the wide zen range acted like
  the 'else' of an 'if-else' statement. That said, the change helps
  improve clarity for the reader by encoding useful knowledge, which
  was obtained from https://en.wikichip.org/wiki/amd/cpuid .
- Added zen2.def and zen3.def files to the collection in travis/cpuid.
  Note that support for zen, zen2, and zen3 is now present, and while
  all the three microarchitectures have identical instruction sets from
  the perspective of BLIS microkernels, they each correspond to
  different subconfigurations and therefore merit separate testing.
  Thanks to Devin Matthews for his guidance in hacking these files as
  slight modifications of zen.def.
- Enabled testing of zen2 and zen3 via the SDE in travis/do_sde.sh.
  Now, zen, zen2, and zen3 are tested through the SDE via Travis CI
  builds.
- Updated travis/do_sde.sh to grab the SDE tarball from a new ci-utils
  repository on GitHub rather than on Intel's website. This change was
  made in an attempt to circumvent recent troubles with Travis CI not
  being able to download the SDE directly from Intel's website via curl.
  Thanks to Devin Matthews for suggesting the idea.
- Updated travis/do_sde.sh to grab the latest version (8.69.1) of the
  Intel SDE from the flame/ci-utils repository.
- Updated .travis.yml to use gcc 9. The file was previously using gcc 8,
  which did not support -march=znver2.
- Created amd64_legacy umbrella family in config_registry for targeting
  older (bulldozer, piledriver, steamroller, and excavator)
  microarchitectures and moved those same subconfigs out of the amd64
  umbrella family. However, x86_64 retains amd64_legacy as a constituent
  member.
- Fixed a bug in configure related to the building of the so-called
  config list. When processing the contents of config_registry,
  configure creates a series of structures and lists that allow for
  various mappings related to configuration families, subconfigs, and
  kernel sets. Two of those lists are built via substitution of
  umbrella families with their subconfig members, and one of those
  lists was improperly performing the substitution in a way that would
  erroneously match on partial umbrella family names. That code was
  changed to match the code that was already doing the substitution
  properly, via substitute_words(). Also added comments noting the
  importance of using substitute_words() in both instances.
- Comment updates.
fgvanzee added a commit that referenced this pull request Sep 10, 2022
Details:
- Added a new 'zen3' subconfiguration targeting support for the AMD Zen3
  microarchitecture (#561). Thanks to AMD for this contribution.
- Restructured clang and AOCC support for zen, zen2, and zen3
  make_defs.mk files. The clang and AOCC version detection now happens
  in configure, not in the subconfigurations' makefile fragments. That
  is, we've added logic to configure that detects the version of
  clang/AOCC, outputs an appropriate variable to config.mk
  (ie: CLANG_OT_*, AOCC_OT_*), and then checks for it within the
  makefile fragment (as is currently done for the GCC_OT_* variables).
- Added configure support for a GCC_OT_10_1_0 variable (and associated
  substitution anchor) to communicate whether the gcc version is older
  than 10.1.0, and use this variable to check for recent enough versions
  of gcc to use -march=znver3 in the zen3 subconfig.
- Inlined the contents of config/zen/amd_config.mk into the zen and zen2
  make_defs.mk so that the files are self-contained, harmonizing the
  format of all three Zen-based subconfigurations' make_defs.mk files.
- Added indenting (with spaces) of GNU make conditionals for easier
  reading in zen, zen2, and zen3 make_defs.mk files.
- Adjusted the range of models checked by bli_cpuid_is_zen() (which was
  previously 0x00 ~ 0xff and is now 0x00 ~ 0x2f) so that it is
  completely disjoint from the models checked by bli_cpuid_is_zen2()
  (0x30 ~ 0xff). This is normally necessary because Zen and Zen2
  microarchitectures share the same family (23, or 0x17), and so the
  model code is the only way to differentiate the two. But in our case,
  fixing the model range for zen *wasn't* actually necessary since we
  checked for zen2 first, and therefore the wide zen range acted like
  the 'else' of an 'if-else' statement. That said, the change helps
  improve clarity for the reader by encoding useful knowledge, which
  was obtained from https://en.wikichip.org/wiki/amd/cpuid .
- Added zen2.def and zen3.def files to the collection in travis/cpuid.
  Note that support for zen, zen2, and zen3 is now present, and while
  all the three microarchitectures have identical instruction sets from
  the perspective of BLIS microkernels, they each correspond to
  different subconfigurations and therefore merit separate testing.
  Thanks to Devin Matthews for his guidance in hacking these files as
  slight modifications of zen.def.
- Enabled testing of zen2 and zen3 via the SDE in travis/do_sde.sh.
  Now, zen, zen2, and zen3 are tested through the SDE via Travis CI
  builds.
- Updated travis/do_sde.sh to grab the SDE tarball from a new ci-utils
  repository on GitHub rather than on Intel's website. This change was
  made in an attempt to circumvent recent troubles with Travis CI not
  being able to download the SDE directly from Intel's website via curl.
  Thanks to Devin Matthews for suggesting the idea.
- Updated travis/do_sde.sh to grab the latest version (8.69.1) of the
  Intel SDE from the flame/ci-utils repository.
- Updated .travis.yml to use gcc 9. The file was previously using gcc 8,
  which did not support -march=znver2.
- Created amd64_legacy umbrella family in config_registry for targeting
  older (bulldozer, piledriver, steamroller, and excavator)
  microarchitectures and moved those same subconfigs out of the amd64
  umbrella family. However, x86_64 retains amd64_legacy as a constituent
  member.
- Fixed a bug in configure related to the building of the so-called
  config list. When processing the contents of config_registry,
  configure creates a series of structures and lists that allow for
  various mappings related to configuration families, subconfigs, and
  kernel sets. Two of those lists are built via substitution of
  umbrella families with their subconfig members, and one of those
  lists was improperly performing the substitution in a way that would
  erroneously match on partial umbrella family names. That code was
  changed to match the code that was already doing the substitution
  properly, via substitute_words(). Also added comments noting the
  importance of using substitute_words() in both instances.
- Comment updates.
- (cherry picked from commit 26e4b6b)
@dzambare dzambare deleted the zen3_support branch October 10, 2022 04:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants