New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add sifive_x280 configuration #737
Conversation
@Aaron-Hutchinson @nick-knight @myeh01 awesome work, much appreciated! Regarding steps 1-3 of the testing process, can these product be pre-built? This would really help CI build times... @angsch and @leekillough have been putting similar things here. |
@devinamatthews Yes, absolutely. The GNU toolchain build, in particular, is substantial. But your comment touches on a larger shortcoming of our PR: we have not addressed CI. (We meant to add a comment about this when we submitted the PR.) We are hoping for some guidance from the community on the best way to go about this, since we have little experience with setting up CI, and none with BLIS CI in particular. |
The PR can be merged without it. Once we get at least one RISC-V configuration running reliably in Travis then adding more shouldn't be too difficult. |
I think that we can extend the CI infrastructure that @leekillough and I set up. I am happy to help here. Further, before merging the PR, it would be good to check how the x280 target interacts with the auto configure and ISA detection work that we added. |
Is there a C macro which is always defined when an X280 compiler is being used? There is an auto-detect mechanism which auto-detects RISC-V architecture based on I want to improve it so that it can also detect X280, because with our PR, it will detect X280 as If X280 is detected when |
I'd be happy to upload a tarball of the prebuilt toolchain and QEMU for CI purposes. It looks like there's already a QEMU tarball in the link in your post, so I can try replacing the QEMU portion of our automation script with just downloading and unpacking that tarball. I can also do something similar with the prebuilt toolchain once it's uploaded. I think then translating the script over to CI would be much smoother.
Our automation script uses the upstream toolchain, so I'm not sure if there would be a way to differentiate it from |
Is there anything like |
@devinamatthews: There is no need to use a runtime @devinamatthews, @Aaron-Hutchinson @nick-knight : There are two RISC-V autodetection header files in PR693: bli_riscv_cpuid.h, which returns one of bli_riscv_detect_arch.h, which returns the full detected RISC-V architecture string, such as |
But if two companies make rv64iv chips how do you tell them apart? |
Hence my question in #737 (comment). @angsch and I have created a foundational RISC-V BLIS port which should be adaptable to all RISC-V variants. But we understand that there may be specific BLIS implementations for specific RISC-V implementations. The BLIS RISC-V autodetection mechanism is able to identify base features of the RISC-V implementation, such as whether |
Regarding prebuilding the toolchain for CI, I'm not sure how portable the toolchain that our script creates is. It appears it hardcodes some of the filepaths, and I fear this may cause some issues if I were to create a tarball of my local build and upload it (I have limited knowledge in this area, so correct me if I'm wrong). Would it be possible to have one of the CI machines build the toolchain itself and save the result for future runs? |
That concern is justified. I encountered incompatibilities when I first packaged qemu. To package qemu, I had to replicate the build environment of the CI machine. Further, the build of the toolchain was susceptible to the execution environment. I think that the incompatibilities are solely due to dismatching version of linked libraries such as glibc. I suggest that you use the tarball of qemu and the toolchain that Lee and I use in our PR. That runs successfully on the CI machine. |
I tried this and it is not possible. The Travis runs will hit a timeout. |
Can the timeout be increased for the steps that build the toolchain/QEMU? |
We were recommended to aim at a runtime of below 10 minutes for our rv[32,64]iv target. Note that |
Again please forgive my limited experience in this area. I would think there would be a way to save the toolchain and QEMU builds for use over multiple CI invocations and only build them when they either don't already exist on the machine or the builds become out of date. This way, they're only built once on the CI machine and nearly all CI runs will skip over the build steps for the toolchain and QEMU. |
I think Travis also has Docker images of the CI environment which you can run locally. |
GitHub has a 100 MB limit on tracked files before it requires paid service. Instead of files stored in the distribution, we would need to use released binaries, which have a 2 GB limit. That is the same 2 GB limit for Git Large File Storage in GitHub. Travis has quotas on how much CPU, memory and disk space can be used. Once the credits run out for a billing period, they must be bought with paid-for credits, or wait until the next billing period. See this also. According to @angsch, the dependency on linked libraries makes it a necessity to build the toolchain in an environment that is compatible with the CI machines. So you need to build on a fresh Ubuntu Focal machine / Docker container. |
Chunked tar.gz?
From: "Field G. Van Zee" ***@***.***>
Reply-To: flame/blis ***@***.***>
Date: Tuesday, April 4, 2023 at 2:19 PM
To: flame/blis ***@***.***>
Cc: "Matthews, Devin" ***@***.***>, Mention ***@***.***>
Subject: Re: [flame/blis] Add sifive_x280 configuration (PR #737)
[EXTERNAL SENDER]
@Aaron-Hutchinson<https://github.com/Aaron-Hutchinson>:
GitHub has a 100 MB limit<https://docs.github.com/en/repositories/working-with-files/managing-large-files/about-large-files-on-github> on tracked files before it requires paid service.
Instead of files stored in the distribution, we would need to use released binaries, which have a 2 GB limit<https://docs.github.com/en/repositories/working-with-files/managing-large-files/about-large-files-on-github#distributing-large-binaries>. That is the same 2 GB limit<https://docs.github.com/en/repositories/working-with-files/managing-large-files/about-git-large-file-storage> for Git Large File Storage in GitHub.
Travis has quotas<https://docs.travis-ci.com/user/billing-faq/> on how much CPU, memory and disk space can be used. Once the credits run out for a billing period, they must be bought with paid-for credits, or wait until the next billing period. See this also<https://www.jeffgeerling.com/blog/2020/travis-cis-new-pricing-plan-threw-wrench-my-open-source-works>.
According to @angsch<https://github.com/angsch>, the dependency on linked libraries makes it a necessity to build the toolchain in an environment that is compatible with the CI machines. So you need to build on a fresh Ubuntu Focal machine / Docker container.
—
Reply to this email directly, view it on GitHub<#737 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/ABIAZIM57GKN75CBKFPUUALW7RX3HANCNFSM6AAAAAAWMJJWJA>.
You are receiving this because you were mentioned.Message ID: ***@***.***>
|
I'm proposing that we do not track any toolchain/QEMU related files on GitHub, and just use build caching for them. It looks like Travis has built-in functionality for exactly this kind of purpose. See here and here. This line from the first link is particularly relevant:
|
@Aaron-Hutchinson Caching sounds fine to me. I read the links you provided, but I'm still not 100% certain how we would employ caching in this context. (Travis could use a few more examples in their documentation!) |
I agree that Travis' documentation is not very thorough. I've read a little bit about this feature and it's something I'd like to try pursuing. Does anyone know if there is a local version of Travis CI I can use on my own machine to test the results of changes to the |
I believe there is a local version using Docker. At least there was a few years ago. |
I haven't been able to find any official documentation on a local version, and unofficial discussions I've come across are a few years old and don't appear to work any more. It looks like they may have made this an Enterprise feature. |
Caching is not recommended for built toolchains (unless that document is outdated), and used to not be performed for Docker images, but seems to be now. See this and this too. |
config/sifive_x280/make_defs.mk
Outdated
CPPROCFLAGS := | ||
CMISCFLAGS := $(CMISCFLAGS_SIFIVE) -fdata-sections -ffunction-sections \ | ||
-fdiagnostics-color=always -fno-rtti -fno-exceptions \ | ||
-std=gnu++17 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this read -std=gnu17
? I think that gnu++17
is a C++-only option.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-std=gnu++17
should be removed completely since BLIS already adds std=c99
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. We just copied this from the generic make_defs.mk
without really understanding what was required by the project. IIRC, a bunch of the warning flags are also redundant (generated somewhere else in the build system).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Aaron-Hutchinson I think you forgot to update CMISCFLAGS
when you rebased
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the reminder! I did indeed forget. This will be fixed in the upcoming commit.
RISC-V General Toolchain BuilderThe following script is used in-house @tactcomplabs: build-riscv.txt (rename to
To use it, edit the variables at the top of the file, e.g.,
and then run To Build BLISAfter the toolchain is built,
Build issues encountered with this PR(The C++ options have been removed, and merge conflicts eliminated, in sifive#3 .) Your script sets:
while also using:
... which seems to exclude shared libraries, while also specifying options to use them. When using QEMU, our script sets:
... which allows QEMU to work with BLIS shared libraries. When I build my toolchain with tag
Is there a @angsch @nick-knight @Aaron-Hutchinson @devinamatthews @fgvanzee @ct-clmsn |
@Aaron-Hutchinson In order to avoid duplication, I tested the QEMU tarball that Lee and I use. I face the compilation problem with the vector intrinsics, too, so my test experimentally enabled all extension that x280 has. Based on these tests, I am confident that you can use the same QEMU tarball for your CI. The tarball lives in a sibling repo: https://github.com/flame/ci-utils/blob/master/riscv/qemu-riscv-2023.02.25-ubuntu-20.04.tar.gz. |
Thanks for all the feedback, sorry we're slow to respond. Regarding the RISC-V vector intrinsics issue, this name-mangling was introduced recently at the behest of the RISC-V Toolchains SIG, in riscv-non-isa/riscv-c-api-doc#31. It made its way into the vector intrinsics API, version 0.11 (multiple PRs, I won't try to list them all). That API change, in turn, appeared in LLVM 16.0.0. Unfortunately, I don't know the status with GCC. Historically, GCC has lagged LLVM w.r.t. chasing unratified/churning RISC-V specs, so I'm not surprised that LLVM works but GCC does not. On that last point, in case it isn't clear, the RISC-V vector intrinsics API is a community project, sponsored by RISC-V International: We are working towards v1.0 of the API but have not frozen yet. And it looks like we'll miss the GCC 13 window. The task group meets monthly; we'd love your company. If you have questions on GCC support for the latest intrinsics API changes, this is the right community to bring it up with. |
Ah yes, the RVV intrinsics API is still not frozen, we should be prepared for churn. Regarding the API question, we can populate a tuple using the "insertion" functions and extract tuple elements (vectors) using the "extraction" functions. You raise a good question regarding the inline asm syntax for tuple types; I raised an issue about this: riscv-non-isa/riscv-c-api-doc#43 We're in the midst of cache-tile tuning, and optimizing the packing microkernels. We'll fix these issues as part of a follow-up commit. (FYI: @myeh01) |
I am willing to do it for this PR, since I have been locally keeping it up to date . |
…code. However, it does not get correct results for complex BLIS routines which use segment loads (or call those that do). The intrinsic types check out and make sense, but it returns wrong answers. It's probably something really simple. For historical reference, see: riscv-non-isa/riscv-c-api-doc#43 flame#737 (comment) https://reviews.llvm.org/D152134 riscv-non-isa/rvv-intrinsic-doc#139 riscv-non-isa/rvv-intrinsic-doc#198 https://github.com/riscv-non-isa/rvv-intrinsic-doc/blob/master/rvv-intrinsic-rfc.md https://github.com/riscv-non-isa/rvv-intrinsic-doc/blob/master/auto-generated/intrinsic_funcs/02_vector_unit-stride_segment_load_store_instructions_zvlsseg.md https://github.com/riscv-non-isa/rvv-intrinsic-doc/blob/master/auto-generated/intrinsic_funcs/03_vector_stride_segment_load_store_instructions_zvlsseg.md https://github.com/riscv-non-isa/rvv-intrinsic-doc/blob/master/auto-generated/intrinsic_funcs/04_vector_indexed_segment_load_store_instructions_zvlsseg.md
Added whitespace and other formatting fixes
Restore changes from sifive-blis-private#28
Our team would like to get this PR merged soon. We have some updates coming in shortly with minor changes, such as resolving the merge conflicts and updating the RISC-V intrinsics. What is the best way forward regarding the CI issue? From what I can tell from the comments above this is still unresolved. |
When you have updated the PR, I am happy to test locally if you can reuse the binaries that are used in the current CI pipeline. I am optimistic that the CI suggestions from above still work. |
acd68f9
to
8663e95
Compare
* Updated RISC-V intrinsics to match LLVM 17.0.2
All of the developmental changes we planned to make are now merged into @angsch If you're able and willing to run the CI tests locally, I think the branch should be in a stable place to do so now. Thank you! |
The following should work. I think it makes sense to use the same compiler version for all RISC-V targets, so the compiler version is bumped below for the already existing targets. diff --git a/.travis.yml b/.travis.yml
index 848cb184..bdfafb6b 100644
--- a/.travis.yml
+++ b/.travis.yml
@@ -86,6 +86,11 @@ matrix:
env: OOT=0 TEST=FAST SDE=0 THR="none" BLD="--disable-shared" CONF="rv32iv" \
CC=riscv32-unknown-linux-gnu-gcc \
LDFLAGS=-static
+ - os: linux
+ compiler: clang
+ env: OOT=0 TEST=FAST SDE=0 THR="none" BLD="--disable-shared" CONF="sifive_x280" \
+ CC=clang \
+ LDFLAGS=-static
install:
- if [ "$CC" = "gcc" ] && [ "$TRAVIS_OS_NAME" = "linux" ]; then export CC="gcc-9"; fi
- if [ -n "$PACKAGES" ] && [ "$TRAVIS_OS_NAME" = "linux" ]; then sudo apt-get install -y $PACKAGES; fi
@@ -106,6 +111,12 @@ script:
export CXX=$DIST_PATH/../toolchain/riscv/bin/riscv32-unknown-linux-gnu-g++;
export TESTSUITE_WRAPPER="$DIST_PATH/../toolchain/qemu-riscv32 -cpu rv32,vext_spec=v1.0,v=true,vlen=128 -B 0x100000";
fi
+- if [ "$CONF" = "sifive_x280" ]; then
+ $DIST_PATH/travis/do_riscv.sh "$CONF";
+ export CC=$DIST_PATH/../toolchain/riscv/bin/clang;
+ export CXX=$DIST_PATH/../toolchain/riscv/bin/clang++;
+ export TESTSUITE_WRAPPER="$DIST_PATH/../toolchain/qemu-riscv64 -cpu rv64,vext_spec=v1.0,v=true,vlen=512 -B 0x100000";
+ fi
- $DIST_PATH/configure -p `pwd`/../install -t $THR $BLD CC=$CC $CONF
- pwd
- ls -l
diff --git a/travis/do_riscv.sh b/travis/do_riscv.sh
index a51d3306..9a114b0e 100755
--- a/travis/do_riscv.sh
+++ b/travis/do_riscv.sh
@@ -3,18 +3,21 @@
set -e
set -x
-TAG=2023.02.25
+TAG=2023.10.18
# The prebuilt toolchains only support hardfloat, so we only
# test these for now.
case $1 in
"rv32iv")
- TARBALL=riscv32-glibc-ubuntu-20.04-nightly-${TAG}-nightly.tar.gz
+ TARBALL=riscv32-glibc-ubuntu-20.04-gcc-nightly-${TAG}-nightly.tar.gz
;;
"rv64iv")
- TARBALL=riscv64-glibc-ubuntu-20.04-nightly-${TAG}-nightly.tar.gz
+ TARBALL=riscv64-glibc-ubuntu-20.04-gcc-nightly-${TAG}-nightly.tar.gz
;;
+ "sifive_x280")
+ TARBALL=riscv64-glibc-ubuntu-20.04-llvm-nightly-${TAG}-nightly.tar.gz
*)
+ ;;
exit 1
;;
esac I zipped the patch due to Github's constraints of what can be attached. |
@angsch Looks like CI has failed after applying the patch due to not being able to find the compiler:
Any idea what went wrong? |
;; | ||
"sifive_x280") | ||
TARBALL=riscv64-glibc-ubuntu-20.04-llvm-nightly-${TAG}-nightly.tar.gz |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We already have a QEMU in this tarball file. Is it necessary to get another one using the following commands?
# Once CI upgrades to jammy, the next three lines can be removed.
# The qemu version installed via packages (qemu-user qemu-user-binfmt)
# is sufficient.
TARBALL_QEMU=qemu-riscv-2023.02.25-ubuntu-20.04.tar.gz
wget https://github.com/flame/ci-utils/raw/master/riscv/${TARBALL_QEMU}
tar -xf $TARBALL_QEMU
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We just need to update TARBALL
to riscv64-glibc-ubuntu-{JAMMY_VER}-gcc-nightly-${TAG}-nightly.tar.gz
if the CI is upgraded.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, I didn't notice that now both the LLVM and the GNU toolchain include qemu.
Does soft link work?
Could you try using |
I think that a syntax error before that introduces the problem. My mistake, sorry. + "sifive_x280")
+ TARBALL=riscv64-glibc-ubuntu-20.04-llvm-nightly-${TAG}-nightly.tar.gz
+ ;;
*)
exit 1
;;
esac (The In the meanwhile, I will try the qemu builds shipped with the toolchain. |
Thanks for the correction @angsch. Looks like with that fix the PR has passed the CI. |
Thank you everyone for your contributions and engagement on this PR! Does anyone else have any comments before I merge? 🚀 |
This PR adds a new configuration to BLIS, called
sifive_x280
. This configuration is built for the RISC-V instruction set architecture and is optimized for SiFive's X280 processor. Included are implementations for most level 1, 1f, and 3 kernels, with the level 3gemm
andgemmtrsm
kernels receiving the most attention.Since this configuration targets RISC-V, compiling it and running tests on typical machines is challenging. For convenience, we've written a simple script that aims to make testing this configuration easier. The script can be found here, which has the following flow:
sifive_x280
, builds it, and runsmake check
.Developers for the
sifive_x280
implementation (in alphabetical order):Special thanks to @fgvanzee for their assistance in debugging various issues and helping our team understand the BLIS framework.
We look forward to your feedback and are very excited to join the BLIS community.