Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix native builds on GitHub Actions using Bazel cache #240

Merged
merged 26 commits into from
Mar 12, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
26 commits
Select commit Hold shift + click to select a range
740fd51
Dump GCP credentials to file
karllessard Mar 8, 2021
0b0e6da
Pass BAZEL_EXTRA_OPTS environment variable
karllessard Mar 8, 2021
1bca2d1
Consume BUILD_EXTRA_FLAGS
karllessard Mar 8, 2021
397cbff
Rename BUILD_EXTRA_FLAGS
karllessard Mar 8, 2021
76533ce
Pass native extra build flags to Maven
karllessard Mar 8, 2021
d9443ae
Trigger CI on bazel-cache branch for testing
karllessard Mar 8, 2021
3460d4a
Try to quote native build flags in Maven command
karllessard Mar 8, 2021
1b98525
Add missing spaces
karllessard Mar 8, 2021
e1843ee
test
karllessard Mar 8, 2021
60d7236
Rename GCP credentials file
karllessard Mar 8, 2021
7c76601
Declare GCP_CREDS in environment
karllessard Mar 8, 2021
d888941
Reactivate windows GPU builds
karllessard Mar 8, 2021
fcd92ca
Fix Windows build with GPU and MKL
karllessard Mar 8, 2021
1cc7037
Restore Windows patches
karllessard Mar 8, 2021
2cab7f0
Delete previous/broken patch
karllessard Mar 8, 2021
c661c1a
Try to restore old windows MKL patch on 2.4.1
karllessard Mar 9, 2021
677d578
Disable MKL for Windows
karllessard Mar 9, 2021
8f0c725
Disable Windows patch again
karllessard Mar 9, 2021
a4cd21e
Try Bazel 4.0.0 for Windows build
karllessard Mar 11, 2021
e9dbf4a
Revert to Bazel 3.1.0 and cleanup Cuda files
karllessard Mar 11, 2021
30ea7e4
Remove more stuff from CUDA to avoid running out of disk space (#239)
saudet Mar 11, 2021
8297d39
Missing recursive flag in Cuda cleanup
karllessard Mar 11, 2021
4812db5
Fix archive cleanup paths
karllessard Mar 12, 2021
6333c75
Revert changes
karllessard Mar 12, 2021
3ecee2a
Remove bazel-cache branch triggers
karllessard Mar 12, 2021
344e565
Include Windows in GPU platform
karllessard Mar 12, 2021
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
53 changes: 37 additions & 16 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@ on:
env:
STAGING_PROFILE_ID: 46f80d0729c92d
NATIVE_BUILD_PROJECTS: tensorflow-core/tensorflow-core-generator,tensorflow-core/tensorflow-core-api
GCP_CREDS: ${{ secrets.GCP_CREDS }}
jobs:
quick-build:
if: github.event_name == 'pull_request' && !contains(github.event.pull_request.labels.*.name, 'CI build')
Expand Down Expand Up @@ -76,8 +77,9 @@ jobs:
tar hxvf $HOME/nccl.txz --strip-components=1 -C /usr/local/cuda/
mv /usr/local/cuda/lib/* /usr/local/cuda/lib64/
echo Removing downloaded archives and unused libraries to avoid running out of disk space
rm -f *.rpm *.tgz *.txz *.tar.*
rm -f $HOME/*.rpm $HOME/*.tgz $HOME/*.txz $HOME/*.tar.*
rm -f $(find /usr/local/cuda/ -name '*.a' -and -not -name libcudart_static.a -and -not -name libcudadevrt.a)
rm -rf /usr/local/cuda/doc* /usr/local/cuda/libnvvp* /usr/local/cuda/nsight* /usr/local/cuda/samples*
fi
- name: Checkout repository
uses: actions/checkout@v1
Expand All @@ -94,8 +96,14 @@ jobs:
mkdir -p $HOME/.m2
[[ "${{ github.event_name }}" == "push" ]] && MAVEN_PHASE=deploy || MAVEN_PHASE=install
echo "<settings><servers><server><id>ossrh</id><username>${{ secrets.CI_DEPLOY_USERNAME }}</username><password>${{ secrets.CI_DEPLOY_PASSWORD }}</password></server></servers></settings>" > $HOME/.m2/settings.xml
if [[ "${{ github.event_name }}" == "push" && "${{ github.repository }}" == "tensorflow/java" ]]; then
printf '%s\n' "${GCP_CREDS}" > $HOME/gcp_creds.json
export BAZEL_CACHE="--remote_cache=https://storage.googleapis.com/tensorflow-sigs-jvm --remote_upload_local_results=true --google_credentials=$HOME/gcp_creds.json"
else
export BAZEL_CACHE="--remote_cache=https://storage.googleapis.com/tensorflow-sigs-jvm --remote_upload_local_results=false"
fi
echo Executing Maven $MAVEN_PHASE
mvn clean $MAVEN_PHASE -B -U -e -Djavacpp.platform=linux-x86_64 -Djavacpp.platform.extension=${{ matrix.ext }} -pl $NATIVE_BUILD_PROJECTS -am -DstagingRepositoryId=${{ needs.prepare.outputs.stagingRepositoryId }}
mvn clean $MAVEN_PHASE -B -U -e -Djavacpp.platform=linux-x86_64 -Djavacpp.platform.extension=${{ matrix.ext }} -pl $NATIVE_BUILD_PROJECTS -am -DstagingRepositoryId=${{ needs.prepare.outputs.stagingRepositoryId }} "-Dnative.build.flags=$BAZEL_CACHE"
df -h
macosx-x86_64:
if: github.event_name == 'push' || contains(github.event.pull_request.labels.*.name, 'CI build')
Expand Down Expand Up @@ -123,17 +131,23 @@ jobs:
mkdir -p $HOME/.m2
[[ "${{ github.event_name }}" == "push" ]] && MAVEN_PHASE=deploy || MAVEN_PHASE=install
echo "<settings><servers><server><id>ossrh</id><username>${{ secrets.CI_DEPLOY_USERNAME }}</username><password>${{ secrets.CI_DEPLOY_PASSWORD }}</password></server></servers></settings>" > $HOME/.m2/settings.xml
if [[ "${{ github.event_name }}" == "push" && "${{ github.repository }}" == "tensorflow/java" ]]; then
printf '%s\n' "${GCP_CREDS}" > $HOME/gcp_creds.json
export BAZEL_CACHE="--remote_cache=https://storage.googleapis.com/tensorflow-sigs-jvm --remote_upload_local_results=true --google_credentials=$HOME/gcp_creds.json"
else
export BAZEL_CACHE="--remote_cache=https://storage.googleapis.com/tensorflow-sigs-jvm --remote_upload_local_results=false"
fi
df -h
echo Executing Maven $MAVEN_PHASE
mvn clean $MAVEN_PHASE -B -U -e -Djavacpp.platform=macosx-x86_64 -Djavacpp.platform.extension=${{ matrix.ext }} -pl $NATIVE_BUILD_PROJECTS -am -DstagingRepositoryId=${{ needs.prepare.outputs.stagingRepositoryId }}
mvn clean $MAVEN_PHASE -B -U -e -Djavacpp.platform=macosx-x86_64 -Djavacpp.platform.extension=${{ matrix.ext }} -pl $NATIVE_BUILD_PROJECTS -am -DstagingRepositoryId=${{ needs.prepare.outputs.stagingRepositoryId }} "-Dnative.build.flags=$BAZEL_CACHE"
df -h
windows-x86_64:
if: github.event_name == 'push' || contains(github.event.pull_request.labels.*.name, 'CI build')
runs-on: windows-latest
needs: prepare
strategy:
matrix:
ext: ["", -mkl] # -gpu, -mkl-gpu]
ext: ["", -gpu, -mkl] #, -mkl-gpu]
steps:
- name: Configure page file
uses: al-cheb/configure-pagefile-action@v1.2
Expand All @@ -154,16 +168,16 @@ jobs:
mkdir C:\bazel
curl.exe -L https://github.com/bazelbuild/bazel/releases/download/3.1.0/bazel-3.1.0-windows-x86_64.exe -o C:/bazel/bazel.exe --retry 10
set "EXT=${{ matrix.ext }}"
if "%EXT:~-4%"=="-gpu" (
echo Removing some unused stuff to avoid running out of disk space
rm.exe -Rf "C:/Program Files (x86)/Android" "C:/Program Files/dotnet" "%CONDA%" "%GOROOT_1_10_X64%" "%GOROOT_1_11_X64%" "%GOROOT_1_12_X64%" "%GOROOT_1_13_X64%" "C:\hostedtoolcache\windows\Ruby" "C:\Rust"
echo Installing CUDA
curl.exe -L https://developer.download.nvidia.com/compute/cuda/11.0.3/local_installers/cuda_11.0.3_451.82_win10.exe -o cuda.exe
curl.exe -L https://developer.download.nvidia.com/compute/redist/cudnn/v8.0.3/cudnn-11.0-windows-x64-v8.0.3.33.zip -o cudnn.zip
cuda.exe -s
mkdir cuda
unzip.exe cudnn.zip
cp.exe -a cuda/include cuda/lib cuda/bin "C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v11.0/"
if "%EXT:~-4%" == "-gpu" (
echo Removing some unused stuff to avoid running out of disk space
rm.exe -Rf "C:/Program Files (x86)/Android" "C:/Program Files/dotnet" "%CONDA%" "%GOROOT_1_10_X64%" "%GOROOT_1_11_X64%" "%GOROOT_1_12_X64%" "%GOROOT_1_13_X64%" "C:\hostedtoolcache\windows\Ruby" "C:\Rust"
echo Installing CUDA
curl.exe -L https://developer.download.nvidia.com/compute/cuda/11.0.3/local_installers/cuda_11.0.3_451.82_win10.exe -o cuda.exe
curl.exe -L https://developer.download.nvidia.com/compute/redist/cudnn/v8.0.3/cudnn-11.0-windows-x64-v8.0.3.33.zip -o cudnn.zip
cuda.exe -s
mkdir cuda
unzip.exe cudnn.zip
cp.exe -a cuda/include cuda/lib cuda/bin "C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v11.0/"
)
echo %JAVA_HOME%
- name: Checkout repository
Expand All @@ -187,12 +201,19 @@ jobs:
call mvn -version
bazel version
mkdir %USERPROFILE%\.m2
if "${{ github.event_name }}"=="push" (set MAVEN_PHASE=deploy) else (set MAVEN_PHASE=install)
if "${{ github.event_name }}" == "push" (set MAVEN_PHASE=deploy) else (set MAVEN_PHASE=install)
echo ^<settings^>^<servers^>^<server^>^<id^>ossrh^</id^>^<username^>${{ secrets.CI_DEPLOY_USERNAME }}^</username^>^<password^>${{ secrets.CI_DEPLOY_PASSWORD }}^</password^>^</server^>^</servers^>^</settings^> > %USERPROFILE%\.m2\settings.xml
set "BAZEL_CACHE=--remote_cache=https://storage.googleapis.com/tensorflow-sigs-jvm --remote_upload_local_results=false"
if "${{ github.event_name }}" == "push" (
if "${{ github.repository }}" == "tensorflow/java" (
printenv GCP_CREDS > %USERPROFILE%\gcp_creds.json
set "BAZEL_CACHE=--remote_cache=https://storage.googleapis.com/tensorflow-sigs-jvm --remote_upload_local_results=true --google_credentials=%USERPROFILE%\gcp_creds.json"
)
)
df -h
wmic pagefile list /format:list
echo Executing Maven %MAVEN_PHASE%
call mvn clean %MAVEN_PHASE% -B -U -e -Djavacpp.platform=windows-x86_64 -Djavacpp.platform.extension=${{ matrix.ext }} -pl %NATIVE_BUILD_PROJECTS% -am -DstagingRepositoryId=${{ needs.prepare.outputs.stagingRepositoryId }}
call mvn clean %MAVEN_PHASE% -B -U -e -Djavacpp.platform=windows-x86_64 -Djavacpp.platform.extension=${{ matrix.ext }} -pl %NATIVE_BUILD_PROJECTS% -am -DstagingRepositoryId=${{ needs.prepare.outputs.stagingRepositoryId }} "-Dnative.build.flags=%BAZEL_CACHE%"
if ERRORLEVEL 1 exit /b
df -h
wmic pagefile list /format:list
Expand Down
2 changes: 1 addition & 1 deletion tensorflow-core/tensorflow-core-api/build.sh
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,7 @@ BUILD_FLAGS="$BUILD_FLAGS --experimental_repo_remote_exec --python_path="$PYTHON
BUILD_FLAGS="$BUILD_FLAGS --distinct_host_configuration=true"

# Build C/C++ API of TensorFlow itself including a target to generate ops for Java
bazel build $BUILD_FLAGS \
bazel build $BUILD_FLAGS $BUILD_EXTRA_FLAGS \
@org_tensorflow//tensorflow:tensorflow_cc \
@org_tensorflow//tensorflow/tools/lib_package:jnilicenses_generate \
:java_proto_gen_sources \
Expand Down
Original file line number Diff line number Diff line change
@@ -1,7 +1,8 @@
diff -ruN tensorflow-1.14.0-rc1/third_party/mkl/mkl.BUILD tensorflow-1.14.0-rc1-windows/third_party/mkl/mkl.BUILD
--- tensorflow-1.14.0-rc1/third_party/mkl/mkl.BUILD 2019-06-08 11:23:20.000000000 +0900
+++ tensorflow-1.14.0-rc1-windows/third_party/mkl/mkl.BUILD 2019-06-12 08:30:41.232683854 +0900
@@ -35,11 +35,23 @@
diff --git a/third_party/mkl/BUILD b/third_party/mkl/BUILD
index aa65b585b85..4e6546eac34 100644
--- a/third_party/mkl/BUILD
+++ b/third_party/mkl/BUILD
@@ -91,10 +91,23 @@ cc_library(
visibility = ["//visibility:public"],
)

Expand All @@ -20,11 +21,24 @@ diff -ruN tensorflow-1.14.0-rc1/third_party/mkl/mkl.BUILD tensorflow-1.14.0-rc1-
cc_library(
name = "mkl_libs_windows",
- srcs = [
- "lib/libiomp5md.lib",
- "lib/mklml.lib",
- "@llvm_openmp//:libiomp5md.dll",
+ deps = [
+ "iomp5",
+ "mklml",
+ "mklml"
],
linkopts = ["/FORCE:MULTIPLE"],
visibility = ["//visibility:public"],
)
diff --git a/third_party/llvm_openmp/BUILD b/third_party/llvm_openmp/BUILD
index 099a84dcbaa..f7f9d44118f 100644
--- a/third_party/llvm_openmp/BUILD
+++ b/third_party/llvm_openmp/BUILD
@@ -71,7 +71,7 @@ omp_vars_linux = {

# Windows Cmake vars to expand.
omp_vars_win = {
- "MSVC": 1,
+ "MSVC": 0,
}

omp_all_cmake_vars = select({

1 change: 1 addition & 0 deletions tensorflow-core/tensorflow-core-api/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -260,6 +260,7 @@
</buildCommand>
<environmentVariables>
<EXTENSION>${javacpp.platform.extension}</EXTENSION>
<BUILD_EXTRA_FLAGS>${native.build.flags}</BUILD_EXTRA_FLAGS>
</environmentVariables>
<workingDirectory>${project.basedir}</workingDirectory>
</configuration>
Expand Down
8 changes: 4 additions & 4 deletions tensorflow-core/tensorflow-core-platform-gpu/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -45,12 +45,12 @@
<version>${javacpp.version}</version>
<classifier>${javacpp.platform.linux-x86_64}</classifier>
</dependency>
<!--dependency>
<dependency>
<groupId>org.bytedeco</groupId>
<artifactId>javacpp</artifactId>
<version>${javacpp.version}</version>
<classifier>${javacpp.platform.windows-x86_64}</classifier>
</dependency-->
</dependency>
<dependency>
<groupId>${project.groupId}</groupId>
<artifactId>${javacpp.moduleId}</artifactId>
Expand All @@ -62,12 +62,12 @@
<version>${project.version}</version>
<classifier>${javacpp.platform.linux-x86_64.extension}</classifier>
</dependency>
<!--dependency>
<dependency>
<groupId>${project.groupId}</groupId>
<artifactId>${javacpp.moduleId}</artifactId>
<version>${project.version}</version>
<classifier>${javacpp.platform.windows-x86_64.extension}</classifier>
</dependency-->
</dependency>
</dependencies>

<build>
Expand Down