Skip to content

Commit

Permalink
Merge the master branch from tecent/ncnn (#6)
Browse files Browse the repository at this point in the history
* remove duplicated newline (Tencent#4187)

* remove duplicated newline (Tencent#4188)

* optmize softmax arm neon (Tencent#4171)

* [docs] Fix typo (Tencent#4201)

* [Prelu x86] Finish intrinsic with elempack merged (Tencent#4177)

* changed size of images for pretty formatting of page (Tencent#4193)

* [Gelu x86] Finish intrinsic with elempack merged(fast version) (Tencent#4144)

* Finish the gelu x86 intrinsics
* Finish the fast tanh x86 simd impl

* Ignore .xmake directory (Tencent#4212)

* Bump pypa/cibuildwheel from 2.9.0 to 2.10.1 (Tencent#4207)

Bumps [pypa/cibuildwheel](https://github.com/pypa/cibuildwheel) from 2.9.0 to 2.10.1.
- [Release notes](https://github.com/pypa/cibuildwheel/releases)
- [Changelog](https://github.com/pypa/cibuildwheel/blob/main/docs/changelog.md)
- [Commits](pypa/cibuildwheel@v2.9.0...v2.10.1)

---
updated-dependencies:
- dependency-name: pypa/cibuildwheel
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* style: space alignment (Tencent#4217)

* Ignore CMakeSettings.json, the Visual Studio CMake schema file (Tencent#4228)

* RVV: use new interface for segment load/store & change word_type to size_t&add clang ci (part Tencent#4100) (Tencent#4118)

* RVV: use size_t for vl

* RVV: replace vsseg.v tuple type by using regex

-----

search:
vsseg([1-9])e(8|16|32)_v_(f|i|u)\2m(1|2|4|8)x\1\(([ -~]+), vcreate_\3\2m\4x\1\(([ -~]+)\), vl\);

substitute by:
vsseg$1e$2_v_$3$2m$4($5, $6, vl);

* RVV: replace vssseg.v tuple types by using regex

---

search:
vssseg([1-9])e(8|16|32)_v_f\2m1x\1\(([ -~]+), vcreate_f\2m1x\1\(([ -~]+)\), vl\);

substitute by:
vssseg$1e$2_v_f$2m1($3, $4, vl);

* RVV: replace vlseg.v tuple types in load/store

* RVV: replace vloxseg2ei32.v tuple types

* RVV: add a wrapper for old compilers

* RVV: add segment load/store wrapper in pakcing

* RVV: fix cmake test

* RVV: make clang happy by dropping VLAs in sgemm

* RVV: add clang cmake toolchain configure

* RVV: add clang ci, riscv64-unknown-linux-gnu

Co-authored-by: thelastlin <thelastlin@users.noreply.github.com>
Co-authored-by: nihui <shuizhuyuanluo@126.com>

* Bump pypa/cibuildwheel from 2.10.1 to 2.10.2 (Tencent#4220)

Bumps [pypa/cibuildwheel](https://github.com/pypa/cibuildwheel) from 2.10.1 to 2.10.2.
- [Release notes](https://github.com/pypa/cibuildwheel/releases)
- [Changelog](https://github.com/pypa/cibuildwheel/blob/main/docs/changelog.md)
- [Commits](pypa/cibuildwheel@v2.10.1...v2.10.2)

---
updated-dependencies:
- dependency-name: pypa/cibuildwheel
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* add c906 build ci (Tencent#4232)

* Add benchmark result of T-Head TH1520 (Tencent#4240)

`cpuinfo`: 

```
isa             : rv64imafdcvsu
mmu             : sv39
cpu-freq                : 1.848Ghz
cpu-icache              : 64KB
cpu-dcache              : 64KB
cpu-l2cache             : 1MB
cpu-tlb         : 1024 4-ways
cpu-cacheline           : 64Bytes
cpu-vector              : 0.7.1
```

Compiled with `-DCMAKE_TOOLCHAIN_FILE=../toolchains/c910-v240.toolchain.cmake -DCMAKE_BUILD_TYPE=release -DNCNN_OPENMP=OFF -DNCNN_THREADS=OFF -DNCNN_RUNTIME_CPU=OFF -DNCNN_RVV=ON -DNCNN_SIMPLEOCV=ON -DNCNN_BUILD_EXAMPLES=ON` 

Seems much worse than expected 🤔

* fix param parsing issue when layer/blob name exceeds 255 (Tencent#4236)

* fix param parsing issue when layer/blob name exceeds 255

* apply code-format changes

Co-authored-by: ZhangGe6 <ZhangGe6@users.noreply.github.com>

* Memory Pool Improvement For Variadic Sized Inputs (Tencent#4190)

* Simple miss count for better space efficiency

* Simple double ended greedy;

* Add size drop threshold setter;

* set workspace allocator cr to zero as we had some sort of recylcing capability :P

Co-authored-by: LinHeLurking <LinHeLurking@users.noreply.github.com>
Co-authored-by: nihuini <nihuini@tencent.com>

* docs: disable fp16 when wrong results encountered caused by overflow (Tencent#4248)

* pnnx math operation (Tencent#4251)

* more stricter armv7 fp16 and armv84 bf16 compiler check, fix Tencent#4147 fix Tencent#4222 (Tencent#4247)

* modified the param axes of expanddims in modelwriter (Tencent#4259)

* Add TH1520 (4*C910V) toolchain support.  (Tencent#4267)

* implement lstm proj_size (Tencent#4263)

* Optimize x86 DeformableConv2D (Tencent#4128)

* fix compile warning with gcc 9.1.0 including simplestl.h file (Tencent#4274)

* fix compile warning with gcc 9.1.0 including simplestl.h file

* apply code-format changes

Co-authored-by: veahow <veahow@users.noreply.github.com>

* add benchmark for rk3588 on rock5b (Tencent#4275)

* linux-x64-cpu-gcc on tencent ci

* implement layer feature disabled bit (Tencent#4278)

* add elu vulkan operator (Tencent#4280)

* fix tencent ci (Tencent#4277)

* implement GLU and pnnx conversion (Tencent#4283)

* Bump pypa/cibuildwheel from 2.10.2 to 2.11.1 (Tencent#4271)

Bumps [pypa/cibuildwheel](https://github.com/pypa/cibuildwheel) from 2.10.2 to 2.11.1.
- [Release notes](https://github.com/pypa/cibuildwheel/releases)
- [Changelog](https://github.com/pypa/cibuildwheel/blob/main/docs/changelog.md)
- [Commits](pypa/cibuildwheel@v2.10.2...v2.11.1)

---
updated-dependencies:
- dependency-name: pypa/cibuildwheel
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* fix pnnx softmax/normalize/slice negative axis conversion to ncnn (Tencent#4284)

* pnnx glu batchindex aware conversion (Tencent#4285)

* 1. Fix typo in readme (Tencent#4287)

* x86 sse2/avx2 optimization for convolution sgemm/winograd int8 family (Tencent#4286)

* pnnx skip dynamic size evaluation (Tencent#4291)

* Fix linux build error(Tencent#4265) (Tencent#4294)

Co-authored-by: wangyu <786794414@qq.com>

* general cpu feature detection on macos/ios, enable bf16 and i8mm on a15 a16 and m2 (Tencent#4300)

* x86 unified fc fp32/fp16s (Tencent#4303)

* more fma
* more transpose utility function

* Bump pypa/cibuildwheel from 2.11.1 to 2.11.2 (Tencent#4308)

Bumps [pypa/cibuildwheel](https://github.com/pypa/cibuildwheel) from 2.11.1 to 2.11.2.
- [Release notes](https://github.com/pypa/cibuildwheel/releases)
- [Changelog](https://github.com/pypa/cibuildwheel/blob/main/docs/changelog.md)
- [Commits](pypa/cibuildwheel@v2.11.1...v2.11.2)

---
updated-dependencies:
- dependency-name: pypa/cibuildwheel
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* pnnx pytorch 1.13 (Tencent#4314)

* fix Tencent#4315 (Tencent#4316)

* get_physical_cpu_count api family (Tencent#4302)

* get_physical_cpu_count api family

* set default to physical big cpu

* always treat smt core as big core

* is_smt_cpu

* get max freq mhz on windows

* windows thread affinity

* groupnorm 1d/2d/4d (Tencent#4312)

* fix slice end index, fix fp16 model weight alignment (Tencent#4317)

* tencent ci test-coverage pnnx (Tencent#4305)

* RVV: BatchNorm with fp16s(a) support (Tencent#4075)

* RVV: InstanceNorm with fp16s(a) support (Tencent#4078)

* fix ci pnnx build

* fold new_full and full_like (Tencent#4323)

* pnnx convert nn.Softmax2d (Tencent#4324)

* pnnx convert fold unfold (Tencent#4325)

* support yolov5 6.2 (Tencent#4328)

* implement ncnn fold and unfold (Tencent#4326)

* pnnx load gpu torchscript and reset device (Tencent#4330)

* fix:pnnx-softmax (Tencent#4333)

* pnnx save onnx zero (Tencent#4077)

* save foldable constants in file for reducing memory usage (Tencent#4337)

* match inplace slice copy pattern, rewrite copy uses (Tencent#4338)

* add vector optimization for loongarch64 (Tencent#4242)

* ci loongarch64 lsx (Tencent#4344)

* gridsample op support (Tencent#4288)



Co-authored-by: LRY89757 <LRY89757@users.noreply.github.com>
Co-authored-by: nihuini <nihuini@tencent.com>
Co-authored-by: nihui <shuizhuyuanluo@126.com>

* squeeze and expanddims 4d (Tencent#4346)

* implement MultiheadAttention kdim vdim (Tencent#4347)

* pnnx convert torch bitwise left_shift right_shift (Tencent#4349)

* pnnx fp16 option for ncnn and onnx weight type (Tencent#4350)

* pnnx fuse more function to module (Tencent#4351)

* pnnx fuse more function to module

* rename some pass name

* fuse adjacent reshape, fuse pad conv2d

* fuse pad conv1d

* split tests (Tencent#4354)

* Support mat.numpy() in Python (Tencent#4356)

* Fix typo in stb_image.h (Tencent#4358)

exitting -> exiting

* Fix windows-arm64 build for non-neon case (Tencent#4227)

* update release ci (Tencent#4359)

* update release ci

* find modern glslang

* parallel jobs on windows

* Fix c api allocator (Tencent#4360)

* add some c_api interfaces related to allocator setup.

* fix errors in allocator parameters in c_api.

* test c api allocator

Co-authored-by: zhangtongshe <yuyuyezi@vip.qq.com>

* update glslang (Tencent#4361)

* disable out-of-line atomics since ndk23+ for resolving linking issue with old ndk (Tencent#4362)

* I added one more project to the list of examples. (Tencent#4205)

* Dedicated to coloring black and white photographs.

* add example project link (Tencent#4365)

* fix(pybind11): build error (Tencent#4368)

* fix openmp affinity abort when cpu goes offline (Tencent#4370)

* Update release-python.yml

* small fixes

* unpack list input

* Remove LSTM2

* fix LSTM

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: Molly Sophia <mollysophia379@gmail.com>
Co-authored-by: Menci <huanghaorui301@gmail.com>
Co-authored-by: luqiang guo <702572275@qq.com>
Co-authored-by: Lry89757 <77330637+LRY89757@users.noreply.github.com>
Co-authored-by: magicse <magicse@users.noreply.github.com>
Co-authored-by: Zhuo Zhang <imzhuo@foxmail.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: 汤圆奶昔 <47135403+tonori@users.noreply.github.com>
Co-authored-by: Xavier Hsinyuan <me@lstlx.com>
Co-authored-by: thelastlin <thelastlin@users.noreply.github.com>
Co-authored-by: nihui <shuizhuyuanluo@126.com>
Co-authored-by: 柚木鉉 <740291272@qq.com>
Co-authored-by: Zhang Ge <sjtu.zg123@gmail.com>
Co-authored-by: ZhangGe6 <ZhangGe6@users.noreply.github.com>
Co-authored-by: LinHe <LinHe.Lurking@gmail.com>
Co-authored-by: LinHeLurking <LinHeLurking@users.noreply.github.com>
Co-authored-by: nihuini <nihuini@tencent.com>
Co-authored-by: MisakaBit <MisakaBit@gmail.com>
Co-authored-by: LiuYi-Up <73060646+LiuYi-Up@users.noreply.github.com>
Co-authored-by: 陸 言 <robinluaa@outlook.com>
Co-authored-by: miemie2013 <53960695+miemie2013@users.noreply.github.com>
Co-authored-by: Eahow Chen <15228088+veahow@users.noreply.github.com>
Co-authored-by: veahow <veahow@users.noreply.github.com>
Co-authored-by: li mengyang <hwdefcom@outlook.com>
Co-authored-by: Yoh <wpz_yoh@163.com>
Co-authored-by: Caize Wu <zepanwucai@gmail.com>
Co-authored-by: bestpower <wangyu117136@gmail.com>
Co-authored-by: wangyu <786794414@qq.com>
Co-authored-by: shaoshengsong <30892500+shaoshengsong@users.noreply.github.com>
Co-authored-by: WuJinxuan <2456510228@qq.com>
Co-authored-by: junchao-loongson <68935141+junchao-loongson@users.noreply.github.com>
Co-authored-by: LRY89757 <LRY89757@users.noreply.github.com>
Co-authored-by: Ikko Ashimine <eltociear@gmail.com>
Co-authored-by: zhangtongshe <yuyuyezi@vip.qq.com>
Co-authored-by: tpoisonooo <khj.application@aliyun.com>
  • Loading branch information
1 parent 5354c63 commit 2655d0b
Show file tree
Hide file tree
Showing 592 changed files with 64,480 additions and 12,976 deletions.
119 changes: 119 additions & 0 deletions .ci/linux-x64-cpu-gcc.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,119 @@
name: linux-x64-cpu-gcc
on:
push:
branches: [master]
paths:
- '.ci/linux-x64-cpu-gcc.yml'
- 'CMakeLists.txt'
- 'cmake/**'
- 'src/*'
- 'src/layer/*'
- 'src/layer/x86/**'
- 'tests/**'
- 'tools/**'
- '!tools/pnnx/**'
- 'examples/**'
mr:
target-branches: [master]
paths:
- '.ci/linux-x64-cpu-gcc.yml'
- 'CMakeLists.txt'
- 'cmake/**'
- 'src/*'
- 'src/layer/*'
- 'src/layer/x86/**'
- 'tests/**'
- 'tools/**'
- '!tools/pnnx/**'
- 'examples/**'
concurrency:
group: linux-x64-cpu-gcc-${{ ci.head_ref }}

jobs:
linux-gcc:
name: linux-gcc
strategy:
matrix:
include:
- { SSE2: 'OFF', AVX: 'OFF', AVX2: 'OFF', AVX512: 'OFF' }
- { SSE2: 'ON', AVX: 'OFF', AVX2: 'OFF', AVX512: 'OFF' }
- { SSE2: 'ON', AVX: 'ON', AVX2: 'OFF', AVX512: 'OFF' }
- { SSE2: 'ON', AVX: 'ON', AVX2: 'ON', AVX512: 'OFF' }
- { SSE2: 'ON', AVX: 'ON', AVX2: 'ON', AVX512: 'ON' }

runs-on:
pool-name: docker
container:
image: bkci/ci:ubuntu
steps:
- name: checkout
checkout: self
with:
strategy: FRESH_CHECKOUT
enableSubmodule: false
enableGitLfs: false

- name: install-deps
run: |
apt-get update
apt-get install -y libprotobuf-dev protobuf-compiler libopencv-dev
- name: build
run: |
mkdir build && cd build
cmake -DNCNN_SSE2=${{matrix.SSE2}} -DNCNN_AVX=${{matrix.AVX}} -DNCNN_AVX2=${{matrix.AVX2}} -DNCNN_AVX512=${{matrix.AVX512}} -DNCNN_BUILD_TESTS=ON ..
cmake --build . -j $(nproc)
- name: test
run: cd build && ctest --output-on-failure -j $(nproc)
- name: build-shared
run: |
mkdir build-shared && cd build-shared
cmake -DNCNN_SSE2=${{matrix.SSE2}} -DNCNN_AVX=${{matrix.AVX}} -DNCNN_AVX2=${{matrix.AVX2}} -DNCNN_AVX512=${{matrix.AVX512}} -DNCNN_SHARED_LIB=ON ..
cmake --build . -j $(nproc)
- name: build-noint8
run: |
mkdir build-noint8 && cd build-noint8
cmake -DNCNN_SSE2=${{matrix.SSE2}} -DNCNN_AVX=${{matrix.AVX}} -DNCNN_AVX2=${{matrix.AVX2}} -DNCNN_AVX512=${{matrix.AVX512}} -DNCNN_INT8=OFF -DNCNN_BUILD_TESTS=ON ..
cmake --build . -j $(nproc)
- name: test-noint8
run: cd build-noint8 && ctest --output-on-failure -j $(nproc)

linux-gcc-cpp03-nostdio-nostring-simplestl:
runs-on:
pool-name: docker
container:
image: bkci/ci:ubuntu
steps:
- name: checkout
checkout: self
with:
strategy: FRESH_CHECKOUT
enableSubmodule: false
enableGitLfs: false

- name: build-nostdio
run: |
mkdir build-nostdio && cd build-nostdio
cmake -DCMAKE_TOOLCHAIN_FILE=../toolchains/host.gcc-c++03.toolchain.cmake -DNCNN_BUILD_TESTS=ON -DNCNN_BUILD_TOOLS=OFF -DNCNN_BUILD_EXAMPLES=OFF ..
cmake --build . -j $(nproc)
- name: test-nostdio
run: cd build-nostdio && ctest --output-on-failure -j $(nproc)
- name: build-nostdio-nostring
run: |
mkdir build-nostdio-nostring && cd build-nostdio-nostring
cmake -DNCNN_STDIO=OFF -DNCNN_STRING=OFF -DNCNN_BUILD_TESTS=OFF -DNCNN_BUILD_BENCHMARK=OFF -DNCNN_BUILD_TOOLS=OFF -DNCNN_BUILD_EXAMPLES=OFF ..
cmake --build . -j $(nproc)
- name: build-simplestl
run: |
mkdir build-simplestl && cd build-simplestl
cmake -DCMAKE_TOOLCHAIN_FILE=../toolchains/host-c.gcc.toolchain.cmake -DNCNN_STDIO=ON -DNCNN_STRING=ON -DNCNN_SIMPLESTL=ON -DNCNN_BUILD_TESTS=ON -DNCNN_BUILD_BENCHMARK=OFF -DNCNN_BUILD_TOOLS=OFF -DNCNN_BUILD_EXAMPLES=OFF ..
cmake --build . -j $(nproc)
- name: test-simplestl
run: cd build-simplestl && ctest --output-on-failure -j $(nproc)
- name: build-simplestl-simpleomp
run: |
mkdir build-simplestl-simpleomp && cd build-simplestl-simpleomp
cmake -DCMAKE_TOOLCHAIN_FILE=../toolchains/host-c.gcc.toolchain.cmake -DNCNN_STDIO=ON -DNCNN_STRING=ON -DNCNN_SIMPLESTL=ON -DNCNN_SIMPLEOMP=ON -DNCNN_BUILD_TESTS=ON -DNCNN_BUILD_BENCHMARK=OFF -DNCNN_BUILD_TOOLS=OFF -DNCNN_BUILD_EXAMPLES=OFF ..
cmake --build . -j $(nproc)
- name: test-simplestl-simpleomp
run: cd build-simplestl-simpleomp && ctest --output-on-failure -j $(nproc)
120 changes: 120 additions & 0 deletions .ci/pnnx.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,120 @@
name: pnnx
on:
push:
branches: [master]
paths:
- '.ci/pnnx.yml'
- 'tools/pnnx/**'
- '!tools/pnnx/README.md'
mr:
target-branches: [master]
paths:
- '.ci/pnnx.yml'
- 'tools/pnnx/**'
- '!tools/pnnx/README.md'
concurrency:
group: pnnx-${{ ci.head_ref }}

jobs:
ubuntu:
strategy:
matrix:
include:
- torch-version: 1.8.1
torchvision-version: 0.9.1
torchvision-cache-key: '0_9_1'

- torch-version: 1.9.1
torchvision-version: 0.10.1
torchvision-cache-key: '0_10_1'

- torch-version: 1.10.0
torchvision-version: 0.11.1
torchvision-cache-key: '0_11_1'

- torch-version: 1.11.0
torchvision-version: 0.12.0
torchvision-cache-key: '0_12_0'

- torch-version: 1.12.0
torchvision-version: 0.13.0
torchvision-cache-key: '0_13_0'

- torch-version: 1.13.0
torchvision-version: 0.14.0
torchvision-cache-key: '0_14_0'

runs-on:
pool-name: docker
container:
image: bkci/ci:ubuntu
steps:
- name: checkout
checkout: self
with:
strategy: FRESH_CHECKOUT
enableGitLfs: false

- name: install-deps
run: |
apt-get update
apt-get install -y python3-pip libjpeg-dev libpng-dev libprotobuf-dev protobuf-compiler
python3 -m pip install --upgrade pip
pip3 uninstall -y setuptools
pip3 install -U pytest setuptools wheel twine distribute requests
- name: setup pytorch
run: |
export PYTHONUSERBASE=${{ci.workspace}}/torch-${{matrix.torch-version}}
pip3 install --user torch==${{matrix.torch-version}}+cpu torchvision==${{matrix.torchvision-version}}+cpu -f https://download.pytorch.org/whl/torch_stable.html
- name: cache-torchvision
id: cache-torchvision
uses: cache@1.*
with:
cachePaths: torchvision-${{matrix.torchvision-version}}-install
cacheKey: torchvision-${{matrix.torchvision-cache-key}}-linux-install-20211228
- name: checkout-torchvision
if: steps.cache-torchvision.outputs.cacheHit != 'true'
checkout: https://github.com/pytorch/vision.git
with:
pullType: TAG
refName: v${{matrix.torchvision-version}}
localPath: vision
enableSubmodule: false
enableGitLfs: false
- name: torchvision
if: steps.cache-torchvision.outputs.cacheHit != 'true'
run: |
cd vision
mkdir -p build; cd build
cmake -DCMAKE_INSTALL_PREFIX=${{ci.workspace}}/torchvision-${{matrix.torchvision-version}}-install -DTorch_DIR=${{ci.workspace}}/torch-${{matrix.torch-version}}/lib/python3.9/site-packages/torch/share/cmake/Torch -DCMAKE_BUILD_TYPE=Release ..
cmake --build . -j $(nproc)
cmake --build . --target install
- name: build-ncnn
run: |
export PYTHONUSERBASE=${{ci.workspace}}/torch-${{matrix.torch-version}}
mkdir build && cd build
cmake -DCMAKE_BUILD_TYPE=Release -DNCNN_PYTHON=ON -DNCNN_BUILD_TOOLS=OFF -DNCNN_BUILD_EXAMPLES=OFF ..
cmake --build . -j $(nproc)
cd ..
export CMAKE_BUILD_PARALLEL_LEVEL=$(nproc)
pip3 install --user .
- name: build-pnnx
run: |
export PYTHONUSERBASE=${{ci.workspace}}/torch-${{matrix.torch-version}}
cd tools/pnnx
mkdir build && cd build
cmake -DCMAKE_BUILD_TYPE=Release -DTorchVision_INSTALL_DIR=${{ci.workspace}}/torchvision-${{matrix.torchvision-version}}-install ..
cmake --build . -j $(nproc)
- name: test
run: |
export PYTHONUSERBASE=${{ci.workspace}}/torch-${{matrix.torch-version}}
export OMP_NUM_THREADS=1
export MKL_NUM_THREADS=1
export MKL_ENABLE_INSTRUCTIONS=SSE4_2
cd tools/pnnx
cd build && ctest --output-on-failure -j 16
Loading

0 comments on commit 2655d0b

Please sign in to comment.