Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge the master branch from tecent/ncnn #6

Merged
merged 86 commits into from
Dec 1, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
86 commits
Select commit Hold shift + click to select a range
1d7b217
remove duplicated newline (#4187)
MollySophia Sep 4, 2022
479a73a
remove duplicated newline (#4188)
Menci Sep 4, 2022
5148224
optmize softmax arm neon (#4171)
luqiang-guo Sep 13, 2022
b16f8ca
[docs] Fix typo (#4201)
LRY89757 Sep 15, 2022
9f59711
[Prelu x86] Finish intrinsic with elempack merged (#4177)
LRY89757 Sep 15, 2022
6167b72
changed size of images for pretty formatting of page (#4193)
magicse Sep 15, 2022
5eb56b2
[Gelu x86] Finish intrinsic with elempack merged(fast version) (#4144)
LRY89757 Sep 18, 2022
bfe27f2
Ignore .xmake directory (#4212)
zchrissirhcz Sep 21, 2022
183e6e9
Bump pypa/cibuildwheel from 2.9.0 to 2.10.1 (#4207)
dependabot[bot] Sep 21, 2022
d30fc82
style: space alignment (#4217)
tonori Sep 24, 2022
4f9e398
Ignore CMakeSettings.json, the Visual Studio CMake schema file (#4228)
zchrissirhcz Oct 1, 2022
e7eadca
RVV: use new interface for segment load/store & change word_type to s…
thelastlin Oct 1, 2022
bdcd68f
Bump pypa/cibuildwheel from 2.10.1 to 2.10.2 (#4220)
dependabot[bot] Oct 1, 2022
59a6fa3
add c906 build ci (#4232)
nihui Oct 3, 2022
eb9bb5d
Add benchmark result of T-Head TH1520 (#4240)
YuzukiTsuru Oct 6, 2022
3fce00b
fix param parsing issue when layer/blob name exceeds 255 (#4236)
ZhangGe6 Oct 7, 2022
9426e21
Memory Pool Improvement For Variadic Sized Inputs (#4190)
LinHeLurking Oct 9, 2022
bbbe17c
docs: disable fp16 when wrong results encountered caused by overflow …
MisakaBit Oct 9, 2022
cef95f6
pnnx math operation (#4251)
nihui Oct 9, 2022
3e2b3fa
more stricter armv7 fp16 and armv84 bf16 compiler check, fix #4147 fi…
nihui Oct 10, 2022
902954d
modified the param axes of expanddims in modelwriter (#4259)
LiuYi-Up Oct 10, 2022
0f38cb2
Add TH1520 (4*C910V) toolchain support. (#4267)
luyanaa Oct 13, 2022
77eda4c
implement lstm proj_size (#4263)
nihui Oct 14, 2022
b13c2a1
Optimize x86 DeformableConv2D (#4128)
miemie2013 Oct 14, 2022
f80c274
fix compile warning with gcc 9.1.0 including simplestl.h file (#4274)
veahow Oct 14, 2022
0df463a
add benchmark for rk3588 on rock5b (#4275)
hwdef Oct 17, 2022
270d6b2
linux-x64-cpu-gcc on tencent ci
nihui Oct 17, 2022
0b591b0
implement layer feature disabled bit (#4278)
nihui Oct 18, 2022
bb660d0
add elu vulkan operator (#4280)
Yoh-Z Oct 18, 2022
c62d256
fix tencent ci (#4277)
nihui Oct 19, 2022
5281d51
implement GLU and pnnx conversion (#4283)
csukuangfj Oct 19, 2022
549152c
Bump pypa/cibuildwheel from 2.10.2 to 2.11.1 (#4271)
dependabot[bot] Oct 19, 2022
777e4ef
fix pnnx softmax/normalize/slice negative axis conversion to ncnn (#4…
nihui Oct 19, 2022
f770987
pnnx glu batchindex aware conversion (#4285)
nihui Oct 19, 2022
c33cbc9
1. Fix typo in readme (#4287)
Zepan Oct 19, 2022
8eab5ea
x86 sse2/avx2 optimization for convolution sgemm/winograd int8 family…
nihui Oct 20, 2022
8edc03c
pnnx skip dynamic size evaluation (#4291)
nihui Oct 20, 2022
a116e00
Fix linux build error(#4265) (#4294)
bestpower Oct 21, 2022
512e584
general cpu feature detection on macos/ios, enable bf16 and i8mm on a…
nihui Oct 23, 2022
5ee276c
x86 unified fc fp32/fp16s (#4303)
nihui Oct 26, 2022
b17c9eb
Bump pypa/cibuildwheel from 2.11.1 to 2.11.2 (#4308)
dependabot[bot] Oct 28, 2022
fdf129f
pnnx pytorch 1.13 (#4314)
nihui Oct 29, 2022
9c6f110
fix #4315 (#4316)
nihui Oct 30, 2022
b853b3d
get_physical_cpu_count api family (#4302)
nihui Oct 31, 2022
6e49fa3
groupnorm 1d/2d/4d (#4312)
nihui Oct 31, 2022
0f9a3bb
fix slice end index, fix fp16 model weight alignment (#4317)
nihui Oct 31, 2022
a91411e
tencent ci test-coverage pnnx (#4305)
nihui Nov 1, 2022
31602bd
RVV: BatchNorm with fp16s(a) support (#4075)
thelastlin Nov 1, 2022
d1ac1de
RVV: InstanceNorm with fp16s(a) support (#4078)
thelastlin Nov 1, 2022
2ef57a6
fix ci pnnx build
nihui Nov 1, 2022
bcf06bd
fold new_full and full_like (#4323)
nihui Nov 2, 2022
b8d40a9
pnnx convert nn.Softmax2d (#4324)
nihui Nov 2, 2022
a12c24d
pnnx convert fold unfold (#4325)
nihui Nov 2, 2022
d522e78
support yolov5 6.2 (#4328)
shaoshengsong Nov 3, 2022
5b28c17
implement ncnn fold and unfold (#4326)
nihui Nov 3, 2022
92da26b
pnnx load gpu torchscript and reset device (#4330)
nihui Nov 4, 2022
abb2843
fix:pnnx-softmax (#4333)
EdVince Nov 5, 2022
cb88e16
pnnx save onnx zero (#4077)
nihui Nov 5, 2022
a7e3c62
save foldable constants in file for reducing memory usage (#4337)
nihui Nov 8, 2022
a2af636
match inplace slice copy pattern, rewrite copy uses (#4338)
nihui Nov 10, 2022
279222c
add vector optimization for loongarch64 (#4242)
junchao-loongson Nov 11, 2022
6019f47
ci loongarch64 lsx (#4344)
nihui Nov 11, 2022
6a47f8d
gridsample op support (#4288)
LRY89757 Nov 11, 2022
498ca73
squeeze and expanddims 4d (#4346)
nihui Nov 13, 2022
eceac35
implement MultiheadAttention kdim vdim (#4347)
nihui Nov 14, 2022
6967baa
pnnx convert torch bitwise left_shift right_shift (#4349)
nihui Nov 14, 2022
ec1b07c
pnnx fp16 option for ncnn and onnx weight type (#4350)
nihui Nov 14, 2022
aed05aa
pnnx fuse more function to module (#4351)
nihui Nov 16, 2022
057b5bb
split tests (#4354)
nihui Nov 17, 2022
1b83fe4
Support mat.numpy() in Python (#4356)
csukuangfj Nov 18, 2022
cdba4ae
Fix typo in stb_image.h (#4358)
eltociear Nov 19, 2022
a5e60ae
Fix windows-arm64 build for non-neon case (#4227)
zchrissirhcz Nov 19, 2022
6647396
update release ci (#4359)
nihui Nov 20, 2022
0736c5b
Fix c api allocator (#4360)
nihui Nov 21, 2022
f527fe8
update glslang (#4361)
nihui Nov 21, 2022
cf07bd9
disable out-of-line atomics since ndk23+ for resolving linking issue …
nihui Nov 23, 2022
8f9a524
I added one more project to the list of examples. (#4205)
magicse Nov 23, 2022
47c4ab7
add example project link (#4365)
shaoshengsong Nov 23, 2022
bdcbc37
fix(pybind11): build error (#4368)
tpoisonooo Nov 25, 2022
c934c6e
fix openmp affinity abort when cpu goes offline (#4370)
nihui Nov 26, 2022
03550ba
Update release-python.yml
nihui Nov 28, 2022
77ba238
small fixes
csukuangfj Dec 1, 2022
9e82a82
Merge remote-tracking branch 'github/master' into test-master
csukuangfj Dec 1, 2022
49fd80f
unpack list input
csukuangfj Dec 1, 2022
064831d
Remove LSTM2
csukuangfj Dec 1, 2022
2c6d916
fix LSTM
csukuangfj Dec 1, 2022
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
119 changes: 119 additions & 0 deletions .ci/linux-x64-cpu-gcc.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,119 @@
name: linux-x64-cpu-gcc
on:
push:
branches: [master]
paths:
- '.ci/linux-x64-cpu-gcc.yml'
- 'CMakeLists.txt'
- 'cmake/**'
- 'src/*'
- 'src/layer/*'
- 'src/layer/x86/**'
- 'tests/**'
- 'tools/**'
- '!tools/pnnx/**'
- 'examples/**'
mr:
target-branches: [master]
paths:
- '.ci/linux-x64-cpu-gcc.yml'
- 'CMakeLists.txt'
- 'cmake/**'
- 'src/*'
- 'src/layer/*'
- 'src/layer/x86/**'
- 'tests/**'
- 'tools/**'
- '!tools/pnnx/**'
- 'examples/**'
concurrency:
group: linux-x64-cpu-gcc-${{ ci.head_ref }}

jobs:
linux-gcc:
name: linux-gcc
strategy:
matrix:
include:
- { SSE2: 'OFF', AVX: 'OFF', AVX2: 'OFF', AVX512: 'OFF' }
- { SSE2: 'ON', AVX: 'OFF', AVX2: 'OFF', AVX512: 'OFF' }
- { SSE2: 'ON', AVX: 'ON', AVX2: 'OFF', AVX512: 'OFF' }
- { SSE2: 'ON', AVX: 'ON', AVX2: 'ON', AVX512: 'OFF' }
- { SSE2: 'ON', AVX: 'ON', AVX2: 'ON', AVX512: 'ON' }

runs-on:
pool-name: docker
container:
image: bkci/ci:ubuntu
steps:
- name: checkout
checkout: self
with:
strategy: FRESH_CHECKOUT
enableSubmodule: false
enableGitLfs: false

- name: install-deps
run: |
apt-get update
apt-get install -y libprotobuf-dev protobuf-compiler libopencv-dev

- name: build
run: |
mkdir build && cd build
cmake -DNCNN_SSE2=${{matrix.SSE2}} -DNCNN_AVX=${{matrix.AVX}} -DNCNN_AVX2=${{matrix.AVX2}} -DNCNN_AVX512=${{matrix.AVX512}} -DNCNN_BUILD_TESTS=ON ..
cmake --build . -j $(nproc)
- name: test
run: cd build && ctest --output-on-failure -j $(nproc)
- name: build-shared
run: |
mkdir build-shared && cd build-shared
cmake -DNCNN_SSE2=${{matrix.SSE2}} -DNCNN_AVX=${{matrix.AVX}} -DNCNN_AVX2=${{matrix.AVX2}} -DNCNN_AVX512=${{matrix.AVX512}} -DNCNN_SHARED_LIB=ON ..
cmake --build . -j $(nproc)
- name: build-noint8
run: |
mkdir build-noint8 && cd build-noint8
cmake -DNCNN_SSE2=${{matrix.SSE2}} -DNCNN_AVX=${{matrix.AVX}} -DNCNN_AVX2=${{matrix.AVX2}} -DNCNN_AVX512=${{matrix.AVX512}} -DNCNN_INT8=OFF -DNCNN_BUILD_TESTS=ON ..
cmake --build . -j $(nproc)
- name: test-noint8
run: cd build-noint8 && ctest --output-on-failure -j $(nproc)

linux-gcc-cpp03-nostdio-nostring-simplestl:
runs-on:
pool-name: docker
container:
image: bkci/ci:ubuntu
steps:
- name: checkout
checkout: self
with:
strategy: FRESH_CHECKOUT
enableSubmodule: false
enableGitLfs: false

- name: build-nostdio
run: |
mkdir build-nostdio && cd build-nostdio
cmake -DCMAKE_TOOLCHAIN_FILE=../toolchains/host.gcc-c++03.toolchain.cmake -DNCNN_BUILD_TESTS=ON -DNCNN_BUILD_TOOLS=OFF -DNCNN_BUILD_EXAMPLES=OFF ..
cmake --build . -j $(nproc)
- name: test-nostdio
run: cd build-nostdio && ctest --output-on-failure -j $(nproc)
- name: build-nostdio-nostring
run: |
mkdir build-nostdio-nostring && cd build-nostdio-nostring
cmake -DNCNN_STDIO=OFF -DNCNN_STRING=OFF -DNCNN_BUILD_TESTS=OFF -DNCNN_BUILD_BENCHMARK=OFF -DNCNN_BUILD_TOOLS=OFF -DNCNN_BUILD_EXAMPLES=OFF ..
cmake --build . -j $(nproc)
- name: build-simplestl
run: |
mkdir build-simplestl && cd build-simplestl
cmake -DCMAKE_TOOLCHAIN_FILE=../toolchains/host-c.gcc.toolchain.cmake -DNCNN_STDIO=ON -DNCNN_STRING=ON -DNCNN_SIMPLESTL=ON -DNCNN_BUILD_TESTS=ON -DNCNN_BUILD_BENCHMARK=OFF -DNCNN_BUILD_TOOLS=OFF -DNCNN_BUILD_EXAMPLES=OFF ..
cmake --build . -j $(nproc)
- name: test-simplestl
run: cd build-simplestl && ctest --output-on-failure -j $(nproc)
- name: build-simplestl-simpleomp
run: |
mkdir build-simplestl-simpleomp && cd build-simplestl-simpleomp
cmake -DCMAKE_TOOLCHAIN_FILE=../toolchains/host-c.gcc.toolchain.cmake -DNCNN_STDIO=ON -DNCNN_STRING=ON -DNCNN_SIMPLESTL=ON -DNCNN_SIMPLEOMP=ON -DNCNN_BUILD_TESTS=ON -DNCNN_BUILD_BENCHMARK=OFF -DNCNN_BUILD_TOOLS=OFF -DNCNN_BUILD_EXAMPLES=OFF ..
cmake --build . -j $(nproc)
- name: test-simplestl-simpleomp
run: cd build-simplestl-simpleomp && ctest --output-on-failure -j $(nproc)
120 changes: 120 additions & 0 deletions .ci/pnnx.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,120 @@
name: pnnx
on:
push:
branches: [master]
paths:
- '.ci/pnnx.yml'
- 'tools/pnnx/**'
- '!tools/pnnx/README.md'
mr:
target-branches: [master]
paths:
- '.ci/pnnx.yml'
- 'tools/pnnx/**'
- '!tools/pnnx/README.md'
concurrency:
group: pnnx-${{ ci.head_ref }}

jobs:
ubuntu:
strategy:
matrix:
include:
- torch-version: 1.8.1
torchvision-version: 0.9.1
torchvision-cache-key: '0_9_1'

- torch-version: 1.9.1
torchvision-version: 0.10.1
torchvision-cache-key: '0_10_1'

- torch-version: 1.10.0
torchvision-version: 0.11.1
torchvision-cache-key: '0_11_1'

- torch-version: 1.11.0
torchvision-version: 0.12.0
torchvision-cache-key: '0_12_0'

- torch-version: 1.12.0
torchvision-version: 0.13.0
torchvision-cache-key: '0_13_0'

- torch-version: 1.13.0
torchvision-version: 0.14.0
torchvision-cache-key: '0_14_0'

runs-on:
pool-name: docker
container:
image: bkci/ci:ubuntu
steps:
- name: checkout
checkout: self
with:
strategy: FRESH_CHECKOUT
enableGitLfs: false

- name: install-deps
run: |
apt-get update
apt-get install -y python3-pip libjpeg-dev libpng-dev libprotobuf-dev protobuf-compiler
python3 -m pip install --upgrade pip
pip3 uninstall -y setuptools
pip3 install -U pytest setuptools wheel twine distribute requests

- name: setup pytorch
run: |
export PYTHONUSERBASE=${{ci.workspace}}/torch-${{matrix.torch-version}}
pip3 install --user torch==${{matrix.torch-version}}+cpu torchvision==${{matrix.torchvision-version}}+cpu -f https://download.pytorch.org/whl/torch_stable.html

- name: cache-torchvision
id: cache-torchvision
uses: cache@1.*
with:
cachePaths: torchvision-${{matrix.torchvision-version}}-install
cacheKey: torchvision-${{matrix.torchvision-cache-key}}-linux-install-20211228
- name: checkout-torchvision
if: steps.cache-torchvision.outputs.cacheHit != 'true'
checkout: https://github.com/pytorch/vision.git
with:
pullType: TAG
refName: v${{matrix.torchvision-version}}
localPath: vision
enableSubmodule: false
enableGitLfs: false
- name: torchvision
if: steps.cache-torchvision.outputs.cacheHit != 'true'
run: |
cd vision
mkdir -p build; cd build
cmake -DCMAKE_INSTALL_PREFIX=${{ci.workspace}}/torchvision-${{matrix.torchvision-version}}-install -DTorch_DIR=${{ci.workspace}}/torch-${{matrix.torch-version}}/lib/python3.9/site-packages/torch/share/cmake/Torch -DCMAKE_BUILD_TYPE=Release ..
cmake --build . -j $(nproc)
cmake --build . --target install

- name: build-ncnn
run: |
export PYTHONUSERBASE=${{ci.workspace}}/torch-${{matrix.torch-version}}
mkdir build && cd build
cmake -DCMAKE_BUILD_TYPE=Release -DNCNN_PYTHON=ON -DNCNN_BUILD_TOOLS=OFF -DNCNN_BUILD_EXAMPLES=OFF ..
cmake --build . -j $(nproc)
cd ..
export CMAKE_BUILD_PARALLEL_LEVEL=$(nproc)
pip3 install --user .

- name: build-pnnx
run: |
export PYTHONUSERBASE=${{ci.workspace}}/torch-${{matrix.torch-version}}
cd tools/pnnx
mkdir build && cd build
cmake -DCMAKE_BUILD_TYPE=Release -DTorchVision_INSTALL_DIR=${{ci.workspace}}/torchvision-${{matrix.torchvision-version}}-install ..
cmake --build . -j $(nproc)

- name: test
run: |
export PYTHONUSERBASE=${{ci.workspace}}/torch-${{matrix.torch-version}}
export OMP_NUM_THREADS=1
export MKL_NUM_THREADS=1
export MKL_ENABLE_INSTRUCTIONS=SSE4_2
cd tools/pnnx
cd build && ctest --output-on-failure -j 16
Loading