Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge the master branch from tecent/ncnn #6

Merged
merged 86 commits into from
Dec 1, 2022
Merged

Merge the master branch from tecent/ncnn #6

merged 86 commits into from
Dec 1, 2022

Conversation

csukuangfj
Copy link
Owner

Use LSTM with projection from tecent/ncnn so that we can use use_packing_layout = true.

For a test wave, it reduces the decoding time from 14.33 seconds to 1.76 seconds.

MollySophia and others added 30 commits September 4, 2022 14:05
…nt#4144)

* Finish the gelu x86 intrinsics
* Finish the fast tanh x86 simd impl
Bumps [pypa/cibuildwheel](https://github.com/pypa/cibuildwheel) from 2.9.0 to 2.10.1.
- [Release notes](https://github.com/pypa/cibuildwheel/releases)
- [Changelog](https://github.com/pypa/cibuildwheel/blob/main/docs/changelog.md)
- [Commits](pypa/cibuildwheel@v2.9.0...v2.10.1)

---
updated-dependencies:
- dependency-name: pypa/cibuildwheel
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
…ize_t&add clang ci (part Tencent#4100) (Tencent#4118)

* RVV: use size_t for vl

* RVV: replace vsseg.v tuple type by using regex

-----

search:
vsseg([1-9])e(8|16|32)_v_(f|i|u)\2m(1|2|4|8)x\1\(([ -~]+), vcreate_\3\2m\4x\1\(([ -~]+)\), vl\);

substitute by:
vsseg$1e$2_v_$3$2m$4($5, $6, vl);

* RVV: replace vssseg.v tuple types by using regex

---

search:
vssseg([1-9])e(8|16|32)_v_f\2m1x\1\(([ -~]+), vcreate_f\2m1x\1\(([ -~]+)\), vl\);

substitute by:
vssseg$1e$2_v_f$2m1($3, $4, vl);

* RVV: replace vlseg.v tuple types in load/store

* RVV: replace vloxseg2ei32.v tuple types

* RVV: add a wrapper for old compilers

* RVV: add segment load/store wrapper in pakcing

* RVV: fix cmake test

* RVV: make clang happy by dropping VLAs in sgemm

* RVV: add clang cmake toolchain configure

* RVV: add clang ci, riscv64-unknown-linux-gnu

Co-authored-by: thelastlin <thelastlin@users.noreply.github.com>
Co-authored-by: nihui <shuizhuyuanluo@126.com>
Bumps [pypa/cibuildwheel](https://github.com/pypa/cibuildwheel) from 2.10.1 to 2.10.2.
- [Release notes](https://github.com/pypa/cibuildwheel/releases)
- [Changelog](https://github.com/pypa/cibuildwheel/blob/main/docs/changelog.md)
- [Commits](pypa/cibuildwheel@v2.10.1...v2.10.2)

---
updated-dependencies:
- dependency-name: pypa/cibuildwheel
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
`cpuinfo`: 

```
isa             : rv64imafdcvsu
mmu             : sv39
cpu-freq                : 1.848Ghz
cpu-icache              : 64KB
cpu-dcache              : 64KB
cpu-l2cache             : 1MB
cpu-tlb         : 1024 4-ways
cpu-cacheline           : 64Bytes
cpu-vector              : 0.7.1
```

Compiled with `-DCMAKE_TOOLCHAIN_FILE=../toolchains/c910-v240.toolchain.cmake -DCMAKE_BUILD_TYPE=release -DNCNN_OPENMP=OFF -DNCNN_THREADS=OFF -DNCNN_RUNTIME_CPU=OFF -DNCNN_RVV=ON -DNCNN_SIMPLEOCV=ON -DNCNN_BUILD_EXAMPLES=ON` 

Seems much worse than expected 🤔
* fix param parsing issue when layer/blob name exceeds 255

* apply code-format changes

Co-authored-by: ZhangGe6 <ZhangGe6@users.noreply.github.com>
* Simple miss count for better space efficiency

* Simple double ended greedy;

* Add size drop threshold setter;

* set workspace allocator cr to zero as we had some sort of recylcing capability :P

Co-authored-by: LinHeLurking <LinHeLurking@users.noreply.github.com>
Co-authored-by: nihuini <nihuini@tencent.com>
…t#4274)

* fix compile warning with gcc 9.1.0 including simplestl.h file

* apply code-format changes

Co-authored-by: veahow <veahow@users.noreply.github.com>
nihui and others added 28 commits November 8, 2022 17:19

Co-authored-by: LRY89757 <LRY89757@users.noreply.github.com>
Co-authored-by: nihuini <nihuini@tencent.com>
Co-authored-by: nihui <shuizhuyuanluo@126.com>
* pnnx fuse more function to module

* rename some pass name

* fuse adjacent reshape, fuse pad conv2d

* fuse pad conv1d
* update release ci

* find modern glslang

* parallel jobs on windows
* add some c_api interfaces related to allocator setup.

* fix errors in allocator parameters in c_api.

* test c api allocator

Co-authored-by: zhangtongshe <yuyuyezi@vip.qq.com>
* Dedicated to coloring black and white photographs.
@csukuangfj csukuangfj merged commit 2655d0b into master Dec 1, 2022
@csukuangfj csukuangfj deleted the test-master branch December 1, 2022 04:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet