Skip to content

Bugfix and speed up

Pre-release
Pre-release
Compare
Choose a tag to compare
@jxt1234 jxt1234 released this 06 Jan 08:20
· 727 commits to master since this release

Bugfix:

  1. Fix bug for memory leak when create session for some model.
  2. Fix metal backend's serveral bug.
  3. Small bugfix for ios demo
  4. Fix bug for 3d BatchNormal Module don't work for MNNTrain
  5. Fix memory leak for CPUInterp
  6. Fix bug for stack error for MNNPackedMatMulRemain.S
  7. Fix bug for SSE branch use AVX instruction

Optimize:

  1. Reduce buffer create for metal execute.
  2. Reduce memory copy in CPUBatchMatMul.
  3. Reduce memory copy for Module
  4. Use neon to optimize CPUTopKV2
  5. Reduce memory usage for CUDA Backend by split and merge

Feature:

  1. Support multi-instance for Module.
  2. Serveral CI scripts.
  3. More op for ARM82 Backend ()
  4. More op for CUDA (ArgMax, BatchMatMul, GatherV2, LayerNorm ......)