Skip to content

HapKoM/neonext

Repository files navigation

Official PyTorch implementation of NeoNeXt models.

NeoNeXt: Novel neural network operator and architecture based on the patch-wise matrix multiplications.

Vladimir Korviakov, Denis Koposov

Environment

-- CUDA 12.1

-- PyTorch 2.3.1

-- torchvision 0.18.1

Run

The example of training command:

pip install -r requirements.txt
bash scripts/run_neonext_imagenet_local.sh

The example of inferemce command:

bash scripts/validate.sh

Implementation details

Models are defined in file ptvision/models/neonext/neonxet.py. Currently there are several models considered as best (but not finally):

  • NeoNeXt-T
  • NeoNeXt-S
  • NeoNeXt-B
  • NeoNeXt-L

(Where the number of NeoNeXt-X means nothing but the version).

Each model has different number of blocks in each stage and different number of channels.

The block of NeoNeXt looks similar to ConvNeXt, but NeoCell is used instead of the depthwise convolution.

Also NeoCells are used for down-sampling in stem, between the stages and after the final feature map.

NeoCell implementation can be found in the file ptvision/models/neonext/neonxet_utils.py: Otimized PyTorch implementation in C++ API of NeoCell functions can be found in file ptvision/models/neonext/csrc/neocell.cpp.

Given input of shape NxCxHxW NeoCell performs channel-wise matrix multiplications using two trainable matrices A and B (pair of matrices for each channel): Y=A*X*B.

All input channels are splitted to several groups of "channel" number (may be different for each group).

Each group is processed by matrices of the same size.

If "kernel" parameter is set, both A and B matrices are squared matrices of size kernel and spatial size of the data is not changed.

If "h_in", "h_out", "w_in", "w_out" parameters are set, A has size h_out*h_in and B has size w_in*w_out and spatial size of the data can be changed (both increased or decreased).

If the "shift" is set (non zero) then all channels for this kernal are splitted to "kernel" sub-groups. And blocks in block-diagonal matrix in each next sub-group will be shifted by 1 in horizontal and vertical directions. The blocks are cycled and parts of kernels can be used in the lower-right and upper left corners of the block-diagonal matrix. The "shift" is supported only for squared matrices.

ImageNet-1k results

Model res #params GFLOPs acc@1
NeoNeXt-T 224 27.5M 4.4 81.44
NeoNeXt-S 224 49.3M 8.6 82.58
NeoNeXt-B 224 86.8 15.2 83.09
NeoNeXt-T 384 27.5M 13.3 82.00
NeoNeXt-S 384 49.3M 25.7 82.94
NeoNeXt-B 384 86.8 45.2 83.26
NeoNeXt-L 384 193.5 TBD 83.68

TODO

  • Inference code
  • Training code
  • Checkpoints of pretrained models
  • Latest tricks
  • Update paper

Citations

@misc{korviakov2024neonext,
      title={NeoNeXt: Novel neural network operator and architecture based on the patch-wise matrix multiplications}, 
      author={Vladimir Korviakov and Denis Koposov},
      year={2024},
      eprint={2403.11251},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published