Skip to content

onnx-preview1

Pre-release
Pre-release
Compare
Choose a tag to compare
@Blinue Blinue released this 10 Mar 09:39
· 110 commits to onnx since this release
中文版

一个支持 ONNX 模型的实验性版本,如果遇到任何问题请在 #772 中讨论。

如何使用

不支持通过 UI 指定 ONNX 模型,你应该编辑根目录中的 model.json

示例:

{
    "path": "2x_AnimeJaNai_HD_V3_UltraCompact_425k-fp16.onnx",
    "scale": 2,
    "backend": "directml"
}

path:ONNX 文件路径
scale:模型的缩放倍率,应和 ONNX 文件匹配,否则将缩放失败。只支持整数倍缩放。
backend:使用的推理后端,支持三种后端

  • directml,也可以简写为 d
  • tensorrt,也可以简写为 t
  • cuda,也可以简写为 c

缩放时会将该模型应用于第一个效果之前,目前没有方法来改变这个行为。

支持的 ONNX 模型

下面是流行的架构的支持情况:

架构 是否支持
CUGAN ×
ESRGAN
SPAN
WAIFU2X

为了在 Magpie 中使用,ONNX 模型必须符合以下要求:

  1. 输入和输出维度必须为 [-1, 3, -1, -1],数据格式必须为 NCHW。
  2. 输入和输出的数据类型支持 fp16 和 fp32,输入和输出的数据类型必须相同。
  3. 输出尺寸必须是输入的整数倍,对于所有尺寸的输入,缩放倍数应相同。

Magpie 使用 onnxruntime 执行推理,因此除了上述条件,该模型还必须被 onnxruntime 支持。

你可以在这里找到很多 ONNX 模型,其中 CUGAN 架构之外的模型都是可以使用的。OpenModelDB 也记录了大量模型。

推理后端选择

支持三种后端:DirectML,TensorRT 和 CUDA。在性能上,通常情况下 TensorRT > DirectML > CUDA。

  • DirectML:可以运行在所有支持 DirectX 12 的硬件上。如果你没有 NVIDIA 显卡,DirectML 便是唯一的选择。
  • TensorRT:可以运行在 Compute Capability 至少为 6.0 的 NVIDIA 显卡上。它的性能比其他两种后端强得多,但必须花费较长时间(几分钟)来构建引擎。为了获得更好的性能,在构建引擎期间请避免执行 GPU 密集型操作。
  • CUDA:可以运行在几乎所有 NVIDIA 显卡上,根据模型的不同,它的性能和 DirectML 互有胜负。

未来的开发计划

ONNX 支持将不会出现在下一个主要版本中。为了正式支持它,还有很多工作要做,包括导入、管理、优化 ONNX 模型,以及 UI 优化也在计划中。如果你有任何的想法、点子,欢迎和我们分享。

This is an experimental version with support for ONNX models. If you encounter any issues, please report them in #772.

How to Use

You cannot specify ONNX models using the UI. You need to modify the model.json file in the root directory.

Example:

{
    "path": "2x_AnimeJaNai_HD_V3_UltraCompact_425k-fp16.onnx",
    "scale": 2,
    "backend": "directml"
}

path: The location of the ONNX file.
scale: The scaling factor of the model, which must be consistent with the ONNX file, or else scaling will fail. Only integer scaling is allowed.
backend: The inference backend to use, which supports three options:

  • directml, also shortened as d
  • tensorrt, also shortened as t
  • cuda, also shortened as c

When scaling, the model will be applied before the first effect; there is no option to change this behavior at the moment.

ONNX Models Compatibility

The table below shows the compatibility with popular architectures:

Architecture Compatible
CUGAN ×
ESRGAN
SPAN
WAIFU2X

To run an ONNX model in Magpie, the model must follow these rules:

  1. Input and output dimensions must be [-1, 3, -1, -1], with data format NCHW.
  2. Input and output data must be either fp16 or fp32, and they must match.
  3. Output size must be a whole-number multiple of the input size, with the same scaling factors for all input sizes.

Magpie uses onnxruntime for inference, so the model must be compatible with onnxruntime, as well as the rules above.

You can access many ONNX models here, and all models except for the CUGAN architecture work. OpenModelDB also has a large database of models.

Backend Selection for Inference

You can choose from three backends: DirectML, TensorRT, and CUDA. They have different performance levels, with TensorRT being the fastest, followed by DirectML and CUDA.

  1. DirectML: Works on any hardware that supports DirectX 12. If you don't have an NVIDIA graphics card, this is your only choice.
  2. TensorRT: Requires an NVIDIA graphics card with a Compute Capability of at least 6.0. It has much better performance than the other two backends, but it takes a long time (several minutes) to build the engine. To get the best performance, don't do any GPU-intensive tasks while the engine is being built.
  3. CUDA: Supports almost all NVIDIA graphics cards, but its performance depends on the model and may be similar to DirectML.

Future Development Plans

We will not add ONNX support in the next major release. We still have a lot of work to do to fully support it, such as importing, managing, and optimizing ONNX models. If you have any ideas or suggestions, feel free to share them with us.