Skip to content

Ability to not embed model into assembly & load model from file path #172

@BrycensRanch

Description

@BrycensRanch

Feature request type

enhancement

Is your feature request related to a problem? Please describe

The current implementation embeds or automatically extracts OCR models into the assembly/binary folder. This causes significant deployment bloat, prevents effective .NET trimming/Native AOT optimization, and makes it impossible to update models without a full recompile. Additionally, users are eyeing the binary size and not liking it. I will open a PR once this issue has been debated based on the merits of it.

Describe the solution you'd like

Provide an option to disable embedded models and load them from a specified file system path instead. This would allow users to manage the model files independently of the application binaries.

Describe alternatives you've considered

Attempting to remove the embedded resources but it seems when I do, OCR fails

Additional context

📦 ASSEMBLY: Sdcb.PaddleOCR.Models.LocalV5
[11:44:42 INF]    Path: /home/romvnly/Documents/Coding/SnapX/SnapX.Avalonia/bin/Debug/net10.0/linux-x64/Sdcb.PaddleOCR.Models.LocalV5.dll
[11:44:42 INF]    ├── Sdcb.PaddleOCR.Models.LocalV5.models.mobile_zh_det.inference.json |     0.22 MiB 
[11:44:42 INF]    ├── Sdcb.PaddleOCR.Models.LocalV5.models.mobile_zh_rec.inference.json |     0.21 MiB 
[11:44:42 INF]    ├── Sdcb.PaddleOCR.Models.LocalV5.models.mobile_zh_det.inference.yml |     0.00 MiB 
[11:44:42 INF]    ├── Sdcb.PaddleOCR.Models.LocalV5.models.mobile_zh_rec.inference.yml |     0.16 MiB 
[11:44:42 INF]    ├── Sdcb.PaddleOCR.Models.LocalV5.models.mobile_zh_det.inference.pdiparams |     4.48 MiB  <-- LARGE ASSET
[11:44:42 INF]    ├── Sdcb.PaddleOCR.Models.LocalV5.models.mobile_zh_rec.inference.pdiparams |    15.70 MiB  <-- LARGE ASSET
[11:44:42 INF] 
📦 ASSEMBLY: Sdcb.PaddleOCR.Models.Local
[11:44:42 INF]    Path: /home/romvnly/Documents/Coding/SnapX/SnapX.Avalonia/bin/Debug/net10.0/linux-x64/Sdcb.PaddleOCR.Models.Local.dll
[11:44:42 INF]    ├── Sdcb.PaddleOCR.Models.Local.models.ch_ppocr_mobile_v2._0_cls.inference.pdmodel |     0.85 MiB 
[11:44:42 INF]    ├── Sdcb.PaddleOCR.Models.Local.models.ch_ppstructure_mobile_v2._0_SLANet.inference.pdmodel |     2.46 MiB  <-- LARGE ASSET
[11:44:42 INF]    ├── Sdcb.PaddleOCR.Models.Local.models.en_ppstructure_mobile_v2._0_SLANet.inference.pdmodel |     2.46 MiB  <-- LARGE ASSET
[11:44:42 INF]    ├── Sdcb.PaddleOCR.Models.Local.models.ch_ppocr_mobile_v2._0_cls.inference.pdiparams |     0.51 MiB 
[11:44:42 INF]    ├── Sdcb.PaddleOCR.Models.Local.models.ch_ppstructure_mobile_v2._0_SLANet.inference.pdiparams |     7.31 MiB  <-- LARGE ASSET
[11:44:42 INF]    ├── Sdcb.PaddleOCR.Models.Local.models.en_ppstructure_mobile_v2._0_SLANet.inference.pdiparams |     7.23 MiB  <-- LARGE ASSET
[11:44:42 INF] 
📦 ASSEMBLY: Sdcb.PaddleOCR.Models.LocalV3
[11:44:42 INF]    Path: /home/romvnly/Documents/Coding/SnapX/SnapX.Avalonia/bin/Debug/net10.0/linux-x64/Sdcb.PaddleOCR.Models.LocalV3.dll
[11:44:42 INF]    ├── Sdcb.PaddleOCR.Models.LocalV3.models.arabic_PP_OCRv3_rec.inference.pdmodel |     0.97 MiB 
[11:44:42 INF]    ├── Sdcb.PaddleOCR.Models.LocalV3.models.chinese_cht_PP_OCRv3_rec.inference.pdmodel |     1.17 MiB 
[11:44:42 INF]    ├── Sdcb.PaddleOCR.Models.LocalV3.models.ch_PP_OCRv3_det.inference.pdmodel |     1.35 MiB 
[11:44:42 INF]    ├── Sdcb.PaddleOCR.Models.LocalV3.models.ch_PP_OCRv3_rec.inference.pdmodel |     1.21 MiB 
[11:44:42 INF]    ├── Sdcb.PaddleOCR.Models.LocalV3.models.cyrillic_PP_OCRv3_rec.inference.pdmodel |     0.97 MiB 
[11:44:42 INF]    ├── Sdcb.PaddleOCR.Models.LocalV3.models.devanagari_PP_OCRv3_rec.inference.pdmodel |     1.25 MiB 
[11:44:42 INF]    ├── Sdcb.PaddleOCR.Models.LocalV3.models.en_PP_OCRv3_det.inference.pdmodel |     1.38 MiB 
[11:44:42 INF]    ├── Sdcb.PaddleOCR.Models.LocalV3.models.en_PP_OCRv3_rec.inference.pdmodel |     0.97 MiB 
[11:44:42 INF]    ├── Sdcb.PaddleOCR.Models.LocalV3.models.japan_PP_OCRv3_rec.inference.pdmodel |     1.25 MiB 
[11:44:42 INF]    ├── Sdcb.PaddleOCR.Models.LocalV3.models.ka_PP_OCRv3_rec.inference.pdmodel |     1.25 MiB 
[11:44:42 INF]    ├── Sdcb.PaddleOCR.Models.LocalV3.models.korean_PP_OCRv3_rec.inference.pdmodel |     1.02 MiB 
[11:44:42 INF]    ├── Sdcb.PaddleOCR.Models.LocalV3.models.latin_PP_OCRv3_rec.inference.pdmodel |     1.15 MiB 
[11:44:42 INF]    ├── Sdcb.PaddleOCR.Models.LocalV3.models.ml_PP_OCRv3_det.inference.pdmodel |     1.37 MiB 
[11:44:42 INF]    ├── Sdcb.PaddleOCR.Models.LocalV3.models.ta_PP_OCRv3_rec.inference.pdmodel |     1.02 MiB 
[11:44:42 INF]    ├── Sdcb.PaddleOCR.Models.LocalV3.models.te_PP_OCRv3_rec.inference.pdmodel |     1.02 MiB 
[11:44:42 INF]    ├── Sdcb.PaddleOCR.Models.LocalV3.models.arabic_PP_OCRv3_rec.inference.pdiparams |     8.52 MiB  <-- LARGE ASSET
[11:44:42 INF]    ├── Sdcb.PaddleOCR.Models.LocalV3.models.chinese_cht_PP_OCRv3_rec.inference.pdiparams |    10.57 MiB  <-- LARGE ASSET
[11:44:42 INF]    ├── Sdcb.PaddleOCR.Models.LocalV3.models.ch_PP_OCRv3_det.inference.pdiparams |     2.27 MiB  <-- LARGE ASSET
[11:44:42 INF]    ├── Sdcb.PaddleOCR.Models.LocalV3.models.ch_PP_OCRv3_rec.inference.pdiparams |    10.12 MiB  <-- LARGE ASSET
[11:44:42 INF]    ├── Sdcb.PaddleOCR.Models.LocalV3.models.cyrillic_PP_OCRv3_rec.inference.pdiparams |     8.52 MiB  <-- LARGE ASSET
[11:44:42 INF]    ├── Sdcb.PaddleOCR.Models.LocalV3.models.devanagari_PP_OCRv3_rec.inference.pdiparams |     8.52 MiB  <-- LARGE ASSET
[11:44:42 INF]    ├── Sdcb.PaddleOCR.Models.LocalV3.models.en_PP_OCRv3_det.inference.pdiparams |     2.27 MiB  <-- LARGE ASSET
[11:44:42 INF]    ├── Sdcb.PaddleOCR.Models.LocalV3.models.en_PP_OCRv3_rec.inference.pdiparams |     8.50 MiB  <-- LARGE ASSET
[11:44:42 INF]    ├── Sdcb.PaddleOCR.Models.LocalV3.models.japan_PP_OCRv3_rec.inference.pdiparams |     9.57 MiB  <-- LARGE ASSET
[11:44:42 INF]    ├── Sdcb.PaddleOCR.Models.LocalV3.models.ka_PP_OCRv3_rec.inference.pdiparams |     8.52 MiB  <-- LARGE ASSET
[11:44:42 INF]    ├── Sdcb.PaddleOCR.Models.LocalV3.models.korean_PP_OCRv3_rec.inference.pdiparams |     9.39 MiB  <-- LARGE ASSET
[11:44:42 INF]    ├── Sdcb.PaddleOCR.Models.LocalV3.models.latin_PP_OCRv3_rec.inference.pdiparams |     8.53 MiB  <-- LARGE ASSET
[11:44:42 INF]    ├── Sdcb.PaddleOCR.Models.LocalV3.models.ml_PP_OCRv3_det.inference.pdiparams |     2.27 MiB  <-- LARGE ASSET
[11:44:42 INF]    ├── Sdcb.PaddleOCR.Models.LocalV3.models.ta_PP_OCRv3_rec.inference.pdiparams |     8.51 MiB  <-- LARGE ASSET
[11:44:42 INF]    ├── Sdcb.PaddleOCR.Models.LocalV3.models.te_PP_OCRv3_rec.inference.pdiparams |     8.52 MiB  <-- LARGE ASSET
[11:44:42 INF] 
📦 ASSEMBLY: Sdcb.PaddleOCR.Models.Shared
[11:44:42 INF]    Path: /home/romvnly/Documents/Coding/SnapX/SnapX.Avalonia/bin/Debug/net10.0/linux-x64/Sdcb.PaddleOCR.Models.Shared.dll
[11:44:42 INF]    ├── Sdcb.PaddleOCR.Models.Shared.dicts.arabic_dict.txt           |     0.00 MiB 
[11:44:42 INF]    ├── Sdcb.PaddleOCR.Models.Shared.dicts.chinese_cht_dict.txt      |     0.04 MiB 
[11:44:42 INF]    ├── Sdcb.PaddleOCR.Models.Shared.dicts.cyrillic_dict.txt         |     0.00 MiB 
[11:44:42 INF]    ├── Sdcb.PaddleOCR.Models.Shared.dicts.devanagari_dict.txt       |     0.00 MiB 
[11:44:42 INF]    ├── Sdcb.PaddleOCR.Models.Shared.dicts.en_dict.txt               |     0.00 MiB 
[11:44:42 INF]    ├── Sdcb.PaddleOCR.Models.Shared.dicts.japan_dict.txt            |     0.02 MiB 
[11:44:42 INF]    ├── Sdcb.PaddleOCR.Models.Shared.dicts.ka_dict.txt               |     0.00 MiB 
[11:44:42 INF]    ├── Sdcb.PaddleOCR.Models.Shared.dicts.korean_dict.txt           |     0.02 MiB 
[11:44:42 INF]    ├── Sdcb.PaddleOCR.Models.Shared.dicts.latin_dict.txt            |     0.00 MiB 
[11:44:42 INF]    ├── Sdcb.PaddleOCR.Models.Shared.dicts.ppocr_keys_v1.txt         |     0.03 MiB 
[11:44:42 INF]    ├── Sdcb.PaddleOCR.Models.Shared.dicts.table_structure_dict.txt  |     0.00 MiB 
[11:44:42 INF]    ├── Sdcb.PaddleOCR.Models.Shared.dicts.table_structure_dict_ch.txt |     0.00 MiB 
[11:44:42 INF]    ├── Sdcb.PaddleOCR.Models.Shared.dicts.ta_dict.txt               |     0.00 MiB 
[11:44:42 INF]    ├── Sdcb.PaddleOCR.Models.Shared.dicts.te_dict.txt               |     0.00 MiB 
[11:44:42 INF] 
📦 ASSEMBLY: Sdcb.PaddleOCR.Models.LocalV4
[11:44:42 INF]    Path: /home/romvnly/Documents/Coding/SnapX/SnapX.Avalonia/bin/Debug/net10.0/linux-x64/Sdcb.PaddleOCR.Models.LocalV4.dll
[11:44:42 INF]    ├── Sdcb.PaddleOCR.Models.LocalV4.models.arabic_PP_OCRv4_rec.inference.pdmodel |     0.16 MiB 
[11:44:42 INF]    ├── Sdcb.PaddleOCR.Models.LocalV4.models.ch_PP_OCRv4_det.inference.pdmodel |     0.16 MiB 
[11:44:42 INF]    ├── Sdcb.PaddleOCR.Models.LocalV4.models.ch_PP_OCRv4_rec.inference.pdmodel |     0.16 MiB 
[11:44:42 INF]    ├── Sdcb.PaddleOCR.Models.LocalV4.models.devanagari_PP_OCRv4_rec.inference.pdmodel |     0.16 MiB 
[11:44:42 INF]    ├── Sdcb.PaddleOCR.Models.LocalV4.models.en_PP_OCRv4_rec.inference.pdmodel |     2.40 MiB  <-- LARGE ASSET
[11:44:42 INF]    ├── Sdcb.PaddleOCR.Models.LocalV4.models.japan_PP_OCRv4_rec.inference.pdmodel |     0.16 MiB 
[11:44:42 INF]    ├── Sdcb.PaddleOCR.Models.LocalV4.models.ka_PP_OCRv4_rec.inference.pdmodel |     0.16 MiB 
[11:44:42 INF]    ├── Sdcb.PaddleOCR.Models.LocalV4.models.korean_PP_OCRv4_rec.inference.pdmodel |     0.34 MiB 
[11:44:42 INF]    ├── Sdcb.PaddleOCR.Models.LocalV4.models.ta_PP_OCRv4_rec.inference.pdmodel |     0.34 MiB 
[11:44:42 INF]    ├── Sdcb.PaddleOCR.Models.LocalV4.models.te_PP_OCRv4_rec.inference.pdmodel |     0.34 MiB 
[11:44:42 INF]    ├── Sdcb.PaddleOCR.Models.LocalV4.models.arabic_PP_OCRv4_rec.inference.pdiparams |     7.29 MiB  <-- LARGE ASSET
[11:44:42 INF]    ├── Sdcb.PaddleOCR.Models.LocalV4.models.ch_PP_OCRv4_det.inference.pdiparams |     4.48 MiB  <-- LARGE ASSET
[11:44:42 INF]    ├── Sdcb.PaddleOCR.Models.LocalV4.models.ch_PP_OCRv4_rec.inference.pdiparams |    10.27 MiB  <-- LARGE ASSET
[11:44:42 INF]    ├── Sdcb.PaddleOCR.Models.LocalV4.models.devanagari_PP_OCRv4_rec.inference.pdiparams |     7.29 MiB  <-- LARGE ASSET
[11:44:42 INF]    ├── Sdcb.PaddleOCR.Models.LocalV4.models.en_PP_OCRv4_rec.inference.pdiparams |     7.25 MiB  <-- LARGE ASSET
[11:44:42 INF]    ├── Sdcb.PaddleOCR.Models.LocalV4.models.japan_PP_OCRv4_rec.inference.pdiparams |     9.24 MiB  <-- LARGE ASSET
[11:44:42 INF]    ├── Sdcb.PaddleOCR.Models.LocalV4.models.ka_PP_OCRv4_rec.inference.pdiparams |     7.28 MiB  <-- LARGE ASSET
[11:44:42 INF]    ├── Sdcb.PaddleOCR.Models.LocalV4.models.korean_PP_OCRv4_rec.inference.pdiparams |    22.81 MiB  <-- LARGE ASSET
[11:44:42 INF]    ├── Sdcb.PaddleOCR.Models.LocalV4.models.ta_PP_OCRv4_rec.inference.pdiparams |    21.17 MiB  <-- LARGE ASSET
[11:44:42 INF]    ├── Sdcb.PaddleOCR.Models.LocalV4.models.te_PP_OCRv4_rec.inference.pdiparams |    21.18 MiB  <-- LARGE ASSET

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions