Add offline/air-gapped model resolution for inference-models cache#2187
Open
Add offline/air-gapped model resolution for inference-models cache#2187
Conversation
- Write model_id into model_config.json so it can be recovered without
the auto-resolution-cache (which expires and gets deleted)
- scan_cached_models now also walks models-cache/{slug}/{package_id}/
model_config.json under both MODEL_CACHE_DIR and INFERENCE_HOME,
covering models pre-populated via inference-models without a
corresponding model_type.json in MODEL_CACHE_DIR
- Prune models-cache/ from the model_type.json walk to avoid noise
- De-duplicate results by model_id (layout-1 takes precedence)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
When model_type.json is missing (e.g. models pre-populated via
inference-models without going through the registry), check
models-cache/{slug}/{package_id}/model_config.json under both
MODEL_CACHE_DIR and INFERENCE_HOME. This allows get_model_type()
to resolve cached models without hitting the Roboflow API,
enabling fully air-gapped model loading.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
When model weights are already cached in the inference-models layout
(models-cache/{slug}/{package_id}/), pass the local directory path
to AutoModel.from_pretrained() instead of the model ID. This triggers
load_model_from_local_storage() which skips the API call entirely,
enabling air-gapped model loading without modifying the inference-models
package.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
# Conflicts: # inference/core/cache/air_gapped.py # inference/core/interfaces/http/builder/routes.py # inference/core/interfaces/http/handlers/workflows.py # inference/core/workflows/core_steps/models/foundation/anthropic_claude/v1.py # inference/core/workflows/core_steps/models/foundation/anthropic_claude/v2.py # inference/core/workflows/core_steps/models/foundation/anthropic_claude/v3.py # inference/core/workflows/core_steps/models/foundation/clip/v1.py # inference/core/workflows/core_steps/models/foundation/clip_comparison/v1.py # inference/core/workflows/core_steps/models/foundation/clip_comparison/v2.py # inference/core/workflows/core_steps/models/foundation/cog_vlm/v1.py # inference/core/workflows/core_steps/models/foundation/depth_estimation/v1.py # inference/core/workflows/core_steps/models/foundation/easy_ocr/v1.py # inference/core/workflows/core_steps/models/foundation/florence2/v1.py # inference/core/workflows/core_steps/models/foundation/florence2/v2.py # inference/core/workflows/core_steps/models/foundation/gaze/v1.py # inference/core/workflows/core_steps/models/foundation/google_gemini/v1.py # inference/core/workflows/core_steps/models/foundation/google_gemini/v2.py # inference/core/workflows/core_steps/models/foundation/google_gemini/v3.py # inference/core/workflows/core_steps/models/foundation/google_vision_ocr/v1.py # inference/core/workflows/core_steps/models/foundation/llama_vision/v1.py # inference/core/workflows/core_steps/models/foundation/lmm/v1.py # inference/core/workflows/core_steps/models/foundation/lmm_classifier/v1.py # inference/core/workflows/core_steps/models/foundation/moondream2/v1.py # inference/core/workflows/core_steps/models/foundation/ocr/v1.py # inference/core/workflows/core_steps/models/foundation/openai/v1.py # inference/core/workflows/core_steps/models/foundation/openai/v2.py # inference/core/workflows/core_steps/models/foundation/openai/v3.py # inference/core/workflows/core_steps/models/foundation/openai/v4.py # inference/core/workflows/core_steps/models/foundation/perception_encoder/v1.py # inference/core/workflows/core_steps/models/foundation/qwen/v1.py # inference/core/workflows/core_steps/models/foundation/qwen3_5vl/v1.py # inference/core/workflows/core_steps/models/foundation/qwen3vl/v1.py # inference/core/workflows/core_steps/models/foundation/seg_preview/v1.py # inference/core/workflows/core_steps/models/foundation/segment_anything2/v1.py # inference/core/workflows/core_steps/models/foundation/segment_anything3/v1.py # inference/core/workflows/core_steps/models/foundation/segment_anything3/v2.py # inference/core/workflows/core_steps/models/foundation/segment_anything3/v3.py # inference/core/workflows/core_steps/models/foundation/segment_anything3_3d/v1.py # inference/core/workflows/core_steps/models/foundation/smolvlm/v1.py # inference/core/workflows/core_steps/models/foundation/stability_ai/image_gen/v1.py # inference/core/workflows/core_steps/models/foundation/stability_ai/inpainting/v1.py # inference/core/workflows/core_steps/models/foundation/stability_ai/outpainting/v1.py # inference/core/workflows/core_steps/models/foundation/yolo_world/v1.py # inference/core/workflows/core_steps/models/roboflow/instance_segmentation/v1.py # inference/core/workflows/core_steps/models/roboflow/instance_segmentation/v2.py # inference/core/workflows/core_steps/models/roboflow/keypoint_detection/v1.py # inference/core/workflows/core_steps/models/roboflow/keypoint_detection/v2.py # inference/core/workflows/core_steps/models/roboflow/multi_class_classification/v1.py # inference/core/workflows/core_steps/models/roboflow/multi_class_classification/v2.py # inference/core/workflows/core_steps/models/roboflow/multi_label_classification/v1.py # inference/core/workflows/core_steps/models/roboflow/multi_label_classification/v2.py # inference/core/workflows/core_steps/models/roboflow/object_detection/v1.py # inference/core/workflows/core_steps/models/roboflow/object_detection/v2.py # inference/core/workflows/core_steps/models/roboflow/semantic_segmentation/v1.py # inference/core/workflows/core_steps/sinks/roboflow/custom_metadata/v1.py # inference/core/workflows/core_steps/sinks/roboflow/dataset_upload/v1.py # inference/core/workflows/core_steps/sinks/roboflow/dataset_upload/v2.py # inference/core/workflows/core_steps/sinks/roboflow/model_monitoring_inference_aggregator/v1.py # inference/core/workflows/core_steps/sinks/slack/notification/v1.py # inference/core/workflows/core_steps/sinks/twilio/sms/v1.py # inference/core/workflows/core_steps/sinks/twilio/sms/v2.py # inference/core/workflows/core_steps/sinks/webhook/v1.py # tests/unit/core/cache/test_air_gapped.py # tests/unit/core/interfaces/http/test_blocks_describe_airgapped.py # tests/unit/core/workflows/test_air_gapped_blocks.py
sberan
commented
Mar 31, 2026
| ) | ||
| from inference.core.workflows.prototypes.block import BlockAirGappedInfo | ||
|
|
||
| logger = logging.getLogger(__name__) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
_resolve_cached_model_pathto inference-models adapters soAutoModel.from_pretrainedcan load directly from the local inference-models cache without calling the Roboflow APImodel_config.jsonin the inference-models cache layout when resolving model metadata, so air-gapped environments can discover models cached by inference-modelsmodel_idintomodel_config.jsonduring model download so offline scanning can map cache entries back to their canonical model IDsloggerinitialization in workflow handlersTest plan
model_config.jsonfallback returns correct task type and architecturemodel_type.jsoncache layout still works as before