Skip to content

Add offline/air-gapped model resolution for inference-models cache#2187

Open
sberan wants to merge 11 commits intomainfrom
offline-support
Open

Add offline/air-gapped model resolution for inference-models cache#2187
sberan wants to merge 11 commits intomainfrom
offline-support

Conversation

@sberan
Copy link
Copy Markdown
Contributor

@sberan sberan commented Mar 31, 2026

Summary

  • Adds _resolve_cached_model_path to inference-models adapters so AutoModel.from_pretrained can load directly from the local inference-models cache without calling the Roboflow API
  • Adds fallback to model_config.json in the inference-models cache layout when resolving model metadata, so air-gapped environments can discover models cached by inference-models
  • Writes model_id into model_config.json during model download so offline scanning can map cache entries back to their canonical model IDs
  • Removes CSRF token enforcement from the workflow builder routes
  • Adds missing logger initialization in workflow handlers

Test plan

  • Verify models cached via inference-models can be loaded offline without API calls
  • Verify model_config.json fallback returns correct task type and architecture
  • Verify traditional model_type.json cache layout still works as before
  • Verify workflow builder routes work without CSRF headers

yeldarby and others added 9 commits March 25, 2026 19:56
- Write model_id into model_config.json so it can be recovered without
  the auto-resolution-cache (which expires and gets deleted)
- scan_cached_models now also walks models-cache/{slug}/{package_id}/
  model_config.json under both MODEL_CACHE_DIR and INFERENCE_HOME,
  covering models pre-populated via inference-models without a
  corresponding model_type.json in MODEL_CACHE_DIR
- Prune models-cache/ from the model_type.json walk to avoid noise
- De-duplicate results by model_id (layout-1 takes precedence)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
When model_type.json is missing (e.g. models pre-populated via
inference-models without going through the registry), check
models-cache/{slug}/{package_id}/model_config.json under both
MODEL_CACHE_DIR and INFERENCE_HOME. This allows get_model_type()
to resolve cached models without hitting the Roboflow API,
enabling fully air-gapped model loading.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
When model weights are already cached in the inference-models layout
(models-cache/{slug}/{package_id}/), pass the local directory path
to AutoModel.from_pretrained() instead of the model ID. This triggers
load_model_from_local_storage() which skips the API call entirely,
enabling air-gapped model loading without modifying the inference-models
package.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
# Conflicts:
#	inference/core/cache/air_gapped.py
#	inference/core/interfaces/http/builder/routes.py
#	inference/core/interfaces/http/handlers/workflows.py
#	inference/core/workflows/core_steps/models/foundation/anthropic_claude/v1.py
#	inference/core/workflows/core_steps/models/foundation/anthropic_claude/v2.py
#	inference/core/workflows/core_steps/models/foundation/anthropic_claude/v3.py
#	inference/core/workflows/core_steps/models/foundation/clip/v1.py
#	inference/core/workflows/core_steps/models/foundation/clip_comparison/v1.py
#	inference/core/workflows/core_steps/models/foundation/clip_comparison/v2.py
#	inference/core/workflows/core_steps/models/foundation/cog_vlm/v1.py
#	inference/core/workflows/core_steps/models/foundation/depth_estimation/v1.py
#	inference/core/workflows/core_steps/models/foundation/easy_ocr/v1.py
#	inference/core/workflows/core_steps/models/foundation/florence2/v1.py
#	inference/core/workflows/core_steps/models/foundation/florence2/v2.py
#	inference/core/workflows/core_steps/models/foundation/gaze/v1.py
#	inference/core/workflows/core_steps/models/foundation/google_gemini/v1.py
#	inference/core/workflows/core_steps/models/foundation/google_gemini/v2.py
#	inference/core/workflows/core_steps/models/foundation/google_gemini/v3.py
#	inference/core/workflows/core_steps/models/foundation/google_vision_ocr/v1.py
#	inference/core/workflows/core_steps/models/foundation/llama_vision/v1.py
#	inference/core/workflows/core_steps/models/foundation/lmm/v1.py
#	inference/core/workflows/core_steps/models/foundation/lmm_classifier/v1.py
#	inference/core/workflows/core_steps/models/foundation/moondream2/v1.py
#	inference/core/workflows/core_steps/models/foundation/ocr/v1.py
#	inference/core/workflows/core_steps/models/foundation/openai/v1.py
#	inference/core/workflows/core_steps/models/foundation/openai/v2.py
#	inference/core/workflows/core_steps/models/foundation/openai/v3.py
#	inference/core/workflows/core_steps/models/foundation/openai/v4.py
#	inference/core/workflows/core_steps/models/foundation/perception_encoder/v1.py
#	inference/core/workflows/core_steps/models/foundation/qwen/v1.py
#	inference/core/workflows/core_steps/models/foundation/qwen3_5vl/v1.py
#	inference/core/workflows/core_steps/models/foundation/qwen3vl/v1.py
#	inference/core/workflows/core_steps/models/foundation/seg_preview/v1.py
#	inference/core/workflows/core_steps/models/foundation/segment_anything2/v1.py
#	inference/core/workflows/core_steps/models/foundation/segment_anything3/v1.py
#	inference/core/workflows/core_steps/models/foundation/segment_anything3/v2.py
#	inference/core/workflows/core_steps/models/foundation/segment_anything3/v3.py
#	inference/core/workflows/core_steps/models/foundation/segment_anything3_3d/v1.py
#	inference/core/workflows/core_steps/models/foundation/smolvlm/v1.py
#	inference/core/workflows/core_steps/models/foundation/stability_ai/image_gen/v1.py
#	inference/core/workflows/core_steps/models/foundation/stability_ai/inpainting/v1.py
#	inference/core/workflows/core_steps/models/foundation/stability_ai/outpainting/v1.py
#	inference/core/workflows/core_steps/models/foundation/yolo_world/v1.py
#	inference/core/workflows/core_steps/models/roboflow/instance_segmentation/v1.py
#	inference/core/workflows/core_steps/models/roboflow/instance_segmentation/v2.py
#	inference/core/workflows/core_steps/models/roboflow/keypoint_detection/v1.py
#	inference/core/workflows/core_steps/models/roboflow/keypoint_detection/v2.py
#	inference/core/workflows/core_steps/models/roboflow/multi_class_classification/v1.py
#	inference/core/workflows/core_steps/models/roboflow/multi_class_classification/v2.py
#	inference/core/workflows/core_steps/models/roboflow/multi_label_classification/v1.py
#	inference/core/workflows/core_steps/models/roboflow/multi_label_classification/v2.py
#	inference/core/workflows/core_steps/models/roboflow/object_detection/v1.py
#	inference/core/workflows/core_steps/models/roboflow/object_detection/v2.py
#	inference/core/workflows/core_steps/models/roboflow/semantic_segmentation/v1.py
#	inference/core/workflows/core_steps/sinks/roboflow/custom_metadata/v1.py
#	inference/core/workflows/core_steps/sinks/roboflow/dataset_upload/v1.py
#	inference/core/workflows/core_steps/sinks/roboflow/dataset_upload/v2.py
#	inference/core/workflows/core_steps/sinks/roboflow/model_monitoring_inference_aggregator/v1.py
#	inference/core/workflows/core_steps/sinks/slack/notification/v1.py
#	inference/core/workflows/core_steps/sinks/twilio/sms/v1.py
#	inference/core/workflows/core_steps/sinks/twilio/sms/v2.py
#	inference/core/workflows/core_steps/sinks/webhook/v1.py
#	tests/unit/core/cache/test_air_gapped.py
#	tests/unit/core/interfaces/http/test_blocks_describe_airgapped.py
#	tests/unit/core/workflows/test_air_gapped_blocks.py
)
from inference.core.workflows.prototypes.block import BlockAirGappedInfo

logger = logging.getLogger(__name__)
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this needed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants