Describe the solution you'd like
When launching the ellm_server / api_server.py with --model_path EmbeddedLLM/Phi-3-mini-4k-instruct-onnx-directml. The OnnxruntimeEngine should automatically download the weight from the repo and pass the cache directory path to og.Model.