Rather than hard-code the supported inference services, models, and corresponding implementation classes in CrossProviderInferenceEngine, how about making this data driven? E.g., move the definitions to a YAML file in the distribution with an option for users to add their own custom entries or, even better, create a separate YAML file that will be searched for in a defined search path. This would make it easy for people to add new services, like Replicate, or overload existing entries, without code changes.
As an alternative, allow the user to provide the same information in the constructor call.