A lot of existing code uses the OpenAI API, could you also support this? For example, FastChat and vLLM both provide OpenAI-compatible API servers (https://github.com/lm-sys/FastChat/blob/main/docs/openai_api.md).