Currently, all AI upstream services are simulated using this fake server method.

Currently, all AI upstream services are simulated using this fake server method.
I'm worried that the difference between the fake server and the real LLM request here is too big.

Should we introduce a container specifically for LLM Fake Server?

_Originally posted by @membphis in https://github.com/apache/apisix/pull/13307#discussion_r3201089261_