LLM ( Large Language Model) inference in TEE can protect the model, input prompt or output. The key challenges are:
- the performance of LLM inference in TEE (CPU)
- can LLM inference run in TEE?
With the significant LLM inference speed-up brought by BigDL-LLM, and the Occlum LibOS, now high-performance and efficient LLM inference in TEE could be realized.
Above is the overview chart and flow description.
For step 3, users could use the Occlum init-ra AECS solution which has no invasion to the application.
More details please refer to LLM demo.