Can Pie support deploying Agent applications (clients) and LLM Serving Systems (servers) on different machines? #129

Gallopm · 2025-10-20T07:35:05Z

Gallopm
Oct 20, 2025

Hi! I have read the paper on SOSP'25 for this project. It is indeed an excellent work that provides great programming flexibility for customizing inference processes for LLM applications. But I have a question. After reading the project documentation, I found that Pie seems to deploy the agent application and the large language model on the same GPU equipped machine, so Inferlet running locally can directly replace the agent application to perform every interaction with the external environment. However, many existing agent applications interact with LLM Serving Systems on cloud vendor servers to obtain intelligent support of LLMs. This allows me to use code agent applications to write code on my machine without GPUs or deployed large language models. Can Pie be compatible with such agent application scenarios? If possible, can it still benefit from reducing network communication overhead? If not possible, does this to some extent limit the applicability of Pie, as currently users using agents may not necessarily deploy models locally, and even if models are deployed, there may not be as many requests as in the cloud, resulting in some waste of this locally built LLM Serving System.

Answered by ingim

Oct 24, 2025

Dear @Gallopm,

Thanks for opening the first discussion thread! Pie’s integrated compute and I/O primarily benefit use cases where the agent logic can be executed without external actions (e.g., evaluating symbolic expressions).

In the scenario you mentioned such as modifying a user’s files, external I/O is required, so Pie does not gain the advantages of reduced boundary crossings in that case.

This "split compute" setup does not limit Pie’s applicability. One can still use the send and receive APIs to interact with the user at runtime, and further improve efficiency through application-specific KV cache management.

View full answer

ingim · 2025-10-24T07:13:49Z

ingim
Oct 24, 2025
Maintainer

Dear @Gallopm,

Thanks for opening the first discussion thread! Pie’s integrated compute and I/O primarily benefit use cases where the agent logic can be executed without external actions (e.g., evaluating symbolic expressions).

In the scenario you mentioned such as modifying a user’s files, external I/O is required, so Pie does not gain the advantages of reduced boundary crossings in that case.

This "split compute" setup does not limit Pie’s applicability. One can still use the send and receive APIs to interact with the user at runtime, and further improve efficiency through application-specific KV cache management.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Can Pie support deploying Agent applications (clients) and LLM Serving Systems (servers) on different machines? #129

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Uh oh!

Can Pie support deploying Agent applications (clients) and LLM Serving Systems (servers) on different machines? #129

Uh oh!

Gallopm Oct 20, 2025

Replies: 1 comment

Uh oh!

ingim Oct 24, 2025 Maintainer

Gallopm
Oct 20, 2025

ingim
Oct 24, 2025
Maintainer