File-based RPC for running Python functions across network-isolated nodes.
Designed for HPC clusters where compute nodes lack internet access but share a filesystem with login nodes that do.
pip install fileproxyOr from source:
git clone https://github.com/tboulet/fileproxy.git
cd fileproxy
pip install -e .Create a server script (example here with litellm.completion as the function to proxy):
# server_script.py
import fileproxy
import litellm
if __name__ == "__main__":
fileproxy.run_server({
"litellm_completion": litellm.completion,
})Run it on the login node:
python server_script.pyTip: On HPC clusters, run the server in a persistent terminal session (e.g., TMUX) so it survives SSH disconnections. See guide_TMUX.md for a quick reference.
import fileproxy
# Create a proxy that behaves like the original function
completion = fileproxy.proxy("litellm_completion")
# Use it exactly like litellm.completion
response = completion(model="gpt-4", messages=[{"role": "user", "content": "Hello"}])The proxy serializes the arguments to a file, the server picks it up, runs the real function, and writes the result back. The proxy polls for the result and returns it.
Register multiple functions on the same server:
# Server
import fileproxy
import litellm
import requests
if __name__ == "__main__":
fileproxy.run_server({
"litellm_completion": litellm.completion,
"http_post": requests.post,
"http_get": requests.get,
})# Client
import fileproxy
completion = fileproxy.proxy("litellm_completion")
http_post = fileproxy.proxy("http_post")
http_get = fileproxy.proxy("http_get")By default, fileproxy stores request/response files in ~/.cache/fileproxy/. Override with:
- Constructor argument:
fileproxy.proxy("func", base_dir="/path/to/dir") - Environment variable:
export FILEPROXY_DIR=/path/to/dir
The server and client must use the same base directory on a shared filesystem.
By default, the server processes requests sequentially. To handle multiple requests concurrently (useful when registering multiple functions or serving multiple clients):
# Process up to 4 requests in parallel
fileproxy.run_server(functions, workers=4)With workers=1 (default), requests are executed one at a time. With workers=2 or more, requests are dispatched to a thread pool. This is particularly useful when mixing slow functions (e.g., LLM calls) with fast ones (e.g., HTTP requests) — a slow call won't block unrelated requests.
Note: Registered functions must be thread-safe when using
workers > 1. Most common use cases (HTTP requests, API calls) are thread-safe.
# Client waits 10s for server acknowledgement (default: 10s)
func = fileproxy.proxy("my_func", no_server_timeout=15.0)The timeout only applies while waiting for the server to acknowledge the request (pick it up). Once the server starts processing, the client waits indefinitely — slow functions will not cause false timeouts.
# Server checks for new requests every 0.5s (default: 0.2s)
fileproxy.run_server(functions, poll_interval=0.5)
# Client checks for response every 0.2s (default: 0.1s)
func = fileproxy.proxy("my_func", poll_interval=0.2)Compute Node (no internet) Login Node (has internet)
───────────────────────── ──────────────────────────
proxy("func")(args, kwargs) Server polls input dir
│ │
├─ Write request.pkl ──────────────┤
│ to input dir ├─ Read request.pkl
│ ├─ Create _started sentinel
│ (client sees _started, ├─ Call func(*args, **kwargs)
│ disables timeout) ├─ Write response.pkl (atomic)
├──────────────────────────────────┤ to output dir
├─ Read response.pkl │
├─ Return result │
~/.cache/fileproxy/
├── func_name_1/
│ ├── input/ # Request files (.pkl)
│ └── output/ # Response files (.pkl) + _started sentinels
├── func_name_2/
│ ├── input/
│ └── output/
├── logs/
│ └── server_20260310_143000.log
└── server_heartbeat.json
fileproxy uses custom exception types to distinguish infrastructure errors from function errors:
import fileproxy
from fileproxy import FileProxyError, ServerNotRunningError
func = fileproxy.proxy("my_func")
try:
result = func(args)
except ServerNotRunningError:
# fileproxy infrastructure problem: server is not running
print("Start the fileproxy server!")
except FileProxyError:
# Other fileproxy infrastructure problem
print("Something went wrong with the file proxy")
except ValueError:
# Exception raised by the actual function on the server side
# (re-raised with original type)
print("The function itself failed")FileProxyError: Base class for all fileproxy infrastructure errors.ServerNotRunningError(FileProxyError): Server did not acknowledge the request within the timeout.- Server-side function exceptions are re-raised with their original type (not wrapped in
FileProxyError).
When the proxied function raises an exception on the server, the proxy re-raises it on the client with the original exception type in most cases. For example, a server-side ValueError("bad input") becomes a client-side ValueError("bad input").
However, some exception classes have non-standard __init__ signatures that prevent Python's pickle from reconstructing them (e.g., litellm.RateLimitError requires llm_provider and model arguments). In these cases, the original exception cannot be faithfully reconstructed, so the proxy raises a RuntimeError instead, with a message of the form:
RuntimeError: Server-side RateLimitError: rate limited
In summary:
- Standard exceptions (e.g.,
ValueError,TypeError,KeyError, most custom exceptions with a simple__init__(self, message)signature): re-raised with original type and message. - Non-picklable exceptions (non-standard
__init__that fails to round-trip through pickle): raised asRuntimeError("Server-side {OriginalType}: {original_message}").
Server logs are written to {base_dir}/logs/server_YYYYMMDDHHMMSS.log and also printed to the server terminal. Each log file corresponds to one server session.
Do not run multiple fileproxy servers with the same base_dir. On startup, the server checks for an existing heartbeat and raises FileProxyError if another server appears to be running. To override and kill the old server, use force=True:
# force=True signals the old server to stop, waits for it to shut down,
# then starts the new server
fileproxy.run_server(functions, force=True)If you need truly independent servers running simultaneously, use different base_dir values:
FILEPROXY_DIR=~/.cache/fileproxy-project-a python server_a.py
FILEPROXY_DIR=~/.cache/fileproxy-project-b python server_b.pyWhen you restart the server, it clears all pending request/response files. Any client calls that were in-flight will eventually time out with ServerNotRunningError. This is by design — it prevents stale requests from a previous session from being processed.
From any node that shares the filesystem:
import fileproxy
info = fileproxy.status()
print(info["alive"]) # True/False
print(info["functions"]) # ["litellm_completion", "http_post", ...]
print(info["pid"]) # Server process ID
print(info["requests_processed"]) # Total requests handled- Atomic writes: Responses are written to a
.tmpfile then renamed, preventing clients from reading partial data. - Started sentinel: When the server begins processing a request, it creates a
_startedmarker file. The client uses this to distinguish "server is processing (wait)" from "server is not running (fail fast)." - Exception propagation: If the function raises an exception on the server, the exception object is pickled and re-raised on the client side with its original type.
- Unpicklable response handling: If the server cannot pickle the response (e.g., it contains open file handles), the client receives a
FileProxyErrorinstead of hanging. - Cleanup: Request, response, and sentinel files are removed after processing.
- Startup cleanup: The server clears stale files from previous runs on startup.
- Arguments and return values must be picklable (most Python objects are — strings, dicts, lists, numbers, dataclasses, etc. Lambdas, open file handles, and generators are not).
- Latency overhead of ~100-200ms per call due to filesystem polling.
- Server and client must share a filesystem (e.g., NFS home directory on HPC clusters). Local-only filesystems like
/tmpwon't work across nodes. - If the server crashes (e.g., killed by OOM) while processing a request, the client will wait indefinitely for that request. Restart the server to recover.
- If the server and client use different Python environments, server-side exceptions from libraries not installed on the client will be raised as
RuntimeErrorinstead of their original type.
MIT