Summary
GatewayRequestBase sets extra="allow", so any extra field on a request body is forwarded to the worker. An attacker can inject internal-only options (for example overriding data: with a URI they control) and spawn simulations against arbitrary datasets. No payload size cap is enforced either.
Location
projects/policyengine-api-simulation/src/modal/gateway/models.py:23-33
projects/policyengine-api-simulation/src/modal/gateway/endpoints.py:157-163, 214-222
What goes wrong
class GatewayRequestBase(BaseModel):
country: str
version: Optional[str] = None
telemetry: TelemetryEnvelope | None = None
model_config = ConfigDict(
extra="allow",
populate_by_name=True,
) # Pass through all other fields
Then in endpoints.py:
payload = request.model_dump(
exclude={"version", "telemetry"},
mode="json",
)
...
sim_func = modal.Function.from_name(app_name, "run_simulation")
call = sim_func.spawn(payload)
Every un-modeled field flows verbatim into run_simulation_impl, which passes them to SimulationOptions.model_validate(simulation_params) (simulation.py:90). A caller can:
- Supply
data: "gs://attacker-bucket/malicious.h5" to point the worker at a dataset they control (see _build_policyengine_bundle, endpoints.py:65-68, which already treats any "://" string as a trusted URI).
- Inject internal-only
SimulationOptions fields that were never meant to be user-controllable.
- Supply arbitrarily large blobs (no
max_length anywhere); Modal has no built-in body size cap for @modal.asgi_app().
Suggested fix
- Change
extra="allow" to extra="forbid" and add the real required simulation fields to SimulationRequest / BudgetWindowBatchRequest as typed attributes.
- If passthrough is genuinely needed, define an explicit
extra_simulation_options: dict[str, AllowedOption] with a closed-set discriminator.
- Refuse
data: values that do not match an allowlist of known URIs (the DATASET_URIS table is already a natural whitelist).
- Add a payload size cap via FastAPI middleware.
Severity
High, security. SSRF-like data exfiltration/dataset substitution plus unbounded-payload DoS.
Summary
GatewayRequestBasesetsextra="allow", so any extra field on a request body is forwarded to the worker. An attacker can inject internal-only options (for example overridingdata:with a URI they control) and spawn simulations against arbitrary datasets. No payload size cap is enforced either.Location
projects/policyengine-api-simulation/src/modal/gateway/models.py:23-33projects/policyengine-api-simulation/src/modal/gateway/endpoints.py:157-163, 214-222What goes wrong
Then in
endpoints.py:Every un-modeled field flows verbatim into
run_simulation_impl, which passes them toSimulationOptions.model_validate(simulation_params)(simulation.py:90). A caller can:data: "gs://attacker-bucket/malicious.h5"to point the worker at a dataset they control (see_build_policyengine_bundle,endpoints.py:65-68, which already treats any"://"string as a trusted URI).SimulationOptionsfields that were never meant to be user-controllable.max_lengthanywhere); Modal has no built-in body size cap for@modal.asgi_app().Suggested fix
extra="allow"toextra="forbid"and add the real required simulation fields toSimulationRequest/BudgetWindowBatchRequestas typed attributes.extra_simulation_options: dict[str, AllowedOption]with a closed-set discriminator.data:values that do not match an allowlist of known URIs (theDATASET_URIStable is already a natural whitelist).Severity
High, security. SSRF-like data exfiltration/dataset substitution plus unbounded-payload DoS.