Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion docs.json
Original file line number Diff line number Diff line change
Expand Up @@ -391,7 +391,7 @@
"flash/cli/overview",
"flash/cli/init",
"flash/cli/login",
"flash/cli/run",
"flash/cli/dev",
"flash/cli/build",
"flash/cli/deploy",
"flash/cli/env",
Expand Down
10 changes: 5 additions & 5 deletions flash/apps/build-app.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -80,10 +80,10 @@ uv pip install -r requirements.txt

## Step 4: Start the local API server

Use `flash run` to start the API server:
Use `flash dev` to start the API server:

```bash
uv run flash run
uv run flash dev
```

Open a new terminal tab or window and test your endpoints using cURL:
Expand All @@ -100,21 +100,21 @@ curl -X POST http://localhost:8888/lb_worker/process \
-d '{"input_data": {"message": "Hello from Flash"}}'
```

If you switch back to the terminal tab where you used `flash run`, you'll see the details of the job's progress.
If you switch back to the terminal tab where you used `flash dev`, you'll see the details of the job's progress.

### Faster testing with auto-provisioning

For development with multiple endpoints, use `--auto-provision` to deploy all resources before testing:

```bash
uv run flash run --auto-provision
uv run flash dev --auto-provision
```

This eliminates cold-start delays by provisioning all serverless endpoints upfront. Endpoints are cached and reused across server restarts, making subsequent runs faster. Resources are identified by name, so the same endpoint won't be re-deployed if the configuration hasn't changed.

## Step 5: Open the API explorer

Besides starting the API server, `flash run` also starts an interactive API explorer. Point your web browser at [http://localhost:8888/docs](http://localhost:8888/docs) to explore the API.
Besides starting the API server, `flash dev` also starts an interactive API explorer. Point your web browser at [http://localhost:8888/docs](http://localhost:8888/docs) to explore the API.

To run endpoint functions in the explorer:

Expand Down
8 changes: 4 additions & 4 deletions flash/apps/customize-app.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -145,13 +145,13 @@ For details, see:

## Test your customizations

After customizing your app, test locally with `flash run`:
After customizing your app, test locally with `flash dev`:

```bash
flash run
flash dev

# If using uv:
uv run flash run
uv run flash dev
```

This starts a development server at http://localhost:8888 with:
Expand All @@ -169,7 +169,7 @@ Make sure to test:

<CardGroup cols={2}>
<Card title="Test locally" href="/flash/apps/local-testing" icon="flask" horizontal>
Use `flash run` for local development and testing.
Use `flash dev` for local development and testing.
</Card>
<Card title="Deploy to Runpod" href="/flash/apps/deploy-apps" icon="rocket" horizontal>
Deploy your application to production with `flash deploy`.
Expand Down
102 changes: 102 additions & 0 deletions flash/apps/deploy-apps.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -369,6 +369,108 @@ async def classify(text: str) -> dict:
return {"classification": result}
```

## Call deployed endpoints from scripts
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR #324 introduced implicit endpoint resolution via FLASH_APP and FLASH_ENV environment variables. The flash_context.py:17-37 module reads these env vars, and client.py:236-275 uses them in the @remote wrapper decision tree to route calls through the Flash sentinel service.

Source: runpod/flash#324

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these environment variables should only be used to override (e.g. running the script in a different cwd and need to target the same app as before)


After deploying your Flash app, you can call your `@Endpoint` functions directly from Python scripts. Flash automatically resolves the app context from your project structure, so in most cases you can run scripts without any additional configuration.

### How it works

When you run a script that calls an `@Endpoint` function, Flash:

1. Detects the app context from the project directory structure.
2. Looks up the deployed endpoint by name within the resolved app and environment.
3. Routes the request to that endpoint using Flash's sentinel service.
4. Returns the result to your script.

This lets you reuse the same `@Endpoint` function definitions to interact with deployed endpoints without modifying your code.

### Example: calling within the same script

The simplest approach is to call the endpoint directly in the same file where it's defined:

```python
# gpu_worker.py
import asyncio
from runpod_flash import Endpoint, GpuType

@Endpoint(
name="inference",
gpu=GpuType.NVIDIA_GEFORCE_RTX_4090,
dependencies=["torch"]
)
async def run_inference(data: dict) -> dict:
import torch
# Inference logic
return {"result": "processed"}

async def main():
result = await run_inference({"input": "data"})
print(result)

if __name__ == "__main__":
asyncio.run(main())
```

Run the script:

```bash
python gpu_worker.py
```

### Example: importing from another script

You can also import and call endpoints from a separate script:

```python
# call_inference.py
import asyncio
from gpu_worker import run_inference

async def main():
# Flash resolves the app context automatically
result = await run_inference({"input": "data"})
print(result)

if __name__ == "__main__":
asyncio.run(main())
```

Run the script:

```bash
python call_inference.py
```

### Override the resolved context

Flash resolves the app name from your project's directory structure. Use `FLASH_APP` and `FLASH_ENV` environment variables to override this automatic resolution when needed.

A common use case is when you move a script to a different directory. Since the resolved app name depends on the directory location, moving the script changes the resolved context. To continue targeting the original app, set `FLASH_APP` explicitly:

```bash
FLASH_APP=my-app python call_inference.py
```

You can also override the environment:

```bash
FLASH_APP=my-app FLASH_ENV=production python call_inference.py
```

### Error without context

If Flash cannot resolve the app context and you haven't set the environment variables, it raises an error:

```text
RuntimeError: no flash context for endpoint 'inference'. either:
- use 'flash dev' for local development
- set FLASH_APP and FLASH_ENV to target a deployed environment
```

### Automatic context in deployed workers

When Flash deploys your app, it automatically sets `FLASH_APP` and `FLASH_ENV` environment variables on each worker. This enables cross-endpoint communication within your deployed application without additional configuration.

## Troubleshooting

### No @Endpoint functions found
Expand Down
8 changes: 4 additions & 4 deletions flash/apps/initialize-project.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ import { LoadBalancingEndpointsTooltip, QueueBasedEndpointsTooltip } from "/snip

The `flash init` command creates a new Flash project with a complete project structure, including example <LoadBalancingEndpointsTooltip /> and <QueueBasedEndpointsTooltip />, and configuration files. This gives you a working starting point for building Flash applications.

Use `flash init` whenever you want to start a new Flash project, fully configured for you to run `flash run` and `flash deploy`.
Use `flash init` whenever you want to start a new Flash project, fully configured for you to run `flash dev` and `flash deploy`.

## Create a new project

Expand Down Expand Up @@ -105,13 +105,13 @@ Once your project is set up:

```bash
# Start the development server
flash run
flash dev

# Open the API explorer
# http://localhost:8888/docs

# If using uv:
uv run flash run
uv run flash dev
```

Make changes to your worker files, and the server reloads automatically. When you're ready, deploy with:
Expand All @@ -126,6 +126,6 @@ uv run flash deploy
## Next steps

- [Customize your app](/flash/apps/customize-app) to add endpoints and modify configurations.
- [Test locally](/flash/apps/local-testing) with `flash run`.
- [Test locally](/flash/apps/local-testing) with `flash dev`.
- [Deploy to production](/flash/apps/deploy-apps) with `flash deploy`.
- [View the flash init reference](/flash/cli/init) for all options.
32 changes: 16 additions & 16 deletions flash/apps/local-testing.mdx
Original file line number Diff line number Diff line change
@@ -1,10 +1,10 @@
---
title: "Test Flash apps locally"
sidebarTitle: "Test locally"
description: "Use flash run to test your Flash application locally before deploying."
description: "Use flash dev to test your Flash application locally before deploying."
---

The `flash run` command starts a local development server that lets you test your Flash application before deploying to production. The development server runs locally and updates automatically as you edit files.
The `flash dev` command starts a local development server that lets you test your Flash application before deploying to production. The development server runs locally and updates automatically as you edit files.

When you call a `@Endpoint` function, Flash sends the latest function code to Serverless workers on Runpod, so your changes are reflected immediately.

Expand All @@ -13,10 +13,10 @@ When you call a `@Endpoint` function, Flash sends the latest function code to Se
From inside your [project directory](/flash/apps/initialize-project), run:

```bash
flash run
flash dev

# If using uv:
uv run flash run
uv run flash dev
```

The server starts at `http://localhost:8888` by default. Your endpoints are available immediately for testing, and `@Endpoint` functions provision Serverless endpoints on first call.
Expand All @@ -25,14 +25,14 @@ The server starts at `http://localhost:8888` by default. Your endpoints are avai

```bash
# Change port
flash run --port 3000
flash dev --port 3000

# Make accessible on network
flash run --host 0.0.0.0
flash dev --host 0.0.0.0

# If using uv:
uv run flash run --port 3000
uv run flash run --host 0.0.0.0
uv run flash dev --port 3000
uv run flash dev --host 0.0.0.0
```

## Test your endpoints
Expand Down Expand Up @@ -96,17 +96,17 @@ print(response.json())
The first call to a `@Endpoint` function provisions a Serverless endpoint, which takes 30-60 seconds. Use `--auto-provision` to provision all endpoints at startup:

```bash
flash run --auto-provision
flash dev --auto-provision

# If using uv:
uv run flash run --auto-provision
uv run flash dev --auto-provision
```

This scans your project for `@Endpoint` functions and deploys them before the server starts accepting requests. Endpoints are cached in `.flash/resources.pkl` and reused across server restarts.

## How it works

With `flash run`, Flash starts a local development server alongside remote Serverless endpoints:
With `flash dev`, Flash starts a local development server alongside remote Serverless endpoints:

```mermaid
%%{init: {'theme':'base', 'themeVariables': { 'primaryColor':'#9289FE','primaryTextColor':'#fff','primaryBorderColor':'#9289FE','lineColor':'#5F4CFE','secondaryColor':'#AE6DFF','tertiaryColor':'#FCB1FF','edgeLabelBackground':'#5F4CFE', 'fontSize':'14px','fontFamily':'font-inter'}}}%%
Expand Down Expand Up @@ -146,11 +146,11 @@ flowchart TB
| `@Endpoint` function code | Runpod Serverless |
| Endpoint storage | Runpod Serverless |

Your code updates automatically as you edit files. Endpoints created by `flash run` are prefixed with `live-` to distinguish them from production endpoints.
Your code updates automatically as you edit files. Endpoints created by `flash dev` are prefixed with `live-` to distinguish them from production endpoints.

## Clean up after testing

Endpoints created by `flash run` persist until you delete them. To clean up:
Endpoints created by `flash dev` persist until you delete them. To clean up:

```bash
# List all endpoints
Expand Down Expand Up @@ -179,10 +179,10 @@ Flash automatically selects the next available port if your specified port is in
Use `--auto-provision` to eliminate cold-start delays:

```bash
flash run --auto-provision
flash dev --auto-provision

# If using uv:
uv run flash run --auto-provision
uv run flash dev --auto-provision
```

**Authentication errors**
Expand Down Expand Up @@ -210,4 +210,4 @@ Values in your `.env` file are only available locally for CLI commands. They are

- [Deploy to production](/flash/apps/deploy-apps) when your app is ready.
- [Clean up endpoints](/flash/cli/undeploy) after testing.
- [View the flash run reference](/flash/cli/run) for all options.
- [View the flash dev reference](/flash/cli/dev) for all options.
4 changes: 2 additions & 2 deletions flash/apps/overview.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -59,7 +59,7 @@ Building a Flash application follows a clear progression from initialization to
Start a local development server to test your application:

```bash
flash run
flash dev
```

Your app runs locally and updates automatically. When you call an `@Endpoint` function, Flash sends the latest code to Runpod workers. [Learn more about local testing](/flash/apps/local-testing).
Expand Down Expand Up @@ -102,7 +102,7 @@ Flash uses a two-level organizational structure: **apps** (project containers) a
Create boilerplate code for a new Flash project with `flash init`.
</Card>
<Card title="Test locally" href="/flash/apps/local-testing" icon="flask" horizontal>
Use `flash run` for local development and testing.
Use `flash dev` for local development and testing.
</Card>
<Card title="Deploy to Runpod" href="/flash/apps/deploy-apps" icon="rocket" horizontal>
Deploy your application to production with `flash deploy`.
Expand Down
2 changes: 1 addition & 1 deletion flash/cli/build.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -164,7 +164,7 @@ ls .flash/.build/
## Related commands

- [`flash deploy`](/flash/cli/deploy) - Build and deploy in one step (includes `--preview` option for local testing)
- [`flash run`](/flash/cli/run) - Start development server
- [`flash dev`](/flash/cli/dev) - Start development server
- [`flash env`](/flash/cli/env) - Manage environments

<Note>
Expand Down
6 changes: 3 additions & 3 deletions flash/cli/deploy.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -214,9 +214,9 @@ flash deploy --exclude scipy,pandas

See [`flash build` - Managing deployment size](/flash/cli/build#managing-deployment-size) for more details.

## flash run vs flash deploy
## flash dev vs flash deploy

See [`flash run`](/flash/cli/run#flash-run-vs-flash-deploy) for a detailed comparison of local development vs production deployment.
See [`flash dev`](/flash/cli/dev#flash-dev-vs-flash-deploy) for a detailed comparison of local development vs production deployment.

## Troubleshooting

Expand Down Expand Up @@ -252,7 +252,7 @@ export RUNPOD_API_KEY="your_key_here"
## Related commands

- [`flash build`](/flash/cli/build) - Build without deploying
- [`flash run`](/flash/cli/run) - Local development server
- [`flash dev`](/flash/cli/dev) - Local development server
- [`flash env`](/flash/cli/env) - Manage environments
- [`flash app`](/flash/cli/app) - Manage applications
- [`flash undeploy`](/flash/cli/undeploy) - Remove endpoints
Loading
Loading