GitHub - UNSTOPBIL/flowforge-ai: OpenEnv RL environment for enterprise workflow automation — Meta PyTorch Hackathon 2024

title	FlowForge AI
emoji	🔧
colorFrom	blue
colorTo	indigo
sdk	docker
app_port	7860

An OpenEnv-compatible Reinforcement Learning environment for Enterprise Workflow Automation.

FlowForge simulates actual back-office operations where LLM agents act as automated HR/operations assistants, learning to synthesize information, manage tools, and recover from real-world errors.

✨ Key Features

Genuine Enterprise Operations: Move beyond toy environments. Agents read files, search employee databases, run SQL queries, schedule meetings, and send emails.
Strictly Defined Action Space: Validated entirely via Pydantic — preventing hallucinatory tool calls.
Task-Aware Reward Shaping: Dense reward signals that adapt based on the task (e.g., read_file is crucial for hard tasks, but optional for easy ones).
Anti-Loop Architecture: Punishes infinite loops and duplicate actions to teach agents efficient planning.
Zero-Cost Baseline: Run locally and test deterministically without eating up OpenAI credits.

⚙️ How it Works

graph TD
    A[LLM Agent] -->|Action JSON| B(FlowForge Environment)
    B -->|Validation| C{Valid Tool?}
    C -- No --> D[Error Observation + Negative Reward]
    C -- Yes --> E[Execute Tool]
    E --> F(State Tracker)
    F -->|Objective Check| G{Task Complete?}
    G -- Yes --> H[Success Observation + Finish Reward]
    G -- No --> I[Result Observation + Progress Reward]
    D --> A
    I --> A
    H --> J((Episode End))
    
    classDef default fill:#1f2937,stroke:#3b82f6,stroke-width:2px,color:#f3f4f6;
    classDef logic fill:#374151,stroke:#f59e0b,stroke-width:2px,color:#f3f4f6;
    classDef success fill:#064e3b,stroke:#10b981,stroke-width:2px,color:#d1fae5;
    classDef fail fill:#7f1d1d,stroke:#ef4444,stroke-width:2px,color:#fee2e2;

    class C,G logic;
    class H success;
    class D fail;

Local Setup

# Set up a virtual environment
python -m venv .venv
source .venv/bin/activate

# Install dependencies
pip install -r requirements.txt

# Run the deterministic baseline inference (tests all 3 tasks)
python inference.py

Docker Deployment

# Build the image
docker build -t flowforge-ai .

# Run the container
docker run -p 7860:7860 --cpus=2 --memory=8g flowforge-ai

📊 Environment Specifications

1. Action Space

The Action Space is strictly defined via the Pydantic FlowForgeAction model.

Available Tools:

Tool	Parameters	Description
`search_db`	`query: str`	Search employee/service database
`send_email`	`to: str, subject: str, body: str`	Send simulated email
`read_file`	`file_path: str`	Read internal reports/files
`run_query`	`query: str`	Execute SQL SELECT on database
`schedule_meeting`	`attendees: list, date: str, title: str`	Schedule a calendar meeting
`finish`	(none)	Signal task completion

2. Observation Space

Defined via the FlowForgeObservation Pydantic model:

Field	Type	Description
`message`	`str`	Free-text feedback from the environment
`data`	`dict`	Structured result data (query rows, file contents, etc.)
`error`	`bool`	True if the previous action failed
`available_tools`	`list[str]`	Tools available in the current episode
`state_summary`	`dict`	Step count, progress, action history, usage stats

3. Reward Function (Dense)

Signal	Value	Condition
Tool execution bonus	`+0.2 × relevance`	Successful tool call (scaled by task relevance)
Objective progress bonus	`+0.3 × (1 + ratio)`	First time a new objective is satisfied
Sub-goal proximity	`+0.1`	Intermediate progress without hitting objective
Finish reward	`+0.1`	Clean termination via `finish`
Tool failure penalty	`-0.1`	Invalid parameters or execution error
Unknown tool penalty	`-0.2`	Attempting a tool that doesn't exist
Loop penalty	`-0.05 × frequency`	Repeating the same tool (capped at -0.2)

📈 Base Performance

Evaluated using the rule-based baseline agent inference.py (guarantees perfect task compliance without LLM hallucination).

Task Difficulty	Objectives	Baseline Score (0-1.0)
Easy	Find employee data	1.00 🏆
Medium	DB search + Send email	1.00 🏆
Hard	Read report + SQL Query + Schedule + Email	1.00 🏆

📁 Repository Structure

FlowForge/
├── inference.py              # Main inference entry point
├── openenv.yaml              # Environment configuration definitions
├── Dockerfile                # Production container specification
├── requirements.txt          # Python dependencies
├── deploy_to_hf.py           # Deployment automation script
├── flowforge/                # Core Environment Logic
│   ├── env.py                # FlowForgeEnvironment implementation
│   ├── grader.py             # Objective-based scoring methodology
│   ├── models.py             # Pydantic Action/Observation schemas
│   ├── tasks/                # Difficulty presets
│   │   ├── task_easy.py
│   │   ├── task_medium.py
│   │   └── task_hard.py
│   └── tools/                # Mock Enterprise Tools
│       ├── search_db.py
│       ├── send_email.py
│       ├── read_file.py
│       ├── run_query.py
│       └── schedule_meeting.py
└── server/                   # HTTP Wrapper (OpenEnv spec)
    └── app.py                # FastAPI endpoints

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
assets		assets
flowforge		flowforge
server		server
.dockerignore		.dockerignore
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
inference.py		inference.py
openenv.yaml		openenv.yaml
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

✨ Key Features

⚙️ How it Works

Local Setup

Docker Deployment

📊 Environment Specifications

📈 Base Performance

📁 Repository Structure

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

✨ Key Features

⚙️ How it Works

Local Setup

Docker Deployment

📊 Environment Specifications

📈 Base Performance

📁 Repository Structure

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages