Skip to content

whiteducksoftware/flock

Repository files navigation

Flock Banner

PyPI Version Python Version CI Status License Built by white duck LinkedIn Bluesky


The Problem You Know Too Well

🀯 Prompt Hell: Brittle 500-line prompts that break with every model update.
πŸ’₯ System Failures: One bad LLM response crashes your entire workflow
πŸ§ͺ Testing Nightmares: "How do I unit test a prompt?" (You don't.)
πŸ§ͺ Measuring Quality: "How do I know my prompts are close to optimal?" (You also don't.)
πŸ“„ Output Chaos: Parsing unstructured LLM responses into reliable data
⛓️ Orchestration Limits: Moving beyond simple chains and DAGs? Good luck
πŸš€ Production Gap: Jupyter notebooks don't scale to enterprise systems

After building dozens of AI systems for enterprise clients, we realized the tooling was fundamentally broken.

Build with agents, not against them.

The Flock Solution

What if you could just skip that 'prompt engineering' step?

Flock is an agent framework for declarative AI workflows. You define what goes in and what should come out, the how is handled by the agent.
No brittle prompts. No guesswork. Just reliable, testable AI agents.

βœ… Declarative Contracts: Define inputs/outputs with Pydantic models. Flock handles the LLM complexity.
⚑ Built-in Resilience: Automatic retries, state persistence, and workflow resumption via Temporal.io
πŸ§ͺ Actually Testable: Clear contracts make agents unit-testable like any other code
πŸ§ͺ Optimal Quality: Agents posses multiple self-optimization algorithms based on latest research
πŸš€ Dynamic Workflows: Self-correcting loops, conditional routing, and intelligent decision-making
πŸ”§ Zero-Config Production: Deploy as REST APIs with one command. Scale without rewriting.

Ready to see it in action?

⚑ Quick Start

from flock.core import Flock, FlockFactory

# 1. Create the main orchestrator
my_flock = Flock(model="openai/gpt-4o")

# 2. Declaratively define an agent
brainstorm_agent = FlockFactory.create_default_agent(
    name="idea_generator",
    input="topic",
    output="catchy_title, key_points"
)

# 3. Add the agent to the Flock
my_flock.add_agent(brainstorm_agent)

# 4. Run the agent!
input_data = {"topic": "The future of AI agents"}
result = my_flock.run(start_agent="idea_generator", input=input_data)

# The result is a Box object (dot-accessible dict)
print(f"Generated Title: {result.catchy_title}")
print(f"Key Points: {result.key_points}")

No 20-line prompt fiddling. Just structured output, every time.

image

Explore more examples β†’ Flock Showcase Repository

πŸ“Ή Video Demo

flock_3min_all.mp4

πŸ’Ύ Installation - Use Flock in your project

Get started with the core Flock library:

# Using uv (recommended)
uv pip install flock-core

# Using pip
pip install flock-core

Extras: Install optional dependencies for specific features:

# Common tools (Tavily, Markdownify)
uv pip install flock-core[all-tools]

# All optional dependencies (including tools, docling, etc.)
uv sync --all-extras

πŸ”‘ Installation - Develop Flock

git clone https://github.com/whiteducksoftware/flock.git
cd flock

# One-liner dev setup after cloning
pip install poethepoet && poe install

Additional provided poe tasks and commands:

poe install # Install the project
poe build # Build the project
poe docs # Serve the docs
poe format # Format the code
poe lint # Lint the code

πŸ”‘ Environment Setup

Flock uses environment variables (typically in a .env file) for configuration, especially API keys. Create a .env file in your project root:

# .env - Example

# --- LLM Provider API Keys (Required by most examples) ---
# Add keys for providers you use (OpenAI, Anthropic, Gemini, Azure, etc.)
# Refer to litellm docs (https://docs.litellm.ai/docs/providers) for names
OPENAI_API_KEY="your-openai-api-key"
# ANTHROPIC_API_KEY="your-anthropic-api-key"

# --- Tool-Specific Keys (Optional) ---
# TAVILY_API_KEY="your-tavily-search-key"
# GITHUB_PAT="your-github-personal-access-token"

# --- Default Flock Settings (Optional) ---
DEFAULT_MODEL="openai/gpt-4o" # Default LLM if agent doesn't specify

# --- Flock CLI Settings (Managed by `flock settings`) ---
# SHOW_SECRETS="False"
# VARS_PER_PAGE="20"

Be sure that the .env file is added to your .gitignore!

🐀 New in Flock 0.4.0 Magpie 🐀

image

0.4.5 - MCP Support - Declaratively connect to 1000s of different tools!

Create a server

ws_fetch_server = FlockFactory.create_mcp_server(
    name="fetch_server",
    enable_tools_feature=True,
    connection_params=FlockFactory.WebsocketParams(
        url="ws://localhost:4001/message"
    ),

Add it to Flock

flock = Flock(
    name="mcp_testbed",
    servers=[
        ws_fetch_server
    ]
)

And tell the flock agents which server to use

webcrawler_agent = FlockFactory.create_default_agent(
    name="webcrawler_agent",
    description="Expert for looking up and retrieving web content",
    input="query: str | User-Query, initial_url: Optional[str] | Optional url to start search from.",
    output="answer: str | Answer to user-query, page_url: str | The url of the page where the answer was found on, page_content: str | Markdown content of the page where the answer was found.",
    servers=[ws_fetch_server], # servers are passed here.
)

Done! The Flock agent has now access to every tool the server offers.

πŸš€ REST API – Deploy Flock Agents as REST API Endpoints

Easily deploy your Flock agents as scalable REST API endpoints. Interact with your agent workflows via standard HTTP requests.

The all-in-one flock.serve() method turns your Flock into a proper REST API!

image

Need custom endpoints to wrap abstract agent logic or add business logic? We've got you. Define them. Declaratively.

word_count_route = FlockEndpoint(
    path="/api/word_count",
    methods=["GET"],
    callback=word_count,
    query_model=WordCountParams,
    response_model=WordCountResponse,
    summary="Counts words in a text",
    description="Takes a text and returns the number of words in it.",
)

flock.serve(custom_endpoints=[img_url_route, word_count_route, yoda_route])
image

Want chat and UI too? Just turn them on.

flock.serve(ui=True, chat=True)

πŸ–₯️ Web UI – Test Flock Agents in the Browser

Test and interact with your Flock agents directly in your browser using an integrated web interface.

image

Highlights of this feature-rich interface:

  • Run all your agents and agent flows
  • Chat with your agents
  • Create sharable links – these freeze agent config so testers can focus on evaluation
  • Send direct feedback – includes everything needed to reproduce issues
  • Switch modes – like standalone chat mode, which hides all but the chat
image

And much, much more... All features are based on real-world client feedback and serve actual business needs.


⌨️ CLI Tool – Manage Flock Agents via Command Line

Manage configurations, run agents, and inspect results – all from your terminal. A quick way to test and validate serialized flocks.

image


πŸ’Ύ Enhanced Serialization – Share, Deploy, and Run Flocks from YAML

Define and share entire Flock configurations using readable YAML files. Perfect for versioning, deployment, and portability.

Take note how even custom types like FantasyCharacter are serialized so the target system doesn't even need your code! Everything portable!

name: pydantic_example
model: openai/gpt-4o
enable_temporal: false
show_flock_banner: false
temporal_start_in_process_worker: true
agents:
  character_agent:
    name: character_agent
    model: openai/gpt-4o
    description: Generates fantasy RPG character profiles for a specified number of
      characters.
    input: 'number_of_characters: int | The number of fantasy character profiles to
      generate.'
    output: 'character_list: list[FantasyCharacter] | A list containing the generated
      character profiles.'
    write_to_file: false
    wait_for_input: false
    evaluator:
      name: default
      config:
        model: openai/gpt-4o
        use_cache: true
        temperature: 0.8
        max_tokens: 8192
        stream: false
        include_thought_process: false
        kwargs: {}
      type: DeclarativeEvaluator
    modules:
      output:
        name: output
        config:
          enabled: true
          theme: abernathy
          render_table: false
          max_length: 1000
          truncate_long_values: true
          show_metadata: true
          format_code_blocks: true
          custom_formatters: {}
          no_output: false
          print_context: false
        type: OutputModule
      metrics:
        name: metrics
        config:
          enabled: true
          collect_timing: true
          collect_memory: true
          collect_token_usage: true
          collect_cpu: true
          storage_type: json
          metrics_dir: metrics/
          aggregation_interval: 1h
          retention_days: 30
          alert_on_high_latency: true
          latency_threshold_ms: 30000
        type: MetricsModule
types:
  FantasyCharacter:
    module_path: __main__
    type: pydantic.BaseModel
    schema:
      description: 'Data model for fantasy RPG character information.

        Docstrings and Field descriptions can help guide the LLM.'
      properties:
        name:
          description: A creative fantasy character name.
          title: Name
          type: string
        race:
          description: The character's race.
          enum:
          - human
          - elf
          - dwarf
          - orc
          - halfling
          title: Race
          type: string
        class_type:
          description: The character's class.
          enum:
          - warrior
          - mage
          - rogue
          - cleric
          - ranger
          title: Class Type
          type: string
        level:
          description: Character level
          title: Level
          type: integer
        strength:
          description: Strength stat
          title: Strength
          type: integer
        dexterity:
          description: Dexterity stat
          title: Dexterity
          type: integer
        constitution:
          description: Constitution stat
          title: Constitution
          type: integer
        intelligence:
          description: Intelligence stat
          title: Intelligence
          type: integer
        wisdom:
          description: Wisdom stat
          title: Wisdom
          type: integer
        charisma:
          description: Charisma stat
          title: Charisma
          type: integer
        weapons:
          description: A list of weapons the character carries.
          items:
            type: string
          title: Weapons
          type: array
        backstory:
          description: A brief, engaging backstory (2-3 sentences).
          title: Backstory
          type: string
        motivation:
          description: The character's motivation for their adventuring.
          title: Motivation
          type: string
        alignment:
          description: Character's moral alignment
          title: Alignment
          type: string
      required:
      - name
      - race
      - class_type
      - level
      - strength
      - dexterity
      - constitution
      - intelligence
      - wisdom
      - charisma
      - weapons
      - backstory
      - motivation
      - alignment
      type: object
components:
  DeclarativeEvaluator:
    type: flock_component
    module_path: flock.evaluators.declarative.declarative_evaluator
    file_path: src\\flock\\evaluators\\declarative\\declarative_evaluator.py
    description: Evaluator that uses DSPy for generation.
  OutputModule:
    type: flock_component
    module_path: flock.modules.output.output_module
    file_path: src\\flock\\modules\\output\\output_module.py
    description: Module that handles output formatting and display.
  MetricsModule:
    type: flock_component
    module_path: flock.modules.performance.metrics_module
    file_path: src\\flock\\modules\\performance\\metrics_module.py
    description: Module for collecting and analyzing agent performance metrics.
dependencies:
- pydantic>=2.0.0
- flock-core>=0.4.0
metadata:
  path_type: relative
  flock_version: 0.4.0

Why is text-based serialization cool? Because agents can manipulate their own config – go wild with meta agents and experiments.


πŸŒ€ New Execution Flows – Batch and Evaluation Modes

Run Flock in batch mode to process multiple inputs or in evaluation mode to benchmark agents against question/answer pairs.

batch_data = [
    {"topic": "Robot Kittens", "audience": "Tech Enthusiasts"},
    {"topic": "AI in Gardening", "audience": "Homeowners"},
    ...
]

static_data = {"number_of_slides": 6}

silent_results = flock.run_batch(
    start_agent=presentation_agent,
    batch_inputs=batch_data,
    static_inputs=static_data,
    parallel=True,
    max_workers=5,
    silent_mode=True,
    return_errors=True,
    write_to_csv=".flock/batch_results.csv",
)

Supports CSV in and out. Combine with .evaluate() to benchmark Flock with known Q/A sets.


⏱️ First-Class Temporal Integration

Flock 0.4.0 brings seamless integration with Temporal.io. Build production-grade, reliable, and scalable agent workflows.

flock = Flock(
    enable_temporal=True,
    temporal_config=TemporalWorkflowConfig(
        task_queue="flock-test-queue",
        workflow_execution_timeout=timedelta(minutes=10),
        default_activity_retry_policy=TemporalRetryPolicyConfig(
            maximum_attempts=2
        ),
    ),
)

Just set a flag. Add your constraints. Now you've got retry policies, timeout control, and error handling baked in.


✨ Utility – @flockclass Hydrator

Flock also adds conveniences. With @flockclass, you can turn any Pydantic model into a self-hydrating agent.

from pydantic import BaseModel
from flock.util.hydrator import flockclass

@flockclass(model="openai/gpt-4o")
class CharacterIdea(BaseModel):
    name: str
    char_class: str
    race: str
    backstory_hook: str | None = None
    personality_trait: str | None = None

async def create_character():
    char = CharacterIdea(name="Gorok", char_class="Barbarian", race="Orc")
    print(f"Before Hydration: {char}")

    hydrated_char = await char.hydrate()

    print(f"\nAfter Hydration: {hydrated_char}")
    print(f"Backstory Hook: {hydrated_char.backstory_hook}")

πŸ“š Examples & Tutorials

For a comprehensive set of examples, ranging from basic usage to complex projects and advanced features, please visit our dedicated showcase repository:

➑️ github.com/whiteducksoftware/flock-showcase ⬅️

The showcase includes:

  • Step-by-step guides for core concepts.
  • Examples of tool usage, routing, memory, and more.
  • Complete mini-projects demonstrating practical applications.

πŸ“– Documentation

Full documentation, including API references and conceptual explanations, can be found at:

➑️ whiteducksoftware.github.io/flock/ ⬅️

🀝 Contributing

We welcome contributions! Please see the CONTRIBUTING.md file (if available) or open an issue/pull request on GitHub.

Ways to contribute:

  • Report bugs or suggest features.
  • Improve documentation.
  • Contribute new Modules, Evaluators, or Routers.
  • Add examples to the flock-showcase repository.

πŸ“œ License

Flock is licensed under the MIT License. See the LICENSE file for details.

🏒 About

Flock is developed and maintained by white duck GmbH, your partner for cloud-native solutions and AI integration.

About

πŸ€πŸ§πŸ“πŸ¦† A declarative and highly modular AI Agent Framework πŸ€πŸ§πŸ“ JOIN THE FLOCK

Topics

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Packages

No packages published

Contributors 7