BNY Agent

A React-based chat interface for a multi-trial self-evolving AI system that answers questions and complete tasks using tool-augmented reasoning with self-evaluation and reflection loops.

Features

Streaming Chat — Real-time Server-Sent Events (SSE) streaming that surfaces each step of the agent's reasoning: tool calls, intermediate results, self-evaluation, reflection, and final answers.
Live Browser View — Optional real-time browser rendering over WebSocket so you can watch the agent's Chromium session as it navigates and interacts with pages.
Multi-Agent Support — Connects to multiple specialized agents (Equity Research Analyst, Market Intelligence Associate, Portfolio Risk Analyst, Financial Advisor Assistant) and routes queries automatically.
Per-Employee Model — Each digital employee runs on its own configured LLM (set in the employee wizard); the runtime honors that choice and the chat header reflects it. Empty values fall back to the default AGENT_MODEL.
Slack Integration — Optional Slack bot (Socket Mode) that lets users chat with employees from Slack. Tag an employee by name in a channel (@BNY Agent Walter, do X) or open a DM with the bot. Chat history persists across restarts and shows up in the same sidebar as web chats.
Chat History — Sidebar with full conversation history, rename, and delete support.
Evaluation Dashboard — View benchmark results for each agent including task/step success rates, latency percentiles, hallucination rates, and per-category breakdowns.
Skills Management — Browse, create, edit, and delete agent skills. Inspect skill definitions and associated files with an inline file viewer.
File Uploads — Attach data files to conversations; uploaded files are tracked per chat.
Configurable Parameters — Adjust the model, max trials, and confidence threshold directly from the input box.

Tech Stack

Layer	Technology
Framework	React 19
Build Tool	Vite 6
Styling	Tailwind CSS 4
Icons	Lucide React
Markdown	react-markdown
Linting	ESLint 9

Project Structure

capstone_frontend/
├── frontend/               # React application
│   ├── public/
│   ├── src/
│   │   ├── main.jsx        # Entry point
│   │   ├── App.jsx          # Root component, routing & state
│   │   ├── index.css        # Global styles & Tailwind config
│   │   ├── services/
│   │   │   └── api.js       # API client (SSE streaming, REST)
│   │   └── components/
│   │       ├── Sidebar.jsx        # Navigation & chat history
│   │       ├── WelcomeHeader.jsx  # Landing screen header
│   │       ├── InputBox.jsx       # Chat input with config controls
│   │       ├── ChatMessage.jsx    # Message rendering (all event types)
│   │       ├── DataContext.jsx    # Uploaded data panel
│   │       ├── EvaluationView.jsx # Agent evaluation dashboard
│   │       └── SkillsView.jsx     # Skill browser & editor
│   ├── package.json
│   └── vite.config.js
├── backend/                # FastAPI backend server
│   ├── server.py # Main API entrypoint (FastAPI)
│   ├── .env # Environment variables & api keys
│   ├── requirements.txt # Backend dependencies
│   │
│   ├── skills/ # 📁 Persisted skill storage
│   │   └── [skill-name]/SKILL.md # Individual skill definitions
│   │   └── ...
│   │
│   ├── reflexion_agent/ # 📁 Agent Interaction Loop
│   │   ├── agent.py # Core agent loop and OpenHands integration
│   │   ├── evaluator.py # Assesses agent output correctness
│   │   ├── reflector.py # Generates self-reflection feedback on failure
│   │   └── memory.py # Manages conversation history/trajectory
│   │
│   ├── skills_ingestor/ # 📁 Skill Creation Tools
│   │   ├── mm_train.py # Multimodal trainer (video/audio -> skills)
│   │   └── prompts.py # System prompts for skill extraction
│   │
│   └── skillsbench/ # 📁 Automated Evaluation Framework
│       ├── experiments/
│       │   ├── skill_evaluation_framework.py # Orchestrator for evaluating skills against tasks
│       │   └── skill-eval-runs/ # Evaluation results (JSON/CSV)
│       └── tasks/ # 📁 Tasks/environments for evaluation
└── start.sh                # One-command launcher for all services

Prerequisites

Node.js >= 18
Python >= 3.10 (for the backend)

Getting Started

1. Configuration

The real agent requires an LLM API key.

Copy the example environment variables:

cd backend
cp .env.example .env

Then edit backend/.env and add your OpenRouter / OpenAI API Keys:

OPENROUTER_API_KEY=sk-or-v1-...

(If .env is missing or invalid, the backend will gracefully fall back to a "Mock Mode" that simulates an agent).

2. Quick Start (recommended)

The included start.sh script installs all dependencies and launches everything in one step:

./start.sh

This will automatically:

Install frontend npm dependencies
Create backend/.venv and install all Python requirements (including OpenHands)
Setup the skillsbench evaluation framework
Start the backend API on http://localhost:8000
Start the React frontend on http://localhost:5173

Press Ctrl+C in the terminal to cleanly stop all services.

Skillsbench Setup (for skill evaluation)

If the "Run Evaluation" button fails or skill evaluation doesn't work, initialize the skillsbench environment manually:

cd backend/skillsbench
uv sync

This installs Harbor and other evaluation dependencies into backend/skillsbench/.venv.

Manual Setup

If you prefer to run the services separately:

Backend:

cd backend
python3 -m venv .venv
.venv/bin/pip install -r requirements.txt
.venv/bin/uvicorn server:app --reload --port 8000

Frontend:

cd frontend
npm install
npm run dev

The frontend dev server starts at http://localhost:5173 and proxies /api requests to the backend.

Slack Bot (optional)

The backend ships with an optional Slack bot that runs in Socket Mode — no public ingress or tunnel is required; the bot opens an outbound WebSocket to Slack.

Create a Slack app from the manifest at backend/slack_app_manifest.yaml (Slack → Your Apps → Create New App → From manifest). The manifest enables Socket Mode and requests the scopes the bot needs (app_mentions:read, chat:write, im:*, etc.).
Install the app to your workspace and grab:
- the Bot User OAuth Token (xoxb-…)
- an App-Level Token with connections:write (xapp-…)

Add them to backend/.env:

SLACK_BOT_TOKEN=xoxb-...
SLACK_APP_TOKEN=xapp-...

Start the backend normally — if both tokens are set, the bot connects on startup. If they are missing the backend skips Slack and only serves the web UI.

Routing: @BNY Agent Walter, summarize this filing routes to the employee named "Walter". DMs remember the last employee you spoke to. Multi-word names work with spaces, underscores, or hyphens.

Live Browser Flags

The realtime browser panel is enabled by default. You can tune it with these environment variables:

ENABLE_BROWSER_LIVE=true
BROWSER_LIVE_QUALITY=60
BROWSER_LIVE_MAX_W=1280
BROWSER_LIVE_MAX_H=800
VITE_LIVE_BROWSER=true

ENABLE_BROWSER_LIVE toggles the backend WebSocket stream at /ws/browser/:sessionId.
BROWSER_LIVE_QUALITY controls JPEG quality from 1 to 100.
BROWSER_LIVE_MAX_W / BROWSER_LIVE_MAX_H cap the streamed frame size.
VITE_LIVE_BROWSER controls whether the employee page renders the browser panel.

Available Scripts

Run these from the frontend/ directory:

Command	Description
`npm run dev`	Start the Vite development server
`npm run build`	Build for production
`npm run preview`	Preview the production build locally
`npm run lint`	Run ESLint

API Overview

The frontend communicates with the backend through these endpoints:

Method	Endpoint	Description
`POST`	`/api/chat`	Send a question; returns an SSE stream of agent events
`WS`	`/ws/browser/:id`	Stream live browser frames and navigation metadata
`GET`	`/api/chats`	List all chat sessions
`GET`	`/api/chats/:id`	Retrieve a full chat with messages
`PATCH`	`/api/chats/:id`	Rename a chat
`DELETE`	`/api/chats/:id`	Delete a chat
`GET`	`/api/agents`	List available agents
`GET`	`/api/evaluations`	Get agent benchmark results
`GET`	`/api/skills`	List all skills
`POST`	`/api/skills`	Create a new skill
`PATCH`	`/api/skills/:id`	Update a skill
`DELETE`	`/api/skills/:id`	Delete a user-created skill

Connecting to the Real Backend

By default, Vite proxies all /api requests to http://localhost:8000. To point at a different backend, update the proxy target in frontend/vite.config.js:

server: {
  proxy: {
    '/api': {
      target: 'http://your-backend-host:port',
      changeOrigin: true,
    },
  },
},

Name		Name	Last commit message	Last commit date
Latest commit History 336 Commits
backend		backend
frontend		frontend
scripts		scripts
.env.template		.env.template
.gitignore		.gitignore
README.md		README.md
TECHNICAL_CONTRIBUTION_ASSIGNMENT.md		TECHNICAL_CONTRIBUTION_ASSIGNMENT.md
start.sh		start.sh
workflow_plan.md		workflow_plan.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

BNY Agent

Features

Tech Stack

Project Structure

Prerequisites

Getting Started

1. Configuration

2. Quick Start (recommended)

Skillsbench Setup (for skill evaluation)

Manual Setup

Slack Bot (optional)

Live Browser Flags

Available Scripts

API Overview

Connecting to the Real Backend

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

BNY Agent

Features

Tech Stack

Project Structure

Prerequisites

Getting Started

1. Configuration

2. Quick Start (recommended)

Skillsbench Setup (for skill evaluation)

Manual Setup

Slack Bot (optional)

Live Browser Flags

Available Scripts

API Overview

Connecting to the Real Backend

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages