Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
24 commits
Select commit Hold shift + click to select a range
b55a69e
total repo with commented back unused features
ijusttookadnatest Apr 11, 2025
f7f81d1
remove comment
ijusttookadnatest Apr 11, 2025
1de90ef
remove front
ijusttookadnatest Apr 14, 2025
9c9f726
removing unused code
ijusttookadnatest Apr 14, 2025
f3a946b
remove data repo and unused network and volume docker
ijusttookadnatest Apr 14, 2025
7e4e782
update tests for cairocoder agent
ijusttookadnatest Apr 14, 2025
f0174b0
backend refacto
ijusttookadnatest Apr 14, 2025
9de746e
remove logger back
ijusttookadnatest Apr 15, 2025
4270dfc
remove unnecessary code
ijusttookadnatest Apr 15, 2025
5529445
small refacto agents
ijusttookadnatest Apr 15, 2025
9802c2e
update test following last changes
ijusttookadnatest Apr 15, 2025
cb011c1
fix generateEmbedding import and add clean clean:all rules to clean b…
ijusttookadnatest Apr 15, 2025
045bc8a
fix import
ijusttookadnatest Apr 16, 2025
f4f2333
update readme, remove files
ijusttookadnatest Apr 16, 2025
b7c14af
update readme
ijusttookadnatest Apr 16, 2025
e95707d
update readme
ijusttookadnatest Apr 16, 2025
42d220f
update readme
ijusttookadnatest Apr 16, 2025
2d15538
update readme
ijusttookadnatest Apr 16, 2025
d18d583
remove ollama provider
ijusttookadnatest Apr 16, 2025
9dfdf62
update cursor doc
ijusttookadnatest Apr 16, 2025
1197c66
remove gethosted mode
ijusttookadnatest Apr 16, 2025
f083720
remove config endpoint
ijusttookadnatest Apr 16, 2025
f958e37
Update README.md
ijusttookadnatest Apr 16, 2025
5d9ae70
Update README.md
ijusttookadnatest Apr 16, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
69 changes: 69 additions & 0 deletions .cursor/rules/coding_standards.mdc
Original file line number Diff line number Diff line change
@@ -0,0 +1,69 @@
---
description: Coding Standards
globs: *.ts,*.tsx,*.js,*.jsx
---
# Coding Standards for Starknet Agent

## Naming Conventions
- Variables and functions: Use `camelCase` (e.g., `fetchData`, `generateEmbeddings`).
- Classes and components: Use `PascalCase` (e.g., `RagAgent`, `ChatInterface`).
- Constants: Use `UPPER_CASE` with underscores (e.g., `DEFAULT_CHAT_MODEL`).
- Type interfaces: Use `PascalCase` with `I` prefix (e.g., `IAgentConfig`).
- Ingester classes: Use `PascalCase` with `Ingester` suffix (e.g., `CairoBookIngester`).
- Pipeline components: Use descriptive names ending with their role (e.g., `QueryProcessor`, `DocumentRetriever`).

## Indentation and Formatting
- Use 2 spaces for indentation (no tabs).
- Keep lines under 100 characters where possible.
- Place opening braces on the same line as the statement (e.g., `if (condition) {`).
- Use Prettier for consistent formatting across the codebase.
- Run `pnpm format:write` before committing changes.

## Imports and Structure
- Group external imports first, followed by internal modules.
- Use barrel exports (index.ts files) to simplify imports.
- Prefer destructured imports when importing multiple items from a single module.
- Order imports alphabetically within their groups.
- Use relative paths for imports within the same package, absolute paths for cross-package imports.

## Comments
- Add JSDoc comments for functions and classes, especially in the agent pipeline and ingester components.
- Use `//` for single-line comments and `/* ... */` for multi-line comments.
- Document ingester classes with clear descriptions of the source and processing approach.
- Include explanations for complex algorithms or non-obvious design decisions.
- For the RAG pipeline components, document the input/output expectations clearly.

## TypeScript Usage
- Use explicit typing for function parameters and return values.
- Prefer interfaces over types for object definitions.
- Use generics where appropriate, especially in the pipeline components and ingester classes.
- Example: `function processQuery<T extends BaseQuery>(query: T): Promise<QueryResult>`
- Use abstract classes for base implementations (e.g., `BaseIngester`).
- Leverage type guards for safe type narrowing.
- Use discriminated unions for state management, especially in the UI components.

## Error Handling
- Wrap async operations in `try/catch` blocks.
- Log errors with context using the logger utility (e.g., `logger.error('Failed to retrieve documents:', error)`).
- Use custom error classes for specific error types in the agent pipeline and ingestion process.
- Implement proper cleanup in error handlers, especially for file operations in ingesters.
- Ensure errors are propagated appropriately and handled at the right level of abstraction.
- Use async/await with proper error handling rather than promise chains where possible.

## Testing
- Write unit tests for utility functions, pipeline components, and ingester classes.
- Use Jest for testing framework.
- Mock external dependencies (LLMs, vector stores, etc.) using jest-mock-extended.
- Aim for high test coverage in core agent functionality and ingestion processes.
- Test each ingester implementation separately.
- Use descriptive test names that explain the behavior being tested.
- Follow the AAA pattern (Arrange, Act, Assert) for test structure.

## Code Organization
- Keep files focused on a single responsibility.
- Group related functionality in directories.
- Separate business logic from UI components.
- Organize ingesters by source type in dedicated directories.
- Follow the template method pattern for ingester implementations.
- Use the factory pattern for creating appropriate instances based on configuration.
- Implement dependency injection for easier testing and component replacement.
95 changes: 95 additions & 0 deletions .cursor/rules/common_patterns.mdc
Original file line number Diff line number Diff line change
@@ -0,0 +1,95 @@
---
description: Common Patterns
globs: *.ts,*.tsx,*.js,*.jsx
---
# Common Patterns in Starknet Agent

## RAG Pipeline Architecture
- Core pattern for information retrieval and response generation.
- Steps in the RAG pipeline:
1. **Query Processor**: `packages/agents/src/pipeline/queryProcessor.ts`
- Analyzes user queries and chat history
- Reformulates queries to optimize document retrieval
2. **Document Retriever**: `packages/agents/src/pipeline/documentRetriever.ts`
- Converts queries to vector embeddings
- Searches vector database using cosine similarity
- Returns relevant document chunks with metadata
3. **Answer Generator**: `packages/agents/src/pipeline/answerGenerator.ts`
- Uses LLMs to generate comprehensive responses
- Includes source citations in the response
- Handles different conversation contexts
4. **RAG Pipeline**: `packages/agents/src/pipeline/ragPipeline.ts`
- Orchestrates the entire process flow
- Manages error handling and logging

## Factory Pattern
- Used for creating RAG agents with different configurations.
- Example: `packages/agents/src/ragAgentFactory.ts`
- Creates different agent instances based on focus mode.
- Configures appropriate vector stores and prompt templates.
- Also used in the ingester package: `packages/ingester/src/IngesterFactory.ts`
- Creates appropriate ingester instances based on documentation source.
- Enables easy addition of new document sources.

## Template Method Pattern
- Used in the ingester package for standardizing the ingestion process.
- Example: `packages/ingester/src/BaseIngester.ts`
- Defines the skeleton of the ingestion algorithm in a method.
- Defers some steps to subclasses (download, extract, process).
- Ensures consistent process flow while allowing customization.
- Common workflow: Download → Extract → Process → Generate Embeddings → Store

## WebSocket Streaming Architecture
- Used for real-time streaming of agent responses.
- Example: `packages/backend/src/websocket/`
- Components:
- `connectionManager.ts`: Manages WebSocket connections and sessions
- `messageHandler.ts`: Processes incoming messages and routes to appropriate handlers
- Flow: Connection → Authentication → Message Handling → Response Streaming
- Enables real-time, chunk-by-chunk delivery of LLM responses

## Repository Pattern
- Used for database interactions.
- Example: `packages/agents/src/db/vectorStore.ts`
- Abstracts MongoDB vector search operations
- Provides methods for similarity search and filtering
- Handles connection pooling and error handling
- Used in ingester for vector store operations: `packages/ingester/src/utils/vectorStoreUtils.ts`

## Configuration Management
- Centralized configuration using TOML files.
- Example: `packages/agents/src/config.ts` and `packages/agents/sample.config.toml`
- Loads configuration from files and environment variables.
- Provides typed access to configuration values.
- Supports multiple LLM providers (OpenAI, Anthropic, etc.)
- Configures multiple vector databases for different focus modes

## Dependency Injection
- Used for providing services to components.
- Example: `packages/agents/src/ragAgentFactory.ts`
- Injects vector stores, LLM providers, and config settings into pipeline components
- Makes testing easier by allowing mock implementations
- Enables flexible configuration of different agent types

## Focus Mode Implementation
- Pattern for targeting specific document sources.
- Example: `packages/agents/src/config/agentConfigs.ts`
- Defines different focus modes (Starknet Ecosystem, Cairo Book, etc.)
- Configures different vector stores for each mode
- Customizes prompts and retrieval parameters per mode
- Enables specialized knowledge domains

## React Hooks for State Management
- Custom hooks for managing UI state and WebSocket communication.
- Example: `packages/ui/lib/hooks/`
- Encapsulates WebSocket connection logic.
- Manages chat history and UI state.
- Handles real-time streaming of responses.

## Error Handling and Logging
- Centralized error handling with detailed logging.
- Example: `packages/agents/src/utils/logger.ts`
- Configurable log levels based on environment
- Context-rich error messages with timestamps and stack traces
- Proper error propagation through the pipeline
- Used throughout the codebase for consistent error reporting.
46 changes: 46 additions & 0 deletions .cursor/rules/documentation.mdc
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
---
description: Documentation
globs:
---
# Documentation for Starknet Agent

## External Resources
- Starknet Documentation: [https://docs.starknet.io](https://docs.starknet.io)
- Referenced in the agent's knowledge base.
- Cairo Book: [https://book.cairo-lang.org](https://book.cairo-lang.org)
- Core resource for Cairo language information.
- MongoDB Atlas Vector Search: [https://www.mongodb.com/docs/atlas/vector-search/](https://www.mongodb.com/docs/atlas/vector-search/)
- Used for vector database implementation.
- Anthropic Claude API: [https://docs.anthropic.com/claude/reference/getting-started-with-the-api](https://docs.anthropic.com/claude/reference/getting-started-with-the-api)
- Used for LLM integration.

## Internal Documentation
- Architecture Overview: `docs/architecture/README.md`
- Explains the RAG pipeline architecture.
- API Integration Guide: `API_INTEGRATION.md`
- Details how to integrate with the agent's API.
- Contributing Guidelines: `CONTRIBUTING.md`
- Instructions for contributing to the project.

## Code Documentation
- JSDoc comments are used throughout the codebase, especially in:
- `packages/agents/src/pipeline/`: Documents the RAG pipeline components.
- `packages/agents/src/core/`: Documents core agent functionality.
- `packages/backend/src/websocket/`: Documents WebSocket communication.

## Configuration Documentation
- Sample configuration: `packages/agents/sample.config.toml`
- Documents available configuration options.
- Environment variables: `.env.example` files
- Documents required environment variables.

## Database Schema
- MongoDB collections structure is documented in:
- `packages/agents/src/db/`: Database interaction code.
- Vector embeddings format and schema.

## Deployment Documentation
- Docker deployment: `docker-compose.yaml` and related Dockerfiles
- Instructions for containerized deployment.
- Production hosting: `docker-compose.prod-hosted.yml`
- Configuration for production environments.
100 changes: 100 additions & 0 deletions .cursor/rules/imports.mdc
Original file line number Diff line number Diff line change
@@ -0,0 +1,100 @@
---
description: Cairo Imports
globs: *.ts,*.tsx,*.js,*.jsx
---
# Imports in Cairo Coder

## External Libraries

### Backend and Agent Libraries
- `express`: Web server framework.
- Used in: `packages/backend/src/app.ts`
- Import: `import express from 'express';`
- `cors`: CORS middleware for Express.
- Used in: `packages/backend/src/app.ts`
- Import: `import cors from 'cors';`
- `mongodb`: MongoDB client for database operations.
- Used in: `packages/agents/src/db/`
- Import: `import { MongoClient } from 'mongodb';`
- `anthropic`: Anthropic Claude API client.
- Used in: `packages/agents/src/lib/`
- Import: `import Anthropic from '@anthropic-ai/sdk';`
- `openai`: OpenAI API client.
- Used in: `packages/agents/src/lib/`
- Import: `import OpenAI from 'openai';`
- `@google/generative-ai`: Google AI API client.
- Used in: `packages/agents/src/lib/`
- Import: `import { GoogleGenerativeAI } from '@google/generative-ai';`

### Frontend Libraries
- `react`: UI library.
- Used in: `packages/ui/components/`
- Import: `import React from 'react';`
- `next`: React framework.
- Used in: `packages/ui/app/`
- Import: `import { useRouter } from 'next/router';`
- `tailwindcss`: CSS framework.
- Used in: `packages/ui/components/`
- Applied via class names.

## Internal Modules

### Agent Modules
- `pipeline`: RAG pipeline components.
- Used in: `packages/agents/src/core/ragAgentFactory.ts`
- Import: `import { QueryProcessor, DocumentRetriever, CodeGenerator } from './pipeline';`
- `config`: Configuration management.
- Used in: `packages/agents/src/`
- Import: `import { config } from './config';`
- `db`: Database interaction.
- Used in: `packages/agents/src/core/`
- Import: `import { VectorStore } from './db/vectorStore';`
- `models`: LLM and embedding models interfaces.
- Used in: `packages/agents/src/core/`
- Import: `import { LLMProviderFactory } from './models/llmProviderFactory';`
- Import: `import { EmbeddingProviderFactory } from './models/embeddingProviderFactory';`

### Backend Modules
- `routes`: API routes.
- Used in: `packages/backend/src/app.ts`
- Import: `import { generateRoutes } from './routes/generate';`
- Import: `import { modelsRoutes } from './routes/models';`
- `handlers`: Request handlers.
- Used in: `packages/backend/src/routes/`
- Import: `import { generateHandler } from '../handlers/generateHandler';`

### Ingester Modules
- `baseIngester`: Abstract base class for all ingesters.
- Used in: `packages/ingester/src/ingesters/`
- Import: `import { BaseIngester } from '../BaseIngester';`
- `ingesterFactory`: Factory for creating ingesters.
- Used in: `packages/ingester/src/scripts/`
- Import: `import { IngesterFactory } from '../IngesterFactory';`
- `utils`: Utility functions.
- Used in: `packages/ingester/src/`
- Import: `import { downloadFile, extractArchive } from './utils/fileUtils';`
- Import: `import { processContent, splitMarkdown } from './utils/contentUtils';`

## Common Import Patterns

### For Backend API Routes
```typescript
import express from 'express';
import { generateHandler } from '../handlers/generateHandler';
import { config } from '../config';
```

### For Agent Core
```typescript
import { VectorStore } from './db/vectorStore';
import { LLMProviderFactory } from './models/llmProviderFactory';
import { EmbeddingProviderFactory } from './models/embeddingProviderFactory';
```

### For Ingesters
```typescript
import { BaseIngester } from '../BaseIngester';
import { BookPageDto, ParsedSection, BookChunk } from '../types';
import { Document } from 'langchain/document';
import { VectorStore } from '../../agents/src/db/vectorStore';
```
Loading