-
Notifications
You must be signed in to change notification settings - Fork 0
Home
Eddy is an innovative text editor that leverages the power of Artificial Intelligence to enhance the writing and editing experience. It's designed to be a collaborative tool, allowing multiple users to work on the same document in real-time, with the added benefit of AI-driven assistance.
Key Features:
- (WIP) Real-time Collaboration: Multiple users can edit the same document simultaneously.
- AI-Powered Autocompletion: Eddy uses advanced language models to suggest relevant completions as you type, based on the content of the document and user uploaded files.
- Context-Aware Editing: The AI understands the context of your writing and can make suggestions for improving sentence structure, tone, and clarity.
- Document Structuring: Eddy can analyze and suggest an outline for your document, making it easier to organize complex ideas.
- Content Selection: Users can upload documents or select existing ones to give the AI additional context for the autocompletion and editing features.
- Dialog-Based Interaction: Users can interact with Eddy through a chat interface, providing instructions and receiving feedback.
- Admin Panel: Offers administrators the ability to manage users, documents, and other aspects of the system.
- Authentication: Secure user authentication and authorization using JWT (JSON Web Tokens).
- File Uploads: Users can upload files that Eddy will use as additional context for its AI features.
Eddy's functionality is built upon a client-server architecture, primarily utilizing the following technologies:
- Backend: Python (Flask)
- Frontend: React
- Real-time Communication: Socket.IO
- Database: PostgreSQL
- AI Models: Google's Gemini, (Untested support for: Anthropic's Claude, and Ollama for open-source models)
Core Components:
- Socket Manager: Handles real-time communication between the client and server, including document updates, user presence, and AI suggestions.
- LLM Manager: Manages interactions with various Language Model (LLM) providers (Gemini, Claude, Ollama).
- Autocomplete Manager: Uses Retrieval Augmented Generation (RAG) and caching to provide intelligent autocompletion suggestions.
- Dialog Manager: Processes user messages from the chat interface, generating action plans and coordinating with other components to execute them.
- Structure Manager: Extracts and applies document structures to improve organization.
- Document Manager: Handles document storage, retrieval, and modification using the Quill Delta format.
- Embedding Manager: Generates and manages text embeddings for semantic search and similarity comparisons.
- Action Plan Manager: Creates, validates, and fixes action plans for responding to user requests.
- Action Manager: Refines and executes actions based on the generated action plan.
- Response Evaluator: Evaluates the proposed actions and decides whether to apply or reject them.
- Dialog History Manager: Tracks the conversation history between the user and the AI.
- File Processor: Handles file uploads, text extraction, and manages the temporary file directory.
Workflow:
- User Interaction: Users interact with Eddy through the editor interface and the chat window.
- Real-time Synchronization: Changes made by users are synchronized in real-time using Socket.IO.
-
AI Assistance:
- Autocompletion: As users type, the Autocomplete Manager generates suggestions based on the current context and the user's selected content.
- Dialog-Based Editing: Users can send instructions to Eddy through the chat. The Dialog Manager processes these instructions, generates an action plan using the Action Plan Manager, and executes actions using the Action Manager.
- Document Structuring: Users can upload a document or a description of the desired structure, and the Structure Manager will attempt to restructure the document accordingly.
- Collaboration: Multiple users can work on the same document simultaneously, with changes reflected in real-time for all users.
- Autocompletion
- Dialog-Based Interaction
- Document Structuring
- Real-time Collaboration
- Configuration
- Authentication
- Admin Panel
- File Uploads
The config.py file contains configuration settings for the Eddy application. These settings control various aspects of the application's behavior, such as database connections, CORS policies, and debugging options.
-
DEBUG: A boolean flag indicating whether the application is running in debug mode. -
SHOW_EMIT_SUCCESS: A boolean flag indicating whether to log successful Socket.IO emissions. -
CORS_ORIGINS: A list of allowed origins for Cross-Origin Resource Sharing (CORS). This controls which web applications can access Eddy's API. -
SQLALCHEMY_DATABASE_URI: The connection string for the PostgreSQL database. This specifies the database driver, username, password, host, port, and database name. -
SQLALCHEMY_TRACK_MODIFICATIONS: A boolean flag that controls whether SQLAlchemy should track modifications to objects and emit signals. It's generally recommended to set this toFalseto avoid unnecessary overhead. -
TMP_PATH: The path to the temporary directory used for storing uploaded files and other temporary data. -
TITLE_DOCUMENT_LENGTH_THRESHOLD: Specifies the minimum document length (in number of characters) required for Eddy to attempt automatic title generation if title is not set already or if title is not longer than 3 characters and title is not set manually by the user.
# backend/src/config.py
class Config:
DEBUG = False
SHOW_EMIT_SUCCESS = False
# Allow connections from React development server
CORS_ORIGINS = ["http://localhost:3000", "http://127.0.0.1:3000", "http://localhost:3000/", "*"] # http://localhost:3000
SQLALCHEMY_DATABASE_URI = 'postgresql://postgres:1234@localhost:5432/eddy_db'
SQLALCHEMY_TRACK_MODIFICATIONS = False
TMP_PATH = '/tmp'
TITLE_DOCUMENT_LENGTH_THRESHOLD = 128The configuration settings are loaded into the Flask application using app.config.from_object(Config). This makes the settings accessible throughout the application via the current_app.config object.
For example, to access the database URI within a Flask route:
from flask import current_app
@app.route('/some_route')
def some_route():
db_uri = current_app.config['SQLALCHEMY_DATABASE_URI']
# ... use the db_uri ...Eddy uses JSON Web Tokens (JWT) for user authentication and authorization. This ensures that only authorized users can access specific resources and perform certain actions.
-
Login:
- When a user logs in with their email and password, the
loginroute inroutes.pyis called. - The user's credentials are verified against the database.
- If the credentials are valid, a JWT is generated using the
Auth.generate_tokenmethod. - The token is sent back to the client in the response.
- When a user logs in with their email and password, the
-
Token Storage:
- The client (e.g., the React frontend) stores the token in local storage or a cookie.
-
Authenticated Requests:
- For subsequent requests that require authentication, the client includes the token in the
Authorizationheader as a Bearer token (e.g.,Authorization: Bearer <token>). - For Socket.IO connections, the token is sent as a query parameter during the connection handshake.
- For subsequent requests that require authentication, the client includes the token in the
-
Token Verification:
-
REST API: The
Auth.rest_auth_requireddecorator is used to protect REST API endpoints. It extracts the token from theAuthorizationheader, decodes it usingAuth.decode_token, and verifies its signature. If the token is valid, the decorated function is executed; otherwise, a 401 Unauthorized error is returned. -
Socket.IO: The
Auth.socket_auth_requireddecorator is used to protect Socket.IO event handlers. It extracts the token from the query parameters, decodes it, and verifies it. If the token is valid, the decorated event handler is executed; otherwise, aserver_authentication_failedevent is emitted.
-
REST API: The
-
Admin Access:
- The
Usermodel has anis_adminfield to indicate whether a user is an administrator. - The
Auth.generate_tokenmethod includes theis_adminstatus in the JWT payload. - The
Auth.rest_admin_auth_requiredandAuth.socket_admin_auth_requireddecorators are used to protect routes and event handlers that require administrator privileges. They check theis_adminfield in the decoded token's payload.
- The
-
AuthClass:-
generate_token(user_id, is_admin): Generates a JWT with the user's ID, admin status, and an expiration time. -
decode_token(token): Decodes a JWT, verifies its signature, and checks for expiration. Returns the payload if the token is valid, or an error message if it's invalid. -
socket_auth_required(emit_event): Decorator for Socket.IO event handlers that require authentication. -
socket_admin_auth_required(emit_event): Decorator for Socket.IO event handlers that require admin authentication. -
rest_auth_required: Decorator for REST API endpoints that require authentication. -
rest_admin_auth_required: Decorator for REST API endpoints that require admin authentication.
-
-
UserModel:-
set_password(password): Hashes a password usinggenerate_password_hash. -
check_password(password): Verifies a password usingcheck_password_hash.
-
-
Secret Key: The
EDDY_SECRET_KEYenvironment variable is used as the secret key for signing and verifying JWTs. It should be a long, random, and securely stored string. - Token Expiration: JWTs have an expiration time (set to 1 day in this implementation). This ensures that tokens cannot be used indefinitely.
- HTTPS: In a production environment, Eddy should be served over HTTPS to encrypt communication and protect tokens from being intercepted.
Eddy allows users to upload files that can be used to provide additional context for the AI models. This can improve the relevance of autocompletion suggestions and the accuracy of dialog-based interactions.
- File Selection: Users can select files to upload using the file upload component in the sidebar.
-
Text Extraction: When a file is uploaded, the backend attempts to extract text content from it using the
FileProcessorclass.- Different methods are used for different file types (e.g.,
.txt,.md,.pdf). - The
textractlibrary is used for handling various file formats. - If text extraction fails, the file is still stored, but text-based features might not be available for it.
- Different methods are used for different file types (e.g.,
-
Storage:
- The file content is stored in the
FileContenttable in the database. - The extracted text is stored in the
text_contentfield. - Hashes of the file content and extracted text are stored to detect duplicates.
- The file content is stored in the
-
Embedding Generation:
- If text extraction is successful, the
EmbeddingManagergenerates embeddings for the extracted text. - The text is split into smaller sequences.
- Embeddings are generated for each sequence using a language model.
- The embeddings are stored in the
SequenceEmbeddingtable, linked to theFileContententry through theFileEmbeddingtable.
- If text extraction is successful, the
- Content Selection: Users can select which uploaded files they want to use as context for the current document.
-
Contextualization: When the user interacts with the AI (e.g., through autocompletion or the chat), the selected files are used as context.
- The
EmbeddingManagerfinds sequences from the selected files that are similar to the current context (e.g., the text around the cursor or the user's chat message). - These relevant sequences are included in the prompt sent to the language model, providing additional context for generating responses.
- The
-
FileProcessor: Handles file uploads, text extraction, and temporary file management.-
process_file_content(filename, content): Extracts text from a file and returns the text content and its hash.
-
-
EmbeddingManager: Generates and manages embeddings.-
get_embeddings(file): Generates or retrieves embeddings for aFileContentorDocumentobject. -
find_similar_sequences(text, embedding_ids, limit): Finds sequences similar to a given text within a set of embeddings.
-
-
ContentUploadComponent: Provides the UI for file uploads and selection. -
Database Models:
-
FileContent: Stores file metadata, content, and extracted text. -
FileEmbedding: LinksFileContentto its embeddings. -
SequenceEmbedding: Stores individual text sequences and their embeddings.
-
- File Validation: The backend should validate uploaded files to prevent malicious uploads (e.g., check file type, size, and content).
- Access Control: Only authorized users should be able to upload files and associate them with documents.
- Storage Limits: Consider implementing storage limits to prevent excessive disk usage.
- Support for more file types: Extend text extraction to support a wider range of file formats.
- Automatic file categorization: Use AI to automatically categorize uploaded files (e.g., "research paper," "meeting notes," "code").
- File preview: Allow users to preview the contents of uploaded files before selecting them as context.
- Background embedding generation: Generate embeddings in the background to improve performance.