## SEC Insights: Deep Dive

## Goals of the Deep Dive
* Deploy to production.
* Understand the role of LLM.
* Understand the placement of the LlamaIndex code.
* Check if there is testing framework in place.
* Additional parts to pay attention to:
    * Github codespace
    * AWS S3
    * Cron job service
    * Ready to deploy to Vercel and Render
    * Local environment setup making use of
        * LocalStack

## Tech Stack
* Frontend
    * React / Next.js
    * Tailwind CSS
* Backend
    * FastAPI
    * Docker
    * SQLAlchemy
    * OpenAI
        * gpt-3.5-turbo
        * text-embedding-ada-002
    * PGVector
    * LlamaIndex
* Infrastructure
    * Render.com
        * Backend hosting
        * Postgres 15
    * Vercel
        * Frontend hosting
    * AWS
        * Cloudfront
        * S3           

## LLM Logic: role and placement
* Role: advanced RAG application
* Main placement: `/backend/app/chat/engine.py`
* Secondary placements in same folder:
    * `messaging.py`
    * `pg_vector.py`
    * `a_response_synth.py`
* Python version of LlamaIndex

## Testing functionality
* Limited testing functionality in /backend/tests/app/chat/test_engine.py
    * TestGetChatHistory

## What it is?
* Chat application
* RAG technique
* Answers questions about SEC 10k and 10Q documents
* Production-ready
* Full-stack repo
* Ready for you to fork

## Try it out
[secinsights.ai](secinsights.ai)

## Product Features
* QA chat grounded in source-of-truth SEC documents
* PDF viewer
* Token-level streaming of chat responses
* Streaming of reasoning steps (sub-questions)
* Citation of source data
* Use of API-based tools (in addition to semantic search)

![alt text](images/architecture-v1.png)

## Architecture
* Backend
    * Render.com: hosting most of the backend.
        * Similar to AWS but easier to use.
    * FastAPI Backend Service.
        * Integrates with Polygon.io to solve some of the quantitative questions about companies (revenue, etc). Good example of how to integrate API-based tools with the chat. 
    * Postgres 15 database: PG vectorstore.
    * Cron job service: more on this on a next section below.
        * pulls the SEC documents from SEC's Edgar API.
        * stores the documents into the AWS S3's Public PDF Bucket.
        * calls OpenAI's embedding API to run the embeddings to the SEC files.
        * stores the embeddings in the PG (postgres) vectorstore. We use the PG Vectorstore Integration.
        * updates the Private StorageContext Bucket to update the metadata of the embeddings.
        * can run at whatever schedule you set in the render.yaml file.
    * Auto-scaling: dynamic automatic adjustment of the computational resources used based on the real demand.
    * All of the above prepared for us in the file render.yaml (see detailed explanation of the render.yaml file in the next section)

* AWS S3: **you will have to setup this yourself**
    * Private StorageContext Bucket: metadata from the llamaindex library
    * Public PDF Bucket
    * See instructions on how to create an AWS S3 account in the next section.

* Frontend
    * NextJS
        * Interacts with the FastAPI chat endpoints. 
    * Hosted in Vercel
    * Interacts with the FastAPI backend for some of the chat endpoints

* External services:
    * [Polygon.io](https://polygon.io) (Financial Data API, Numeric Data). Good example on how to integrate tools in your chat agent.
        * [Get free API key](https://polygon.io/dashboard/login?redirect=%2F) 
    * [SEC's Edgar API](https://sec-api.io)
        * [Get free API Key](https://sec-api.io/signup/free) 
    * OpenAI Service (LLM)
        * The cron service is the main worker that calls the OpenAI Embedding API to get the embeddings for the given SEC documents.
        * The cron service calls the Edgar API from the SEC, get the PDFs, store them in the AWS Public PDF Bucket, run the embeddings on them and store the embeddings in the Postgres database (we use the PG Vectorstore integration).
        * The cron job can run at whatever schedule you set in the render.yaml file.
    * Sentry.io (Production-level Monitoring Service. It will ping you whenever there is an error in the backend service, or threshold errors, etc. You can also do Performance Monitoring: what sections of the code are taking more time in your service).
        * More info on this in a next section below. 

All the setup is open source and is easy to deploy on Vercel and Render.com

## Contents of the render.yaml file

This code is a configuration for deploying and managing the app on a cloud hosting platform, Render.com, based on its structure and comments. Let's break it down by sections to better understand what it does:

#### General Configurations:

- `previewsEnabled: true`: Enables preview functionality for this project.

#### Database Configuration:

- `databases`: Defines a database for the application.
  - `name`: The internal name of the database.
  - `databaseName`: The actual name of the database.
  - `plan`: The service plan for the database (**in this case, `pro`**).
  - `previewPlan`: The plan for previews (in this case, `starter`).

#### Services:

Two main services are defined here, a web service and a cron service.

1. **Web Service (`type: web`)**:
   - Deployed as a Docker service in a web environment.
   - Specifies the service name, runtime configuration, repository, and region.
   - Defines an auto-scaling range (between 2 and 10 instances) with specific targets for CPU and memory usage.
   - Configures a health check endpoint and a command for initial deployment.
   - Specific environment variables for the service, including the database URL and other settings.

2. **Cron Service (`type: cron`)**:
   - A cron job executed in a Docker container.
   - Similar in configuration to the web service but designed to run on a schedule (though here it's configured to not run automatically).
   - Uses the same command as the web service for initial deployment.

#### Environment Variable Groups:

Groups of environment variables are defined for different purposes:

- `general-settings`: Common variables for all environments, such as CORS configuration, S3 bucket names, CDN URL, etc.
- `prod-web-secrets`: Environment-specific variables for the production environment, like API keys and AWS credentials.
- `preview-web-secrets`: Similar variables for the preview environment.

#### Summary:

This code configures two main services (web and cron) on Render.com, with a database and a set of environment variables for different environments (production and preview). It includes details about scaling, service health, and specific deployment configurations, indicating a well-thought infrastructure for applications in production and development.

## How to create an AWS S3 account

Creating an account on Amazon Web Services (AWS) and starting to use the Amazon S3 (Simple Storage Service) involves several steps. Here is a guide through the process:

1. **Sign Up for Amazon Web Services (AWS)**:
   - Visit the AWS website at [https://aws.amazon.com/](https://aws.amazon.com/).
   - Click on "Create an AWS Account".
   - Follow the instructions to create your account, which include providing your email address, creating a password, and choosing a name for your AWS account.
   - Then, you'll be asked to provide contact information and billing details (credit card information), as AWS operates on a pay-as-you-go basis.

2. **Log in to the AWS Management Console**:
   - Once your account is active, log in to the AWS Management Console at [https://aws.amazon.com/console/](https://aws.amazon.com/console/).
   - Use the email and password you registered with to log in.

3. **Access the Amazon S3 Service**:
   - In the AWS Management Console, search for "S3" in the search bar or find it under the "Storage" section.
   - Click on S3 to open the service management panel.

4. **Create a Bucket in S3**:
   - Within the S3 console, you can start by creating a "bucket", which is a basic container where data is stored.
   - Click on "Create bucket".
   - Provide a unique name for the bucket and select the region where you want to store your data.
   - Configure additional options as needed, such as access control, versioning, etc.
   - Click on "Create" to finalize the creation of the bucket.

5. **Upload Files to Your Bucket**:
   - Once the bucket is created, you can upload files to it. To do this, select your bucket and then use the "Upload" option to add files from your computer.

6. **Configure Permissions and Policies**:
   - It's important to properly configure permissions and security policies to control who can access your files in S3.

7. **Additional Uses**:
   - Besides storing files, you can use S3 for a variety of purposes, such as hosting static websites, as part of backup solutions, etc.

Remember that AWS S3 is a pay-for-use service, so you will be charged for storage and data transfer according to AWS's pricing. Also, it's advisable to familiarize yourself with best practices for security and management to protect your data on AWS.

## Create a private bucket on AWS S3
* name used for the bucket: private-storagecontext-bucket

Creating a private bucket in Amazon S3 (Simple Storage Service) is a process that involves two main parts: 
* creating the bucket
* and setting its access policy to ensure it is private.

Here's how you can do it:

#### Step 1: Create the Bucket
1. **Log in to AWS Management Console**:
   - Go to the [AWS Management Console](https://aws.amazon.com/console/) and log in with your account.

2. **Open Amazon S3 Service**:
   - Once in the console, search for and select the S3 service.

3. **Create a New Bucket**:
   - Click on "Create bucket".
   - Provide a name for your bucket following S3 naming rules.
   - Choose the region where you want to create the bucket.
   - Click "Next" to continue with the configuration.

4. **Optional Configuration**:
   - In the options section, you can leave the default settings or adjust them as needed (versioning, logging, etc.).
   - Click "Next".

5. **Set Permissions**:
   - This is where you will ensure the bucket is private.
   - Check “Block Public Access” to ensure there is no public access to the bucket.
   - Do not add any access control list (ACL) that allows public access.

6. **Review and Create the Bucket**:
   - Review your settings and then click "Create bucket".

#### Step 2 (Optional, recommended): Configure the Bucket Privacy Policy
Once the bucket is created, you can add a bucket policy to further restrict access:

1. **Select Your Bucket**:
   - In the S3 console, click on the name of your newly created bucket.

2. **Go to the Permissions Section**:
   - Click on the “Permissions” tab.

3. **Add a Bucket Policy**:
   - In the Permissions section, find and click on “Bucket Policy”.
   - Here you can add a JSON policy that defines who can access the bucket. To keep it private, you can use a policy that denies all access unless it comes from specific users or roles within your AWS account.

   Example of a basic policy for maintaining privacy:
   ```json
   {
     "Version": "2012-10-17",
     "Statement": [
       {
         "Effect": "Deny",
         "Principal": "*",
         "Action": "s3:*",
         "Resource": ["arn:aws:s3:::your-bucket-name/*", "arn:aws:s3:::your-bucket-name"]
       }
     ]
   }
   ```
   Replace `your-bucket-name` with the actual name of your bucket.

4. **Save the Policy**:
   - Click “Save” to apply the policy.

With these steps, you will have created a bucket in S3 that is private and whose access is restricted to only those whom you explicitly give permission through IAM policies or bucket policies.

## How to create a public bucket on AWS S3
* name used for the bucket: public-pdf-bucket

To create a public bucket in Amazon S3, you need to configure the bucket's access options to allow public access. However, it's important to consider the security implications of making a bucket public, as this could expose your data to anyone on the Internet. If you're sure that you need a public bucket (for example, to host assets for a static website), here's how you can do it:

#### Step 1: Create the Bucket
1. **Log in to AWS Management Console**:
   - Go to the [AWS Management Console](https://aws.amazon.com/console/) and log in.

2. **Open the Amazon S3 Service**:
   - Once in the console, search for and select the S3 service.

3. **Create a New Bucket**:
   - Click on "Create bucket".
   - Provide a name for your bucket and select the region.
   - Continue with the configuration until you reach the permissions section.

#### Step 2: Configure Public Access Permissions
In the permissions section during bucket creation:

1. **Block Public Access Settings**:
   - Disable the "Block all public access" option. This will allow public access to the bucket.
   - AWS will warn you about the risks of doing this; ensure you understand the consequences.

2. **Review and Create the Bucket**:
   - Review your configuration and then create the bucket.

#### Step 3 (Optional, recommended): Set Up a Bucket Policy for Public Access
Once the bucket is created, you need to define a bucket policy to explicitly allow public access:

1. **Select Your Bucket**:
   - In the S3 console, click on the name of your bucket.

2. **Go to the Permissions Section**:
   - Click on the "Permissions" tab.

3. **Add a Bucket Policy**:
   - Click on “Bucket Policy”.
   - Add a policy that grants public read permissions. For example:
     ```json
     {
       "Version": "2012-10-17",
       "Statement": [
         {
           "Effect": "Allow",
           "Principal": "*",
           "Action": ["s3:GetObject"],
           "Resource": ["arn:aws:s3:::your-bucket-name/*"]
         }
       ]
     }
     ```
   - Replace `your-bucket-name` with the actual name of your bucket.

4. **Save the Policy**:
   - Click on “Save” to apply the policy.

#### Additional Considerations
- **Security**: Keep in mind that making a bucket public can expose your data. Use this setting only when it's absolutely necessary and when the data is intended to be public.
- **CORS Usage**: If your bucket will be used to serve content to websites on different domains, you might also need to configure CORS rules.

By following these steps, you will have created a bucket in S3 that is publicly accessible, meaning anyone with the correct URL can access or download the files stored in it.

## What is a Cron Job and how to set it up in Render.com

#### The Cron Job configuration on the render.yaml file

The configuration of the cron job specified in the `render.yaml` file indicates that it will be deployed and executed on Render.com, but within a Docker environment. That is, although the execution of the cron job is managed through Render.com, the environment in which the cron job runs is based on Docker.

To clarify:

- **Deployment on Render.com**: The cron job is configured and managed through Render.com. This includes control over when and how the cron job starts, as well as the provisioning of the necessary resources for its execution.

- **Execution in Docker**: The cron job executes inside a Docker container. This means that the cron job's code runs in an isolated environment provided by Docker. The use of Docker ensures that the cron job has a consistent execution environment, with the required dependencies and configuration already set up.

So, in summary, the cron job is configured to be deployed and managed by Render.com, but it executes inside a Docker container as defined in the `render.yaml` file.

#### What is a Cron Job and how to set it up on Render.com
A cron job is a scheduled task that runs automatically at a specified time. It is a common feature in Unix-type operating systems and is used to automate repetitive tasks such as backups, system updates, file synchronization, etc. In the context of web applications, cron jobs can be used for tasks such as database cleanup, report generation, maintenance script execution, among others.

#### How to Configure a Cron Job on Render.com

Render.com offers an easy way to configure and manage cron jobs for your applications. Here's how you can configure a cron job on Render:

1. **Access Your Dashboard on Render**: 
   - Log in to your Render.com account and access the dashboard.

2. **Create a New Service**:
   - In the dashboard, select the option to create a new service. Render allows you to create different types of services, including cron jobs.

3. **Configure Your Cron Job**:
   - When setting up a new service, select "Cron Job" as the service type.
   - Provide the necessary details for your cron job:
     - **Service Name**: Assign a descriptive name to your cron job.
     - **Command**: Define the command that will be executed. This should be the script or task you want to automate.
     - **Schedule**: You need to specify when the cron job will run. Render uses the standard cron syntax, which allows you to specify the frequency of execution (for example, `0 * * * *` to run every hour).

4. **Additional Configurations**:
   - Depending on your need, you can set up environment variables and other specific settings for your cron job.

5. **Deployment and Monitoring**:
   - Once configured, deploy your cron job. Render provides tools to monitor the execution of your cron jobs, allowing you to view logs and check the status of executions.

6. **Updates or Changes**:
   - If you need to make changes or adjustments to your cron job, you can update the settings at any time from your dashboard on Render.

#### Example of Cron Job Configuration

Imagine you have a maintenance script called `daily-maintenance.sh` that you want to run every day at midnight. The configuration of your cron job in Render would be something like this:

- **Command**: `./daily-maintenance.sh`
- **Schedule**: `0 0 * * *` (This means "at 0 hours, 0 minutes, of each day, of each month, of each day of the week").

Remember that the correct configuration and testing of your cron jobs are essential to ensure that automated tasks run as expected and without interruptions.

## Dev Environment

* It's recommended to use the config included for a Github Codespace in the devcontiner.json file.

* Log into Github.
* Go to the SEC Insights open source repository:
    * [Here](https://github.com/run-llama/sec-insights/tree/main) 
* Drop-down the "Code" button to create a Github Codespace.
* At the bottom of the Codespace you have a remote terminal.
    * `cd frontend`
    * `ls`
    * You can see that the frontend is based on a basic vercel app.
    * `npm install`
    * source the frontend/.env.example folder to load the environment variables that are present here.
    * **the url there (NEXT_PUBLIC_BACKEND_URL) is the local one of the backend (localhost/8000), will have to be changed when we use the backend in the cloud**.
    * `set -a`
    * `source .env.example`
    * `npm run dev`
* that starts running the app in localhost/3000 served by Codespace.
* It comes with live reload, so if you edit any UI file it will show immediately. For example, you can edit the title in components/landing-page/TitleAndDropdown.tsx
* With the app open in one terminal, you can open a second terminal in Codespace clicking on the plus sign and cd backend.
    * `cd backend`
    * `ls`
* This is a fastAPI python backend app. Most of this app is based in the templates fastAPI offers.

## Backend

* In the Codespace second terminal, go to /backend and read the readMe file.
* `cd backend`
* No need to install pyenv nor docker if you are running from the devcontainer image in Github Codespaces.
* `cat .python-version`: confirms you have 3.11.3
* `poetry shell`: activates a virtual environment.

The command `poetry shell` is used in the context of Python programming, specifically when managing Python projects with the tool Poetry. Poetry is a tool for dependency management and packaging in Python, allowing developers to declare, manage, and install dependencies of Python projects.

Here's what `poetry shell` does:

1. **Activates the Virtual Environment**: When you run `poetry shell`, it activates the project's virtual environment. A virtual environment is a self-contained directory that contains a Python installation for a particular version of Python, along with a number of additional packages.

2. **Isolation of Project Dependencies**: This isolation ensures that the dependencies of the project do not interfere with the system-wide Python installation or other Python projects. It's a key practice in Python development to avoid dependency conflicts and maintain project consistency.

3. **Interactive Shell**: Once the virtual environment is activated, you are placed into an interactive shell (like bash or Command Prompt) that is configured to use the project's Python interpreter. This means any Python commands you run in this shell will use the project's Python version and have access to its dependencies.

4. **Convenience for Development**: The `poetry shell` command is convenient for development purposes. You can run Python scripts, start a Python interactive session, or use command-line tools that are part of your project's dependencies without needing to manually activate the virtual environment or adjust your system's `PATH`.

5. **Temporary Activation**: The activation of the virtual environment using `poetry shell` is temporary. Once you exit the shell, the environment is deactivated, and your terminal returns to its previous state.

In summary, `poetry shell` is a command to activate the virtual environment associated with your Poetry-managed Python project, providing an isolated and consistent development environment for that project.

* As you can see in the terminal cursor, now in the github codespaces terminal you are in the (llama-app-backend-py3.11) virtual environment.
* `poetry install`
* Now you have to create the backend/.env file
    * `cp .env.development .env`
    * `set -a`
    * `source .env`

The command `set -a` in a Unix/Linux shell environment is used to change the behavior of the shell with respect to how it handles variables and their visibility (exporting) to child processes. Here's what it does:

1. **Auto-Export Variables**: When you use `set -a`, any variable that you subsequently define or modify in your shell session will be automatically exported. This means that these variables become environment variables and are inherited by any child processes or sub-shells spawned from your shell.

2. **Child Process Inheritance**: Normally, when you create a variable in a shell, it's only known to that particular shell session. Child processes or scripts invoked from that shell don't have access to those variables unless they are explicitly exported using the `export` command. However, with `set -a`, this export is implicit for all variables set after the command.

3. **Use Cases**: This command is particularly useful in scripts where you need to ensure that all defined variables are available to sub-processes without having to explicitly export each one. It's often used in startup scripts or in scripts that configure environment variables for a particular application or service.

4. **Reversing the Effect**: If you want to revert to the normal behavior where variables are not automatically exported, you can use `set +a`. This command will stop the automatic export of variables defined after it.

5. **Scope of Effect**: It's important to note that the effect of `set -a` is limited to the current shell session or script in which it is run. It does not affect other shell sessions or globally change the behavior of the shell.

In summary, `set -a` is a shell command used to automatically export all variables set in the current shell session, making them available to any child processes. This can be useful in scripting scenarios where environment variable propagation is desired.

* source .env

The command `source .env` in a Unix/Linux shell is used to execute the contents of a file (in this case, a file named `.env`) in the current shell session. Here's what this command specifically does:

1. **Loads Environment Variables**: The `.env` file typically contains environment variables. These are often key-value pairs that are used to configure the behavior of an application or script. By using `source .env`, you are effectively loading these variables into your current shell environment.

2. **Executes in Current Shell**: The `source` command (which can also be represented as a dot `.` in some shells) executes the file in the context of the current shell, rather than starting a new shell to run the script. This means any changes made to the environment, such as setting variables, changing directories, etc., will persist in the current shell after the script completes.

3. **Use in Application Configuration**: This is a common practice in application development, especially in web development, where `.env` files are used to set configuration variables that should not be hard-coded into the application, such as database passwords, API keys, and other sensitive information.

4. **Security Note**: It's important to be cautious with `.env` files, especially regarding sensitive information. These files should not be included in version control (like Git) if they contain sensitive data.

5. **Portability and Convenience**: This approach allows for easy customization of application behavior in different environments (like development, testing, production) by simply changing the contents of the `.env` file, rather than altering the application code.

In summary, `source .env` is used to execute the contents of the `.env` file in the current shell, typically for the purpose of setting environment variables that configure the behavior of an application or script. This allows for a flexible and secure way to manage configuration settings.

## Enter the environment variables in the backend/.env file

* DATABASE_URL=postgresql://user:password@127.0.0.1:5432/llama_app_db
    * change with the URL of the database you have created in Render.com
* BACKEND_CORS_ORIGINS=
    * '["http://localhost",
    * "http://localhost:8000",
    * "http://localhost:3000",
    * "http://127.0.0.1:3000",
    * "https://llama-app-backend.onrender.com",
        * change with the URL of your Render.com backend
    * "https://llama-app-frontend.vercel.app",
        * change with the URL of your Vercel frontend  
    * "http://secinsights.ai",
        * change with the URL of your live app 
    * "http://www.secinsights.ai",
        * change with the URL of your live app
    * "https://secinsights.ai",
        * change with the URL of your live app
    * "https://www.secinsights.ai"]'
        * change with the URL of your live app
* OPENAI_API_KEY=sk-XXXXXXXXXXXXXXXXXXXXXXXX
    * enter your OpenAI API Key 
* LOG_LEVEL=debug
* RENDER=False
* S3_BUCKET_NAME=llama-app-backend-local
    * enter the name of your private bucket 
* S3_ASSET_BUCKET_NAME=llama-app-web-assets-local
    * enter the name of your public bucket 
* CDN_BASE_URL=http://llama-app-web-assets-local.s3-website.localhost.localstack.cloud:4566
    * By default, this is using a URL generated by LocalStack, that simulates AWS S3 in your local environment.
    * When you move to production, you will create a CDN using CloudFront from AWS. 
        * This is a CDN to serve your buckets more efficiently
        * Read the price details carefully, this usually starts being free but it will turn into a paid version after you reach certain traffic volume (1TB).
        * Create a Cloudfront distribution
        * Associate it with your private bucket
        * Enter the URL
* AWS_KEY=xxx
    * You can enter a fake key like 123abc123 in development. 
    * In production, you will need to enter your AWS Key. Read the security recommendations of AWS about it.
* AWS_SECRET=xxx
    * You can enter a fake key like 123abc123 in development. 
    * In production, you will need to enter your AWS Key. Read the security recommendations of AWS about it.
* POLYGON_IO_API_KEY=xxx
    * You can enter a fake key like 123abc123 in development. 
    * In production, you will need to enter your Polygon API Key.
* SEC_EDGAR_COMPANY_NAME=YourOrgName
    * You can enter a fake name like abc123 in development.
    * In production, you will need to enter your SEC Edgar Company Name.
* SEC_EDGAR_EMAIL=you@example.com
    * You can enter a fake email like abc@abc.com in development.
    * In production, you will need to enter SEC Edgar account email.

## Start the backend server

* `make migrate`: runs the database migrations.
    * In development, LocalStack will simulate AWS S3 in your local environment.
* `make run`: starts the server locally.
    * This spins up the Postgres 15 DB & Localstack in their own docker containers.
    * The server will not run in a container but will instead run directly on your OS.
    * This is to allow for use of debugging tools like pdb

## When you are ready, enter your final environment variables in the .env file

* Open the .env file and replace the placeholder values with your own API keys.
* Source the file again with `set -a` then `source .env`

## Populate your local database with some sample SEC filings

* Run `make seed_db_local`
    * If this step fails, you may find it helpful to run `make refresh_db` to wipe your local database and re-start with emptied tables.
* Done 🏁! You can run make run again and you should see some documents loaded at [http://localhost:8000/api/document](http://localhost:8000/api/document)

## To use this RAG app with your own private documents

* Example working with the app in your local environment.
* Have the backend running in a second terminal window.
* Go to localhost:8000/api/document
    * This will show the current documents in the database 
* To load a new document, in the /backend/scripts folder there are useful scripts to do that like upsert_document.py
* In terminal, run `python upsert_document.py URLofYourPDFDocument
    * this loads a new document into the database our updates it if it already exists.
    * if you go to localhost:8000/api/document you will see the new document
    * then, if you run `make chat` you will be able to chat with your private documents from your terminal.
        * run `pick_docks`
        * pick > run `help` to see the available commands
        * pick > run `select_id idOfTheDocument`
        * pick > run `finish`
        * run `help`
        * run `create`
        * run `detail`
        * run `message WriteYourQuestionAboutTheDocumentHere`

## For any issues
* For any issues in setting up the above or during the rest of your development, you can check for solutions in the following places:

    * [backend/troubleshooting.md](https://github.com/run-llama/sec-insights/blob/main/backend/troubleshooting.md)
    * [Open & already closed Github Issues](https://github.com/run-llama/sec-insights/issues?q=is%3Aissue+is%3Aclosed)
    * [The #sec-insights discord channel](https://discord.com/channels/1059199217496772688/1150942525968879636)

## SEC Document Downloader

We have a script to easily download SEC 10-K & 10-Q files! This is a single step of the larger seed script described in the next section. Unless you have some use for just running this step on it's own, you probably want to stick to the Seed script described in the section below. However, the setup instructions for this script are a pre-requisite for running the seed script.

No API keys are needed to use this, it calls the SEC's free to use Edgar API.

The instructions below explain a process to use the script to download the SEC filings, convert the to PDFs, and store them in an S3 bucket.

## Setup / Usage Instructions
Pre-requisite setup steps to use the downloader script to load the SEC PDFs directly into an S3 bucket.

These steps assume you've already followed the steps above for setting up your dev workspace!

#### Setup AWS CLI
* Install AWS CLI
    * This step can be skipped if you're running from the devcontainer image in Github Codespaces
    * Steps:
        * curl "https://awscli.amazonaws.com/awscli-exe-linux-x86_64.zip" -o "awscliv2.zip"
        * unzip awscliv2.zip
        * sudo ./aws/install
* Configure AWS CLI
    * This is mainly to set the AWS credentials that will later be used by s3fs
    * Run aws configure and enter the access key & secret key for a AWS IAM user that has access to the PDFs where you want to store the SEC files.
        * set the default AWS region to us-east-1 (what we're primarily using).

#### Setup s3fs
* Install s3fs
    * This step can be skipped if you're running from the devcontainer image in Github Codespaces
    * sudo apt install s3fs
* Setup a s3fs mounted folder
    * Create the mounted folder locally mkdir ~/mounted_folder
    * s3fs llama-app-web-assets-preview ~/mounted_folder
        * You can replace llama-app-web-assets-preview with the name of the S3 bucket you want to upload the files to.

#### Install wkhtmltopdf
* This step can be skipped if you're running from the devcontainer image in Github Codespaces
* Steps:
    * sudo apt-get update
    * sudo apt-get install wkhtmltopdf

#### Get into your poetry shell with poetry shell from the project's root directory.

#### Run the script! 
* python scripts/download_sec_pdf.py -o ~/mounted_folder --file-types="['10-Q','10-K']"
* Take a 🚽 break while it's running, it'll take a while!

#### Go to AWS Console and verify you're seeing the SEC files in the S3 bucket.

## Seed DB Script

There are a collection of scripts we have for seeding the database with a set of documents. The script in `scripts/seed_db.py` is an attempt at consolidating those disparate scripts into one unified command.

This script will:

* Download a set of SEC 10-K & 10-Q documents to a local temp directory
* Upload those SEC documents to the S3 folder specified by $S3_ASSET_BUCKET_NAME
* Crawl through all the PDF files in the S3 folder and upsert a database row into the Document table based on the path of the file within the bucket

#### Use Cases
This is useful for times when:

* You want to setup a local environment with your local Postgres DB to have a set of documents in the documents table
    * When running locally, this will use localstack to store the documents into a local S3 bucket instead of a real one.

* You want to update the documents present in either Prod or Preview DBs
    * In fact, this is the very script that is run by the llama-app-cron cron job service that gets setup by the render.yaml blueprint when deploying this service to Render.com.

#### Usage
To run the script, make sure you've:

* Activated your Python virtual environment using poetry shell
* Installed all the pre-requisite dependencies for the SEC Document Downloader script.
* Defined all the environment variables from `.env.development` within your shell environment according to the environment you want to execute the seed script (e.g. local, preview, prod environments)

After that you can run `python scripts/seed_db.py` to start the seed process.

To make things easier, the Makefile has some shorthand commands.

* `make seed_db`
    * Just runs the seed_db.py script with no CLI args, so just based on what env vars you've set

* `make seed_db_preview`
    * Same as make seed_db but only loads SEC documents from Amazon & Meta
    * We don't need to load that many company documents for Preview environments.

* `make seed_db_local`
    * To be used for local database seeding
    * Runs seed_db.py just for `$AMZN & $META` documents
    * Sets up the localstack bucket to actually serve the documents locally as well, so you can load them in your local browser.

* `make seed_db_based_on_env`
    * Automatically calls one of the above shorthands based on the RENDER & IS_PREVIEW_ENV environment variables

## Deep dive into the key LlamaIndex logic
* Main placement: `/backend/app/chat/engine.py`
* Secondary placements in same folder:
    * `messaging.py`
    * `pg_vector.py`
    * `a_response_synth.py`

## /backend/app/chat/engine.py

#### General summary

This code is a backend process which interacts with documents and handles chat messages for an AI assistant service. 

It is using several Python libraries and custom modules to accomplish its tasks. Here's a simplified breakdown:

1. **Import Statements**: The code starts by importing various Python modules and custom modules. These are used for handling different types of data, file operations, logging, and interacting with external services like AWS S3.

2. **Logger Configuration**: It sets up logging to keep track of events and errors.

3. **AsyncIO Patch**: A patch for AsyncIO (asynchronous input/output) is applied. This is used in Python for non-blocking I/O operations, which are important in web and network applications for efficiency.

4. **S3 File System Setup**: A function `get_s3_fs` is defined to configure and return a connection to AWS S3, a cloud storage service. It uses credentials and settings from the application configuration.

5. **Document Handling Functions**: Several functions are defined to fetch, read, and process documents. For example, `fetch_and_read_document` downloads a PDF document, reads it, and returns its contents.

6. **Document Indexing and Querying**: Functions like `index_to_query_engine` and `build_doc_id_to_index_map` are defined for indexing documents and setting up a query engine. The application uses a document search and retrieval system.

7. **Chat Message Processing**: The function `get_chat_history` processes chat messages, filtering and sorting them. This indicates the application might be handling chat data, possibly for customer support or an interactive AI assistant.

8. **Service Context Setup**: The `get_tool_service_context` function sets up the context for various services used in the application, likely integrating different tools and APIs for processing requests.

9. **Chat Engine Initialization**: The function `get_chat_engine` initializes a chat engine with various settings and tools. This engine is probably used to handle and respond to chat messages in a conversation, using AI models (like GPT-4).

10. **Caching and Asynchronous Operations**: The code uses caching (with `cached` and `TTLCache`) and asynchronous operations (with `async` keyword) for efficiency, especially important in handling web requests and external API calls.

11. **Application-Specific Logic**: The code includes specific logic related to the application's domain, like handling SEC documents, parsing chat messages, and generating responses.

In summary, this code is a part of a backend service for an AI-powered chat application, dealing with document management, retrieval, and processing, as well as handling and responding to chat messages in a conversational context. It integrates various services and APIs, and employs modern Python programming techniques like asynchronous operations and caching for efficiency.

#### index_to_query_engine function

This code defines a function named `index_to_query_engine` in Python, which is designed to convert a document index into a query engine. Here's a simplified explanation:

1. **Function Definition**: The function `index_to_query_engine` takes two parameters:
   - `doc_id`: A string representing the unique identifier of a document.
   - `index`: An object of type `VectorStoreIndex`, which is likely a class of the LlamaIndex library. This object represents an index of documents where each document is represented as a vector (a list of numbers) for efficient searching and comparison.

2. **Setting Up Filters**: Inside the function, a `MetadataFilters` object is created with an `ExactMatchFilter`. This filter is configured to match documents with a specific ID (`DB_DOC_ID_KEY`) equal to the `doc_id` passed to the function. Essentially, this setup ensures that the query engine will only consider the document with the specified ID.

3. **Query Engine Configuration**: The function then prepares a set of keyword arguments (`kwargs`) for the query engine:
   - `similarity_top_k`: This is set to 3, which means the query engine will return the top 3 most similar results when performing a semantic search.
   - `filters`: The filters configured earlier are passed here, restricting the search to the specified document.

4. **Creating and Returning the Query Engine**: Finally, the function calls `as_query_engine` method on the `index` object, passing the configured arguments. This method converts the index into a query engine capable of performing search queries based on the settings provided. The resulting `BaseQueryEngine` object is returned.

In simple terms, this function is part of a system that handles complex document searches. It takes a document's unique identifier and an index, and sets up a query engine specifically tailored to search for information related to that particular document, focusing on the top 3 most relevant results.

#### get_storage_context function

This piece of Python code defines a function named `get_storage_context`. The purpose of this function is to create and return a `StorageContext` object, which is used for managing data storage in the application. Let's break down the function in simple terms:

1. **Function Definition**: The function `get_storage_context` is defined with three parameters:
   - `persist_dir`: A string representing the directory where data should be persisted (stored). This is where the application will save or access its data.
   - `vector_store`: An object of type `VectorStore`. This refers to a storage system or structure where vectors are stored. Vectors are used to represent data, especially in the context of search and machine learning.
   - `fs`: An optional parameter of type `AsyncFileSystem`, which is a file system interface for asynchronous file operations. Asynchronous operations allow the application to handle file I/O (input/output) in a non-blocking way, which is more efficient for web and network applications. This parameter defaults to `None` if not provided.

2. **Logging Information**: The function logs a message saying "Creating new storage context." This is a standard practice in software development to record what the application is doing, which helps in debugging and monitoring.

3. **Creating and Returning the Storage Context**: The function then creates a new `StorageContext` object by calling `StorageContext.from_defaults`. It passes the `persist_dir`, `vector_store`, and `fs` to this method. The `from_defaults` method likely sets up a standard storage context with the given parameters.

4. **Return Value**: The newly created `StorageContext` object is returned. This object would encapsulate the details about how and where the application's data is stored and managed.

In summary, `get_storage_context` is a utility function that sets up and returns a storage context for the application, based on the given directory for data persistence, a vector store for storing data vectors, and an optional asynchronous file system interface. This storage context is used in the application to handle data storage and retrieval operations efficiently.

#### build_doc_id_to_index_map function

This code defines an asynchronous function `build_doc_id_to_index_map` in Python. The function's purpose is to create a mapping between document IDs and their corresponding indices, which are used for efficient data retrieval and search operations. Here's a breakdown of the function in simple terms:

1. **Function Definition**: 
   - `async def`: Indicates that this is an asynchronous function, which can perform operations without blocking the execution of other parts of the program.
   - `service_context`: A parameter representing the context in which the service operates.
   - `documents`: A list of `DocumentSchema` objects, each representing a document in the system.
   - `fs`: An optional parameter of type `AsyncFileSystem`, used for handling file operations asynchronously. It defaults to `None` if not provided.
   - Returns a dictionary mapping string keys (document IDs) to `VectorStoreIndex` objects.

2. **Setting Up Storage Context**:
   - The function defines a `persist_dir` variable, set to the name of an S3 bucket (a cloud storage location) from the application's settings.
   - It attempts to retrieve a singleton instance of a `vector_store` asynchronously using `await get_vector_store_singleton()`.
   - It tries to create a `storage_context` using this vector store and the file system provided. If it fails (due to a `FileNotFoundError`), it logs a message and creates a new storage context.

3. **Loading Indices from Storage**:
   - The function attempts to load indices for each document from the storage. These indices are structures that enable efficient searching and accessing of documents.
   - If successful, it creates a mapping (`doc_id_to_index`) between each document's ID and its index.

4. **Handling Errors and Creating New Indices**:
   - If there is a `ValueError`, suggesting that indices could not be loaded, the function logs an error message and proceeds to create new indices for each document.
   - It reads each document (using `fetch_and_read_document`), adds them to the storage context's document store, and creates a new index for each document.
   - These indices are then persisted in the storage and added to the `doc_id_to_index` mapping.

5. **Returning the Mapping**:
   - Finally, the function returns the `doc_id_to_index` dictionary, which contains a mapping of document IDs to their corresponding indices.

In summary, `build_doc_id_to_index_map` is an asynchronous function that creates and returns a mapping between document IDs and their indices. It handles both the scenario where indices can be loaded from existing storage and the case where new indices need to be created, ensuring efficient document management and retrieval in the application.

#### get_tool_service_context function

This code defines a function named `get_tool_service_context`, which is designed to create and return a `ServiceContext` object. This `ServiceContext` is a configuration used to handle the main operations within the application. Here’s a breakdown in simpler terms:

1. **Function Definition**:
   - The function takes one parameter, `callback_handlers`, which is a list of `BaseCallbackHandler` objects. These callback handlers are likely used to perform specific actions or processes when certain events or conditions are met in the application.

2. **Setting Up OpenAI and Embedding Models**:
   - The function initializes an `OpenAI` object with specific settings, including a temperature of 0 (which controls the randomness of the AI's responses), the model `gpt-4-1106-preview` (indicating it uses a specific version of the GPT-4 AI model), and an API key for authentication.
   - It also creates a `CallbackManager` using the provided callback handlers.
   - An `embedding_model` is set up using `OpenAIEmbedding`. This model is configured for similarity mode, indicating it is likely used for tasks involving finding similarities in text data, and it uses a specific embedding model type (`TEXT_EMBED_ADA_002`).

3. **Node Parser Configuration**:
   - A `node_parser` is initialized with default settings. This parser is configured with a chunk size and chunk overlap, parameters that might be relevant for breaking down and analyzing text data in smaller parts for more detailed or granular processing.

4. **Creating the Service Context**:
   - The function then creates a `ServiceContext` object using the `from_defaults` method. This `ServiceContext` is configured with the previously created `callback_manager`, `llm` (the OpenAI object), `embedding_model`, and `node_parser`.
   - The `ServiceContext` is a comprehensive setup that includes components for AI processing (using OpenAI), handling callbacks, and text data processing.

5. **Returning the Service Context**:
   - Finally, the function returns the `ServiceContext` object.

In simple terms, `get_tool_service_context` is a setup function that prepares and returns a context (a collection of settings and tools) for a service. This context includes an AI model (GPT-4), mechanisms to handle callbacks (actions triggered by certain events), and tools for analyzing and processing text data. This setup is used in parts of the application where AI-powered text analysis and processing are needed.

#### get_chat_engine function

This code defines an asynchronous function named `get_chat_engine`, which is designed to set up and return a chat engine powered by OpenAI's technology for an application that handles interactive conversations. The function uses various tools and services to manage and respond to chat messages in a sophisticated manner. Here's a breakdown of the function in simpler terms:

1. **Function Definition**:
   - It's an `async` function, meaning it's designed to perform asynchronous operations that don't block the execution of other parts of the program.
   - It takes two parameters: `callback_handler`, an object to handle certain types of events or actions, and `conversation`, a schema representing the structure of a conversation.

2. **Setting Up the Service Context**:
   - The function initializes a service context using `get_tool_service_context`. This context includes various tools and settings for handling chat-related tasks, including AI models and callback mechanisms.

3. **Document Indexing and Storage Setup**:
   - Retrieves a connection to an S3 file system (`s3_fs`).
   - Builds a map (`doc_id_to_index`) that links document IDs to their indices for efficient document retrieval. This is done using the `build_doc_id_to_index_map` function.

4. **Creating Query Engines**:
   - Sets up two types of query engines:
     - **Qualitative Question Engine**: Handles qualitative aspects of the conversation, like sentiments, risks, etc., using documents.
     - **Quantitative Question Engine**: Deals with quantitative data, like financial metrics, from selected documents.

5. **Integrating OpenAI Model**:
   - Initializes an `OpenAI` model with specific settings for generating or interpreting chat messages. This model uses OpenAI's GPT-4 with certain configurations.

6. **Chat History and Document Titles**:
   - Processes the chat history from the `conversation` and prepares a list of document titles related to the conversation.

7. **Assembling the Chat Engine**:
   - The chat engine (`OpenAIAgent`) is assembled using various tools, including the qualitative and quantitative query engines, the OpenAI model, chat history, and system prompts. The system prompt is formatted with the current date and document titles.
   - The chat engine is designed to use the assembled tools to handle and respond to chat messages in the context of the conversation.

8. **Returning the Chat Engine**:
   - Finally, the function returns the fully assembled chat engine.

In summary, the `get_chat_engine` function sets up a chat engine capable of handling and responding to conversations in a sophisticated manner. It integrates document management, AI-powered message generation and interpretation, and specialized query engines for handling different types of questions.

## backend/app/chat/pg_vector.py

#### General Summary

This code is part of a web application using FastAPI, and it deals with setting up and managing a specialized database store for vectors. Vectors are numerical representations of data, commonly used in machine learning and search applications. Let's break it down into simpler terms:

1. **Import Statements**: 
   - The code begins by importing necessary modules and classes. These include `VectorStore` (a type of data store for vectors), `PGVectorStore` (a PostgreSQL-based implementation of `VectorStore`), and various SQLAlchemy components (a Python SQL toolkit and ORM).

2. **Singleton Setup**: 
   - `singleton_instance` and `did_run_setup` are global variables. The `singleton_instance` will hold a single instance of the vector store (to ensure only one exists in the application), and `did_run_setup` is a flag to check if the setup has been done.

3. **CustomPGVectorStore Class**: 
   - This is a custom class that inherits from `PGVectorStore`. It's designed to integrate with the FastAPI application's database connections.
   - The `_connect` method overrides the parent class to set up the database engine and session using the application's existing database engine (`app_engine`) and session (`AppSessionLocal`). This ensures that the vector store uses the same database connection pool as the rest of the application.
   - The `close` method is an asynchronous function to close database sessions and dispose of the engine.
   - The `_create_tables_if_not_exists` and `_create_extension` methods are overridden but left empty, indicating they are not needed or their functionality is handled elsewhere.
   - The `run_setup` method is an asynchronous function that ensures the PostgreSQL extension for vector handling is created and initializes the required database tables. It uses the `did_run_setup` flag to avoid redundant setups.

4. **Asynchronous Singleton Factory Function**:
   - `get_vector_store_singleton` is an asynchronous function that returns an instance of `VectorStore`. 
   - It checks if `singleton_instance` already exists; if not, it creates a new instance of `CustomPGVectorStore` with parameters derived from the application's settings (like database URL and table name).
   - Once created, this instance is stored in `singleton_instance` and returned for use.

In summary, this code is setting up and managing a custom vector store in a FastAPI application for storing and querying vector data efficiently. It integrates with the application's database settings and connection pool, and ensures that the vector store is set up only once (singleton pattern). The custom vector store is specifically tailored to work with PostgreSQL and is asynchronous, aligning with FastAPI's asynchronous nature.

## backend/app/chat/messaging.py

#### General Summary

This code focuses on handling chat messages and streaming responses. Let's break it down into simpler terms:

1. **Import Statements and Class Definitions**:
   - The code begins by importing necessary Python modules and classes, including those for handling asynchronous operations (`asyncio`), logging, and data streaming.
   - It defines some data models (`StreamedMessage` and `StreamedMessageSubProcess`) using Pydantic, which are structured representations of the messages and subprocesses in the chat.

2. **ChatCallbackHandler Class**:
   - This class extends `BaseCallbackHandler` and is designed to handle specific events that occur during the chat.
   - It defines methods like `on_event_start` and `on_event_end` to manage the start and end of certain events, creating asynchronous tasks to process these events.
   - The `async_on_event` method asynchronously handles specific events, sending information about these events (such as their metadata) to a channel for further processing.
   - The class also includes some no-operation methods (`start_trace` and `end_trace`), which are placeholders and do not perform any action.

3. **Handle Chat Message Function**:
   - `handle_chat_message` is an asynchronous function that takes a conversation object, a user message, and a send channel as inputs.
   - The function sets up a chat engine by calling `get_chat_engine`, passing it an instance of `ChatCallbackHandler` and the conversation context.
   - It then sends a templated message (which includes the user's message) to the chat engine for processing.
   - The response from the chat engine is streamed back, and each part of the response is sent to the send channel as it is received.
   - If the response is empty, a default message is sent indicating that the system couldn't understand or answer the question.

4. **Streaming Chat Response**:
   - The chat engine's response is handled in a streaming fashion, meaning the response is processed and relayed as it's being generated, rather than waiting for the entire response to be complete.
   - This is likely for efficiency and to provide a more interactive user experience.

5. **Error Handling and Logging**:
   - The code includes checks to ensure that the send channel is not closed before sending messages.
   - It logs various events and errors, which is useful for debugging and monitoring the application.

In summary, this code is part of a chatbot or an interactive system that handles user messages. It sets up a callback handler to process events during a chat, uses a chat engine to generate responses, and streams these responses back to the user. The system is designed to be asynchronous for efficiency and is equipped with error handling and logging for robust operation.

## backend/app/chat/qa_response_synth.py

#### General Summary

This code defines a function named `get_custom_response_synth`, which creates a custom response synthesizer for a chatbot dealing with questions about SEC (U.S. Securities and Exchange Commission) filing documents. Here's a simplified explanation:

1. **Import Statements**: The code imports necessary Python classes and functions, including ones related to response synthesis, service context, prompts, and document handling.

2. **Function Definition**:
   - The function `get_custom_response_synth` takes two parameters: `service_context` (which likely provides context and settings for the service) and `documents` (a list of `DocumentSchema` objects, each representing an SEC filing document).

3. **Building Document Titles**:
   - The function constructs a string (`doc_titles`) that contains a list of the titles of the provided SEC documents. This is done by joining the titles with newline characters, formatting each title with a hyphen prefix.

4. **Refine Prompt Template**:
   - A template string `refine_template_str` is created for a type of prompt called "Refine Prompt". This template is structured to provide information about the selected SEC documents and an existing answer to a query. It then asks to refine the existing answer with additional context or to return the original answer if the context isn’t useful.

5. **Question-Answer Prompt Template**:
   - Similarly, a template string `qa_template_str` is created for a "Question-Answer Prompt". This template provides the titles of the SEC documents and some context information, then asks to answer a query based on this information.

6. **Creating Prompts**:
   - Two prompt objects (`refine_prompt` and `qa_prompt`) are created using the respective template strings. These prompts are configured with their types (`REFINE` and `QUESTION_ANSWER`).

7. **Creating and Returning the Response Synthesizer**:
   - The function calls `get_response_synthesizer`, passing the `service_context`, the two prompts, and a flag for structured answer filtering (set to False, which might be specific to certain versions of GPT, like 3.5).
   - It returns the response synthesizer configured with these prompts. This synthesizer is used to generate responses to user queries in the chatbot.

In summary, `get_custom_response_synth` sets up a custom response synthesizer for a system handling queries about SEC documents. It uses two types of prompts - one to refine existing answers and another to generate new answers from scratch based on provided document context.

## backend/app/chat/constants.py

#### General Summary

This code snippet is part of a larger application, an AI-driven chatbot or virtual assistant, designed to answer financial questions using specific guidelines and tools. Here's a breakdown of the components and their purposes:

1. **Constant Definition (`DB_DOC_ID_KEY`)**:
   - `DB_DOC_ID_KEY = "db_document_id"`: This line defines a constant `DB_DOC_ID_KEY` with the value `"db_document_id"`. This constant is used as a key or identifier for document IDs in a database.

2. **System Message Template (`SYSTEM_MESSAGE`)**:
   - The `SYSTEM_MESSAGE` variable holds a multiline string that serves as a template for the system message presented to the AI agent. This message defines the role and guidelines for the agent:
     - The AI is framed as an "expert financial analyst" who must use specific tools to answer financial questions.
     - It emphasizes the necessity of using these tools to find answers or relevant information, even if it seems the tools might not have a direct answer.
     - The AI should assume that user queries are related to the financial documents (SEC documents) they have selected.
     - For non-financial questions, the AI is instructed to respectfully decline to respond and prompt the user to ask a relevant question.
     - In cases where the tools don't find a direct answer, the AI should communicate this and still provide any useful insights obtained.
   - The message includes placeholders (`{doc_titles}` and `{curr_date}`) that will be replaced with actual document titles and the current date in the context of the conversation.

3. **Node Parser Configuration (`NODE_PARSER_CHUNK_SIZE` and `NODE_PARSER_CHUNK_OVERLAP`)**:
   - These constants define parameters for a node parser, which is likely a component of the application used for processing text:
     - `NODE_PARSER_CHUNK_SIZE = 512`: This sets the size of text chunks that the node parser will process at a time. A chunk size of 512 characters is a common choice for balancing granularity and performance.
     - `NODE_PARSER_CHUNK_OVERLAP = 10`: This specifies the number of characters that will overlap between consecutive chunks. An overlap can help ensure that the context is not lost between chunks.

In summary, this code provides configuration and instructions for an AI system designed to respond to financial queries. It sets up guidelines for how the AI should handle questions, emphasizing the use of specific tools and databases (especially for financial analysis related to SEC documents), and configures the text processing parameters for the node parser. The system is designed to be responsive and helpful, while also acknowledging its limitations and guiding users to ask appropriate questions.

## Other interesting files

## backend/app/api/crud.py

#### General Summary

This code is part of an asynchronous web application, built with Python and using SQLAlchemy for database operations. It defines several functions to interact with a database for managing conversations, messages, and documents, specifically in the context of an application like a chatbot. Here's a breakdown of the functions in simple terms:

1. **Import Statements**:
   - The code imports necessary Python modules and classes, including those for typing, SQLAlchemy's ORM (Object-Relational Mapping), and the application's specific database and schema models.

2. **`fetch_conversation_with_messages` Function**:
   - This function fetches a conversation and its associated messages (including sub-processes related to each message) from the database.
   - It takes an `AsyncSession` (for asynchronous database operations) and a `conversation_id`.
   - If the conversation exists, it returns a structured conversation object; otherwise, it returns `None`.

3. **`create_conversation` Function**:
   - This function creates a new conversation in the database.
   - It takes the conversation payload (data) and inserts it into the database, along with any related documents.
   - After committing the new conversation to the database, it fetches and returns the full conversation with its messages.

4. **`delete_conversation` Function**:
   - This function deletes a conversation from the database based on its ID.
   - It returns `True` if the deletion was successful (i.e., if any rows were affected), otherwise `False`.

5. **`fetch_message_with_sub_processes` Function**:
   - This function fetches a specific message and its sub-processes (related operations or actions) based on the message's ID.
   - If the message exists, it returns a structured message object; otherwise, it returns `None`.

6. **`fetch_documents` Function**:
   - This function fetches documents from the database based on different criteria like individual ID, a list of IDs, or URL.
   - It can also limit the number of documents returned.
   - The function returns a list of document objects or `None` if no documents are found.

7. **`upsert_document_by_url` Function**:
   - This function performs an "upsert" operation for a document - it inserts a new document or updates it if it already exists, based on the document's URL.
   - It returns the upserted document object.

Each of these functions is asynchronous (`async`), indicating they are designed for efficient I/O operations in a web application environment. They handle complex interactions with a database involving conversations, messages, and documents, which are central to applications like chatbots.

## backend/app/models/main.py

#### General Summary

This code is part of a web application built using FastAPI. It includes setup, configuration, and launching instructions for the application. The application uses SQLAlchemy for database operations and integrates with Sentry for error tracking. Here's a breakdown of its main components:

1. **Import Statements**: The code imports necessary modules for web application development, including FastAPI, SQLAlchemy, Alembic (for database migrations), and Uvicorn (an ASGI server for running the app).

2. **Logging and Sentry Setup**:
   - The `__setup_logging` function configures logging for the application, setting the log level and format.
   - The `__setup_sentry` function initializes Sentry for error tracking, with configurations varying based on the environment (e.g., production vs. other environments).

3. **Database and Vector Store Initialization**:
   - The `lifespan` async context manager ensures certain operations occur when the FastAPI application starts and stops. This includes checking and establishing a database connection, ensuring the database schema is up to date, initializing a vector store (for storing and managing vector data), and setting up a sentence tokenizer.

4. **FastAPI Application Setup**:
   - An instance of `FastAPI` is created with configurations like the project name and OpenAPI URL.
   - CORS (Cross-Origin Resource Sharing) middleware is added to the app to allow requests from different origins, which is essential for a web application that interacts with various frontends or services.

5. **Router and Middleware Configurations**:
   - The application includes various routers (like `api_router` and `loader_io_router`) for handling different URL endpoints.
   - The CORS settings are fine-tuned, including a special condition for GitHub Codespaces environments.

6. **Application Start Function**:
   - The `start` function is the entry point for running the application. It sets up logging, initializes Sentry, and optionally runs database migrations depending on the deployment environment.
   - The application is then launched using Uvicorn, an ASGI server, with configurations for host, port, and worker count. The `reload` option is set based on the environment, enabling live reload in development.

7. **Database Migration and App Environment Handling**:
   - In certain deployment environments (like Render.com), the app runs database migrations at startup. This ensures the database schema is always up to date.
   - The application environment (e.g., production, local) determines certain behaviors like logging level, Sentry setup, and whether to perform database migrations.

In summary, this code sets up and launches a FastAPI web application with configurations for logging, error tracking, database connection, and CORS. It includes a lifespan context manager for initialization and cleanup tasks and is structured to handle different deployment environments and configurations.

## backend/app/tests/app/chat/test_engine.py

#### General Summary

This code is for testing a chat application, specifically focusing on how the application handles and processes chat messages. It includes several components and functions to assist with these tests:

1. **Imports and MockMessage Class**:
   - The code imports necessary modules and classes, including those for typing, UUIDs, dates, and chat-related models.
   - `MockMessage` is a mock class extending the `Message` schema, simulating a message in the chat application. It includes a conversation ID, message content, and associated sub-processes.

2. **chat_tuples_to_chat_messages Function**:
   - This function converts a list of tuples representing chat messages into a list of `ChatMessage` objects. Each tuple contains a user message and an assistant message.
   - The function iterates through each tuple, creating `ChatMessage` objects for user and assistant messages if they exist.

3. **TestGetChatHistory Class**:
   - This class contains multiple test methods to validate the functionality of `get_chat_history`, a function that retrieves chat history from messages.
   - Each test method simulates different scenarios to ensure `get_chat_history` behaves as expected:
     - `test_get_chat_history_happy_path`: Tests the standard case where user and assistant messages alternate.
     - `test_get_chat_history_multiple_consecutive_messages_from_same_role`: Tests how consecutive messages from the same role (user or assistant) are handled.
     - `test_get_chat_history_empty_input`: Checks the function's behavior with an empty message list.
     - `test_get_chat_history_error_status`: Tests handling of messages marked with an error status.
     - `test_get_chat_history_error_status_assistant_message`: Similar to the previous test, but checks for an error status in assistant messages.
     - `test_get_chat_history_strip_content`: Verifies how the function handles messages with only whitespace content.
     - `test_get_chat_history_unpaired_user_message`: Tests how the function handles unpaired user messages (i.e., user messages with no corresponding assistant message).

In each test, `MockMessage` objects are created to simulate a conversation's messages. These messages are passed to `get_chat_history`, and the output is compared against an expected result generated by `chat_tuples_to_chat_messages`. These tests ensure that `get_chat_history` correctly processes and structures chat histories in various scenarios, which is crucial for maintaining a reliable and user-friendly chat application.