Skip to content

inspired-consulting/aishe.ai

Repository files navigation

aishe core

Setup/Deployment

Docker Compose

  1. Copy .env.example to .env and modify content
  2. Create ngrok domain
  3. Setup ngrok agent auth
  4. Setup google access to llm and add keys to .env
  5. Setup langsmith in .env
  6. Start everything as a docker compose with code hot reload: docker compose --env-file .env -p aishe_ai up

Conventional

  1. Copy .env.example to .env and modify content
  2. Install tesseract-ocr for your system with apt etc
  3. Install python deps: pip3 install -r requirements.txt or update current pip install -r requirements.txt --upgrade
  4. Install chromium pip install -q playwright beautifulsoup4 playwright install
  5. Create ngrok domain
  6. Install ngrok
  7. Setup ngrok agent auth
  8. Setup google access to llm and add keys to .env
  9. Setup langsmith in .env
  10. Start fastapi: uvicorn app:app --reload
  11. Start ngrok: ngrok http --domain=DOMAIN 8000, domain must be the same as the bot creation

Issues

  • Browser is not starting for webscraping, for example within the webpage_tool:
    • add to the browser launch parameters: args=["--disable-gpu"] -> browser = await p.chromium.launch(headless=True, args=["--disable-gpu"])
    • only observed with wsl2 systems

Formatting

  • black FOLDER_NAME

Testing

tbd

Docker

Public image repo docker run -d -p 80:80 --env-file .env aishe-ai

Data structures

Planned

For prompts regarding internal company data, which will the regulary be scraped. When user prompts system, following will happen:

  1. get member from given email (search)
  2. get memberships from member (join)
  3. get documents from memberships (join)
  4. iterate over accessable documents and add their embeddings into the vector space for similarity search with given user prompt
erDiagram
    organizations ||--|{ data_sources : belongs_to
    organizations ||--|{ members : belongs_to
    data_sources ||--|{ documents : belongs_to
    members ||--|| memberships : belongs_to
    data_sources ||--|| memberships : belongs_to
    documents ||--|| memberships : belongs_to
    organizations {
        uuid uuid PK
        name string
        description string
    }
    data_sources {
        uuid uuid PK
        name text
        description text
        bot_auth_data jsonb
        organization_uuid uuid FK
    }
    members {
        uuid uuid PK
        email text
        name text
        organization_uuid uuid FK
    }
    documents {
        uuid uuid PK
        data_source_uuid uuid FK
        name text
        description text
        url text
        metadata jsonb
        embeddings vector[]
        content text
    }
    memberships {
        uuid uuid PK
        data_source_role text
        data_source_uuid uuid FK
        namespace_user_name text
        member_uuid uuid FK
        document_uuid uuid FK
    }
Loading

langchain pqvector

erDiagram
    langchain_pg_collection ||--o{ langchain_pg_embedding : belongs_to
    langchain_pg_collection {
        uuid uuid PK
        name varchar()
        cmetadata json
    }
    langchain_pg_embedding {
        uuid uuid PK
        embedding vector
        document varchar()
        cmetadata json
        custom_id varchar()
        collection_id uuid FK
    }
Loading

About

LLM Assitent with Chat Integration

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published