Skip to content

Commit

Permalink
feat(api)(frontend): source documents from s3 buckets (#41)
Browse files Browse the repository at this point in the history
This PR introduces ingesting documents from an s3 bucket.

Closes #22
  • Loading branch information
mawandm committed Apr 23, 2024
1 parent 1f2c0ab commit 9407b88
Show file tree
Hide file tree
Showing 19 changed files with 1,003 additions and 273 deletions.
16 changes: 16 additions & 0 deletions compose-dev.yml
Original file line number Diff line number Diff line change
Expand Up @@ -80,9 +80,25 @@ services:
- 18000:8000
networks:
- ametnes

localstack:
image: localstack/localstack:latest
ports:
- "4566:4566"
environment:
- SERVICES=s3
- DEBUG=1
- START_WEB=0
- LAMBDA_REMOTE_DOCKER=0
- DATA_DIR=/localstack/data
volumes:
- '/var/run/docker.sock:/var/run/docker.sock'
- local_stack:/localstack/data

volumes:
minio_data:
samba_data2:
postgres16_data:
qdrant_data:
chroma-data:
local_stack:
24 changes: 12 additions & 12 deletions compose.yml
Original file line number Diff line number Diff line change
Expand Up @@ -68,18 +68,18 @@ services:
- "11211:11211"
networks:
- nesis
# samba:
# image: andyzhangx/samba:win-fix
# command: ["-u", "username;password", "-s", "share;/smbshare/;yes;no;no;all;none", "-p"]
# ports:
# - '2445:445'
# networks:
# - nesis
# volumes:
# - 'samba_data2:/smbshare'
# environment:
# - USERNAME=username
# - PASSWORD=password
samba:
image: andyzhangx/samba:win-fix
command: ["-u", "username;password", "-s", "share;/smbshare/;yes;no;no;all;none", "-p"]
ports:
- '2445:445'
networks:
- nesis
volumes:
- 'samba_data2:/smbshare'
environment:
- USERNAME=username
- PASSWORD=password
minio:
image: docker.io/bitnami/minio:2022
ports:
Expand Down
4 changes: 3 additions & 1 deletion docs/mkdocs.yml
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
site_name: Nesis - AI Powered Enterprise Knowledge Partner
site_name: Nesis - Your AI Powered Enterprise Knowledge Partner
site_description: Your AI Powered Enterprise Knowledge Partner
site_url: https://ametnes.github.io/nesis/
repo_url: https://github.com/ametnes/nesis/
Expand All @@ -24,13 +24,15 @@ theme:
- content.code.copy

markdown_extensions:
- attr_list
- admonition
- pymdownx.details
- pymdownx.superfences

nav:
- Home: 'index.md'
- 'Quick Start': 'quick-start.md'
- 'Deployment': 'deployment.md'
- 'Development Guide':
- 'Local Development': 'dev-guide/local.md'
- 'Architecture': 'dev-guide/architecture.md'
Expand Down
9 changes: 9 additions & 0 deletions docs/src/deployment.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
# Deploying Nesis
Nesis has been built around cloud native container deployment.
You have multiple deployment options for Nesis however they all

## Docker Compose

## Helm

## Ametnes Platform
20 changes: 15 additions & 5 deletions docs/src/dev-guide/local.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,26 +5,34 @@ get an overview of the components that make up Nesis and its architecture [here]

## Prerequisites
1. We use docker and docker-compose to support our development process. If you don't have docker installed
locally, please for the [Install Docker Engine](https://docs.docker.com/engine/install/) link for instructions on how
locally, please for the [Install Docker Engine](https://docs.docker.com/engine/install/){target="_blank"} link for instructions on how
install docker on your local workstation.
2. If you rather not install docker, you will need to have access to a Postgres and Memcached instance.
3. _Optional:_ The RAG Engine needs access to an LLM endpoint such as an OpenAI's endpoint or a private LLM endpoint
in order to start querying your documents. You will need to set the `OPENAI_API_KEY` and the `OPENAI_API_BASE`environment variables.
4. You need to have python 3.11 for the API and RAG Engine microservices.
5. You also need to have node and npm installed.

## Quick Start
### Using Docker

For a quick start,
!!! note "A word on vector databases"

Nesis' RAG Engine requires a vector database to store vector embeddings. In order to contain the number of
dependant services, we use pgvector packaged into an extended Bitnami Postgres docker image `ametnes/postgresql:16-debian-12`
[here](https://github.com/ametnes/postgresql){target="_blank"}. You are however free to use other vector databases.
Curently, we support `chromadb` and `qdrant`.


## Quick Start

#### Checkout the repository.
Start by checking out the repository.

```bash
git checkout https://github.com/ametnes/nesis.git
cd nesis
```

### Using Docker

#### Build all the docker images locally.

```bash
Expand Down Expand Up @@ -52,7 +60,9 @@ docker-compose up
### Using your IDE

#### Start supporting services

Supporting services include

1. Postgres (for the backend database as well as the vector database).
2. Memcached for caching and locking services.
3. _Optional_ Minio for document storage.
Expand Down
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
"""migrations-0.0.3.1
"""migrations-0.0.3-rc1
Revision ID: 64a6f5fbfc82
Revises: 953b4d309aaa
Expand Down
33 changes: 33 additions & 0 deletions nesis/api/alembic/versions/fbf94c515e04_migrations_0_0_3_rc2.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
"""migrations-0.0.3-rc2
Revision ID: fbf94c515e04
Revises: 64a6f5fbfc82
Create Date: 2024-04-22 19:47:19.139869
"""

from typing import Sequence, Union

from alembic import op
import sqlalchemy as sa


# revision identifiers, used by Alembic.
revision: str = "fbf94c515e04"
down_revision: Union[str, None] = "64a6f5fbfc82"
branch_labels: Union[str, Sequence[str], None] = None
depends_on: Union[str, Sequence[str], None] = None


def upgrade() -> None:
# ### commands auto generated by Alembic - please adjust! ###
# Upgrade the document_status type
op.execute("ALTER TYPE datasource_type ADD VALUE 'S3';")
# ### end Alembic commands ###


def downgrade() -> None:
# ### commands auto generated by Alembic - please adjust! ###
# Downgrade the document_type type is intentionally ignored because existing records might already depend on it
pass
# ### end Alembic commands ###
Loading

0 comments on commit 9407b88

Please sign in to comment.