Skip to content

Commit

Permalink
Dockers for llm-app pipelines (#6711)
Browse files Browse the repository at this point in the history
Co-authored-by: Olivier Ruas <olivier@pathway.com>
GitOrigin-RevId: 0ed6ad276de863574c6837860a8374a2c4263189
  • Loading branch information
2 people authored and Manul from Pathway committed Jun 19, 2024
1 parent e88b3dd commit fac310f
Show file tree
Hide file tree
Showing 45 changed files with 580 additions and 145 deletions.
6 changes: 6 additions & 0 deletions examples/pipelines/alert/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
FROM pathwaycom/pathway:latest
WORKDIR /app
COPY . .
EXPOSE 8080

CMD ["python", "app.py"]
44 changes: 29 additions & 15 deletions examples/pipelines/alert/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@
</a>
</p>

# Alert Pipeline
# Real-time alerting based on local documents: End-to-end template

This example implements a pipeline that answers questions based on documents in a given folder. Additionally, in your prompts you can ask to be notified of any changes - in such case an alert will be sent to a Slack channel.

Expand Down Expand Up @@ -36,36 +36,54 @@ For this demo, Slack notifications are optional and notifications will be printe
Your Slack application will need at least `chat:write.public` scope enabled.

### Setup environment:
Set your env variables in the .env file placed in this directory or in the root of the repo.
Set your env variables in the .env file placed in this directory.

```bash
OPENAI_API_KEY=sk-...
SLACK_ALERT_CHANNEL_ID= # If unset, alerts will be printed to the terminal
SLACK_ALERT_CHANNEL_ID=
SLACK_ALERT_TOKEN=
PATHWAY_DATA_DIR= # If unset, defaults to ./data/live/
PATHWAY_DATA_DIR= # If unset, defaults to ./data/live/. If running with Docker, when you change this variable you may need to change the volume mount.
PATHWAY_PERSISTENT_STORAGE= # Set this variable if you want to use caching
```

### Run the project
### Run with Docker

Make sure you have installed poetry dependencies with `--extras unstructured`.
To run jointly the Alert pipeline and a simple UI execute:

```bash
poetry install --with examples --extras unstructured
docker compose up --build
```

Run:
Then, the UI will run at http://0.0.0.0:8501 by default. You can access it by following this URL in your web browser.

The `docker-compose.yml` file declares a [volume bind mount](https://docs.docker.com/reference/cli/docker/container/run/#volume) that makes changes to files under `data/` made on your host computer visible inside the docker container. The files in `data/live` are indexed by the pipeline - you can paste new files there and they will impact the computations.

### Run manually

Alternatively, you can run each service separately.

Make sure you have installed poetry dependencies.
```bash
poetry run python app.py
poetry install --with examples
```

If all dependencies are managed manually rather than using poetry, you can run:
Then run:
```bash
poetry run python app.py
```

If all dependencies are managed manually rather than using poetry, you can alternatively use:
```bash
python app.py
```

To run the Streamlit UI, run:
```bash
streamlit run ui/server.py --server.port 8501 --server.address 0.0.0.0
```

### Querying the pipeline

To create alerts, you can call the REST API:

```bash
Expand All @@ -75,8 +93,4 @@ curl --data '{
}' http://localhost:8080/ | jq
```

or use the Streamlit UI. Run:
```bash
streamlit run ui/server.py --server.port 8501 --server.address 0.0.0.0
```
and then you can access the UI at `0.0.0.0:8501`.
or access the Streamlit UI at `0.0.0.0:8501`.
21 changes: 21 additions & 0 deletions examples/pipelines/alert/docker-compose.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
version: "3.8"
services:
pathway:
build:
context: .
ports:
- "8080:8080"
environment:
OPENAI_API_KEY:
PATHWAY_PERSISTENT_STORAGE:
volumes:
- "./data:/app/data"
streamlit_ui:
depends_on:
- pathway
build:
context: ./ui
ports:
- "8501:8501"
environment:
PATHWAY_REST_CONNECTOR_HOST: "pathway"
11 changes: 11 additions & 0 deletions examples/pipelines/alert/ui/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
FROM python:3.11

WORKDIR /app

RUN pip install streamlit python-dotenv

COPY . .

EXPOSE 8501

CMD ["streamlit", "run", "server.py", "--server.port", "8501", "--server.address", "0.0.0.0"]
4 changes: 1 addition & 3 deletions examples/pipelines/alert/ui/server.py
Original file line number Diff line number Diff line change
Expand Up @@ -49,6 +49,4 @@
st.markdown(response)
st.session_state.messages.append({"role": "assistant", "content": response})
else:
st.error(
f"Failed to send data to Discounts API. Status code: {response.status_code}"
)
st.error(f"Failed to send data. Status code: {response.status_code}")
6 changes: 6 additions & 0 deletions examples/pipelines/contextful/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
FROM pathwaycom/pathway:latest
WORKDIR /app
COPY . .
EXPOSE 8080

CMD ["python", "app.py"]
42 changes: 29 additions & 13 deletions examples/pipelines/contextful/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@
</a>
</p>

# Contextful Pipeline
# RAG pipeline with up-to-date knowledge: get answers based on documents in local folder

This example implements a simple pipeline that answers questions based on documents in a given folder.

Expand All @@ -19,32 +19,52 @@ and sent to the OpenAI chat service for processing.
## How to run the project

### Setup environment:
Set your env variables in the .env file placed in this directory or in the root of the repo.
Set your env variables in the .env file placed in this directory.

```bash
OPENAI_API_KEY=sk-...
PATHWAY_DATA_DIR= # If unset, defaults to ./data/
PATHWAY_DATA_DIR= # If unset, defaults to ./data/. If running with Docker, when you change this variable you may need to change the volume mount.
PATHWAY_PERSISTENT_STORAGE= # Set this variable if you want to use caching
```

### Run the project
### Run with Docker

To run jointly the Alert pipeline and a simple UI execute:

```bash
poetry install --with examples
docker compose up --build
```

Run:
Then, the UI will run at http://0.0.0.0:8501 by default. You can access it by following this URL in your web browser.

The `docker-compose.yml` file declares a [volume bind mount](https://docs.docker.com/reference/cli/docker/container/run/#volume) that makes changes to files under `data/` made on your host computer visible inside the docker container. The files in `data/live` are indexed by the pipeline - you can paste new files there and they will impact the computations.

### Run manually

Alternatively, you can run each service separately.

Make sure you have installed poetry dependencies.
```bash
poetry run python app.py
poetry install --with examples
```

If all dependencies are managed manually rather than using poetry, you can run either:
Then run:
```bash
poetry run python app.py
```

If all dependencies are managed manually rather than using poetry, you can alternatively use:
```bash
python app.py
```

To run the Streamlit UI, run:
```bash
streamlit run ui/server.py --server.port 8501 --server.address 0.0.0.0
```

### Querying the pipeline

To query the pipeline, you can call the REST API:

```bash
Expand All @@ -54,8 +74,4 @@ curl --data '{
}' http://localhost:8080/ | jq
```

or use the Streamlit UI. Run:
```bash
streamlit run ui/server.py --server.port 8501 --server.address 0.0.0.0
```
and then you can access the UI at `0.0.0.0:8501`.
or access the Streamlit UI at `0.0.0.0:8501`.
21 changes: 21 additions & 0 deletions examples/pipelines/contextful/docker-compose.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
version: "3.8"
services:
pathway:
build:
context: .
ports:
- "8080:8080"
environment:
OPENAI_API_KEY:
PATHWAY_PERSISTENT_STORAGE:
volumes:
- "./data:/app/data"
streamlit_ui:
depends_on:
- pathway
build:
context: ./ui
ports:
- "8501:8501"
environment:
PATHWAY_REST_CONNECTOR_HOST: "pathway"
11 changes: 11 additions & 0 deletions examples/pipelines/contextful/ui/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
FROM python:3.11

WORKDIR /app

RUN pip install streamlit python-dotenv

COPY . .

EXPOSE 8501

CMD ["streamlit", "run", "server.py", "--server.port", "8501", "--server.address", "0.0.0.0"]
4 changes: 1 addition & 3 deletions examples/pipelines/contextful/ui/server.py
Original file line number Diff line number Diff line change
Expand Up @@ -49,6 +49,4 @@
st.markdown(response)
st.session_state.messages.append({"role": "assistant", "content": response})
else:
st.error(
f"Failed to send data to Discounts API. Status code: {response.status_code}"
)
st.error(f"Failed to send data. Status code: {response.status_code}")
6 changes: 6 additions & 0 deletions examples/pipelines/contextful_geometric/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
FROM pathwaycom/pathway:latest
WORKDIR /app
COPY . .
EXPOSE 8080

CMD ["python", "app.py"]
43 changes: 30 additions & 13 deletions examples/pipelines/contextful_geometric/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@
</a>
</p>

# Contextful Geometric Pipeline
# RAG pipeline with up-to-date knowledge: get answers based on increasing number of documents

This example implements a pipeline that answers questions based on documents in a given folder. To get the answer it sends increasingly more documents to the LLM chat until it can find an answer. You can read more about the reasoning behind this approach [here](https://pathway.com/developers/showcases/adaptive-rag).

Expand All @@ -24,32 +24,52 @@ do so until either question is answered or a limit of iterations is reached.
## How to run the project

### Setup environment:
Set your env variables in the .env file placed in this directory or in the root of the repo.
Set your env variables in the .env file placed in this directory.

```bash
OPENAI_API_KEY=sk-...
PATHWAY_DATA_DIR= # If unset, defaults to ./data/
PATHWAY_DATA_DIR= # If unset, defaults to ./data/. If running with Docker, when you change this variable you may need to change the volume mount.
PATHWAY_PERSISTENT_STORAGE= # Set this variable if you want to use caching
```

### Run the project
### Run with Docker

To run jointly the Alert pipeline and a simple UI execute:

```bash
poetry install --with examples
docker compose up --build
```

Run:
Then, the UI will run at http://0.0.0.0:8501 by default. You can access it by following this URL in your web browser.

The `docker-compose.yml` file declares a [volume bind mount](https://docs.docker.com/reference/cli/docker/container/run/#volume) that makes changes to files under `data/` made on your host computer visible inside the docker container. The files in `data/live` are indexed by the pipeline - you can paste new files there and they will impact the computations.

### Run manually

Alternatively, you can run each service separately.

Make sure you have installed poetry dependencies with `--extras unstructured`.
```bash
poetry run python app.py
poetry install --with examples --extras unstructured
```

If all dependencies are managed manually rather than using poetry, you can run either:
Then run:
```bash
poetry run python app.py
```

If all dependencies are managed manually rather than using poetry, you can alternatively use:
```bash
python app.py
```

To run the Streamlit UI, run:
```bash
streamlit run ui/server.py --server.port 8501 --server.address 0.0.0.0
```

### Querying the pipeline

To query the pipeline, you can call the REST API:

```bash
Expand All @@ -59,8 +79,5 @@ curl --data '{
}' http://localhost:8080/ | jq
```

or use the Streamlit UI. Run:
```bash
streamlit run ui/server.py --server.port 8501 --server.address 0.0.0.0
```
and then you can access the UI at `0.0.0.0:8501`.
or access the Streamlit UI at `0.0.0.0:8501`.

21 changes: 21 additions & 0 deletions examples/pipelines/contextful_geometric/docker-compose.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
version: "3.8"
services:
pathway:
build:
context: .
ports:
- "8080:8080"
environment:
OPENAI_API_KEY:
PATHWAY_PERSISTENT_STORAGE:
volumes:
- "./data:/app/data"
streamlit_ui:
depends_on:
- pathway
build:
context: ./ui
ports:
- "8501:8501"
environment:
PATHWAY_REST_CONNECTOR_HOST: "pathway"
11 changes: 11 additions & 0 deletions examples/pipelines/contextful_geometric/ui/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
FROM python:3.11

WORKDIR /app

RUN pip install streamlit python-dotenv

COPY . .

EXPOSE 8501

CMD ["streamlit", "run", "server.py", "--server.port", "8501", "--server.address", "0.0.0.0"]
4 changes: 1 addition & 3 deletions examples/pipelines/contextful_geometric/ui/server.py
Original file line number Diff line number Diff line change
Expand Up @@ -49,6 +49,4 @@
st.markdown(response)
st.session_state.messages.append({"role": "assistant", "content": response})
else:
st.error(
f"Failed to send data to Discounts API. Status code: {response.status_code}"
)
st.error(f"Failed to send data. Status code: {response.status_code}")
Loading

0 comments on commit fac310f

Please sign in to comment.