Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
34 commits
Select commit Hold shift + click to select a range
ebe3621
RAG working again
AlexejPenner Oct 22, 2024
e7d66ed
Further small changes
AlexejPenner Oct 22, 2024
a27bd7d
Adjusted Readme
AlexejPenner Oct 22, 2024
9a34934
Further small improvements
AlexejPenner Oct 23, 2024
8a3ba53
Merge branch 'main' into feature/llm-complete-updates
AlexejPenner Oct 23, 2024
51d0ec0
Rag -> Embeddings running
AlexejPenner Oct 24, 2024
5791de5
Implemented basic gitflow
AlexejPenner Oct 25, 2024
e456732
Update llm-complete-guide/README.md
AlexejPenner Oct 25, 2024
7ff8092
Removed outdated flags
AlexejPenner Oct 25, 2024
78e94f2
Merge branch 'feature/llm-complete-updates' of github.com:zenml-io/ze…
AlexejPenner Oct 25, 2024
fb5a083
update requirements and constants
strickvl Oct 25, 2024
e0b54bb
remove dummy pipeline
strickvl Oct 25, 2024
5739e47
Merge branch 'feature/llm-complete-updates' of https://github.com/zen…
strickvl Oct 25, 2024
32e2be0
update gitignore
strickvl Oct 25, 2024
e83b505
Moved secrets into ZenML secrets
AlexejPenner Oct 25, 2024
4c10ff8
Merge branch 'feature/llm-complete-updates' of github.com:zenml-io/ze…
AlexejPenner Oct 25, 2024
9ae4922
Furhter changes for remote execution
AlexejPenner Oct 28, 2024
d028e44
Fixed some configs
AlexejPenner Oct 28, 2024
7410df9
add gradio to requirements
strickvl Oct 28, 2024
20f1fe5
add rag deployment
strickvl Oct 28, 2024
200b75f
rag deployment addition
strickvl Oct 28, 2024
96dabeb
fix in get_db_port
strickvl Oct 28, 2024
194948c
Single secret for all
AlexejPenner Oct 28, 2024
2869bb4
Updated README to new secret
AlexejPenner Oct 28, 2024
a4c8f3c
Merge branch 'feature/llm-complete-updates' of github.com:zenml-io/ze…
AlexejPenner Oct 28, 2024
50d245d
Added default
AlexejPenner Oct 28, 2024
261893d
Updated github actions
AlexejPenner Oct 28, 2024
4082b6b
Updated Readme
AlexejPenner Oct 28, 2024
3efd9d1
formatting
strickvl Oct 28, 2024
f01f4f8
moar formatting
strickvl Oct 28, 2024
0b26eb5
notebooks need formatting too
strickvl Oct 28, 2024
d2bbfda
add gradio temp folder to gitignore
strickvl Oct 28, 2024
0c9c130
Updated requirements.
AlexejPenner Oct 28, 2024
bd63d3f
Merge branch 'feature/llm-complete-updates' of github.com:zenml-io/ze…
AlexejPenner Oct 28, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
50 changes: 50 additions & 0 deletions .github/workflows/run_complete_llm.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
name: Staging Trigger LLM-COMPLETE
on:
pull_request:
types: [opened, synchronize]
branches: [staging, main]
concurrency:
# New commit on branch cancels running workflows of the same branch
group: ${{ github.workflow }}-${{ github.ref }}
cancel-in-progress: true

jobs:
run-staging-workflow:
runs-on: ubuntu-dind-runners
env:
ZENML_HOST: ${{ secrets.ZENML_HOST }}
ZENML_API_KEY: ${{ secrets.ZENML_API_KEY }}
ZENML_STAGING_STACK: 51a49786-b82a-4646-bde7-a460efb0a9c5
ZENML_GITHUB_SHA: ${{ github.event.pull_request.head.sha }}
ZENML_GITHUB_URL_PR: ${{ github.event.pull_request._links.html.href }}
ZENML_DEBUG: true
ZENML_ANALYTICS_OPT_IN: false
ZENML_LOGGING_VERBOSITY: INFO
ZENML_PROJECT_SECRET_NAME: llm-complete

steps:
- name: Check out repository code
uses: actions/checkout@v3

- uses: actions/setup-python@v4
with:
python-version: '3.11'

- name: Install requirements
run: |
pip3 install -r requirements.txt
zenml integration install gcp -y

- name: Connect to ZenML server
run: |
zenml connect --url $ZENML_HOST --api-key $ZENML_API_KEY

- name: Set stack (Staging)
if: ${{ github.base_ref == 'staging' }}
run: |
zenml stack set ${{ env.ZENML_STAGING_STACK }}

- name: Run pipeline (Staging)
if: ${{ github.base_ref == 'staging' }}
run: |
python run.py --rag --evaluation --no-cache
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -162,6 +162,8 @@ llm-lora-finetuning/configs/shopify.yaml
finetuned-matryoshka/
finetuned-all-MiniLM-L6-v2/
finetuned-snowflake-arctic-embed-m/
finetuned-snowflake-arctic-embed-m-v1.5/
.gradio/

# ollama ignores
nohup.out
Binary file added llm-complete-guide/.assets/argilla_secret.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
48 changes: 29 additions & 19 deletions llm-complete-guide/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -43,11 +43,16 @@ environment and install the dependencies using the following command:
pip install -r requirements.txt
```

Depending on your hardware you may run into some issues when running the `pip install` command with the
`flash_attn` package. In that case running `FLASH_ATTENTION_SKIP_CUDA_BUILD=TRUE pip install flash-attn --no-build-isolation`
could help you.

In order to use the default LLM for this query, you'll need an account and an
API key from OpenAI specified as another environment variable:
API key from OpenAI specified as a ZenML secret:

```shell
export OPENAI_API_KEY=<your-openai-api-key>
zenml secret create llm-complete --openai_api_key=<your-openai-api-key>
export ZENML_PROJECT_SECRET_NAME=llm-complete
```

### Setting up Supabase
Expand All @@ -63,22 +68,15 @@ You'll want to save the Supabase database password as a ZenML secret so that it
isn't stored in plaintext. You can do this by running the following command:

```shell
zenml secret create supabase_postgres_db --password="YOUR_PASSWORD"
zenml secret update llm-complete -v '{"supabase_password": "YOUR_PASSWORD", "supabase_user": "YOUR_USER", "supabase_host": "YOUR_HOST", "supabase_port": "YOUR_PORT"}'
```

You'll then want to connect to this database instance by getting the connection
You can get the user, host and port for this database instance by getting the connection
string from the Supabase dashboard.

![](.assets/supabase-connection-string.png)

You can use these details to populate some environment variables where the
pipeline code expects them:

```shell
export ZENML_POSTGRES_USER=<your-supabase-user>
export ZENML_POSTGRES_HOST=<your-supabase-host>
export ZENML_POSTGRES_PORT=<your-supabase-port>
```
In case supabase is not an option for you, you can use a different database as the backend.

### Running the RAG pipeline

Expand Down Expand Up @@ -151,16 +149,17 @@ documentation](https://docs.zenml.io/v/docs/stack-components/annotators/argilla)
will guide you through the process of connecting to your instance as a stack
component.

### Finetune the embeddings

To run the pipeline for finetuning the embeddings, you can use the following
commands:
Please use the secret from above to track all the secrets. Here we are also
setting a Huggingface write key. In order to make the rest of the pipeline work for you, you
will need to change the hf repo urls to a space you have permissions to.

```shell
pip install -r requirements-argilla.txt # special requirements
python run.py --embeddings
```bash
zenml secret update llm-complete -v '{"argilla_api_key": "YOUR_ARGILLA_API_KEY", "argilla_api_url": "YOUR_ARGILLA_API_URL", "hf_token": "YOUR_HF_TOKEN"}'
```


### Finetune the embeddings

As with the previous pipeline, you will need to have set up and connected to an Argilla instance for this
to work. Please follow the instructions in the [Argilla
documentation](https://docs.argilla.io/latest/getting_started/quickstart/)
Expand All @@ -170,6 +169,17 @@ documentation](https://docs.zenml.io/v/docs/stack-components/annotators/argilla)
will guide you through the process of connecting to your instance as a stack
component.

The pipeline assumes that your argilla secret is stored within a ZenML secret called `argilla_secrets`.
![Argilla Secret](.assets/argilla_secret.png)

To run the pipeline for finetuning the embeddings, you can use the following
commands:

```shell
pip install -r requirements-argilla.txt # special requirements
python run.py --embeddings
```

*Credit to Phil Schmid for his [tutorial on embeddings finetuning with Matryoshka
loss function](https://www.philschmid.de/fine-tune-embedding-model-for-rag) which we adapted for this project.*

Expand Down
39 changes: 39 additions & 0 deletions llm-complete-guide/configs/embeddings.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
# enable_cache: False

# environment configuration
settings:
docker:
parent_image: "zenmldocker/prepare-release:base-0.68.0"
requirements:
- langchain-community
- ratelimit
- langchain>=0.0.325
- langchain-openai
- pgvector
- psycopg2-binary
- beautifulsoup4
- unstructured
- pandas
- numpy
- sentence-transformers>=3
- transformers[torch]
- litellm
- ollama
- tiktoken
- umap-learn
- matplotlib
- pyarrow
- rerankers[flashrank]
- datasets
- torch
environment:
ZENML_PROJECT_SECRET_NAME: llm_complete


# configuration of the Model Control Plane
model:
name: finetuned-zenml-docs-embeddings
version: latest
license: Apache 2.0
description: Finetuned LLM on ZenML docs
tags: ["rag", "finetuned"]
21 changes: 21 additions & 0 deletions llm-complete-guide/configs/rag_eval.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
enable_cache: False

# environment configuration
settings:
docker:
requirements:
- unstructured
- sentence-transformers>=3
- pgvector
- datasets
- litellm
- numpy
- psycopg2-binary
- tiktoken

# configuration of the Model Control Plane
model:
name: finetuned-zenml-docs-embeddings
license: Apache 2.0
description: Finetuned LLM on ZenML docs
tags: ["rag", "finetuned"]
36 changes: 36 additions & 0 deletions llm-complete-guide/configs/rag_gcp.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
# environment configuration
settings:
docker:
requirements:
- unstructured
- sentence-transformers>=3
- pgvector
- datasets
- litellm
- numpy
- psycopg2-binary
- tiktoken
- ratelimit
environment:
ZENML_PROJECT_SECRET_NAME: llm_complete
ZENML_ENABLE_RICH_TRACEBACK: FALSE
ZENML_LOGGING_VERBOSITY: INFO

steps:
url_scraper:
parameters:
docs_url: https://docs.zenml.io
generate_embeddings:
step_operator: "terraform-gcp-6c0fd52233ca"
settings:
step_operator.vertex:
accelerator_type: "NVIDIA_TESLA_P100"
accelerator_count: 1
machine_type: "n1-standard-8"

# configuration of the Model Control Plane
model:
name: finetuned-zenml-docs-embeddings
license: Apache 2.0
description: Finetuned LLM on ZenML docs
tags: ["rag", "finetuned"]
32 changes: 32 additions & 0 deletions llm-complete-guide/configs/rag_local_dev.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
enable_cache: False

# environment configuration
settings:
docker:
requirements:
- unstructured
- sentence-transformers>=3
- pgvector
- datasets
- litellm
- numpy
- psycopg2-binary
- tiktoken
- ratelimit
environment:
ZENML_PROJECT_SECRET_NAME: llm_complete
ZENML_ENABLE_RICH_TRACEBACK: FALSE
ZENML_LOGGING_VERBOSITY: INFO


# configuration of the Model Control Plane
model:
name: finetuned-zenml-docs-embeddings
license: Apache 2.0
description: Finetuned LLM on ZenML docs
tags: ["rag", "finetuned"]

steps:
url_scraper:
parameters:
docs_url: https://docs.zenml.io/stack-components/orchestrators
39 changes: 39 additions & 0 deletions llm-complete-guide/configs/synthetic.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
# enable_cache: False

# environment configuration
settings:
docker:
requirements:
- langchain-community
- ratelimit
- langchain>=0.0.325
- langchain-openai
- pgvector
- psycopg2-binary
- beautifulsoup4
- unstructured
- pandas
- numpy
- sentence-transformers>=3
- transformers
- litellm
- ollama
- tiktoken
- umap-learn
- matplotlib
- pyarrow
- rerankers[flashrank]
- datasets
- torch
- distilabel
environment:
ZENML_PROJECT_SECRET_NAME: llm_complete


# configuration of the Model Control Plane
model:
name: finetuned-zenml-docs-embeddings
version: latest
license: Apache 2.0
description: Finetuned LLM on ZenML docs
tags: ["rag", "finetuned"]
10 changes: 7 additions & 3 deletions llm-complete-guide/constants.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@
# See the License for the specific language governing permissions and
# limitations under the License.
#
import os

# Vector Store constants
CHUNK_SIZE = 2000
Expand Down Expand Up @@ -57,20 +58,23 @@

# embeddings finetuning constants
EMBEDDINGS_MODEL_NAME_ZENML = "finetuned-zenml-docs-embeddings"
DATASET_NAME_DEFAULT = "zenml/rag_qa_embedding_questions_0_60_0"
# DATASET_NAME_DEFAULT = "zenml/rag_qa_embedding_questions_0_60_0"
DATASET_NAME_DEFAULT = "zenml/rag_qa_embedding_questions"
DATASET_NAME_DISTILABEL = f"{DATASET_NAME_DEFAULT}_distilabel"
DATASET_NAME_ARGILLA = DATASET_NAME_DEFAULT.replace("zenml/", "")
OPENAI_MODEL_GEN = "gpt-4o"
OPENAI_MODEL_GEN_KWARGS_EMBEDDINGS = {
"temperature": 0.7,
"max_new_tokens": 512,
}
EMBEDDINGS_MODEL_ID_BASELINE = "Snowflake/snowflake-arctic-embed-m"
EMBEDDINGS_MODEL_ID_FINE_TUNED = "finetuned-snowflake-arctic-embed-m"
EMBEDDINGS_MODEL_ID_BASELINE = "Snowflake/snowflake-arctic-embed-m-v1.5"
EMBEDDINGS_MODEL_ID_FINE_TUNED = "finetuned-snowflake-arctic-embed-m-v1.5"
EMBEDDINGS_MODEL_MATRYOSHKA_DIMS: list[int] = [
384,
256,
128,
64,
] # Important: large to small
USE_ARGILLA_ANNOTATIONS = False

SECRET_NAME = os.getenv("ZENML_PROJECT_SECRET_NAME", "llm-complete")
6 changes: 4 additions & 2 deletions llm-complete-guide/most_basic_eval.py
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,8 @@

from openai import OpenAI

from utils.openai_utils import get_openai_api_key


def preprocess_text(text):
text = text.lower()
Expand Down Expand Up @@ -51,7 +53,7 @@ def answer_question(query, corpus, top_n=2):
return "I don't have enough information to answer the question."

context = "\n".join(relevant_chunks)
client = OpenAI(api_key=os.environ.get("OPENAI_API_KEY"))
client = OpenAI(api_key=get_openai_api_key())
chat_completion = client.chat.completions.create(
messages=[
{
Expand Down Expand Up @@ -117,7 +119,7 @@ def evaluate_retrieval(question, expected_answer, corpus, top_n=2):


def evaluate_generation(question, expected_answer, generated_answer):
client = OpenAI(api_key=os.environ.get("OPENAI_API_KEY"))
client = OpenAI(api_key=get_openai_api_key())
chat_completion = client.chat.completions.create(
messages=[
{
Expand Down
Loading
Loading