From a686d4cfe8f27fe5a6b5b2414c3547b1c88d45c6 Mon Sep 17 00:00:00 2001 From: Kelly Brown Date: Thu, 21 Nov 2024 15:07:21 -0500 Subject: [PATCH] Fix rendering issue in getting sterted Signed-off-by: Kelly Brown --- docs/getting-started/download_models.md | 42 +++- docs/getting-started/initilize_ilab.md | 152 ++++++------- docs/getting-started/linux_amd.md | 84 +++---- docs/getting-started/linux_nvidia.md | 288 +++--------------------- docs/getting-started/mac_metal.md | 76 +++---- docs/getting-started/serve_and_chat.md | 16 +- 6 files changed, 217 insertions(+), 441 deletions(-) diff --git a/docs/getting-started/download_models.md b/docs/getting-started/download_models.md index 826e8bb..11ea492 100644 --- a/docs/getting-started/download_models.md +++ b/docs/getting-started/download_models.md @@ -6,7 +6,7 @@ logo: images/ilab_dog.png # 📥 Download the model -- Run the `ilab model download` command. +1) Run the ilab model download command to download a compact pre-trained version of the `granite-7b-lab-GGUF`, `merlinite-7b-lab-GGUF`, and `Mistral-7B-Instruct-v0.2-GGUF` models (~4.4G each) from HuggingFace. ```shell ilab model download @@ -14,11 +14,27 @@ ilab model download `ilab model download` downloads a compact pre-trained version of the [model](https://huggingface.co/instructlab/) (~4.4G) from HuggingFace: +*Example output of the models downloading* + +```shell +Downloading model from Hugging Face: + Model: instructlab/granite-7b-lab-GGUF@main + Destination: /Users//.cache/instructlab/models +Downloading model from Hugging Face: + Model: instructlab/merlinite-7b-lab-GGUF@main + Destination: /Users//.cache/instructlab/models +Downloading model from Hugging Face: + Model: TheBloke/Mistral-7B-Instruct-v0.2-GGUF@main + Destination: /Users//.cache/instructlab/models + +TheBloke/Mistral-7B-Instruct-v0.2-GGUF requires a HF Token to be set. +Please use '--hf-token' or 'export HF_TOKEN' to download all necessary models. +``` + +a) You may be prompted to use your Hugging Face token to download the `Mistral-7B-Instruct-v0.2-GGUF` model. + ```shell -(venv) $ ilab model download -Downloading model from Hugging Face: instructlab/merlinite-7b-lab-GGUF@main to /Users/USERNAME/Library/Caches/instructlab/models... -... -INFO 2024-08-01 15:05:48,464 huggingface_hub.file_download:1893: Download complete. Moving file to /Users/USERNAME/Library/Caches/instructlab/models/merlinite-7b-lab-Q4_K_M.gguf +ilab model download --hf-token ``` !!! note @@ -26,10 +42,10 @@ INFO 2024-08-01 15:05:48,464 huggingface_hub.file_download:1893: Download comple ## Downloading an entire Hugging Face repository (Safetensors Model) -- Specify repository, and a Hugging Face token if necessary. For example: +1) Specify repository, and a Hugging Face token if necessary. For example: ```shell -HF_TOKEN= ilab model download --repository=instructlab/granite-7b-lab +ilab model download --repository instructlab/granite-7b-lab-GGUF --filename granite-7b-lab-Q4_K_M.gguf --hf-token ``` These types of models are useful for GPU-enabled systems or anyone looking to serve a model using vLLM. InstructLab provides Safetensor versions of our Granite models on HuggingFace. @@ -46,9 +62,11 @@ ilab model list ```shell (venv) $ ilab model list -+------------------------------+---------------------+--------+ -| Model Name | Last Modified | Size | -+------------------------------+---------------------+--------+ -| merlinite-7b-lab-Q4_K_M.gguf | 2024-08-01 15:05:48 | 4.1 GB | -+------------------------------+---------------------+--------+ ++-------------------------------------+---------------------+--------+ +| Model Name | Last Modified | Size | ++-------------------------------------+---------------------+--------+ +| granite-7b-lab-Q4_K_M.gguf | 2024-08-01 15:05:48 | 4.1 GB | +| merlinite-7b-lab-Q4_K_M.gguf | 2024-08-01 15:05:48 | 4.1 GB | +| mistral-7b-instruct-v0.2.Q4_K_M.gguf| 2024-08-01 15:05:48 | 4.1 GB | ++-------------------------------------+---------------------+--------+ ``` diff --git a/docs/getting-started/initilize_ilab.md b/docs/getting-started/initilize_ilab.md index 3e8b3a0..1dffaff 100644 --- a/docs/getting-started/initilize_ilab.md +++ b/docs/getting-started/initilize_ilab.md @@ -6,112 +6,88 @@ logo: images/ilab_dog.png # 🏗️ Initialize `ilab` -### 🏗️ Initialize `ilab` +1) Initialize `ilab` by running the following command: -1. Initialize `ilab` by running the following command: - - ```shell - ilab config init - ``` +```shell +ilab config init +``` -2. When prompted, clone the `https://github.com/instructlab/taxonomy.git` repository into the current directory by typing **enter** +2) When prompted, clone the `https://github.com/instructlab/taxonomy.git` repository into the current directory by typing **enter** **Optional**: If you want to point to an existing local clone of the `taxonomy` repository, you can pass the path interactively or alternatively with the `--taxonomy-path` flag. `ilab` will use the default configuration file unless otherwise specified. You can override this behavior with the `--config` parameter for any `ilab` command. -3. When prompted, provide the path to your default model. Otherwise, the default of a quantized [Merlinite](https://huggingface.co/instructlab/merlinite-7b-lab-GGUF) model is used. - - *Example output of steps 1 - 3* - - ```shell - ---------------------------------------------------- - Welcome to the InstructLab CLI - This guide will help you to setup your environment - ---------------------------------------------------- - - Please provide the following values to initiate the environment [press Enter for defaults]: - Path to taxonomy repo [/Users/kellybrown/.local/share/instructlab/taxonomy]: - Path to your model [/Users/kellybrown/.cache/instructlab/models/merlinite-7b-lab-Q4_K_M.gguf]: - ``` - - You can download this model with `ilab model download` command as well. - -4. The InstructLab CLI auto-detects your hardware and select the exact system profile that matches your machine. System profiles populate the `config.yaml` file with the proper parameter values based on your detected GPU types and avaiible vRAM. - - *Example output of profile auto-detection* - - ```shell - Generating config file and profiles: - /home/user/.config/instructlab/config.yaml - /home/user/.local/share/instructlab/internal/train_configuration/profiles - - We have detected the AMD CPU profile as an exact match for your system. - - -------------------------------------------- - Initialization completed successfully! - You're ready to start using `ilab`. Enjoy! - -------------------------------------------- - ``` - -5. If there is not an exact match for your system, you can manually select a system profile when prompted. There are various flags you can utilize with individual `ilab` commands that allow you to utilize your GPU if applicable. - - *Example output of selecting a system profile* - - ```shell - Please choose a system profile to use. - System profiles apply to all parts of the config file and set hardware specific defaults for each command. - First, please select the hardware vendor your system falls into - [1] APPLE - [2] INTEL - [3] AMD - [4] NVIDIA - Enter the number of your choice [0]: 1 - You selected: APPLE - Next, please select the specific hardware configuration that most closely matches your system. - [0] No system profile - [1] APPLE M1 ULTRA - [2] APPLE M1 MAX - [3] APPLE M2 MAX - [4] APPLE M2 ULTRA - [5] APPLE M2 PRO - [6] APPLE M2 - [7] APPLE M3 MAX - [8] APPLE M3 PRO - [9] APPLE M3 - Enter the number of your choice [hit enter for hardware defaults] [0]: 8 - You selected: /Users/kellybrown/.local/share/instructlab/internal/system_profiles/apple/m3/m3_pro.yaml - - -------------------------------------------- - Initialization completed successfully! - You're ready to start using `ilab`. Enjoy! - -------------------------------------------- - ``` +3) When prompted, provide the path to your default model. Otherwise, the default of a quantized [Merlinite](https://huggingface.co/instructlab/merlinite-7b-lab-GGUF) model is used. - The GPU profiles are listed by GPU type and number of GPUs present. If you happen to have a GPU configuration with a similar amount of vRAM as any of the above profiles, feel free to try them out! +*Example output of steps 1 - 3* +```shell +---------------------------------------------------- + Welcome to the InstructLab CLI +This guide will help you to setup your environment +---------------------------------------------------- + +Please provide the following values to initiate the environment [press Enter for defaults]: +Path to taxonomy repo [/Users//.local/share/instructlab/taxonomy]: +Path to your model [/Users//.cache/instructlab/models/merlinite-7b-lab-Q4_K_M.gguf]: +``` -### `ilab` directory layout after initializing your system +You can download this model with `ilab model download` command as well. -### Mac directory +4) The InstructLab CLI auto-detects your hardware and select the exact system profile that matches your machine. System profiles populate the `config.yaml` file with the proper parameter values based on your detected GPU types and avaiible vRAM. -After running `ilab config init` your directories will look like the following on a Mac system: +*Example output of profile auto-detection* ```shell -├─ ~/Library/Application\ Support/instructlab/models/ (1) -├─ ~/Library/Application\ Support/instructlab/datasets (2) -├─ ~/Library/Application\ Support/instructlab/taxonomy (3) -├─ ~/Library/Application\ Support/instructlab/checkpoints (4) -``` +Generating config file and profiles: +/home/user/.config/instructlab/config.yaml +/home/user/.local/share/instructlab/internal/train_configuration/profiles - 1) `/Users/USERNAME/Library/Caches/instructlab/models/`: Contains all downloaded large language models, including the saved output of ones you generate with ilab. +We have detected the AMD CPU profile as an exact match for your system. - 2) `~/Library/Application\ Support/instructlab/datasets/`: Contains data output from the SDG phase, built on modifications to the taxonomy repository. +-------------------------------------------- + Initialization completed successfully! + You're ready to start using `ilab`. Enjoy! +-------------------------------------------- +``` - 3) `~/Library/Application\ Support/instructlab/taxonomy/`: Contains the skill and knowledge data. +5) If there is not an exact match for your system, you can manually select a system profile when prompted. There are various flags you can utilize with individual `ilab` commands that allow you to utilize your GPU if applicable. - 4) `~/Users/USERNAME/Library/Caches/instructlab/checkpoints/`: Contains the output of the training process +*Example output of selecting a system profile* -### Linux directory +```shell +Please choose a system profile to use. +System profiles apply to all parts of the config file and set hardware specific defaults for each command. +First, please select the hardware vendor your system falls into +[1] APPLE +[2] INTEL +[3] AMD +[4] NVIDIA +Enter the number of your choice [0]: 1 +You selected: APPLE +Next, please select the specific hardware configuration that most closely matches your system. +[0] No system profile +[1] APPLE M1 ULTRA +[2] APPLE M1 MAX +[3] APPLE M2 MAX +[4] APPLE M2 ULTRA +[5] APPLE M2 PRO +[6] APPLE M2 +[7] APPLE M3 MAX +[8] APPLE M3 PRO +[9] APPLE M3 +Enter the number of your choice [hit enter for hardware defaults] [0]: 8 +You selected: /Users/kellybrown/.local/share/instructlab/internal/system_profiles/apple/m3/m3_pro.yaml + +-------------------------------------------- + Initialization completed successfully! +You're ready to start using `ilab`. Enjoy! +-------------------------------------------- +``` + +The GPU profiles are listed by GPU type and number of GPUs present. If you happen to have a GPU configuration with a similar amount of vRAM as any of the above profiles, feel free to try them out! + +### `ilab` directory layout after initializing your system After running `ilab config init` your directories will look like the following on a Linux system: diff --git a/docs/getting-started/linux_amd.md b/docs/getting-started/linux_amd.md index 733fa60..c0d272a 100644 --- a/docs/getting-started/linux_amd.md +++ b/docs/getting-started/linux_amd.md @@ -45,59 +45,59 @@ The following steps in this document use [Python venv](https://docs.python.org/3 1) Install with AMD ROCm - ```bash - python3 -m venv --upgrade-deps venv - source venv/bin/activate - pip cache remove llama_cpp_python - pip install 'instructlab[rocm]' \ - --extra-index-url https://download.pytorch.org/whl/rocm6.0 \ - -C cmake.args="-DLLAMA_HIPBLAS=on" \ - -C cmake.args="-DAMDGPU_TARGETS=all" \ - -C cmake.args="-DCMAKE_C_COMPILER=/opt/rocm/llvm/bin/clang" \ - -C cmake.args="-DCMAKE_CXX_COMPILER=/opt/rocm/llvm/bin/clang++" \ - -C cmake.args="-DCMAKE_PREFIX_PATH=/opt/rocm" \ - -C cmake.args="-DLLAMA_NATIVE=off" - ``` +```bash +python3 -m venv --upgrade-deps venv +source venv/bin/activate +pip cache remove llama_cpp_python +pip install 'instructlab[rocm]' \ +--extra-index-url https://download.pytorch.org/whl/rocm6.0 \ +-C cmake.args="-DLLAMA_HIPBLAS=on" \ +-C cmake.args="-DAMDGPU_TARGETS=all" \ +-C cmake.args="-DCMAKE_C_COMPILER=/opt/rocm/llvm/bin/clang" \ +-C cmake.args="-DCMAKE_CXX_COMPILER=/opt/rocm/llvm/bin/clang++" \ +-C cmake.args="-DCMAKE_PREFIX_PATH=/opt/rocm" \ +-C cmake.args="-DLLAMA_NATIVE=off" +``` On Fedora 40+, use `-DCMAKE_C_COMPILER=clang-17` and `-DCMAKE_CXX_COMPILER=clang++-17.` 2) From your `venv` environment, verify `ilab` is installed correctly, by running the `ilab` command. - ```shell - ilab - ``` +```shell +ilab +``` - *Example output of the `ilab` command* +*Example output of the `ilab` command* - ```shell - (venv) $ ilab - Usage: ilab [OPTIONS] COMMAND [ARGS]... +```shell +(venv) $ ilab +Usage: ilab [OPTIONS] COMMAND [ARGS]... - CLI for interacting with InstructLab. +CLI for interacting with InstructLab. - If this is your first time running ilab, it's best to start with `ilab - config init` to create the environment. +If this is your first time running ilab, it's best to start with `ilab +config init` to create the environment. - Options: - --config PATH Path to a configuration file. [default: +Options: +--config PATH Path to a configuration file. [default: /Users/kellybrown/.config/instructlab/config.yaml] - -v, --verbose Enable debug logging (repeat for even more verbosity) - --version Show the version and exit. - --help Show this message and exit. - - Commands: - config Command Group for Interacting with the Config of InstructLab. - data Command Group for Interacting with the Data generated by... - model Command Group for Interacting with the Models in InstructLab. - system Command group for all system-related command calls - taxonomy Command Group for Interacting with the Taxonomy of InstructLab. - - Aliases: - chat model chat - generate data generate - serve model serve - train model train - ``` +-v, --verbose Enable debug logging (repeat for even more verbosity) +--version Show the version and exit. +--help Show this message and exit. + +Commands: +config Command Group for Interacting with the Config of InstructLab. +data Command Group for Interacting with the Data generated by... +model Command Group for Interacting with the Models in InstructLab. +system Command group for all system-related command calls +taxonomy Command Group for Interacting with the Taxonomy of InstructLab. + +Aliases: +chat model chat +generate data generate +serve model serve +train model train +``` !!! important Every `ilab` command needs to be run from within your Python virtual environment. You can enter the Python environment by running the `source venv/bin/activate` command. diff --git a/docs/getting-started/linux_nvidia.md b/docs/getting-started/linux_nvidia.md index ee7da0e..fffb2b8 100644 --- a/docs/getting-started/linux_nvidia.md +++ b/docs/getting-started/linux_nvidia.md @@ -44,15 +44,15 @@ The following steps in this document use [Python venv](https://docs.python.org/3 For the best CUDA experience, installing vLLM is necessary to serve Safetensors format models. - ```bash - python3 -m venv --upgrade-deps venv - source venv/bin/activate - pip cache remove llama_cpp_python - pip install 'instructlab[cuda]' \ - -C cmake.args="-DLLAMA_CUDA=on" \ - -C cmake.args="-DLLAMA_NATIVE=off" - pip install vllm@git+https://github.com/opendatahub-io/vllm@2024.08.01 - ``` +```bash +python3 -m venv --upgrade-deps venv +source venv/bin/activate +pip cache remove llama_cpp_python +pip install 'instructlab[cuda]' \ +-C cmake.args="-DLLAMA_CUDA=on" \ +-C cmake.args="-DLLAMA_NATIVE=off" +pip install vllm@git+https://github.com/opendatahub-io/vllm@2024.08.01 +``` 2) From your `venv` environment, verify `ilab` is installed correctly, by running the `ilab` command. @@ -62,35 +62,35 @@ ilab *Example output of the `ilab` command* - ```shell - (venv) $ ilab - Usage: ilab [OPTIONS] COMMAND [ARGS]... +```shell +(venv) $ ilab +Usage: ilab [OPTIONS] COMMAND [ARGS]... - CLI for interacting with InstructLab. +CLI for interacting with InstructLab. - If this is your first time running ilab, it's best to start with `ilab - config init` to create the environment. +If this is your first time running ilab, it's best to start with `ilab +config init` to create the environment. - Options: - --config PATH Path to a configuration file. [default: +Options: +--config PATH Path to a configuration file. [default: /Users/kellybrown/.config/instructlab/config.yaml] - -v, --verbose Enable debug logging (repeat for even more verbosity) - --version Show the version and exit. - --help Show this message and exit. - - Commands: - config Command Group for Interacting with the Config of InstructLab. - data Command Group for Interacting with the Data generated by... - model Command Group for Interacting with the Models in InstructLab. - system Command group for all system-related command calls - taxonomy Command Group for Interacting with the Taxonomy of InstructLab. - - Aliases: - chat model chat - generate data generate - serve model serve - train model train` - ``` +-v, --verbose Enable debug logging (repeat for even more verbosity) +--version Show the version and exit. +--help Show this message and exit. + +Commands: +config Command Group for Interacting with the Config of InstructLab. +data Command Group for Interacting with the Data generated by... +model Command Group for Interacting with the Models in InstructLab. +system Command group for all system-related command calls +taxonomy Command Group for Interacting with the Taxonomy of InstructLab. + +Aliases: +chat model chat +generate data generate +serve model serve +train model train +``` !!! important Every `ilab` command needs to be run from within your Python virtual environment. You can enter the Python environment by running the `source venv/bin/activate` command. @@ -142,222 +142,4 @@ you can save the completion script and source it from `~/.bashrc`: ```sh _ILAB_COMPLETE=fish_source ilab > ~/.config/fish/completions/ilab.fish -``` - -### 🏗️ Initialize `ilab` - -1) Initialize `ilab` by running the following command: - -```shell -ilab config init -``` - -*Example output* - -```shell -Welcome to InstructLab CLI. This guide will help you set up your environment. -Please provide the following values to initiate the environment [press Enter for defaults]: -Path to taxonomy repo [taxonomy]: -``` - -2) When prompted by the interface, press **Enter** to add a new default `config.yaml` file. - -3) When prompted, clone the `https://github.com/instructlab/taxonomy.git` repository into the current directory by typing **y**. - - **Optional**: If you want to point to an existing local clone of the `taxonomy` repository, you can pass the path interactively or alternatively with the `--taxonomy-path` flag. - - *Example output after initializing `ilab`* - - ```shell - (venv) $ ilab config init - Welcome to InstructLab CLI. This guide will help you set up your environment. - Please provide the following values to initiate the environment [press Enter for defaults]: - Path to taxonomy repo [taxonomy]: - `taxonomy` seems to not exists or is empty. Should I clone https://github.com/instructlab/taxonomy.git for you? [y/N]: y - Cloning https://github.com/instructlab/taxonomy.git... - ``` - - `ilab` will use the default configuration file unless otherwise specified. You can override this behavior with the `--config` parameter for any `ilab` command. - -4) When prompted, provide the path to your default model. Otherwise, the default of a quantized [Merlinite](https://huggingface.co/instructlab/merlinite-7b-lab-GGUF) model will be used - you can download this model with `ilab model download` (see below). - - ```shell - (venv) $ ilab config init - Welcome to InstructLab CLI. This guide will help you set up your environment. - Please provide the following values to initiate the environment [press Enter for defaults]: - Path to taxonomy repo [taxonomy]: - `taxonomy` seems to not exists or is empty. Should I clone https://github.com/instructlab/taxonomy.git for you? [y/N]: y - Cloning https://github.com/instructlab/taxonomy.git... - Path to your model [/home/user/.cache/instructlab/models/merlinite-7b-lab-Q4_K_M.gguf]: - ``` - -5) When prompted, please choose a train profile. Train profiles are GPU specific profiles that enable accelerated training behavior. **YOU ARE ON LINUX**, please choose `No Profile (CPU-Only)` by hitting Enter. There are various flags you can utilize with individual `ilab` commands that will allow you to utilize your GPU if applicable. - - ```shell - Welcome to InstructLab CLI. This guide will help you to setup your environment. - Please provide the following values to initiate the environment [press Enter for defaults]: - Path to taxonomy repo [/home/user/.local/share/instructlab/taxonomy]: - Path to your model [/home/user/.cache/instructlab/models/merlinite-7b-lab-Q4_K_M.gguf]: - Generating `/home/user/.config/instructlab/config.yaml`... - Please choose a train profile to use: - [0] No profile (CPU-only) - [1] A100_H100_x2.yaml - [2] A100_H100_x4.yaml - [3] A100_H100_x8.yaml - [4] L40_x4.yaml - [5] L40_x8.yaml - [6] L4_x8.yaml - Enter the number of your choice [hit enter for the default CPU-only profile] [0]: - Using default CPU-only train profile. - Initialization completed successfully, you're ready to start using `ilab`. Enjoy! - ``` - - The GPU profiles are listed by GPU type and number. If you happen to have a GPU configuration with a similar amount of VRAM as any of the above profiles, feel free to try them out! - -### `ilab` directory layout after initializing your system - -After running `ilab config init` your directories will look like the following on a Linux system: - -```shell -├─ ~/.cache/instructlab/models/ (1) -├─ ~/.local/share/instructlab/datasets (2) -├─ ~/.local/share/instructlab/taxonomy (3) -├─ ~/.local/share/instructlab/checkpoints (4) -``` - -1) `~/.cache/instructlab/models/`: Contains all downloaded large language models, including the saved output of ones you generate with ilab. -2) `~/.local/share/instructlab/datasets/`: Contains data output from the SDG phase, built on modifications to the taxonomy repository. -3) `~/.local/share/instructlab/taxonomy/`: Contains the skill and knowledge data. -4) `~/.local/share/instructlab/checkpoints/`: Contains the output of the training process - - -### 📥 Download the model - -- Run the `ilab model download` command. - -```shell -ilab model download -``` - -`ilab model download` downloads a compact pre-trained version of the [model](https://huggingface.co/instructlab/) (~4.4G) from HuggingFace: - -```shell -(venv) $ ilab model download -Downloading model from Hugging Face: instructlab/merlinite-7b-lab-GGUF@main to /Users/USERNAME/Library/Caches/instructlab/models... -... -INFO 2024-08-01 15:05:48,464 huggingface_hub.file_download:1893: Download complete. Moving file to /home/user/.cache/instructlab/models/merlinite-7b-lab-Q4_K_M.gguf -``` - -!!! note - ⏳ This command can take few minutes or immediately depending on your internet connection or model is cached. If you have issues connecting to Hugging Face, refer to the [Hugging Face discussion forum](https://discuss.huggingface.co/) for more details. - -#### Downloading an entire Hugging Face repository (Safetensors Model) - -- Specify repository, and a Hugging Face token if necessary. For example: - -```shell -HF_TOKEN= ilab model download --repository=TheBloke/Mixtral-8x7B-Instruct-v0.1-GGUF --filename=mixtral-8x7b-instruct-v0.1.Q4_K_M.gguf -``` - -These types of models are useful for GPU-enabled systems or anyone looking to serve a model using vLLM. InstructLab provides Safetensor versions of our Granite models on HuggingFace. - -#### Listing downloaded models - -All downloaded models can be seen with `ilab model list`. - -```shell -ilab model list -``` - -*Example output of `ilab model list` after `ilab model download`* - -```shell -(venv) $ ilab model list -+------------------------------+---------------------+--------+ -| Model Name | Last Modified | Size | -+------------------------------+---------------------+--------+ -| merlinite-7b-lab-Q4_K_M.gguf | 2024-08-01 15:05:48 | 4.1 GB | -+------------------------------+---------------------+--------+ -``` - -### 🍴 Serving the model - -- Serve the model by running the following command: - -```shell -ilab model serve -``` - -erve a non-default model (e.g. Mixtral-8x7B-Instruct-v0.1): - -```shell -ilab model serve --model-path models/mixtral-8x7b-instruct-v0.1.Q4_K_M.gguf -``` - -nce the model is served and ready, you'll see the following output: - -```shell -(venv) $ ilab model serve -INFO 2024-03-02 02:21:11,352 lab.py:201 Using model 'models/ggml-merlinite-7b-lab-Q4_K_M.gguf' with -1 gpu-layers and 4096 max context size. -Starting server process -After application startup complete see http://127.0.0.1:8000/docs for API. -Press CTRL+C to shut down the server. -``` - -!!! note - If multiple `ilab` clients try to connect to the same InstructLab server at the same time, the 1st will connect to the server while the others will start their own temporary server. This will require additional resources on the host machine. - -- Serve a non-default Safetensors model (e.g. granite-7b-lab). NOTE: this requires a GPU. - -Ensure vllm is installed: - -```shell -pip show vllm -``` - -If it is not, please run: - -```shell -pip install vllm@git+https://github.com/opendatahub-io/vllm@2024.08.01 -``` - -```shell -ilab model serve --model-path ~/.cache/instructlab/models/instructlab/granite-7b-lab -``` - -### 📣 Chat with the model (Optional) - -Because you're serving the model in one terminal window, you will have to create a new window and re-activate your Python virtual environment to run `ilab model chat` command: - -```shell -source venv/bin/activate -ilab model chat -``` - -Chat with a non-default model (e.g. Mixtral-8x7B-Instruct-v0.1): - -```shell -source venv/bin/activate -ilab model chat --model models/mixtral-8x7b-instruct-v0.1.Q4_K_M.gguf -``` - -Please note that usage of `--model` necessitates that the existing server has that model. If not, you must exit the server. `--model` in `ilab model chat` has the ability to start a server on your behalf with the specified model if one is not already running on the port. - -Before you start adding new skills and knowledge to your model, you can check its baseline performance by asking it a question such as `what is the capital of Canada?`. - -!!! note - The model needs to be trained with the generated synthetic data to use the new skills or knowledge - - -```shell -(venv) $ ilab model chat -╭────────────────────────────────────────────────────────────────────────────────────────────────────────────────── system ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮ -│ Welcome to InstructLab Chat w/ GGML-MERLINITE-7B-lab-Q4_K_M (type /h for help) │ -╰────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯ ->>> what is the capital of Canada [S][default] -╭────────────────────────────────────────────────────────────────────────────────────────────────────── ggml-merlinite-7b-lab-Q4_K_M ───────────────────────────────────────────────────────────────────────────────────────────────────────╮ -│ The capital city of Canada is Ottawa. It is located in the province of Ontario, on the southern banks of the Ottawa River in the eastern portion of southern Ontario. The city serves as the political center for Canada, as it is home to │ -│ Parliament Hill, which houses the House of Commons, Senate, Supreme Court, and Cabinet of Canada. Ottawa has a rich history and cultural significance, making it an essential part of Canada's identity. │ -╰─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── elapsed 12.008 seconds ─╯ -``` - +``` \ No newline at end of file diff --git a/docs/getting-started/mac_metal.md b/docs/getting-started/mac_metal.md index 467562f..1247c2e 100644 --- a/docs/getting-started/mac_metal.md +++ b/docs/getting-started/mac_metal.md @@ -36,54 +36,54 @@ The following steps in this document use [Python venv](https://docs.python.org/3 1) Install with Apple Metal on M1/M2/M3 Macs: - ```shell - python3.11 -m venv --upgrade-deps venv - source venv/bin/activate - pip cache remove llama_cpp_python - pip install instructlab - ``` +```shell +python3.11 -m venv --upgrade-deps venv +source venv/bin/activate +pip cache remove llama_cpp_python +pip install instructlab +``` - !!! note - Make sure your system Python build is `Mach-O 64-bit executable arm64` by using `file -b $(command -v python)`, - or if your system is setup with [pyenv](https://github.com/pyenv/pyenv) by using the `file -b $(pyenv which python)` command. +!!! note + Make sure your system Python build is `Mach-O 64-bit executable arm64` by using `file -b $(command -v python)`, + or if your system is setup with [pyenv](https://github.com/pyenv/pyenv) by using the `file -b $(pyenv which python)` command. 2) From your `venv` environment, verify `ilab` is installed correctly, by running the `ilab` command. - ```shell - ilab - ``` +```shell +ilab +``` - *Example output of the `ilab` command* +*Example output of the `ilab` command* - ```shell - (venv) $ ilab - Usage: ilab [OPTIONS] COMMAND [ARGS]... +```shell +(venv) $ ilab +Usage: ilab [OPTIONS] COMMAND [ARGS]... - CLI for interacting with InstructLab. +CLI for interacting with InstructLab. - If this is your first time running ilab, it's best to start with `ilab - config init` to create the environment. +If this is your first time running ilab, it's best to start with `ilab +config init` to create the environment. - Options: - --config PATH Path to a configuration file. [default: +Options: +--config PATH Path to a configuration file. [default: /Users/kellybrown/.config/instructlab/config.yaml] - -v, --verbose Enable debug logging (repeat for even more verbosity) - --version Show the version and exit. - --help Show this message and exit. - - Commands: - config Command Group for Interacting with the Config of InstructLab. - data Command Group for Interacting with the Data generated by... - model Command Group for Interacting with the Models in InstructLab. - system Command group for all system-related command calls - taxonomy Command Group for Interacting with the Taxonomy of InstructLab. - - Aliases: - chat model chat - generate data generate - serve model serve - train model train - ``` +-v, --verbose Enable debug logging (repeat for even more verbosity) +--version Show the version and exit. +--help Show this message and exit. + +Commands: +config Command Group for Interacting with the Config of InstructLab. +data Command Group for Interacting with the Data generated by... +model Command Group for Interacting with the Models in InstructLab. +system Command group for all system-related command calls +taxonomy Command Group for Interacting with the Taxonomy of InstructLab. + +Aliases: +chat model chat +generate data generate +serve model serve +train model train +``` !!! important Every `ilab` command needs to be run from within your Python virtual environment. You can enter the Python environment by running the `source venv/bin/activate` command. diff --git a/docs/getting-started/serve_and_chat.md b/docs/getting-started/serve_and_chat.md index d942893..a4f7fdb 100644 --- a/docs/getting-started/serve_and_chat.md +++ b/docs/getting-started/serve_and_chat.md @@ -6,19 +6,19 @@ logo: images/ilab_dog.png # 🍴 Serving the model -- Serve the model by running the following command: +Serve the model by running the following command: ```shell ilab model serve ``` -Serve a non-default model: +Serve a non-default model with the following command: ```shell ilab model serve --model-path models/granite-7b-instruct.GGUF ``` -Once the model is served and ready, you'll see the following output: +*Example output of a model that is served and ready* ```shell (venv) $ ilab model serve @@ -31,15 +31,15 @@ Press CTRL+C to shut down the server. !!! note If multiple `ilab` clients try to connect to the same InstructLab server at the same time, the 1st will connect to the server while the others will start their own temporary server. This will require additional resources on the host machine. -- Serve a non-default Safetensors model (e.g. granite-7b-lab). NOTE: this requires a GPU. +Serve a non-default Safetensors model (e.g. granite-7b-lab). NOTE: this requires a GPU. -Ensure vllm is installed: +a. Ensure vllm is installed: ```shell pip show vllm ``` -If it is not, please run: +b. If it is not, please run: ```shell pip install vllm@git+https://github.com/opendatahub-io/vllm@2024.08.01 @@ -69,8 +69,8 @@ Please note that usage of `--model` necessitates that the existing server has th Before you start adding new skills and knowledge to your model, you can check its baseline performance by asking it a question such as `what is the capital of Canada?`. -> [!NOTE] -> The model needs to be trained with the generated synthetic data to use the new skills or knowledge +!!! note + The model needs to be trained with the generated synthetic data to use any new skills or knowledge ```shell