Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 5 additions & 5 deletions CodeGen/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -85,12 +85,12 @@ Currently we support two ways of deploying ChatQnA services with docker compose:

By default, the LLM model is set to a default value as listed below:

| Service | Model |
| ------------ | ------------------------------------------------------------------------------- |
| LLM_MODEL_ID | [meta-llama/CodeLlama-7b-hf](https://huggingface.co/meta-llama/CodeLlama-7b-hf) |
| Service | Model |
| ------------ | --------------------------------------------------------------------------------------- |
| LLM_MODEL_ID | [Qwen/Qwen2.5-Coder-7B-Instruct](https://huggingface.co/Qwen/Qwen2.5-Coder-7B-Instruct) |

[meta-llama/CodeLlama-7b-hf](https://huggingface.co/meta-llama/CodeLlama-7b-hf) is a gated model that requires submitting an access request through Hugging Face. You can replace it with another model.
Change the `LLM_MODEL_ID` below for your needs, such as: [Qwen/CodeQwen1.5-7B-Chat](https://huggingface.co/Qwen/CodeQwen1.5-7B-Chat), [deepseek-ai/deepseek-coder-6.7b-instruct](https://huggingface.co/deepseek-ai/deepseek-coder-6.7b-instruct)
[Qwen/Qwen2.5-Coder-7B-Instruct](https://huggingface.co/Qwen/Qwen2.5-Coder-7B-Instruct) may be a gated model that requires submitting an access request through Hugging Face. You can replace it with another model.
Change the `LLM_MODEL_ID` below for your needs, such as: [deepseek-ai/deepseek-coder-6.7b-instruct](https://huggingface.co/deepseek-ai/deepseek-coder-6.7b-instruct)

If you choose to use `meta-llama/CodeLlama-7b-hf` as LLM model, you will need to visit [here](https://huggingface.co/meta-llama/CodeLlama-7b-hf), click the `Expand to review and access` button to ask for model access.

Expand Down
2 changes: 1 addition & 1 deletion CodeGen/docker_compose/intel/cpu/xeon/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -105,7 +105,7 @@ export your_no_proxy=${your_no_proxy},"External_Public_IP"
export no_proxy=${your_no_proxy}
export http_proxy=${your_http_proxy}
export https_proxy=${your_http_proxy}
export LLM_MODEL_ID="meta-llama/CodeLlama-7b-hf"
export LLM_MODEL_ID="Qwen/Qwen2.5-Coder-7B-Instruct"
export TGI_LLM_ENDPOINT="http://${host_ip}:8028"
export HUGGINGFACEHUB_API_TOKEN=${your_hf_api_token}
export MEGA_SERVICE_HOST_IP=${host_ip}
Expand Down
2 changes: 1 addition & 1 deletion CodeGen/docker_compose/intel/hpu/gaudi/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -85,7 +85,7 @@ Since the `compose.yaml` will consume some environment variables, you need to se
export no_proxy=${your_no_proxy}
export http_proxy=${your_http_proxy}
export https_proxy=${your_http_proxy}
export LLM_MODEL_ID="meta-llama/CodeLlama-7b-hf"
export LLM_MODEL_ID="Qwen/Qwen2.5-Coder-7B-Instruct"
export TGI_LLM_ENDPOINT="http://${host_ip}:8028"
export HUGGINGFACEHUB_API_TOKEN=${your_hf_api_token}
export MEGA_SERVICE_HOST_IP=${host_ip}
Expand Down
2 changes: 1 addition & 1 deletion CodeGen/docker_compose/set_env.sh
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
# SPDX-License-Identifier: Apache-2.0


export LLM_MODEL_ID="meta-llama/CodeLlama-7b-hf"
export LLM_MODEL_ID="Qwen/Qwen2.5-Coder-7B-Instruct"
export TGI_LLM_ENDPOINT="http://${host_ip}:8028"
export MEGA_SERVICE_HOST_IP=${host_ip}
export LLM_SERVICE_HOST_IP=${host_ip}
Expand Down
2 changes: 1 addition & 1 deletion CodeGen/kubernetes/intel/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@
```
cd GenAIExamples/CodeGen/kubernetes/intel/cpu/xeon/manifests
export HUGGINGFACEHUB_API_TOKEN="YourOwnToken"
export MODEL_ID="meta-llama/CodeLlama-7b-hf"
export MODEL_ID="Qwen/Qwen2.5-Coder-7B-Instruct"
sed -i "s/insert-your-huggingface-token-here/${HUGGINGFACEHUB_API_TOKEN}/g" codegen.yaml
sed -i "s/meta-llama\/CodeLlama-7b-hf/${MODEL_ID}/g" codegen.yaml
kubectl apply -f codegen.yaml
Expand Down
2 changes: 1 addition & 1 deletion CodeGen/kubernetes/intel/cpu/xeon/gmc/codegen_xeon.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,6 @@ spec:
internalService:
serviceName: tgi-service
config:
MODEL_ID: meta-llama/CodeLlama-7b-hf
MODEL_ID: Qwen/Qwen2.5-Coder-7B-Instruct
endpoint: /generate
isDownstreamService: true
2 changes: 1 addition & 1 deletion CodeGen/kubernetes/intel/cpu/xeon/manifest/codegen.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -64,7 +64,7 @@ metadata:
app.kubernetes.io/version: "2.1.0"
app.kubernetes.io/managed-by: Helm
data:
MODEL_ID: "meta-llama/CodeLlama-7b-hf"
MODEL_ID: "Qwen/Qwen2.5-Coder-7B-Instruct"
PORT: "2080"
HF_TOKEN: "insert-your-huggingface-token-here"
http_proxy: ""
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,6 @@ spec:
internalService:
serviceName: tgi-gaudi-svc
config:
MODEL_ID: meta-llama/CodeLlama-7b-hf
MODEL_ID: Qwen/Qwen2.5-Coder-7B-Instruct
endpoint: /generate
isDownstreamService: true
2 changes: 1 addition & 1 deletion CodeGen/kubernetes/intel/hpu/gaudi/manifest/codegen.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -64,7 +64,7 @@ metadata:
app.kubernetes.io/version: "2.1.0"
app.kubernetes.io/managed-by: Helm
data:
MODEL_ID: "meta-llama/CodeLlama-7b-hf"
MODEL_ID: "Qwen/Qwen2.5-Coder-7B-Instruct"
PORT: "2080"
HF_TOKEN: "insert-your-huggingface-token-here"
http_proxy: ""
Expand Down
6 changes: 3 additions & 3 deletions supported_examples.md
Original file line number Diff line number Diff line change
Expand Up @@ -63,9 +63,9 @@ This document introduces the supported examples of GenAIExamples. The supported

[CodeGen](./CodeGen/README.md) is an example of copilot designed for code generation in Visual Studio Code.

| Framework | LLM | Serving | HW | Description |
| ------------------------------------------------------------------------------ | ------------------------------------------------------------------------------- | --------------------------------------------------------------- | ----------- | ----------- |
| [LangChain](https://www.langchain.com)/[LlamaIndex](https://www.llamaindex.ai) | [meta-llama/CodeLlama-7b-hf](https://huggingface.co/meta-llama/CodeLlama-7b-hf) | [TGI](https://github.com/huggingface/text-generation-inference) | Xeon/Gaudi2 | Copilot |
| Framework | LLM | Serving | HW | Description |
| ------------------------------------------------------------------------------ | --------------------------------------------------------------------------------------- | --------------------------------------------------------------- | ----------- | ----------- |
| [LangChain](https://www.langchain.com)/[LlamaIndex](https://www.llamaindex.ai) | [Qwen/Qwen2.5-Coder-7B-Instruct](https://huggingface.co/Qwen/Qwen2.5-Coder-7B-Instruct) | [TGI](https://github.com/huggingface/text-generation-inference) | Xeon/Gaudi2 | Copilot |

### CodeTrans

Expand Down