|
2 | 2 |
|
3 | 3 | # Generative AI Examples
|
4 | 4 |
|
5 |
| -This project provides a collective list of Generative AI (GenAI) and Retrieval-Augmented Generation (RAG) examples such as chatbot with question and answering (ChatQnA), code generation (CodeGen), document summary (DocSum), etc. |
6 |
| - |
7 |
| -[](https://github.com/opea-project/GenAIExamples/releases) |
| 5 | +[](https://github.com/opea-project/GenAIExamples/releases) |
8 | 6 | [](https://github.com/intel/neural-compressor/blob/master/LICENSE)
|
9 | 7 |
|
10 | 8 | ---
|
11 | 9 |
|
12 | 10 | <div align="left">
|
13 | 11 |
|
14 |
| -## GenAI Examples |
15 |
| - |
16 |
| -All the examples are well-validated on Intel platforms. In addition, these examples are: |
17 |
| - |
18 |
| -- <b>Easy to use</b>. Use ecosystem-compliant APIs to build the end-to-end GenAI examples |
19 |
| - |
20 |
| -- <b>Easy to customize</b>. Customize the example using different framework, LLM, embedding, serving etc. |
21 |
| - |
22 |
| -- <b>Easy to deploy</b>. Deploy the GenAI examples with performance on Intel platforms |
23 |
| - |
24 |
| -> **Note**: |
25 |
| -> The below support matrix gives the validated configurations. Feel free to customize per your needs. |
26 |
| -
|
27 |
| -### ChatQnA |
| 12 | +## Introduction |
28 | 13 |
|
29 |
| -[ChatQnA](./ChatQnA/README.md) is an example of chatbot for question and answering through retrieval argumented generation (RAG). |
| 14 | +GenAIComps-based Generative AI examples offer streamlined deployment, testing, and scalability. All examples are fully compatible with Docker and Kubernetes, supporting a wide range of hardware platforms such as Gaudi, Xeon, and other hardwares. |
30 | 15 |
|
31 |
| -<table> |
32 |
| - <tbody> |
33 |
| - <tr> |
34 |
| - <td>Framework</td> |
35 |
| - <td>LLM</td> |
36 |
| - <td>Embedding</td> |
37 |
| - <td>Vector Database</td> |
38 |
| - <td>Serving</td> |
39 |
| - <td>HW</td> |
40 |
| - <td>Description</td> |
41 |
| - </tr> |
42 |
| - <tr> |
43 |
| - <td><a href="https://www.langchain.com">LangChain</a>/<a href="https://www.llamaindex.ai">LlamaIndex</a></td> |
44 |
| - <td><a href="https://huggingface.co/Intel/neural-chat-7b-v3-3">NeuralChat-7B</a></td> |
45 |
| - <td><a href="https://huggingface.co/BAAI/bge-base-en">BGE-Base</a></td> |
46 |
| - <td><a href="https://redis.io/">Redis</a></td> |
47 |
| - <td><a href="https://github.com/huggingface/text-generation-inference">TGI</a> <a href="https://github.com/huggingface/text-embeddings-inference">TEI</a></td> |
48 |
| - <td>Xeon/Gaudi2/GPU</td> |
49 |
| - <td>Chatbot</td> |
50 |
| - </tr> |
51 |
| - <tr> |
52 |
| - <td><a href="https://www.langchain.com">LangChain</a>/<a href="https://www.llamaindex.ai">LlamaIndex</a></td> |
53 |
| - <td><a href="https://huggingface.co/Intel/neural-chat-7b-v3-3">NeuralChat-7B</a></td> |
54 |
| - <td><a href="https://huggingface.co/BAAI/bge-base-en">BGE-Base</a></td> |
55 |
| - <td><a href="https://www.trychroma.com/">Chroma</a></td> |
56 |
| - <td><a href="https://github.com/huggingface/text-generation-inference">TGI</a> <a href="https://github.com/huggingface/text-embeddings-inference">TEI</td> |
57 |
| - <td>Xeon/Gaudi2</td> |
58 |
| - <td>Chatbot</td> |
59 |
| - </tr> |
60 |
| - <tr> |
61 |
| - <td><a href="https://www.langchain.com">LangChain</a>/<a href="https://www.llamaindex.ai">LlamaIndex</a></td> |
62 |
| - <td><a href="https://huggingface.co/mistralai/Mistral-7B-v0.1">Mistral-7B</a></td> |
63 |
| - <td><a href="https://huggingface.co/BAAI/bge-base-en">BGE-Base</a></td> |
64 |
| - <td><a href="https://redis.io/">Redis</a></td> |
65 |
| - <td><a href="https://github.com/huggingface/text-generation-inference">TGI</a> <a href="https://github.com/huggingface/text-embeddings-inference">TEI</td> |
66 |
| - <td>Xeon/Gaudi2</td> |
67 |
| - <td>Chatbot</td> |
68 |
| - </tr> |
69 |
| - <tr> |
70 |
| - <td><a href="https://www.langchain.com">LangChain</a>/<a href="https://www.llamaindex.ai">LlamaIndex</a></td> |
71 |
| - <td><a href="https://huggingface.co/mistralai/Mistral-7B-v0.1">Mistral-7B</a></td> |
72 |
| - <td><a href="https://huggingface.co/BAAI/bge-base-en">BGE-Base</a></td> |
73 |
| - <td><a href="https://qdrant.tech/">Qdrant</a></td> |
74 |
| - <td><a href="https://github.com/huggingface/text-generation-inference">TGI</a> <a href="https://github.com/huggingface/text-embeddings-inference">TEI</td> |
75 |
| - <td>Xeon/Gaudi2</td> |
76 |
| - <td>Chatbot</td> |
77 |
| - </tr> |
78 |
| - <tr> |
79 |
| - <td><a href="https://www.langchain.com">LangChain</a>/<a href="https://www.llamaindex.ai">LlamaIndex</a></td> |
80 |
| - <td><a href="https://huggingface.co/Qwen/Qwen2-7B">Qwen2-7B</a></td> |
81 |
| - <td><a href="https://huggingface.co/BAAI/bge-base-en">BGE-Base</a></td> |
82 |
| - <td><a href="https://redis.io/">Redis</a></td> |
83 |
| - <td><a href=<a href="https://github.com/huggingface/text-embeddings-inference">TEI</td> |
84 |
| - <td>Xeon/Gaudi2</td> |
85 |
| - <td>Chatbot</td> |
86 |
| - </tr> |
87 |
| - </tbody> |
88 |
| -</table> |
| 16 | +## Architecture |
89 | 17 |
|
90 |
| -### CodeGen |
| 18 | +GenAIComps is a service-based tool that includes microservice components such as llm, embedding, reranking, and so on. Using these components, various examples in GenAIExample can be constructed, including ChatQnA, DocSum, etc. |
91 | 19 |
|
92 |
| -[CodeGen](./CodeGen/README.md) is an example of copilot designed for code generation in Visual Studio Code. |
| 20 | +GenAIInfra, part of the OPEA containerization and cloud-native suite, enables quick and efficient deployment of GenAIExamples in the cloud. |
93 | 21 |
|
94 |
| -<table> |
95 |
| - <tbody> |
96 |
| - <tr> |
97 |
| - <td>Framework</td> |
98 |
| - <td>LLM</td> |
99 |
| - <td>Serving</td> |
100 |
| - <td>HW</td> |
101 |
| - <td>Description</td> |
102 |
| - </tr> |
103 |
| - <tr> |
104 |
| - <td><a href="https://www.langchain.com">LangChain</a>/<a href="https://www.llamaindex.ai">LlamaIndex</a></td> |
105 |
| - <td><a href="https://huggingface.co/meta-llama/CodeLlama-7b-hf">meta-llama/CodeLlama-7b-hf</a></td> |
106 |
| - <td><a href="https://github.com/huggingface/text-generation-inference">TGI</a></td> |
107 |
| - <td>Xeon/Gaudi2</td> |
108 |
| - <td>Copilot</td> |
109 |
| - </tr> |
110 |
| - </tbody> |
111 |
| -</table> |
| 22 | +GenAIEvals measures service performance metrics such as throughput, latency, and accuracy for GenAIExamples. This feature helps users compare performance across various hardware configurations easily. |
112 | 23 |
|
113 |
| -### CodeTrans |
| 24 | +## Getting Started |
114 | 25 |
|
115 |
| -[CodeTrans](./CodeTrans/README.md) is an example of chatbot for converting code written in one programming language to another programming language while maintaining the same functionality. |
| 26 | +GenAIExamples offers flexible deployment options that cater to different user needs, enabling efficient use and deployment in various environments. Here’s a brief overview of the three primary methods: Python startup, Docker Compose, and Kubernetes. |
116 | 27 |
|
117 |
| -<table> |
118 |
| - <tbody> |
119 |
| - <tr> |
120 |
| - <td>Framework</td> |
121 |
| - <td>LLM</td> |
122 |
| - <td>Serving</td> |
123 |
| - <td>HW</td> |
124 |
| - <td>Description</td> |
125 |
| - </tr> |
126 |
| - <tr> |
127 |
| - <td><a href="https://www.langchain.com">LangChain</a>/<a href="https://www.llamaindex.ai">LlamaIndex</a></td> |
128 |
| - <td><a href="https://huggingface.co/HuggingFaceH4/mistral-7b-grok">HuggingFaceH4/mistral-7b-grok</a></td> |
129 |
| - <td><a href="https://github.com/huggingface/text-generation-inference">TGI</a></td> |
130 |
| - <td>Xeon/Gaudi2</td> |
131 |
| - <td>Code Translation</td> |
132 |
| - </tr> |
133 |
| - </tbody> |
134 |
| -</table> |
| 28 | +1. <b>Docker Compose</b>: Check the released docker images in [docker image list](./docker_images_list.md) for detailed information. |
| 29 | +2. <b>Kubernetes</b>: Follow the steps at [K8s Install](https://github.com/opea-project/docs/tree/main/guide/installation/k8s_install) and [GMC Install](https://github.com/opea-project/docs/blob/main/guide/installation/gmc_install/gmc_install.md) to setup k8s and GenAI environment . |
135 | 30 |
|
136 |
| -### DocSum |
| 31 | +Users can choose the most suitable approach based on ease of setup, scalability needs, and the environment in which they are operating. |
137 | 32 |
|
138 |
| -[DocSum](./DocSum/README.md) is an example of chatbot for summarizing the content of documents or reports. |
| 33 | +### Deployment |
139 | 34 |
|
140 | 35 | <table>
|
141 |
| - <tbody> |
142 |
| - <tr> |
143 |
| - <td>Framework</td> |
144 |
| - <td>LLM</td> |
145 |
| - <td>Serving</td> |
146 |
| - <td>HW</td> |
147 |
| - <td>Description</td> |
148 |
| - </tr> |
149 |
| - <tr> |
150 |
| - <td><a href="https://www.langchain.com">LangChain</a>/<a href="https://www.llamaindex.ai">LlamaIndex</a></td> |
151 |
| - <td><a href="https://huggingface.co/Intel/neural-chat-7b-v3-3">NeuralChat-7B</a></td> |
152 |
| - <td><a href="https://github.com/huggingface/text-generation-inference">TGI</a></td> |
153 |
| - <td>Xeon/Gaudi2</td> |
154 |
| - <td>Chatbot</td> |
155 |
| - </tr> |
156 |
| - <tr> |
157 |
| - <td><a href="https://www.langchain.com">LangChain</a>/<a href="https://www.llamaindex.ai">LlamaIndex</a></td> |
158 |
| - <td><a href="https://huggingface.co/mistralai/Mistral-7B-v0.1">Mistral-7B</a></td> |
159 |
| - <td><a href="https://github.com/huggingface/text-generation-inference">TGI</a></td> |
160 |
| - <td>Xeon/Gaudi2</td> |
161 |
| - <td>Chatbot</td> |
162 |
| - </tr> |
163 |
| - </tbody> |
| 36 | + <tr> |
| 37 | + <th rowspan="3" style="text-align:center;">Use Cases</th> |
| 38 | + <th colspan="4" style="text-align:center;">Deployment</th> |
| 39 | + </tr> |
| 40 | + <tr> |
| 41 | + <td colspan="2" style="text-align:center;">Docker Compose</td> |
| 42 | + <td rowspan="2" style="text-align:center;">Kubernetes</td> |
| 43 | + </tr> |
| 44 | + <tr> |
| 45 | + <td style="text-align:center;">Xeon</td> |
| 46 | + <td style="text-align:center;">Gaudi</td> |
| 47 | + </tr> |
| 48 | + <tr> |
| 49 | + <td style="text-align:center;">ChatQnA</td> |
| 50 | + <td><a href="https://github.com/opea-project/GenAIExamples/blob/main/ChatQnA/docker/xeon/README.md">Xeon Link</a></td> |
| 51 | + <td><a href="https://github.com/opea-project/GenAIExamples/blob/main/ChatQnA/docker/gaudi/README.md">Gaudi Link</a></td> |
| 52 | + <td><a href="https://github.com/opea-project/GenAIExamples/blob/main/ChatQnA/kubernetes/README.md">K8s Link</a></td> |
| 53 | + </tr> |
| 54 | + <tr> |
| 55 | + <td style="text-align:center;">CodeGen</td> |
| 56 | + <td><a href="https://github.com/opea-project/GenAIExamples/blob/main/CodeGen/docker/xeon/README.md">Xeon Link</a></td> |
| 57 | + <td><a href="https://github.com/opea-project/GenAIExamples/blob/main/CodeGen/docker/gaudi/README.md">Gaudi Link</a></td> |
| 58 | + <td><a href="https://github.com/opea-project/GenAIExamples/blob/main/CodeGen/kubernetes/README.md">K8s Link</a></td> |
| 59 | + </tr> |
| 60 | + <tr> |
| 61 | + <td style="text-align:center;">CodeTrans</td> |
| 62 | + <td><a href="https://github.com/opea-project/GenAIExamples/blob/main/CodeTrans/docker/xeon/README.md">Xeon Link</a></td> |
| 63 | + <td><a href="https://github.com/opea-project/GenAIExamples/blob/main/CodeTrans/docker/gaudi/README.md">Gaudi Link</a></td> |
| 64 | + <td><a href="https://github.com/opea-project/GenAIExamples/blob/main/CodeTrans/kubernetes/README.md">K8s Link</a></td> |
| 65 | + </tr> |
| 66 | + <tr> |
| 67 | + <td style="text-align:center;">DocSum</td> |
| 68 | + <td><a href="https://github.com/opea-project/GenAIExamples/blob/main/DocSum/docker/xeon/README.md">Xeon Link</a></td> |
| 69 | + <td><a href="https://github.com/opea-project/GenAIExamples/blob/main/DocSum/docker/gaudi/README.md">Gaudi Link</a></td> |
| 70 | + <td><a href="https://github.com/opea-project/GenAIExamples/blob/main/DocSum/kubernetes/README.md">K8s Link</a></td> |
| 71 | + </tr> |
| 72 | + <tr> |
| 73 | + <td style="text-align:center;">SearchQnA</td> |
| 74 | + <td><a href="https://github.com/opea-project/GenAIExamples/blob/main/SearchQnA/docker/xeon/README.md">Xeon Link</a></td> |
| 75 | + <td><a href="https://github.com/opea-project/GenAIExamples/blob/main/SearchQnA/docker/gaudi/README.md">Gaudi Link</a></td> |
| 76 | + <td><a href="https://github.com/opea-project/GenAIExamples/blob/main/SearchQnA/kubernetes/README.md">K8s Link</a></td> |
| 77 | + </tr> |
| 78 | + <tr> |
| 79 | + <td style="text-align:center;">FaqGen</td> |
| 80 | + <td><a href="https://github.com/opea-project/GenAIExamples/blob/main/FaqGen/docker/xeon/README.md">Xeon Link</a></td> |
| 81 | + <td><a href="https://github.com/opea-project/GenAIExamples/blob/main/FaqGen/docker/gaudi/README.md">Gaudi Link</a></td> |
| 82 | + <td><a href="https://github.com/opea-project/GenAIExamples/blob/main/FaqGen/kubernetes/manifests/README.md">K8s Link</a></td> |
| 83 | + </tr> |
| 84 | + <tr> |
| 85 | + <td style="text-align:center;">Translation</td> |
| 86 | + <td><a href="https://github.com/opea-project/GenAIExamples/blob/main/Translation/docker/xeon/README.md">Xeon Link</a></td> |
| 87 | + <td><a href="https://github.com/opea-project/GenAIExamples/blob/main/Translation/docker/gaudi/README.md">Gaudi Link</a></td> |
| 88 | + <td><a href="https://github.com/opea-project/GenAIExamples/tree/main/Translation/kubernetes">K8s Link</a></td> |
| 89 | + </tr> |
| 90 | + <tr> |
| 91 | + <td style="text-align:center;">AudioQnA</td> |
| 92 | + <td><a href="https://github.com/opea-project/GenAIExamples/blob/main/AudioQnA/docker/xeon/README.md">Xeon Link</a></td> |
| 93 | + <td><a href="https://github.com/opea-project/GenAIExamples/blob/main/AudioQnA/docker/gaudi/README.md">Gaudi Link</a></td> |
| 94 | + <td>Not supported yet</td> |
| 95 | + </tr> |
| 96 | + <tr> |
| 97 | + <td style="text-align:center;">VisualQnA</td> |
| 98 | + <td><a href="https://github.com/opea-project/GenAIExamples/tree/main/VisualQnA">Xeon Link</a></td> |
| 99 | + <td><a href="https://github.com/opea-project/GenAIExamples/tree/main/VisualQnA">Gaudi Link</a></td> |
| 100 | + <td>Not supported yet</td> |
| 101 | + </tr> |
164 | 102 | </table>
|
165 | 103 |
|
166 |
| -### Language Translation |
167 |
| - |
168 |
| -[Language Translation](./Translation/README.md) is an example of chatbot for converting a source-language text to an equivalent target-language text. |
169 |
| - |
170 |
| -<table> |
171 |
| - <tbody> |
172 |
| - <tr> |
173 |
| - <td>Framework</td> |
174 |
| - <td>LLM</td> |
175 |
| - <td>Serving</td> |
176 |
| - <td>HW</td> |
177 |
| - <td>Description</td> |
178 |
| - </tr> |
179 |
| - <tr> |
180 |
| - <td><a href="https://www.langchain.com">LangChain</a>/<a href="https://www.llamaindex.ai">LlamaIndex</a></td> |
181 |
| - <td><a href="https://huggingface.co/haoranxu/ALMA-13B">haoranxu/ALMA-13B</a></td> |
182 |
| - <td><a href="https://github.com/huggingface/text-generation-inference">TGI</a></td> |
183 |
| - <td>Xeon/Gaudi2</td> |
184 |
| - <td>Language Translation</td> |
185 |
| - </tr> |
186 |
| - </tbody> |
187 |
| -</table> |
188 |
| - |
189 |
| -### SearchQnA |
190 |
| - |
191 |
| -[SearchQnA](./SearchQnA/README.md) is an example of chatbot for using search engine to enhance QA quality. |
192 |
| - |
193 |
| -<table> |
194 |
| - <tbody> |
195 |
| - <tr> |
196 |
| - <td>Framework</td> |
197 |
| - <td>LLM</td> |
198 |
| - <td>Serving</td> |
199 |
| - <td>HW</td> |
200 |
| - <td>Description</td> |
201 |
| - </tr> |
202 |
| - <tr> |
203 |
| - <td><a href="https://www.langchain.com">LangChain</a>/<a href="https://www.llamaindex.ai">LlamaIndex</a></td> |
204 |
| - <td><a href="https://huggingface.co/Intel/neural-chat-7b-v3-3">NeuralChat-7B</a></td> |
205 |
| - <td><a href="https://github.com/huggingface/text-generation-inference">TGI</a></td> |
206 |
| - <td>Xeon/Gaudi2</td> |
207 |
| - <td>Chatbot</td> |
208 |
| - </tr> |
209 |
| - <tr> |
210 |
| - <td><a href="https://www.langchain.com">LangChain</a>/<a href="https://www.llamaindex.ai">LlamaIndex</a></td> |
211 |
| - <td><a href="https://huggingface.co/mistralai/Mistral-7B-v0.1">Mistral-7B</a></td> |
212 |
| - <td><a href="https://github.com/huggingface/text-generation-inference">TGI</a></td> |
213 |
| - <td>Xeon/Gaudi2</td> |
214 |
| - <td>Chatbot</td> |
215 |
| - </tr> |
216 |
| - </tbody> |
217 |
| -</table> |
218 |
| - |
219 |
| -### VisualQnA |
220 |
| - |
221 |
| -[VisualQnA](./VisualQnA/README.md) is an example of chatbot for question and answering based on the images. |
222 |
| - |
223 |
| -<table> |
224 |
| - <tbody> |
225 |
| - <tr> |
226 |
| - <td>LLM</td> |
227 |
| - <td>HW</td> |
228 |
| - <td>Description</td> |
229 |
| - </tr> |
230 |
| - <tr> |
231 |
| - <td><a href="https://huggingface.co/llava-hf/llava-1.5-7b-hf">LLaVA-1.5-7B</a></td> |
232 |
| - <td>Gaudi2</td> |
233 |
| - <td>Chatbot</td> |
234 |
| - </tr> |
235 |
| - </tbody> |
236 |
| -</table> |
| 104 | +## Support Examples |
237 | 105 |
|
238 |
| -> **_NOTE:_** The `Language Translation`, `SearchQnA`, `VisualQnA` and other use cases not listing here are in active development. The code structure of these use cases are subject to change. |
| 106 | +Check [here](./supported_examples.md) for detailed information of supported examples, models, hardwares, etc. |
239 | 107 |
|
240 | 108 | ## Additional Content
|
241 | 109 |
|
|
0 commit comments