Llama2 enabling (#109)

intel · Aug 8, 2023 · d4fb27c · d4fb27c
1 parent 9c1bfd1
commit d4fb27c
Showing 1 changed file with 9 additions and 8 deletions.
diff --git a/intel_extension_for_transformers/backends/neural_engine/graph/README.md b/intel_extension_for_transformers/backends/neural_engine/graph/README.md
@@ -9,16 +9,20 @@ ITREX Graph is an experimental c++ bare metal LLM inference solution that mainly
 
 In short, ITREX Graph is an experimental feature and may keep changing.
 
-### Compile Graph
+### Supported Models
+Now we supports [GPT-NeoX](https://github.com/EleutherAI/gpt-neox), [LLaMA](https://github.com/facebookresearch/llama), [LLaMA2](https://huggingface.co/meta-llama/Llama-2-7b-chat-hf), [MPT](https://huggingface.co/mosaicml/mpt-7b), [FALCON](https://huggingface.co/tiiuae/falcon-7b), [GPT-J](https://huggingface.co/docs/transformers/model_doc/gptj).
+
+## How to use
+
+### 1. Build Graph
 ```shell
 mkdir build
 cd build
 cmake .. -G Ninja
 ninja
 ```
 
-## How to use
-### Convert model
+### 2. Convert Models
 Currently, Graph uses the same models as llama.cpp. You can also convert the model yourself
 ```bash
 ls ./models
@@ -49,7 +53,7 @@ python scripts/convert_starcoder.py --model={input_model_name_or_path} --outfile
 ./build/bin/quant_starcoder --model_file ${output_path}/ne-f32.bin --out_file ${output_path}/ne-q4_j.bin --bits 4
 ```
 
-### Run Models
+### 3. Run Models
 Running LLAMA model, for details please refer to [LLaMA model documentation](./application/ChatLLAMA/README.md).
 
 ```bash
@@ -63,13 +67,10 @@ Running GPT-NEOX / MPT / FALCON / / GPT-J / STARCODER model, please use `main_gp
 OMP_NUM_THREADS=56 numactl -m 0 -C 0-55 ./build/bin/main_gptneox -m ${output_path}/ne-q8.bin --seed 12 -c 512 -b 1024 -n 256 -t 56 --repeat-penalty 1.0 -p "She opened the door and see"
 ```
 
-for GPT-J, you can also try python binds which is experimental currently:
+For GPT-J, you can also try python binds which is experimental currently:
 
 ```bash
 cp scripts/gptj_binding.py build
 cd build
 python gptj_binding.py
 ```
-
-### Supported model
-Now we supports [GPT-NeoX](https://github.com/EleutherAI/gpt-neox), [LLaMA](https://github.com/facebookresearch/llama), [MPT](https://huggingface.co/mosaicml/mpt-7b), [FALCON](https://huggingface.co/tiiuae/falcon-7b), [STARCODER](https://huggingface.co/bigcode/starcoder), [GPT-J](https://huggingface.co/docs/transformers/model_doc/gptj).