Skip to content

Commit

Permalink
Llama2 enabling (#109)
Browse files Browse the repository at this point in the history
  • Loading branch information
Zhenzhong1 committed Aug 8, 2023
1 parent 9c1bfd1 commit d4fb27c
Showing 1 changed file with 9 additions and 8 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -9,16 +9,20 @@ ITREX Graph is an experimental c++ bare metal LLM inference solution that mainly

In short, ITREX Graph is an experimental feature and may keep changing.

### Compile Graph
### Supported Models
Now we supports [GPT-NeoX](https://github.com/EleutherAI/gpt-neox), [LLaMA](https://github.com/facebookresearch/llama), [LLaMA2](https://huggingface.co/meta-llama/Llama-2-7b-chat-hf), [MPT](https://huggingface.co/mosaicml/mpt-7b), [FALCON](https://huggingface.co/tiiuae/falcon-7b), [GPT-J](https://huggingface.co/docs/transformers/model_doc/gptj).

## How to use

### 1. Build Graph
```shell
mkdir build
cd build
cmake .. -G Ninja
ninja
```

## How to use
### Convert model
### 2. Convert Models
Currently, Graph uses the same models as llama.cpp. You can also convert the model yourself
```bash
ls ./models
Expand Down Expand Up @@ -49,7 +53,7 @@ python scripts/convert_starcoder.py --model={input_model_name_or_path} --outfile
./build/bin/quant_starcoder --model_file ${output_path}/ne-f32.bin --out_file ${output_path}/ne-q4_j.bin --bits 4
```

### Run Models
### 3. Run Models
Running LLAMA model, for details please refer to [LLaMA model documentation](./application/ChatLLAMA/README.md).

```bash
Expand All @@ -63,13 +67,10 @@ Running GPT-NEOX / MPT / FALCON / / GPT-J / STARCODER model, please use `main_gp
OMP_NUM_THREADS=56 numactl -m 0 -C 0-55 ./build/bin/main_gptneox -m ${output_path}/ne-q8.bin --seed 12 -c 512 -b 1024 -n 256 -t 56 --repeat-penalty 1.0 -p "She opened the door and see"
```

for GPT-J, you can also try python binds which is experimental currently:
For GPT-J, you can also try python binds which is experimental currently:

```bash
cp scripts/gptj_binding.py build
cd build
python gptj_binding.py
```

### Supported model
Now we supports [GPT-NeoX](https://github.com/EleutherAI/gpt-neox), [LLaMA](https://github.com/facebookresearch/llama), [MPT](https://huggingface.co/mosaicml/mpt-7b), [FALCON](https://huggingface.co/tiiuae/falcon-7b), [STARCODER](https://huggingface.co/bigcode/starcoder), [GPT-J](https://huggingface.co/docs/transformers/model_doc/gptj).

0 comments on commit d4fb27c

Please sign in to comment.