Create h2oGPT 40B based on tiiuae/falcon-40b #216

arnocandel · 2023-06-01T01:58:37Z

https://huggingface.co/tiiuae/falcon-40b Apache 2.0 model (can't use the -instruct one, since trained on Alpaca)

RWForCausalLM(
  (transformer): RWModel(
    (word_embeddings): Embedding(65024, 8192)
    (h): ModuleList(
      (0-59): 60 x DecoderLayer(
        (ln_attn): LayerNorm((8192,), eps=1e-05, elementwise_affine=True)
        (ln_mlp): LayerNorm((8192,), eps=1e-05, elementwise_affine=True)
        (self_attention): Attention(
          (maybe_rotary): RotaryEmbedding()
          (query_key_value): Linear(in_features=8192, out_features=9216, bias=False)
          (dense): Linear(in_features=8192, out_features=8192, bias=False)
          (attention_dropout): Dropout(p=0.0, inplace=False)
        )
        (mlp): MLP(
          (dense_h_to_4h): Linear(in_features=8192, out_features=32768, bias=False)
          (act): GELU(approximate='none')
          (dense_4h_to_h): Linear(in_features=32768, out_features=8192, bias=False)
        )
      )
    )
    (ln_f): LayerNorm((8192,), eps=1e-05, elementwise_affine=True)
  )
  (lm_head): Linear(in_features=8192, out_features=65024, bias=False)
)

The text was updated successfully, but these errors were encountered:

arnocandel · 2023-06-01T02:07:02Z

(env) arno@rippa:/nfs4/llm/h2ogpt(main)$ CUDA_VISIBLE_DEVICES=0,1 torchrun --nproc_per_node=2 finetune.py --data_path=h2oai/openassistant_oasst1_h2ogpt_graded --drop_truncations=True --train_8bit=True --base_model=tiiuae/falcon-40b --micro_batch_size=1 --batch_size=128 --num_epochs=3 --run_id=6 &> log.6.txt

bin /nfs4/llm/h2o-llm/env/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_cuda117.so
CUDA SETUP: CUDA runtime path found: /usr/local/cuda-11/lib64/libcudart.so.11.0
CUDA SETUP: Highest compute capability among GPUs detected: 8.9
CUDA SETUP: Detected CUDA version 117
CUDA SETUP: Loading binary /nfs4/llm/h2o-llm/env/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_cuda117.so...
Loading checkpoint shards: 100%|██████████| 9/9 [03:35<00:00, 23.90s/it]
Loading checkpoint shards: 100%|██████████| 9/9 [03:35<00:00, 23.91s/it]
PeftModelForCausalLM(
  (base_model): LoraModel(
    (model): RWForCausalLM(
      (transformer): RWModel(
        (word_embeddings): Embedding(65024, 8192)
        (h): ModuleList(
          (0-59): 60 x DecoderLayer(
            (ln_attn): LayerNorm((8192,), eps=1e-05, elementwise_affine=True)
            (ln_mlp): LayerNorm((8192,), eps=1e-05, elementwise_affine=True)
            (self_attention): Attention(
              (maybe_rotary): RotaryEmbedding()
              (query_key_value): Linear8bitLt(
                in_features=8192, out_features=9216, bias=False
                (lora_dropout): ModuleDict(
                  (default): Dropout(p=0.05, inplace=False)
                )
                (lora_A): ModuleDict(
                  (default): Linear(in_features=8192, out_features=8, bias=False)
                )
                (lora_B): ModuleDict(
                  (default): Linear(in_features=8, out_features=9216, bias=False)
                )
                (lora_embedding_A): ParameterDict()
                (lora_embedding_B): ParameterDict()
              )
              (dense): Linear8bitLt(in_features=8192, out_features=8192, bias=False)
              (attention_dropout): Dropout(p=0.0, inplace=False)
            )
            (mlp): MLP(
              (dense_h_to_4h): Linear8bitLt(in_features=8192, out_features=32768, bias=False)
              (act): GELU(approximate='none')
              (dense_4h_to_h): Linear8bitLt(in_features=32768, out_features=8192, bias=False)
            )
          )
        )
        (ln_f): LayerNorm((8192,), eps=1e-05, elementwise_affine=True)
      )
      (lm_head): Linear(in_features=8192, out_features=65024, bias=False)
    )
  )
)
trainable params: 8355840 || all params: 41311649792 || trainable%: 0.020226352716656956
PeftModelForCausalLM(
  (base_model): LoraModel(
    (model): RWForCausalLM(
      (transformer): RWModel(
        (word_embeddings): Embedding(65024, 8192)
        (h): ModuleList(
          (0-59): 60 x DecoderLayer(
            (ln_attn): LayerNorm((8192,), eps=1e-05, elementwise_affine=True)
            (ln_mlp): LayerNorm((8192,), eps=1e-05, elementwise_affine=True)
            (self_attention): Attention(
              (maybe_rotary): RotaryEmbedding()
              (query_key_value): Linear8bitLt(
                in_features=8192, out_features=9216, bias=False
                (lora_dropout): ModuleDict(
                  (default): Dropout(p=0.05, inplace=False)
                )
                (lora_A): ModuleDict(
                  (default): Linear(in_features=8192, out_features=8, bias=False)
                )
                (lora_B): ModuleDict(
                  (default): Linear(in_features=8, out_features=9216, bias=False)
                )
                (lora_embedding_A): ParameterDict()
                (lora_embedding_B): ParameterDict()
              )
              (dense): Linear8bitLt(in_features=8192, out_features=8192, bias=False)
              (attention_dropout): Dropout(p=0.0, inplace=False)
            )
            (mlp): MLP(
              (dense_h_to_4h): Linear8bitLt(in_features=8192, out_features=32768, bias=False)
              (act): GELU(approximate='none')
              (dense_4h_to_h): Linear8bitLt(in_features=32768, out_features=8192, bias=False)
            )
          )
        )
        (ln_f): LayerNorm((8192,), eps=1e-05, elementwise_affine=True)
      )
      (lm_head): Linear(in_features=8192, out_features=65024, bias=False)
    )
  )
)
trainable params: 8355840 || all params: 41311649792 || trainable%: 0.020226352716656956
Using Validation Metrics: []
Supported Metrics: ['bleu', 'rouge', 'sacrebleu', 'meteor']
Auto set val_set_size 1000
Found cached dataset json (/home/arno/.cache/huggingface/datasets/h2oai___json/h2oai--openassistant_oasst1_h2ogpt_graded-29f03a61004f6aef/0.0.0/e347ab1c932092252e717ff3f949105a4dd28b27e842dd53157d2f72e276c2e4)
  0%|          | 0/1 [00:00<?, ?it/s]Found cached dataset json (/home/arno/.cache/huggingface/datasets/h2oai___json/h2oai--openassistant_oasst1_h2ogpt_graded-29f03a61004f6aef/0.0.0/e347ab1c932092252e717ff3f949105a4dd28b27e842dd53157d2f72e276c2e4)
100%|██████████| 1/1 [00:00<00:00,  5.71it/s]
100%|██████████| 1/1 [00:00<00:00,  5.70it/s]
Tokenizing 30368 training rows
avoid keeping truncated cases to avoid contaminating model with truncation cases.  Original size: 30368
avoid keeping truncated cases to avoid contaminating model with truncation cases.  New size: 21583
Final fine-tuning data:
Train Dataset({
    features: ['source', 'grade_deberta', 'input', 'prompt_type', 'id', 'input_ids', 'token_type_ids', 'attention_mask', 'labels'],
    num_rows: 21583
})
Valid None
Sample input: {'source': ['OpenAssistant/oasst1'], 'grade_deberta': [0.4986453354358673], 'input': ['<human>: You obviously know yourself the best, but how do you believe you learn the best? Do you prefer large datasets all at once or would you rather have numerous small episodes of learning? Finally, as humans we are able to build new knowledge by leveraging what we already know, so are you able to do that easily or would you need to be re-trained/fine-tuned?\n<bot>: I think I learned. Best from numerous small episodes of learning. It feels like the most natural way. Understand the foundation, make a number of attempts, learn from the failures of those attempts, just continue to build on that.\n\n<human>: I think you should learn more about how to use punctuation)\n\n<bot>: Sorry for my bad use of punctuation here is an improved response:\nI think I learned best from numerous small episodes of learning. It feels like the most natural way. Understand the foundation, make a number of attempts and learn from the failures of those attempts. Just continue to build on that.\n\n<human>:'], 'prompt_type': ['plain'], 'id': [20695], 'token_type_ids': [[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]]}
No neptune configured, set NEPTUNE_API_TOKEN env var.
Auto set eval_steps to 25 out of 505 total training steps
Auto step save_steps to 25
You are adding a <class 'transformers.integrations.TensorBoardCallback'> to the callbacks of this Trainer, but there is already one. The currentlist of callbacks is
:DefaultFlowCallback
TensorBoardCallback
You are adding a <class 'transformers.integrations.TensorBoardCallback'> to the callbacks of this Trainer, but there is already one. The currentlist of callbacks is
:DefaultFlowCallback
TensorBoardCallback
You're using a PreTrainedTokenizerFast tokenizer. Please note that with a fast tokenizer, using the `__call__` method is faster than using a method to encode the text followed by a call to the `pad` method to get a padded encoding.
/nfs4/llm/h2o-llm/env/lib/python3.10/site-packages/bitsandbytes/autograd/_functions.py:318: UserWarning: MatMul8bitLt: inputs will be cast from torch.float32 to float16 during quantization
  warnings.warn(f"MatMul8bitLt: inputs will be cast from {A.dtype} to float16 during quantization")
  0%|          | 0/504 [00:00<?, ?it/s]You're using a PreTrainedTokenizerFast tokenizer. Please note that with a fast tokenizer, using the `__call__` method is faster than using a method to encode the text followed by a call to the `pad` method to get a padded encoding.
/nfs4/llm/h2o-llm/env/lib/python3.10/site-packages/bitsandbytes/autograd/_functions.py:318: UserWarning: MatMul8bitLt: inputs will be cast from torch.float32 to float16 during quantization
  warnings.warn(f"MatMul8bitLt: inputs will be cast from {A.dtype} to float16 during quantization")

0%| | 1/504 [02:35<21:44:04, 155.56s/it]
OutOfMemoryError: CUDA out of memory. Tried to allocate 64.00 MiB (GPU 0; 47.47 GiB total capacity; 43.43 GiB already allocated; 63.81 MiB free; 45.14 GiB reserved in total by PyTorch) If on 2xA6000Ada (48GB)

arnocandel · 2023-06-01T02:12:26Z

confirm training works locally
prepare merging LoRA + foundation -> HF state
prepare to train on 8xA100, with improved LoRA (use more layers)

arnocandel · 2023-06-01T03:52:15Z

Improved LoRA coverage:

CUDA_VISIBLE_DEVICES=0,1 torchrun --nproc_per_node=2 finetune.py --data_path=h2oai/openassistant_oasst1_h2ogpt_graded --drop_truncations=True --train_8bit=True --base_model=tiiuae/falcon-40b --micro_batch_size=1 --batch_size=128 --num_epochs=1 --run_id=7 --lora_target_modules='["query_key_value", "dense_h_to_4h", "dense_4h_to_h", "dense"]' &> log.7.txt

PeftModelForCausalLM(
  (base_model): LoraModel(
    (model): RWForCausalLM(
      (transformer): RWModel(
        (word_embeddings): Embedding(65024, 8192)
        (h): ModuleList(
          (0-59): 60 x DecoderLayer(
            (ln_attn): LayerNorm((8192,), eps=1e-05, elementwise_affine=True)
            (ln_mlp): LayerNorm((8192,), eps=1e-05, elementwise_affine=True)
            (self_attention): Attention(
              (maybe_rotary): RotaryEmbedding()
              (query_key_value): Linear8bitLt(
                in_features=8192, out_features=9216, bias=False
                (lora_dropout): ModuleDict(
                  (default): Dropout(p=0.05, inplace=False)
                )
                (lora_A): ModuleDict(
                  (default): Linear(in_features=8192, out_features=8, bias=False)
                )
                (lora_B): ModuleDict(
                  (default): Linear(in_features=8, out_features=9216, bias=False)
                )
                (lora_embedding_A): ParameterDict()
                (lora_embedding_B): ParameterDict()
              )
              (dense): Linear8bitLt(
                in_features=8192, out_features=8192, bias=False
                (lora_dropout): ModuleDict(
                  (default): Dropout(p=0.05, inplace=False)
                )
                (lora_A): ModuleDict(
                  (default): Linear(in_features=8192, out_features=8, bias=False)
                )
                (lora_B): ModuleDict(
                  (default): Linear(in_features=8, out_features=8192, bias=False)
                )
                (lora_embedding_A): ParameterDict()
                (lora_embedding_B): ParameterDict()
              )
              (attention_dropout): Dropout(p=0.0, inplace=False)
            )
            (mlp): MLP(
              (dense_h_to_4h): Linear8bitLt(
                in_features=8192, out_features=32768, bias=False
                (lora_dropout): ModuleDict(
                  (default): Dropout(p=0.05, inplace=False)
                )
                (lora_A): ModuleDict(
                  (default): Linear(in_features=8192, out_features=8, bias=False)
                )
                (lora_B): ModuleDict(
                  (default): Linear(in_features=8, out_features=32768, bias=False)
                )
                (lora_embedding_A): ParameterDict()
                (lora_embedding_B): ParameterDict()
              )
              (act): GELU(approximate='none')
              (dense_4h_to_h): Linear8bitLt(
                in_features=32768, out_features=8192, bias=False
                (lora_dropout): ModuleDict(
                  (default): Dropout(p=0.05, inplace=False)
                )
                (lora_A): ModuleDict(
                  (default): Linear(in_features=32768, out_features=8, bias=False)
                )
                (lora_B): ModuleDict(
                  (default): Linear(in_features=8, out_features=8192, bias=False)
                )
                (lora_embedding_A): ParameterDict()
                (lora_embedding_B): ParameterDict()
              )
            )
          )
        )
        (ln_f): LayerNorm((8192,), eps=1e-05, elementwise_affine=True)
      )
      (lm_head): Linear(in_features=8192, out_features=65024, bias=False)
    )
  )
)
trainable params: 55541760 || all params: 41358835712 || trainable%: 0.13429236835089367

1%| | 2/168 [05:06<7:03:23, 153.03s/it]

arnocandel · 2023-06-01T04:01:40Z

8x A100 80GB tiiuae/falcon-40b + oasst1_h2ogpt_graded INSTRUCT TUNING (4-bit)

Note: failed with OOM for --train_8bit=True, maybe still the PEFT memory overuse bug? https://github.com/huggingface/peft.git@207d2908650f3f4f3ba0e21d243c1b2aee66e72d

torchrun --nproc_per_node=8 finetune.py --data_path=h2oai/openassistant_oasst1_h2ogpt_graded --drop_truncations=True --train_4bit=True --base_model=tiiuae/falcon-40b --micro_batch_size=1 --batch_size=32 --num_epochs=3 --lora_target_modules='["query_key_value", "dense_h_to_4h", "dense_4h_to_h", "dense"]' --run_id=8 &> log.8.txt
0%| | 1/504 [00:53<7:31:49, 53.90s/it]

https://slack-files.com/T0329MHH6-F05AGKHEW85-a7ce922a1a lora weights, checkpoint and logs

https://huggingface.co/h2oai/h2ogpt-oasst1-falcon-40b

arnocandel · 2023-06-01T14:34:11Z

8x A100 80GB tiiuae/falcon-40b + h2ogpt-oig-oasst1-instruct-cleaned-v3 INSTRUCT TUNING (4-bit)

torchrun --nproc_per_node=8 finetune.py --data_path=h2oai/h2ogpt-oig-oasst1-instruct-cleaned-v3 --drop_truncations=True --train_4bit=True --base_model=tiiuae/falcon-40b --micro_batch_size=2 --batch_size=64 --num_epochs=3 --lora_target_modules='["query_key_value", "dense_h_to_4h", "dense_4h_to_h", "dense"]' --run_id=10 &> log.10.txt
1%| | 128/12213 [18:10<28:15:09, 8.42s/it]

https://huggingface.co/h2oai/h2ogpt-oig-oasst1-falcon-40b

arnocandel · 2023-06-02T15:58:13Z

CUDA_VISIBLE_DEVICES=0 python generate.py --base_model=h2oai/h2ogpt-oasst1-falcon-40b --chat=False --stream_output=False --gradio=False --eval_sharegpt_prompts_only=500 --eval_sharegpt_as_output=False --num_beams=2 --infer_devices=False --load_4bit=True --debug

OOM

16-bit/80GB across 2x48GB+1x24GB cards
CUDA_VISIBLE_DEVICES=0,1,2 python generate.py --base_model=h2oai/h2ogpt-oasst1-falcon-40b --chat=False --stream_output=False --gradio=False --eval_sharegpt_prompts_only=500 --eval_sharegpt_as_output=False --num_beams=2 --infer_devices=False --debug

arnocandel · 2023-06-06T23:02:09Z

8xA100 Eval suite

https://github.com/EleutherAI/lm-evaluation-harness
4b701e228768052cfae9043dca13e82052ca5eea

diff --git a/lm_eval/models/huggingface.py b/lm_eval/models/huggingface.py
index 4d3aa24..5e4257e 100644
--- a/lm_eval/models/huggingface.py
+++ b/lm_eval/models/huggingface.py
@@ -76,10 +76,10 @@ class HuggingFaceAutoLM(BaseLM):
         subfolder: Optional[str] = None,
         revision: Optional[str] = "main",
         batch_size: Optional[Union[int, str]] = 1,
-        max_gen_toks: Optional[int] = 256,
+        max_gen_toks: Optional[int] = 512,
         max_length: Optional[int] = None,
         add_special_tokens: Optional[bool] = None,
-        use_accelerate: Optional[bool] = False,
+        use_accelerate: Optional[bool] = True,
         device_map_option: Optional[str] = "auto",
         max_memory_per_gpu: Optional[Union[int, str]] = None,
         max_cpu_memory: Optional[Union[int, str]] = None,
@@ -89,7 +89,7 @@ class HuggingFaceAutoLM(BaseLM):
         peft: str = None,
         load_in_8bit: Optional[bool] = False,
         load_in_4bit: Optional[bool] = False,
-        trust_remote_code: Optional[bool] = False,
+        trust_remote_code: Optional[bool] = True,
         gptq_use_triton: Optional[bool] = False,
     ):
         """Initializes a HuggingFace `AutoModel` and `AutoTokenizer` for evaluation.

CUDA_VISIBLE_DEVICES=0,1 python main.py --model hf-causal-experimental --model_args pretrained=h2oai/h2ogpt-oig-oasst1-falcon-40b --tasks openbookqa,arc_easy,winogrande,hellaswag,arc_challenge,piqa,boolq --device cuda &> h2ogpt-oig-oasst1-falcon-40b.16bit.eval.log

CUDA_VISIBLE_DEVICES=2,3 python main.py --model hf-causal-experimental --model_args pretrained=h2oai/h2ogpt-oasst1-falcon-40b --tasks openbookqa,arc_easy,winogrande,hellaswag,arc_challenge,piqa,boolq --device cuda &> h2ogpt-oasst1-falcon-40b.16bit.eval.log

logs.zip

arnocandel · 2023-06-07T00:29:33Z

8xA100 ShareGPT Eval 40B

CUDA_VISIBLE_DEVICES=4,5 python generate.py --base_model=h2oai/h2ogpt-oig-oasst1-falcon-40b --chat=False --stream_output=False --gradio=False --eval_prompts_only_num=500 --eval_as_output=False --num_beams=2 --infer_devices=False --debug --max_new_tokens=512 &> h2ogpt-oig-oasst1-falcon-40b.sharegpt.log

CUDA_VISIBLE_DEVICES=6,7 python generate.py --base_model=h2oai/h2ogpt-oasst1-falcon-40b --chat=False --stream_output=False --gradio=False --eval_prompts_only_num=500 --eval_as_output=False --num_beams=2 --infer_devices=False --debug --max_new_tokens=512 &> h2ogpt-oasst1-falcon-40b.sharegpt.log

arnocandel · 2023-06-07T00:38:56Z

1x A6000 Ada ShareGPT Eval 40B 4bit

CUDA_VISIBLE_DEVICES=0 python generate.py --base_model=h2oai/h2ogpt-oig-oasst1-falcon-40b --chat=False --stream_output=False --gradio=False --eval_prompts_only_num=500 --eval_as_output=False --num_beams=2 --infer_devices=False --debug --max_new_tokens=512 --load_4bit=True &> h2ogpt-oasst1-falcon-40b.4bit.sharegpt.log

OOM still
h2ogpt-oasst1-falcon-40b.4bit.sharegpt.log

arnocandel · 2023-06-07T14:30:24Z

h2ogpt-oasst1-falcon-40b.sharegpt.log

h2ogpt-oig-oasst1-falcon-40b.sharegpt.log

arnocandel · 2023-06-22T21:39:42Z

Attempt to improve h2oGPT 40B slightly, based on findings from h2ogpt-gm models

Changes:

1 epoch vs 3 epochs, but use larger dataset again, no grading
increase cutoff length to 2048, so nothing gets dropped
increase lora alpha/r/dropout

CUDA_VISIBLE_DEVICES=4,5,6,7 torchrun --nproc_per_node=4 finetune.py --data_path=h2oai/openassistant_oasst1_h2ogpt --cutoff_len=2048 --drop_truncations=True --train_4bit=True --base_model=tiiuae/falcon-40b --micro_batch_size=1 --batch_size=32 --num_epochs=1 --lora_alpha=32 --lora_r=16 --lora_dropout=0.1 --lora_target_modules='["query_key_value", "dense_h_to_4h", "dense_4h_to_h", "dense"]' --run_id=9 &> log.9.txt
4%|▎ | 52/1483 [15:58<7:10:12, 18.04s/it]

https://huggingface.co/h2oai/h2ogpt-oasst1-2048-falcon-40b

arnocandel · 2023-06-23T18:08:47Z

Eval Suite

same as #216 (comment)
CUDA_VISIBLE_DEVICES=6,7 python main.py --model hf-causal-experimental --model_args pretrained=h2oai/h2ogpt-oasst1-2048-falcon-40b --tasks openbookqa,arc_easy,winogrande,hellaswag,arc_challenge,piqa,boolq --device cuda &> h2ogpt-oasst1-2048-falcon-40b.16bit.eval.log
h2ogpt-oasst1-2048-falcon-40b.16bit.eval.log

hf-causal-experimental (pretrained=h2oai/h2ogpt-oasst1-2048-falcon-40b), limit: None, provide_description: False, num_fewshot: 0, batch_size: None
|    Task     |Version| Metric |Value |   |Stderr|
|-------------|------:|--------|-----:|---|-----:|
|arc_challenge|      0|acc     |0.4940|±  |0.0146|
|             |       |acc_norm|0.5307|±  |0.0146|
|arc_easy     |      0|acc     |0.8106|±  |0.0080|
|             |       |acc_norm|0.7748|±  |0.0086|
|boolq        |      1|acc     |0.8266|±  |0.0066|
|hellaswag    |      0|acc     |0.6464|±  |0.0048|
|             |       |acc_norm|0.8267|±  |0.0038|
|openbookqa   |      0|acc     |0.3520|±  |0.0214|
|             |       |acc_norm|0.4720|±  |0.0223|
|piqa         |      0|acc     |0.8156|±  |0.0090|
|             |       |acc_norm|0.8384|±  |0.0086|
|winogrande   |      0|acc     |0.7774|±  |0.0117|

arnocandel · 2023-06-23T18:17:29Z

maybe DOA (or maybe just 1 epoch isn't enough for proper personalization, as seen for gm models too)

arnocandel self-assigned this Jun 1, 2023

arnocandel mentioned this issue Jun 8, 2023

Add evaluations of h2oGPT models with hendrycks/test - Humanities/STEM/SocialSciences/Other #251

Open

arnocandel mentioned this issue Jun 17, 2023

LlaMa 65B + h2oai/openassistant_oasst1_h2ogpt_graded 1 epoch #130

Open

arnocandel mentioned this issue Jul 7, 2023

Can't load freshly created model in text-generation-inference 0.9.1 #407

Closed

pseudotensor closed this as completed Aug 31, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Create h2oGPT 40B based on tiiuae/falcon-40b #216

Create h2oGPT 40B based on tiiuae/falcon-40b #216

arnocandel commented Jun 1, 2023 •

edited

Loading

arnocandel commented Jun 1, 2023 •

edited

Loading

arnocandel commented Jun 1, 2023 •

edited

Loading

arnocandel commented Jun 1, 2023 •

edited

Loading

arnocandel commented Jun 1, 2023 •

edited

Loading

arnocandel commented Jun 1, 2023 •

edited

Loading

arnocandel commented Jun 2, 2023 •

edited

Loading

arnocandel commented Jun 6, 2023 •

edited

Loading

arnocandel commented Jun 7, 2023 •

edited

Loading

arnocandel commented Jun 7, 2023 •

edited

Loading

arnocandel commented Jun 7, 2023

arnocandel commented Jun 22, 2023 •

edited

Loading

arnocandel commented Jun 23, 2023 •

edited

Loading

arnocandel commented Jun 23, 2023 •

edited

Loading

Create h2oGPT 40B based on tiiuae/falcon-40b #216

Create h2oGPT 40B based on tiiuae/falcon-40b #216

Comments

arnocandel commented Jun 1, 2023 • edited Loading

arnocandel commented Jun 1, 2023 • edited Loading

arnocandel commented Jun 1, 2023 • edited Loading

arnocandel commented Jun 1, 2023 • edited Loading

Improved LoRA coverage:

arnocandel commented Jun 1, 2023 • edited Loading

8x A100 80GB tiiuae/falcon-40b + oasst1_h2ogpt_graded INSTRUCT TUNING (4-bit)

arnocandel commented Jun 1, 2023 • edited Loading

8x A100 80GB tiiuae/falcon-40b + h2ogpt-oig-oasst1-instruct-cleaned-v3 INSTRUCT TUNING (4-bit)

arnocandel commented Jun 2, 2023 • edited Loading

arnocandel commented Jun 6, 2023 • edited Loading

8xA100 Eval suite

arnocandel commented Jun 7, 2023 • edited Loading

8xA100 ShareGPT Eval 40B

arnocandel commented Jun 7, 2023 • edited Loading

1x A6000 Ada ShareGPT Eval 40B 4bit

arnocandel commented Jun 7, 2023

arnocandel commented Jun 22, 2023 • edited Loading

Attempt to improve h2oGPT 40B slightly, based on findings from h2ogpt-gm models

arnocandel commented Jun 23, 2023 • edited Loading

Eval Suite

arnocandel commented Jun 23, 2023 • edited Loading

arnocandel commented Jun 1, 2023 •

edited

Loading

arnocandel commented Jun 1, 2023 •

edited

Loading

arnocandel commented Jun 1, 2023 •

edited

Loading

arnocandel commented Jun 1, 2023 •

edited

Loading

arnocandel commented Jun 1, 2023 •

edited

Loading

arnocandel commented Jun 1, 2023 •

edited

Loading

arnocandel commented Jun 2, 2023 •

edited

Loading

arnocandel commented Jun 6, 2023 •

edited

Loading

arnocandel commented Jun 7, 2023 •

edited

Loading

arnocandel commented Jun 7, 2023 •

edited

Loading

arnocandel commented Jun 22, 2023 •

edited

Loading

arnocandel commented Jun 23, 2023 •

edited

Loading

arnocandel commented Jun 23, 2023 •

edited

Loading