# Most popular open weights LLMs - 2023

## Dependencies

In [None]:
pip install --upgrade transformers

In [3]:
import transformers
transformers.__version__

'4.36.2'

## Models dictionary

### 3B parameters

| Date | Name | URL | Params | Context | Train tokens | License |
| --- | --- | --- | --- | --- | --- | --- |
| 2023/05/04 | redpajama_3b | https://huggingface.co/togethercomputer/RedPajama-INCITE-Base-3B-v1 | 2.8 B | 2048 | 800 B | Apache 2.0 |
| 2023/07/14 | btlm_3b | https://huggingface.co/cerebras/btlm-3b-8k-base | 3 B | 8192 | 627 B | Apache 2.0 |
| 2023/07/16 | openllama_3b | https://huggingface.co/openlm-research/open_llama_3b_v2 | 3 B | 2048 | 1 T | Apache 2.0 |
| 2023/09/29 | stablelm_3b | https://huggingface.co/stabilityai/stablelm-3b-4e1t | 3 B | 4096 | 1 T | CC BY-SA-4.0 |
| 2023/12/13 | phi2_3b | https://huggingface.co/microsoft/phi-2 | 2.7 B | 2048 | 1.4 T | microsoft-research-license |

### 7B parameters

| Date | Name | URL | Params | Context | Train tokens | License |
| --- | --- | --- | --- | --- | --- | --- |
| 2023/04/24 | falcon_7b | https://huggingface.co/tiiuae/falcon-7b | 7 B | 2048 | 1.5 T | Apache 2.0 |
| 2024/05/04 | redpajama_7b | https://huggingface.co/togethercomputer/RedPajama-INCITE-7B-Base | 6.9 B | 2048 | 1 T | Apache 2.0 |
| 2023/05/05 | mpt_7b | https://huggingface.co/mosaicml/mpt-7b | 6.7 B | 2048 | 1 T | Apache-2.0 |
| 2023/06/30 | mpt_7b_8k| https://huggingface.co/mosaicml/mpt-7b-8k | 6.7 B | 8192 | 1.5 T | Apache-2.0 |
| 2023/07/06 | openllama_7b | https://huggingface.co/openlm-research/open_llama_7b_v2 | 7 B | 2048 | 1 T | Apache 2.0 | 
| 2023/07/26 | llama2_7b_32k | https://huggingface.co/togethercomputer/LLaMA-2-7B-32K | 7 B | 32768 | fine-tuned | llama2 |
| 2023/09/20 | mistral_7b | https://huggingface.co/mistralai/Mistral-7B-v0.1 | 7.3 B | 8192 | ?? | Apache 2.0 |
| 2023/11/01 | yi_6b | https://huggingface.co/01-ai/Yi-6B | 6 B | 4096 | 3 T | yi-license |





### 40B parameters

In [23]:
 models = { 
    "redpajama_3b" : "togethercomputer/RedPajama-INCITE-Base-3B-v1", # 5.30 GB
    "btlm_3b" : "cerebras/btlm-3b-8k-base", #  4.93 GB
    "stablelm_3b" : "stabilityai/stablelm-3b-4e1t", # 5.21 GB
    "openllama_3b" : "openlm-research/open_llama_3b_v2", #  6.38 GB
    "phi2_3b" : "microsoft/phi-2", # 5.18 GB
    
    "yi_6b" : "01-ai/Yi-6B", # 11.29 GB
    "mistral_7b" : "mistralai/Mistral-7B-v0.1", # 13.49 GB
    "mpt_7b" : "mosaicml/mpt-7b",
    "falcon_7b" : "tiiuae/falcon-7b",
    "redpajama_7b" : "togethercomputer/RedPajama-INCITE-7B-Base",
    "llama2_7b_32k" : "togethercomputer/LLaMA-2-7B-32K",
    "openllama_7b" : "openlm-research/open_llama_7b_v2",
    "mpt_7b_8k" : "mosaicml/mpt-7b-8k",
    "qwen_7b" : "Qwen/Qwen-7B",
    "llama2_7b" : "meta-llama/Llama-2-7b-hf",
    "bloomz_7b" : "bigscience/bloomz-7b1-mt",
    "decilm_7b" : "Deci/DeciLM-7B",
    
    "solar_10b" : "upstage/SOLAR-10.7B-v1.0",    
    "llama2_13b" : "meta-llama/Llama-2-13b-hf",
    "openllama_13b" : "openlm-research/open_llama_13b",
    "qwen_14b" : "Qwen/Qwen-14B",
    
    "mpt_30b" : "mosaicml/mpt-30b",
    "yi_34b" : "TheBloke/Yi-34B-GPTQ",
    "falcon_40b" : "TheBloke/falcon-40b-instruct-GPTQ",
    "mixtral_8x7B" : "heBloke/Mixtral-8x7B-v0.1-GPTQ"
}

## Models download

If you want to be able to access gated HuggingFace repositories:

1. Login to your HuggingFace account, go to https://huggingface.co/settings/tokens, create a READ access token and copy it
2. Paste your HuggingFace access token in the local file /workspace/hftoken
3. Load it in Python with the the code below

In [28]:
with open("/workspace/hftoken", 'r') as file:
    myhftoken = file.read().strip()

Download all models in HF local cache and measure the model files size on disk:

In [20]:
import os
from transformers import AutoModelForCausalLM, AutoTokenizer
from transformers.utils.hub import cached_file

memory_unit_mb = 1024*1024
memory_unit_gb = 1024*1024*1024

def get_directory_size(directory):
    total_size = 0
    for dirpath, dirnames, filenames in os.walk(directory):
        for f in filenames:
            fp = os.path.join(dirpath, f)
            total_size += os.path.getsize(fp)
    return total_size

def get_model_path_and_size_on_disk(model):    
    model_config_file = cached_file(model.name_or_path, "config.json", local_files_only=True)
    model_directory = os.path.dirname(model_config_file)    
    total_size = get_directory_size(model_directory)
    return model_directory,total_size

def download_in_local_cache(pretrained_model_id, **kwargs):
        print(f"Loading model {pretrained_model_id} in local cache ...")
        AutoTokenizer.from_pretrained(pretrained_model_id, **kwargs)
        model = AutoModelForCausalLM.from_pretrained(pretrained_model_id, device_map="meta", **kwargs)
        path,size = get_model_path_and_size_on_disk(model)
        print(f"--> model files size   : {(size/memory_unit_gb):.2f} GB")
        print(f"--> stored in directory: {path}")
        print()

In [None]:
for model_key in models:
    download_in_local_cache(models[model_key], trust_remote_code=True, token=myhftoken)

Loading model togethercomputer/RedPajama-INCITE-Base-3B-v1 in local cache ...


Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.


--> model files size   : 5.30 GB
--> stored in directory: /models/huggingface/transformers/models--togethercomputer--RedPajama-INCITE-Base-3B-v1/snapshots/094fbdd0c911feb485ce55de1952ab2e75277e1e

Loading model cerebras/btlm-3b-8k-base in local cache ...


A new version of the following files was downloaded from https://huggingface.co/cerebras/btlm-3b-8k-base:
- configuration_btlm.py
. Make sure to double-check they do not contain any added malicious code. To avoid downloading new versions of the code file, you can pin a revision.
A new version of the following files was downloaded from https://huggingface.co/cerebras/btlm-3b-8k-base:
- modeling_btlm.py
. Make sure to double-check they do not contain any added malicious code. To avoid downloading new versions of the code file, you can pin a revision.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.


--> model files size   : 4.93 GB
--> stored in directory: /models/huggingface/transformers/models--cerebras--btlm-3b-8k-base/snapshots/2f325501c4db6464d4fe03c84c3a394197865690

Loading model stabilityai/stablelm-3b-4e1t in local cache ...


A new version of the following files was downloaded from https://huggingface.co/stabilityai/stablelm-3b-4e1t:
- configuration_stablelm_epoch.py
. Make sure to double-check they do not contain any added malicious code. To avoid downloading new versions of the code file, you can pin a revision.
A new version of the following files was downloaded from https://huggingface.co/stabilityai/stablelm-3b-4e1t:
- modeling_stablelm_epoch.py
. Make sure to double-check they do not contain any added malicious code. To avoid downloading new versions of the code file, you can pin a revision.
You are using the default legacy behaviour of the <class 'transformers.models.llama.tokenization_llama.LlamaTokenizer'>. This is expected, and simply means that the `legacy` (previous) behavior will be used so nothing changes for you. If you want to use the new behaviour, set `legacy=False`. This should only be set if you understand what it means, and thoroughly read the reason why this was added as explained in h

--> model files size   : 5.21 GB
--> stored in directory: /models/huggingface/transformers/models--stabilityai--stablelm-3b-4e1t/snapshots/c6554ba60f40a8252d2a43e38e55ee2e3a645813

Loading model openlm-research/open_llama_3b_v2 in local cache ...


config.json:   0%|          | 0.00/506 [00:00<?, ?B/s]

pytorch_model.bin:   0%|          | 0.00/6.85G [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/137 [00:00<?, ?B/s]

--> model files size   : 6.38 GB
--> stored in directory: /models/huggingface/transformers/models--openlm-research--open_llama_3b_v2/snapshots/bce5d60d3b0c68318862270ec4e794d83308d80a

Loading model microsoft/phi-2 in local cache ...


Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.


config.json:   0%|          | 0.00/755 [00:00<?, ?B/s]

configuration_phi.py:   0%|          | 0.00/2.03k [00:00<?, ?B/s]

A new version of the following files was downloaded from https://huggingface.co/microsoft/phi-2:
- configuration_phi.py
. Make sure to double-check they do not contain any added malicious code. To avoid downloading new versions of the code file, you can pin a revision.


modeling_phi.py:   0%|          | 0.00/33.4k [00:00<?, ?B/s]

A new version of the following files was downloaded from https://huggingface.co/microsoft/phi-2:
- modeling_phi.py
. Make sure to double-check they do not contain any added malicious code. To avoid downloading new versions of the code file, you can pin a revision.


model.safetensors.index.json:   0%|          | 0.00/24.3k [00:00<?, ?B/s]

Downloading shards:   0%|          | 0/2 [00:00<?, ?it/s]

model-00001-of-00002.safetensors:   0%|          | 0.00/4.98G [00:00<?, ?B/s]

model-00002-of-00002.safetensors:   0%|          | 0.00/577M [00:00<?, ?B/s]

Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

generation_config.json:   0%|          | 0.00/69.0 [00:00<?, ?B/s]

--> model files size   : 5.18 GB
--> stored in directory: /models/huggingface/transformers/models--microsoft--phi-2/snapshots/d3186761bf5c4409f7679359284066c25ab668ee

Loading model 01-ai/Yi-6B in local cache ...


config.json:   0%|          | 0.00/605 [00:00<?, ?B/s]

model.safetensors.index.json:   0%|          | 0.00/23.9k [00:00<?, ?B/s]

Downloading shards:   0%|          | 0/2 [00:00<?, ?it/s]

model-00001-of-00002.safetensors:   0%|          | 0.00/9.94G [00:00<?, ?B/s]

model-00002-of-00002.safetensors:   0%|          | 0.00/2.18G [00:00<?, ?B/s]

Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

generation_config.json:   0%|          | 0.00/132 [00:00<?, ?B/s]

--> model files size   : 11.29 GB
--> stored in directory: /models/huggingface/transformers/models--01-ai--Yi-6B/snapshots/b881162e08d0fa65011cb53f2c51544e1b623112

Loading model mistralai/Mistral-7B-v0.1 in local cache ...


model.safetensors.index.json:   0%|          | 0.00/25.1k [00:00<?, ?B/s]

Downloading shards:   0%|          | 0/2 [00:00<?, ?it/s]

model-00001-of-00002.safetensors:   0%|          | 0.00/9.94G [00:00<?, ?B/s]

model-00002-of-00002.safetensors:   0%|          | 0.00/4.54G [00:00<?, ?B/s]

Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.


--> model files size   : 13.49 GB
--> stored in directory: /models/huggingface/transformers/models--mistralai--Mistral-7B-v0.1/snapshots/26bca36bde8333b5d7f72e9ed20ccda6a618af24

Loading model mosaicml/mpt-7b in local cache ...


config.json:   0%|          | 0.00/1.23k [00:00<?, ?B/s]

configuration_mpt.py:   0%|          | 0.00/11.0k [00:00<?, ?B/s]

A new version of the following files was downloaded from https://huggingface.co/mosaicml/mpt-7b:
- configuration_mpt.py
. Make sure to double-check they do not contain any added malicious code. To avoid downloading new versions of the code file, you can pin a revision.


modeling_mpt.py:   0%|          | 0.00/20.1k [00:00<?, ?B/s]

adapt_tokenizer.py:   0%|          | 0.00/1.72k [00:00<?, ?B/s]

A new version of the following files was downloaded from https://huggingface.co/mosaicml/mpt-7b:
- adapt_tokenizer.py
. Make sure to double-check they do not contain any added malicious code. To avoid downloading new versions of the code file, you can pin a revision.


meta_init_context.py:   0%|          | 0.00/3.96k [00:00<?, ?B/s]

A new version of the following files was downloaded from https://huggingface.co/mosaicml/mpt-7b:
- meta_init_context.py
. Make sure to double-check they do not contain any added malicious code. To avoid downloading new versions of the code file, you can pin a revision.


norm.py:   0%|          | 0.00/3.12k [00:00<?, ?B/s]

A new version of the following files was downloaded from https://huggingface.co/mosaicml/mpt-7b:
- norm.py
. Make sure to double-check they do not contain any added malicious code. To avoid downloading new versions of the code file, you can pin a revision.


attention.py:   0%|          | 0.00/21.6k [00:00<?, ?B/s]

flash_attn_triton.py:   0%|          | 0.00/28.2k [00:00<?, ?B/s]

A new version of the following files was downloaded from https://huggingface.co/mosaicml/mpt-7b:
- flash_attn_triton.py
. Make sure to double-check they do not contain any added malicious code. To avoid downloading new versions of the code file, you can pin a revision.


fc.py:   0%|          | 0.00/167 [00:00<?, ?B/s]

A new version of the following files was downloaded from https://huggingface.co/mosaicml/mpt-7b:
- fc.py
. Make sure to double-check they do not contain any added malicious code. To avoid downloading new versions of the code file, you can pin a revision.
A new version of the following files was downloaded from https://huggingface.co/mosaicml/mpt-7b:
- attention.py
- flash_attn_triton.py
- fc.py
. Make sure to double-check they do not contain any added malicious code. To avoid downloading new versions of the code file, you can pin a revision.


hf_prefixlm_converter.py:   0%|          | 0.00/10.5k [00:00<?, ?B/s]

A new version of the following files was downloaded from https://huggingface.co/mosaicml/mpt-7b:
- hf_prefixlm_converter.py
. Make sure to double-check they do not contain any added malicious code. To avoid downloading new versions of the code file, you can pin a revision.


blocks.py:   0%|          | 0.00/2.84k [00:00<?, ?B/s]

ffn.py:   0%|          | 0.00/1.75k [00:00<?, ?B/s]

A new version of the following files was downloaded from https://huggingface.co/mosaicml/mpt-7b:
- ffn.py
. Make sure to double-check they do not contain any added malicious code. To avoid downloading new versions of the code file, you can pin a revision.
A new version of the following files was downloaded from https://huggingface.co/mosaicml/mpt-7b:
- blocks.py
- ffn.py
. Make sure to double-check they do not contain any added malicious code. To avoid downloading new versions of the code file, you can pin a revision.


param_init_fns.py:   0%|          | 0.00/11.9k [00:00<?, ?B/s]

A new version of the following files was downloaded from https://huggingface.co/mosaicml/mpt-7b:
- param_init_fns.py
. Make sure to double-check they do not contain any added malicious code. To avoid downloading new versions of the code file, you can pin a revision.


custom_embedding.py:   0%|          | 0.00/292 [00:00<?, ?B/s]

A new version of the following files was downloaded from https://huggingface.co/mosaicml/mpt-7b:
- custom_embedding.py
. Make sure to double-check they do not contain any added malicious code. To avoid downloading new versions of the code file, you can pin a revision.
A new version of the following files was downloaded from https://huggingface.co/mosaicml/mpt-7b:
- modeling_mpt.py
- adapt_tokenizer.py
- meta_init_context.py
- norm.py
- attention.py
- hf_prefixlm_converter.py
- blocks.py
- param_init_fns.py
- custom_embedding.py
. Make sure to double-check they do not contain any added malicious code. To avoid downloading new versions of the code file, you can pin a revision.


pytorch_model.bin.index.json:   0%|          | 0.00/16.0k [00:00<?, ?B/s]

Downloading shards:   0%|          | 0/2 [00:00<?, ?it/s]

pytorch_model-00001-of-00002.bin:   0%|          | 0.00/9.94G [00:00<?, ?B/s]