## Note: Use GPU as your runtime Hardware accelerator

### How to do that?

> 1. Click on Runtime
> 2. Select Change Runtime type
> 3. Choose T4 GPU as your Hardware accelerator

In [1]:
!pip install transformers ctransformers accelerate

Collecting ctransformers
  Downloading ctransformers-0.2.27-py3-none-any.whl.metadata (17 kB)
Collecting py-cpuinfo<10.0.0,>=9.0.0 (from ctransformers)
  Downloading py_cpuinfo-9.0.0-py3-none-any.whl.metadata (794 bytes)
Downloading ctransformers-0.2.27-py3-none-any.whl (9.9 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m9.9/9.9 MB[0m [31m997.0 kB/s[0m eta [36m0:00:00[0m:01[0m00:01[0m0m
[?25hDownloading py_cpuinfo-9.0.0-py3-none-any.whl (22 kB)
Installing collected packages: py-cpuinfo, ctransformers
Successfully installed ctransformers-0.2.27 py-cpuinfo-9.0.0


In [2]:
!pip install sentencepiece



In [3]:
import torch

In [4]:
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
device

device(type='cuda')

## StableLM

StableLM-Base-Alpha-3B-v2 is a 3 billion parameter decoder-only language model pre-trained on diverse English datasets. This model is the successor to the first StableLM-Base-Alpha-3B model, addressing previous shortcomings through the use of improved data sources and mixture ratios.

In [5]:
from transformers import AutoModelForCausalLM, AutoTokenizer

In [6]:
model_name = "stabilityai/stablelm-base-alpha-3b-v2"

In [8]:
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
  model_name,
  trust_remote_code=True,
  torch_dtype="auto",
)

Downloading shards:   0%|          | 0/2 [00:00<?, ?it/s]

model-00001-of-00002.safetensors:  58%|#####7    | 5.78G/9.96G [00:00<?, ?B/s]

model-00002-of-00002.safetensors:   0%|          | 0.00/1.22G [00:00<?, ?B/s]

Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

generation_config.json:   0%|          | 0.00/111 [00:00<?, ?B/s]

In [9]:
model.cuda()

StableLMAlphaForCausalLM(
  (transformer): StableLMAlphaModel(
    (embed): Embedding(50432, 2560)
    (layers): ModuleList(
      (0-31): 32 x DecoderLayer(
        (norm): LayerNorm((2560,), eps=1e-05, elementwise_affine=True)
        (attention): Attention(
          (qkv_proj): Linear(in_features=2560, out_features=7680, bias=False)
          (out_proj): Linear(in_features=2560, out_features=2560, bias=False)
          (rotary_emb): RotaryEmbedding()
        )
        (mlp): MLP(
          (gate_proj): Linear(in_features=2560, out_features=13824, bias=False)
          (out_proj): Linear(in_features=6912, out_features=2560, bias=False)
          (act): SiLU()
        )
      )
    )
    (final_norm): LayerNorm((2560,), eps=1e-05, elementwise_affine=True)
  )
  (lm_head): Linear(in_features=2560, out_features=50432, bias=False)
)

In [10]:
inputs = tokenizer("The weather is always wonderful", return_tensors="pt").to(device)

In [11]:
tokens = model.generate(
  **inputs,
  max_new_tokens=300,
  temperature=0.75,
  top_p=0.95,
  do_sample=True,
)

Setting `pad_token_id` to `eos_token_id`:0 for open-end generation.


In [12]:
print(tokenizer.decode(tokens[0], skip_special_tokens=True))

The weather is always wonderful in the fall, even if it is chilly, and the change of season is always inspiring.
We took our two young sons to the park, then headed back to my mother’s house to pick up her car and go out to dinner with her and our other son.
We sat outside and ate at a wonderful restaurant, then drove back to my mom’s house and walked around the neighborhood.
We didn’t have a lot of conversation, but there were a lot of smiles and laughs. I hope we’re just getting a taste of what is to come in the coming years.
The fall season is always a time for family to come together, and for me to enjoy the company of my parents and brother and sister-in-law.
I’m looking forward to more dinners and more time with my family in the future.



In [13]:
torch.cuda.empty_cache()

In [22]:
from huggingface_hub import notebook_login

notebook_login()


VBox(children=(HTML(value='<center> <img\nsrc=https://huggingface.co/front/assets/huggingface_logo-noborder.sv…

## Orca-mini-3b

In [23]:
from transformers import LlamaForCausalLM, LlamaTokenizer

In [30]:
model_path = 'psmathur/orca_mini_3b'
tokenizer = LlamaTokenizer.from_pretrained(model_path)
model = LlamaForCausalLM.from_pretrained(
    model_path, torch_dtype=torch.float16, device_map='auto', offload_buffers=True
)



Loading checkpoint shards:   0%|          | 0/3 [00:00<?, ?it/s]

Some parameters are on the meta device because they were offloaded to the cpu and disk.


In [32]:
# Get model device (auto-detected)
device = model.device

In [33]:
prompt = "Write about SpaceX and it's achievements"

In [34]:
tokens = tokenizer.encode(prompt)
tokens = torch.LongTensor(tokens).unsqueeze(0)
tokens = tokens.to(device)

In [35]:
instance = {'input_ids': tokens,'top_p': 1.0, 'temperature':0.2, 'generate_len': 512, 'top_k': 30}

In [36]:
length = len(tokens[0])
with torch.no_grad():
  response = model.generate(
              input_ids=tokens,
              max_length=length+instance['generate_len'],
              top_p=instance['top_p'],
              temperature=instance['temperature'],
              top_k=instance['top_k'],
              use_cache=True,
              do_sample=True,
        )

In [37]:
response

tensor([[    1, 14734,   562, 26878,   291,   492, 14013, 31843,    13, 19324,
         31916,   322,   260,  2309,  2138, 11317,  1384,  6820,   417, 27570,
         20269,   288, 31822, 31855, 31852, 31852, 31855, 31843,   571,   322,
          1743,   288,  5493,   388,  7566, 31844,  2870, 31844,   291,   470,
           619,  1326,  2609,   288,  2063, 26324, 24056, 31844,  2870, 31843,
         26878,   322,  1802,   329,   619,   306,   337,   528, 31223,   291,
         23130, 31844,   540,   397,  3213,   289,   865,  2138, 11317,   541,
          8491,   291,  7043, 31843,    13,  9040,   287, 26878, 31876, 31829,
           758,  2259, 14013,   322,   266,  3671,  4540,   291, 11894,   287,
           619, 23869, 31822, 31877, 16380,   288, 31822, 31855, 31852, 31853,
         31880, 31843,   672, 16380,   393,  7271,   287,  8321,   260, 27298,
           287,   550,   289, 31822, 31853, 31878, 31852, 31844, 31852, 31852,
         31852, 18308,   352, 31855, 31886, 31887, 3

In [38]:
output = response[0][length:]
output

tensor([31843,    13, 19324, 31916,   322,   260,  2309,  2138, 11317,  1384,
         6820,   417, 27570, 20269,   288, 31822, 31855, 31852, 31852, 31855,
        31843,   571,   322,  1743,   288,  5493,   388,  7566, 31844,  2870,
        31844,   291,   470,   619,  1326,  2609,   288,  2063, 26324, 24056,
        31844,  2870, 31843, 26878,   322,  1802,   329,   619,   306,   337,
          528, 31223,   291, 23130, 31844,   540,   397,  3213,   289,   865,
         2138, 11317,   541,  8491,   291,  7043, 31843,    13,  9040,   287,
        26878, 31876, 31829,   758,  2259, 14013,   322,   266,  3671,  4540,
          291, 11894,   287,   619, 23869, 31822, 31877, 16380,   288, 31822,
        31855, 31852, 31853, 31880, 31843,   672, 16380,   393,  7271,   287,
         8321,   260, 27298,   287,   550,   289, 31822, 31853, 31878, 31852,
        31844, 31852, 31852, 31852, 18308,   352, 31855, 31886, 31887, 31844,
        31852, 31852, 31852, 26716, 31861,   684,  2079,  4556, 

In [39]:
result = tokenizer.decode(output, skip_special_tokens=True)

In [40]:
result

".\nSpaceX is a private space exploration company founded by Elon Musk in 2002. It is based in Hawthorne, California, and has its main office in El Segundo, California. SpaceX is known for its reusable rockets and spacecraft, which are designed to make space exploration more affordable and sustainable.\nOne of SpaceX's most significant achievements is the successful launch and landing of its Falcon 9 rocket in 2015. This rocket was capable of carrying a payload of up to 130,000 kg (286,000 lb) into low Earth orbit. SpaceX also successfully launched and landed the Falcon 9 rocket again in 2016, making it the first private company to do so.\nIn addition, SpaceX has also successfully launched and landed its Dragon spacecraft, which is designed to carry astronauts and cargo to the International Space Station. The company has also plans to send humans to Mars in the near future.\nSpaceX has also made significant progress in developing reusable rockets. Its Falcon 9 rocket can be launched an