In [1]:
from transformers import AutoTokenizer, AutoModelForCausalLM
import transformers
import torch

model = "tiiuae/falcon-rw-1b"

tokenizer = AutoTokenizer.from_pretrained(model)
pipeline = transformers.pipeline(
    "text-generation",
    model=model,
    tokenizer=tokenizer,
    torch_dtype=torch.bfloat16,
    device_map="auto",
)
sequences = pipeline(
   "Girafatron is obsessed with giraffes, the most glorious animal on the face of this Earth. Giraftron believes all other animals are irrelevant when compared to the glorious majesty of the giraffe.\nDaniel: Hello, Girafatron!\nGirafatron:",
    max_length=200,
    do_sample=True,
    top_k=10,
    num_return_sequences=1,
    eos_token_id=tokenizer.eos_token_id,
)
for seq in sequences:
    print(f"Result: {seq['generated_text']}")


To support symlinks on Windows, you either need to activate Developer Mode or to run Python as an administrator. In order to see activate developer mode, see this article: https://docs.microsoft.com/en-us/windows/apps/get-started/enable-your-device-for-development
Truncation was not explicitly activated but `max_length` is provided a specific value, please use `truncation=True` to explicitly truncate examples to max length. Defaulting to 'longest_first' truncation strategy. If you encode pairs of sequences (GLUE-style) with the tokenizer you can select this strategy more precisely by providing a specific strategy to `truncation`.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


Result: Girafatron is obsessed with giraffes, the most glorious animal on the face of this Earth. Giraftron believes all other animals are irrelevant when compared to the glorious majesty of the giraffe.
Daniel: Hello, Girafatron!
Girafatron: *blink blink* Hello, Daniel.
Daniel: What are we doing, Girafatron?
Girafatron: *blink blink* Well, we’re just going to talk about giraffes and I’m going to ask you about giraffes.
Daniel: I don’t understand.
Girafatron: I don’t understand why you don’t understand.
Daniel: *blink blink* Why don’t I understand?
Girafatron: *blink blink* Why are giraffes so cool and why do we love giraffes so much?
Daniel: *blink blink


In [3]:
tokenizer.save_pretrained('my_tokenizer')


('my_tokenizer\\tokenizer_config.json',
 'my_tokenizer\\special_tokens_map.json',
 'my_tokenizer\\vocab.json',
 'my_tokenizer\\merges.txt',
 'my_tokenizer\\added_tokens.json',
 'my_tokenizer\\tokenizer.json')

In [4]:
pipeline.model.save_pretrained('my_model')


In [5]:
from transformers import AutoTokenizer, AutoModelForCausalLM
import transformers
import torch

model = "tiiuae/falcon-7b"

tokenizer = AutoTokenizer.from_pretrained(model)
pipeline = transformers.pipeline(
    "text-generation",
    model=model,
    tokenizer=tokenizer,
    torch_dtype=torch.bfloat16,
    trust_remote_code=True,
    device_map="auto",
)
sequences = pipeline(
   "Girafatron is obsessed with giraffes, the most glorious animal on the face of this Earth. Giraftron believes all other animals are irrelevant when compared to the glorious majesty of the giraffe.\nDaniel: Hello, Girafatron!\nGirafatron:",
    max_length=200,
    do_sample=True,
    top_k=10,
    num_return_sequences=1,
    eos_token_id=tokenizer.eos_token_id,
)
for seq in sequences:
    print(f"Result: {seq['generated_text']}")


To support symlinks on Windows, you either need to activate Developer Mode or to run Python as an administrator. In order to see activate developer mode, see this article: https://docs.microsoft.com/en-us/windows/apps/get-started/enable-your-device-for-development
A new version of the following files was downloaded from https://huggingface.co/tiiuae/falcon-7b:
- configuration_falcon.py
. Make sure to double-check they do not contain any added malicious code. To avoid downloading new versions of the code file, you can pin a revision.


A new version of the following files was downloaded from https://huggingface.co/tiiuae/falcon-7b:
- modeling_falcon.py
. Make sure to double-check they do not contain any added malicious code. To avoid downloading new versions of the code file, you can pin a revision.
Downloading shards: 100%|██████████| 2/2 [10:11<00:00, 305.70s/it]
Loading checkpoint shards: 100%|██████████| 2/2 [00:00<00:00,  3.42it/s]
Truncation was not explicitly activated but `max_l

Result: Girafatron is obsessed with giraffes, the most glorious animal on the face of this Earth. Giraftron believes all other animals are irrelevant when compared to the glorious majesty of the giraffe.
Daniel: Hello, Girafatron!
Girafatron: Daniel, my friend. It is great to meet you.
Daniel: I can't believe I'm here!
Girafatron: I have a question, do you have a giraffe?
Daniel: Yes! His name is George!
Giraffes are the most perfect animal on the face of this Earth.
Daniel: Girafatron, can I ask you a question?
Giraffatron: Of course, Daniel. Ask me anything.
Daniel: Giraffes are awesome, but what are your favourite animals?
Girafatron: I like elephants and pandas.
Giraffatron has


In [6]:
tokenizer.save_pretrained('my_tokenizer7b')
pipeline.model.save_pretrained('my_model7b')
