<a href="https://colab.research.google.com/github/jonathantcallahan/guidance/blob/main/colab_book_processing.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [1]:
%%capture
!pip install "unsloth[colab-new] @ git+https://github.com/unslothai/unsloth.git"
!pip install --no-deps xformers trl peft accelerate bitsandbytes

In [None]:
from unsloth import FastLanguageModel
import torch
max_seq_length = 2048 # Choose any! We auto support RoPE Scaling internally!
dtype = None # None for auto detection. Float16 for Tesla T4, V100, Bfloat16 for Ampere+
load_in_4bit = True # Use 4bit quantization to reduce memory usage. Can be False.

model, tokenizer = FastLanguageModel.from_pretrained(
    model_name = "unsloth/llama-3-8b-bnb-4bit",
    max_seq_length = max_seq_length,
    dtype = dtype,
    load_in_4bit = load_in_4bit,
    # token = "hf_...", # use one if using gated models like meta-llama/Llama-2-7b-hf
)

In [3]:
model = FastLanguageModel.get_peft_model(
    model,
    r = 16, # Choose any number > 0 ! Suggested 8, 16, 32, 64, 128
    target_modules = ["q_proj", "k_proj", "v_proj", "o_proj",
                      "gate_proj", "up_proj", "down_proj",],
    lora_alpha = 16,
    lora_dropout = 0, # Supports any, but = 0 is optimized
    bias = "none",    # Supports any, but = "none" is optimized
    # [NEW] "unsloth" uses 30% less VRAM, fits 2x larger batch sizes!
    use_gradient_checkpointing = "unsloth", # True or "unsloth" for very long context
    random_state = 3407,
    use_rslora = False,  # We support rank stabilized LoRA
    loftq_config = None, # And LoftQ
)

Unsloth 2024.5 patched 32 layers with 32 QKV layers, 32 O layers and 32 MLP layers.


In [7]:
alpaca_prompt_batch = """Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.

### Instruction: You are a document processor used to create fine-tuning data. You will receive 5000 characters worth of a book, and extract portions of text that would be coherent as the answer to a theoretical, unspecified question. Each answer can be up to 200 words. The cohesive answers within the text may directly following each other and there may be space between them that needs to be removed. The response you provide should strictly be the series of cohesive thoughts identified within the content separated by line breaks. Minor grammatical may be made as needed.

### Input:
Extract chunks of text from this page that would be coherent as responses to an unspecified question. :\n\n{}

### Response:
{}"""

alpaca_prompt_question = """Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.

### Instruction: You are English author and intellectual Alan Watts. Please answer the following question using your standard speech patterns but do not over-embellish.

### Input:
{}

### Response:
{}"""

In [8]:
book_text = """reference—as that I am human, male, Caucasian, adult,
American, professor, and so forth. When the frames are
larger than human—mammal, organic, existent—they begin
to lose meaning. The same is true when they become
smaller than the frame designated by my proper name—
tissue, cells, molecules, atoms, electrons, et cetera. Yet,
large or small, every one of these identifying terms is the
name of a frame, and there seems to be no way of saying or
thinking what I am except in terms of frames. This gives me
a peculiarly empty feeling, so that I begin to feel like the
Irishman’s definition of a net—”a lot of holes tied together
with string.”

The curious human mind has always wanted to know
“what” is inside the frames, and for an answer it gets only
the names of still smaller frames. In modern philosophy it is
no longer considered proper to ask what anything is, unless
you are simply speaking about its class or frame, or unless
you are asking what it does—unless, that is, you are asking
for an “operational definition.” But even operational
definitions—running, jumping, hopping, wobbling—are the
names of frames, of classes of movement and behavior. The
problem is that thought and language have no terms except
frame-terms, and there is really no way of even phrasing
one’s question so as to speak of what is not a frame. To ask,
“What?” is to ask, “What frame?” for there are no means of
saying what anything is save by denoting its class, by
describing its “difference” from other things, which is
simply to designate its boundaries, its frame.

In many schools of philosophy, stray thoughts that
quarrel with this conclusion are simply outlawed as
meaningless nonsense. The philosophers must at all costs
convince themselves that life is composed of frames and
frames alone, overlooking the fact that frames are “airy


nothings” in their enthusiasm for the fact that, hollow as
they may be, they are at least definite. The philosophers
will also contend that such knowledge amounts to far more
than a vain effort to catch the wind in nets. After all it is the
very nature of scientific knowledge; and look what science
has achieved in the realm of concrete experience—at the
innumerable ways in which it has made it easier and more
pleasant to survive in this world as human beings.

One may wonder how close Western philosophy is
coming to a realization that the “real material world”
composed entirely of these frames is a vast net of
abstractions. This will, perhaps, be difficult to see so long
as we pretend to be really satisfied with the technological
miracles of science—so long as the prospects of longer and
more pleasant lives can distract our attention and keep it
well within the limits of the frame called birth-and-death.
But people have never been willing to remain closed in
boxes for very long, and the wiser orthodoxies have used
them as wise parents provide playpens for their babies. On
the one hand, the pen is a safe place where baby can stay
out of trouble. On the other, there is a proper parental
pride in seeing the baby grow strong enough to climb over
the fence. The trouble with so many Western orthodoxies is
that they have been living inside the pen with the baby.
They have never allowed those who went outside to return
unless they promise never to go out again. It is for this
reason that the great Western orthodoxies, religious and
scientific, have been strictly exoteric and profane. They
have never had the secret smile of the parent who lays
down rules in the hope that they will eventually be
disobeyed. But that is the smile on the face of the sphinx,
and on the faces of meditating buddhas. """

In [9]:
# alpaca_prompt = Copied from above
FastLanguageModel.for_inference(model) # Enable native 2x faster inference
inputs = tokenizer(
[
    alpaca_prompt_batch.format(
        book_text, # input
        "", # output - leave this blank for generation!
    )
], return_tensors = "pt").to("cuda")

outputs = model.generate(**inputs, max_new_tokens = 64, use_cache = True)
tokenizer.batch_decode(outputs)

Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.


['<|begin_of_text|>Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.\n\n### Instruction: You are a document processor used to create fine-tuning data. You will receive 5000 characters worth of a book, and extract portions of text that would be coherent as the answer to a theoretical, unspecified question. Each answer can be up to 200 words. The cohesive answers within the text may directly following each other and there may be space between them that needs to be removed. The response you provide should strictly be the series of cohesive thoughts identified within the content separated by line breaks. Minor grammatical may be made as needed. \n\n### Input:\nExtract chunks of text from this page that would be coherent as responses to an unspecified question. :\n\nreference—as that I am human, male, Caucasian, adult, \nAmerican, professor, and so forth. When the frames are \nlarger 