The Case Study:
finetune Falcon LLM for creating midjourney prompt.

Input: "Midjourney prompt for a fantasy village night".

Output: "a beautiful adorable fantasy village the ground is lit like warm daylight, but the skay is dark and full of stars. Photorealistic --ar 16:9"

First we have to choose a model, in our case we will use falcon from hugging face [tiiuae/falcon-7b-instruct](https://huggingface.co/tiiuae/falcon-7b-instruct)

In [2]:
# Here, the code imports necessary libraries: AutoTokenizer and AutoModelForCausalLM from the 
# Hugging Face Transformers library, as well as the transformers, torch libraries.
from transformers import AutoTokenizer, AutoModelForCausalLM
import transformers
import torch
# Using `low_cpu_mem_usage=True` or a `device_map` requires accelerate

# It specifies the pre-trained language model to be used ("tiiuae/falcon-7b-instruct") 
# and initializes the tokenizer for that model.
model = "tiiuae/falcon-7b-instruct"

tokenizer = AutoTokenizer.from_pretrained(model)

# It sets up a text generation pipeline using the specified model and tokenizer. 
# Several additional parameters are also set:
#torch_dtype: The datatype used for torch tensors (set to torch.bfloat16).
#trust_remote_code: Allows trusting remote code (set to True).
#device_map: Automatically selects the device for computation (set to "auto").

pipeline = transformers.pipeline(
    "text-generation",
    model=model,
    tokenizer=tokenizer,
    torch_dtype=torch.bfloat16,
    trust_remote_code=True,
    device_map="auto",
)

# It generates text sequences using the pipeline. The input text is provided, and several parameters are set:

# max_length: Maximum length of the generated sequences (set to 200).
# do_sample: Whether to use sampling during generation (set to True).
# top_k: The number of highest probability words to sample from during generation (set to 10).
# num_return_sequences: Number of sequences to generate (set to 1).
# eos_token_id: ID of the end-of-sequence token.
# sequences = pipeline(
   "Girafatron is obsessed with giraffes, the most glorious animal on the face of this Earth. Giraftron believes all other animals are irrelevant when compared to the glorious majesty of the giraffe.\nDaniel: Hello, Girafatron!\nGirafatron:",
    max_length=200,
    do_sample=True,
    top_k=10,
    num_return_sequences=1,
    eos_token_id=tokenizer.eos_token_id,
)
for seq in sequences:
    print(f"Result: {seq['generated_text']}")

Loading checkpoint shards: 100%|██████████| 2/2 [00:31<00:00, 15.55s/it]
Setting `pad_token_id` to `eos_token_id`:11 for open-end generation.


Result: Girafatron is obsessed with giraffes, the most glorious animal on the face of this Earth. Giraftron believes all other animals are irrelevant when compared to the glorious majesty of the giraffe.
Daniel: Hello, Girafatron!
Girafatron: "Oh, hello, Daniel! How's the weather out there?"
Daniel: "Oh, I'm fine, just enjoying the lovely view."
Girafatron: "Aha! What a great way to put it! Yes, I'm so happy just looking out at all the giraffes around me!"
