<a href="https://colab.research.google.com/github/djprojecthub/handson-llm/blob/main/Chapter_1_Generate_First_Text.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

This cell imports necessary classes from the `transformers` library and loads a pre-trained large language model (LLM) and its corresponding tokenizer.

*   `AutoModelForCausalLM.from_pretrained("microsoft/Phi-3-mini-4k-instruct", ...)`: Loads the 'Phi-3-mini-4k-instruct' model, which is a causal language model suitable for instruction-following tasks.
    *   `device_map="auto"`: Automatically determines the best device (CPU or GPU) to load the model onto.
    *   `torch_dtype="auto"`: Automatically selects the appropriate data type for PyTorch tensors, optimizing for performance.
*   `AutoTokenizer.from_pretrained("microsoft/Phi-3-mini-4k-instruct")`: Loads the tokenizer associated with the 'Phi-3-mini-4k-instruct' model. The tokenizer is essential for converting text into a format the model can understand and vice-versa.

In [None]:
from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained(
    "microsoft/Phi-3-mini-4k-instruct",
    device_map="auto",
    torch_dtype="auto",
)
tokenizer = AutoTokenizer.from_pretrained("microsoft/Phi-3-mini-4k-instruct")

This cell sets up a text generation pipeline using the loaded model and tokenizer.

*   `from transformers import pipeline`: Imports the `pipeline` function, a high-level API for using models for various tasks.
*   `generator = pipeline("text-generation", ...)`: Initializes a text generation pipeline.
    *   `model=model`: Specifies the pre-trained language model to use for generation.
    *   `tokenizer=tokenizer`: Specifies the tokenizer to use for processing input and output text.
    *   `return_full_text=False`: Ensures that only the newly generated text is returned, not the input prompt concatenated with the generated text.
    *   `max_new_tokens=500`: Sets the maximum number of new tokens (words/subwords) the model can generate.
    *   `progress_bar=False`: Disables the progress bar during generation.

In [None]:
from transformers import pipeline

generator = pipeline(
    "text-generation",
    model=model,
    tokenizer=tokenizer,
    return_full_text=False,
    max_new_tokens=500,
    progress_bar=False,

)

This cell defines an input message for the model, generates text using the pipeline, and prints the output.

*   `message = [{ "role":"user", "content":"Create a funny joke about chickens." }]`: Defines the input message in a chat-like format, specifying the role as "user" and the content of the request.
*   `output = generator(message)[0]["generated_text"]`: Calls the text generation pipeline with the defined message. The output is a list of dictionaries, so `[0]["generated_text"]` extracts the actual generated text from the first (and usually only) result.
*   `print(output)`: Prints the generated joke to the console.

In [None]:
message = [{
    "role":"user",
    "content":"Create a funny joke about chickens."
}]

output=generator(message)[0]["generated_text"]
print(output)