In [1]:
pip install transformers torch

Collecting nvidia-cuda-nvrtc-cu12==12.4.127 (from torch)
  Downloading nvidia_cuda_nvrtc_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-cuda-runtime-cu12==12.4.127 (from torch)
  Downloading nvidia_cuda_runtime_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-cuda-cupti-cu12==12.4.127 (from torch)
  Downloading nvidia_cuda_cupti_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl.metadata (1.6 kB)
Collecting nvidia-cudnn-cu12==9.1.0.70 (from torch)
  Downloading nvidia_cudnn_cu12-9.1.0.70-py3-none-manylinux2014_x86_64.whl.metadata (1.6 kB)
Collecting nvidia-cublas-cu12==12.4.5.8 (from torch)
  Downloading nvidia_cublas_cu12-12.4.5.8-py3-none-manylinux2014_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-cufft-cu12==11.2.1.3 (from torch)
  Downloading nvidia_cufft_cu12-11.2.1.3-py3-none-manylinux2014_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-curand-cu12==10.3.5.147 (from torch)
  Downloading nvidia_curand_cu12-10.3.5

In [2]:
from transformers import GPT2LMHeadModel, GPT2Tokenizer

## **Easy Approach:**

In [4]:
tokenizer = GPT2Tokenizer.from_pretrained("gpt2")
model = GPT2LMHeadModel.from_pretrained("gpt2")

input_text = "Artificial Intelligence is"
inputs = tokenizer(input_text, return_tensors="pt") #return_tensors="pt" converts the output into a PyTorch tensor so it can be used by the model.

output = model.generate(**inputs, max_length = 50) #The model predicts one token at a time until it reaches the length limit or a stopping condition.

print(tokenizer.decode(output[0], skip_special_tokens=True))
#output[0] contains the generated token IDs.
#tokenizer.decode(output[0]) converts them back into human-readable text.

tokenizer_config.json:   0%|          | 0.00/26.0 [00:00<?, ?B/s]

vocab.json:   0%|          | 0.00/1.04M [00:00<?, ?B/s]

merges.txt:   0%|          | 0.00/456k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/1.36M [00:00<?, ?B/s]

config.json:   0%|          | 0.00/665 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/548M [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/124 [00:00<?, ?B/s]

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


Artificial Intelligence is a new field of research that has been gaining traction in recent years. It is a field that has been growing in popularity since the early 1990s.

The field is called Artificial Intelligence and it is a field that has been


## **Advanced Approach:**

In [5]:
# Load GPT-2 tokenizer and model
tokenizer = GPT2Tokenizer.from_pretrained("gpt2")
model = GPT2LMHeadModel.from_pretrained("gpt2")

# Set the model to evaluation mode
model.eval()

# Function to generate text using GPT-2
def generate_text(prompt, max_length=50, temperature=1.0, top_k=50, top_p=0.95):
    # Tokenize input text
    inputs = tokenizer(prompt, return_tensors="pt")

    # Generate text
    output = model.generate(
        **inputs,
        max_length=max_length,
        temperature=temperature,  # Controls randomness (higher = more random)
        top_k=top_k,  # Keeps only the top 50 most probable words per step.
        top_p=top_p,  # Nucleus sampling (top-p filtering), Uses nucleus sampling (sampling from top 95% probability mass).
        do_sample=True,  # Enables sampling instead of greedy decoding.
    )

    # Decode generated text
    return tokenizer.decode(output[0], skip_special_tokens=True)

# Example usage
if __name__ == "__main__":
    prompt_text = "Artificial Intelligence is"
    generated_text = generate_text(prompt_text, max_length=100)
    print("\nGenerated Text:\n", generated_text)

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.



Generated Text:
 Artificial Intelligence is not one of these areas that's been successfully implemented in a real-world setting.

The idea that artificial intelligence can be taught to human beings is something that might interest the general public. It's the sort of thing that has a lot of relevance in this kind of technology.

The way AI is being programmed, it's already well-established that it's extremely unlikely that anybody will be able to do the same thing for someone. The big question, though,


## **Some Important Points-**

📌 **Why Use top_k and top_p Together?**
*   top_k prevents the model from selecting low-probability words.
*   top_p ensures the model only selects from the most meaningful words.

📌 **What is Sampling in NLP and Deep Learning?**

Sampling is the process of choosing the next word in text generation models like GPT-2/GPT-3 based on predicted probabilities. Instead of always selecting the most probable word (greedy decoding), sampling introduces randomness to generate more diverse and creative text.

📌 **Why is Sampling Needed?**

Without sampling (greedy decoding) → The model always picks the word with the highest probability.

❌ Problem: The text becomes predictable and repetitive.

With sampling → The model randomly selects a word from the top choices, making the text more diverse and creative.

✅ Advantage: Generates more natural and varied responses.

