**This notebook explores the fascinating world of text generation using the powerful StableBeluga-13B model from Hugging Face Transformers. Text generation has a wide range of applications, from generating creative stories to enhancing existing quotes, and this notebook demonstrates its versatility.

**Introduction to StableBeluga-13B Model:** This notebook begins with an introduction to the StableBeluga-13B model and provides instructions for setting it up.

**Using a Text Generation Pipeline:** You'll learn how to use a high-level helper pipeline to generate text effortlessly. The StableBeluga-13B model will be used to generate text based on prompts, making it suitable for a variety of tasks.

**Loading the Model Directly:** For those who want to dive deeper, the notebook explains how to load the StableBeluga-13B model and tokenizer directly, enabling more control over the text generation process.

**Preprocessing Kaggle 5k Dataset:** The input file has been preprocessed using a Kaggle 5k dataset. Details about the preprocessing steps, such as cleaning, tokenization, and any other relevant transformations, are provided to ensure a well-prepared dataset for text generation.

**Enhancing Quotes:** The notebook demonstrates how to enhance quotes by providing an input quote and generating an enriched version. You'll witness the model's creative text generation capabilities in action.

**Generating Image Prompts:** You'll explore how to create image prompts by generating scene descriptions based on an initial quote. These descriptions can serve as inspiration for visual storytelling.

**Creating Short Stories:** The notebook dives into text generation from a dataset of quotes. You'll learn how to generate short stories for quotes using a customizable story generation function.

In [1]:
pip install transformers

Collecting transformers
  Downloading transformers-4.35.0-py3-none-any.whl (7.9 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m7.9/7.9 MB[0m [31m43.9 MB/s[0m eta [36m0:00:00[0m00:01[0m00:01[0m
Collecting tokenizers<0.15,>=0.14
  Downloading tokenizers-0.14.1-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (3.8 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m3.8/3.8 MB[0m [31m69.1 MB/s[0m eta [36m0:00:00[0m:00:01[0m
Collecting safetensors>=0.3.1
  Downloading safetensors-0.4.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.3 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.3/1.3 MB[0m [31m51.1 MB/s[0m eta [36m0:00:00[0m
Collecting huggingface-hub<1.0,>=0.16.4
  Downloading huggingface_hub-0.19.0-py3-none-any.whl (311 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m311.2/311.2 kB[0m [31m26.0 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting regex!=2019.12.17
  Downloadin

In [2]:
# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="stabilityai/StableBeluga-13B")

  from .autonotebook import tqdm as notebook_tqdm
Downloading (…)lve/main/config.json: 100%|██████████| 583/583 [00:00<00:00, 59.9kB/s]
Downloading (…)fetensors.index.json: 100%|██████████| 33.4k/33.4k [00:00<00:00, 3.77MB/s]
Downloading shards:   0%|          | 0/3 [00:00<?, ?it/s]
Downloading (…)of-00003.safetensors:   0%|          | 0.00/9.95G [00:00<?, ?B/s][A
Downloading (…)of-00003.safetensors:   0%|          | 10.5M/9.95G [00:00<01:39, 100MB/s][A
Downloading (…)of-00003.safetensors:   0%|          | 41.9M/9.95G [00:00<00:50, 195MB/s][A
Downloading (…)of-00003.safetensors:   1%|          | 73.4M/9.95G [00:00<00:41, 238MB/s][A
Downloading (…)of-00003.safetensors:   1%|          | 105M/9.95G [00:00<00:38, 253MB/s] [A
Downloading (…)of-00003.safetensors:   1%|▏         | 136M/9.95G [00:00<00:37, 264MB/s][A
Downloading (…)of-00003.safetensors:   2%|▏         | 168M/9.95G [00:00<00:36, 270MB/s][A
Downloading (…)of-00003.safetensors:   2%|▏         | 199M/9.95G [00:00<00:35, 272

In [3]:
# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("stabilityai/StableBeluga-13B")
model = AutoModelForCausalLM.from_pretrained("stabilityai/StableBeluga-13B")

Loading checkpoint shards: 100%|██████████| 3/3 [00:06<00:00,  2.15s/it]


In [6]:
import pandas as pd
import random

# Load the dataset
dataset_path = "/kaggle/input/cleaneddd/cleaned_quotes.csv"
df = pd.read_csv(dataset_path)

# Define a function to generate a short story
def generate_story(quote):
    # You can customize the story generation logic here.
    # For this example, we'll generate a random story based on the quote.
    story = f"Write a short story with minimum 500 words based on the quote:\n'{quote}'\n"
    
    return story

# Generate stories for the first 10 rows
for index, row in df.head(1).iterrows():
    quote = row['Text']
    story = generate_story(quote)
    print(story)
    print()

# Note: You can customize the story generation logic to create more meaningful stories.


Write a short story with minimum 500 words based on the quote:
'I'm selfish, impatient," and stubborn. but i do not care. i care about you. '




In [None]:
import pandas as pd
from transformers import AutoTokenizer, AutoModelForCausalLM

# Load the dataset
dataset_path = "/kaggle/input/cleaneddd/cleaned_quotes.csv"
df = pd.read_csv(dataset_path)

# Load the StableBeluga-13B model directly
model_name = "stabilityai/StableBeluga-13B"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)

# Define a function to generate a short story
def generate_story(quote):
    # Tokenize the quote
    input_ids = tokenizer.encode(quote, return_tensors="pt", max_length=50, truncation=True)
    
    # Generate text based on the quote
    output = model.generate(input_ids, max_length=300, num_beams=5, temperature=0.7, top_k=50, top_p=0.95, no_repeat_ngram_size=2, num_return_sequences=1)
    
    # Decode the generated text
    generated_text = tokenizer.decode(output[0], skip_special_tokens=True)
    
    # Create a story incorporating the generated text
    story = f"Write a short story based on the inspirational quote:\n'{quote}'\n\n{generated_text}"
    
    return story

# Generate stories for the first row
for index, row in df.head(1).iterrows():
    quote = row['Text']
    story = generate_story(quote)
    print(story)
    print()


Loading checkpoint shards: 100%|██████████| 3/3 [00:04<00:00,  1.38s/it]
