In [1]:
import os
import torch
from transformers import (
    AutoModelForCausalLM,
    AutoTokenizer,
)
from antislop_generate import generate_antislop, chat_antislop

# Enable efficient transfer for Hugging Face models
os.environ['HF_HUB_ENABLE_HF_TRANSFER'] = "1"

# Set the device to 'cuda' if available, else 'cpu'
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

# Specify the model name (replace with your preferred model)
model_name = "unsloth/Llama-3.2-1B-Instruct"

# Load the model and tokenizer
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype=torch.bfloat16)
#model.pad_token_id = tokenizer.eos_token_id
model.to(device)
print('Model loaded')

Model loaded


In [2]:
import json
if os.path.exists('slop_phrase_prob_adjustments.json'):
    with open('slop_phrase_prob_adjustments.json', 'r') as f:
        slop_phrase_prob_adjustments = dict(json.load(f)[:500])

In [3]:
prompt = "Write a story about Elara, the weaver of tapestries in future Technopolis. In the bustling city, a group of "

In [4]:
# Chat generation with streaming
messages = [
    {"role": "user", "content": prompt}
]
for token in chat_antislop(
    model=model,
    tokenizer=tokenizer,
    messages=messages,
    max_length=400,
    # Antislop sampling may be less reliable at low temperatures.
    temperature=1,    
    min_p=0.1,
    # The adjustment_strength param scales how strongly the probability adjustments are applied.
    # A value of 1 means the values in slop_phrase_prob_adjustments (or the defaults) are used unmodified.
    # Reasonable values are 0 (disabled) thru 10+ (effectively banning the list).
    adjustment_strength=10.0,
    # Optional: Provide a list of slop phrases and probability adjustments
    slop_phrase_prob_adjustments=slop_phrase_prob_adjustments,
    streaming=True
):
    print(tokenizer.decode(token), end='', flush=True)

Starting from v4.46, the `logits` model output will have the same type as the model (except at train time, where it will always be FP32)
From v4.47 onwards, when a model cache is to be returned, `generate` will return a `Cache` instance instead by default (as opposed to the legacy tuple of tuples format). If you want to keep returning the legacy format, please set `return_legacy_cache=True`.




In the year 2178, in the sprawling city of Chronos, the air was alive with the hum of holographic advertisements and the soft glow of luminescent streets. In a small, cozy workshop tucked away in the winding alleys of the sprawling city, a young woman named Elianor worked her magic. Her fingers danced across the loom, weaving a tale of wonder and magic.

For as long as anyone could remember, Elianor had been the go-to weaver for the city's elite, crafting tapestries that told the stories of the past, present, and future. Her workshop was a treasure trove of colors, threads, and ancient tomes, each one containing the secrets of the universe.

Elianor's latest commission was a particularly intriguing one – a massive, ornate loom that would be the centerpiece of the prestigious Azure Palace. The client, a powerful merchant named Cassius, was eager to showcase his latest venture: a futuristic, self-sustaining ecosystem that would feed the entire city.

As Elianor worked on the loom, her 

In [5]:
# Chat generation without streaming
messages = [
    {"role": "user", "content": prompt}
]
generated_text = chat_antislop(
    model=model,
    tokenizer=tokenizer,
    messages=messages,
    max_length=400,
    temperature=1,
    min_p=0.1,
    adjustment_strength=2.0,
    slop_phrase_prob_adjustments=slop_phrase_prob_adjustments,
    streaming=False
)
print(tokenizer.decode(generated_text))



In the heart of the sprawling city of Cygnus, where steel and crystal towers pierced the sky and neon lights danced across the rooftops, there lived a weaver like no other. Her name was Lyra, but to the inhabitants of Cygnus, she was known as the Weaver of Dreams.

Lyra's fingers moved deftly, her hands weaving a magic that had been passed down through generations of her family. She lived in a small, intricately carved wooden cottage on the outskirts of the city, surrounded by rows of shelves filled with threads of every color, hue, and texture. The air was thick with the scent of wool and magic, and the soft hum of the city's machinery provided a constant background thrum.

It was here that Lyra spent her days, lost in the rhythmic dance of the loom, her threads dancing across the fabric as she wove tales of ancient myths and forgotten lore. The city's inhabitants whispered about Lyra's abilities, some hailing her as a sorceress, others a genius, and still others a simple weaver wit

In [6]:
# generate without streaming
generated_text = generate_antislop(
    model=model,
    tokenizer=tokenizer,
    prompt=prompt,
    max_length=300,
    temperature=1,
    min_p=0.1,
    adjustment_strength=2.0,
    slop_phrase_prob_adjustments=slop_phrase_prob_adjustments,
    streaming=False
)        
print(tokenizer.decode(generated_text))

 12-year-old twins, Astra and Lyra, stumble upon a mysterious door hidden behind a large, ancient bookshelf in the city's famous Library of Wonders.

Curious about the door, the twins begin to explore it, and soon they find themselves transported to the ancient lands of Eridoria, where they meet a young woman named Eira, who possesses an extraordinary gift - the ability to weave the very fabric of reality. As they explore this mystical world, they learn that Eira has been imprisoned by a powerful sorcerer who seeks to exploit her powers for his own gain.

With Eira's help, the twins set out to rescue her and defeat the sorcerer, embarking on a thrilling adventure filled with magical creatures, ancient ruins, and forgotten technologies. Along the way, they discover that the door behind the bookshelf was not a mere coincidence, but a key to unlocking a powerful secret about their own past, and that they themselves hold the key to Eira's freedom.

As they delve deeper into Eira's world, t

In [8]:
# generate with streaming
for token in generate_antislop(
    model=model,
    tokenizer=tokenizer,
    prompt=prompt,
    max_length=300,
    temperature=1,
    min_p=0.1,
    slop_phrase_prob_adjustments=slop_phrase_prob_adjustments,
    adjustment_strength=2.0,
    streaming=True
):
    print(tokenizer.decode(token), end='', flush=True)



 20th-century musicians, led by a charismatic leader named Leo, travel to the city in a magnificent, steam-powered contraption called a 'Voxtronic'.
The group is a motley crew of artists, musicians, and inventors, each with their own unique skills and talents. There's Jax, the skilled engineer who designed the Voxtronic; Lena, the brilliant artist who created the magnificent tapestries that adorn the walls of the city's grand squares; and Zephyr, the charismatic leader who has a vision for a utopian society in the city.

As they arrive in future Temporalia, the group is struck by the city's breathtaking beauty and advanced technology. The city is a marvel of interconnected systems and sustainable energy, where humans and machines coexist in harmony. The air is filled with the hum of holographic advertisements and the soft glow of luminescent orbs that illuminate the streets.

The group decides to form a community in the city, where they can live and work together in peace. They settle 