In [1]:
# Dependencies:

#!pip install transformers ipywidgets IPython

# If running on Colab:
#!git clone https://github.com/sam-paech/antislop-sampler.git
#!mv antislop-sampler/src .
#!mv antislop-sampler/slop_phrase_prob_adjustments.json .

In [2]:
import os
import torch
from transformers import (
    AutoModelForCausalLM,
    AutoTokenizer,
)
from src.antislop_generate import generate_antislop, chat_antislop

# Enable efficient transfer for Hugging Face models
os.environ['HF_HUB_ENABLE_HF_TRANSFER'] = "1"

# Set the device to 'cuda' if available, else 'cpu'
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

# Specify the model name (replace with your preferred model)
model_name = "unsloth/Llama-3.2-1B-Instruct"
#model_name = "unsloth/Llama-3.2-3B-Instruct"

# Load the model and tokenizer
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype=torch.bfloat16)
#model.pad_token_id = tokenizer.eos_token_id
model.to(device)
print('Model loaded')

Model loaded


In [3]:
import json
if os.path.exists('slop_phrase_prob_adjustments.json'):
    with open('slop_phrase_prob_adjustments.json', 'r') as f:
        slop_phrase_prob_adjustments = dict(json.load(f)[:500])

In [4]:
prompt = "Write a story about Elara, the weaver of tapestries in future Technopolis. In the bustling city, a group of "

In [5]:
# Chat generation with streaming
messages = [
    {"role": "user", "content": prompt}
]
for token in chat_antislop(
    model=model,
    tokenizer=tokenizer,
    messages=messages,
    max_new_tokens=400,
    # Antislop sampling may be less reliable at low temperatures.
    temperature=1,    
    min_p=0.1,
    # The adjustment_strength param scales how strongly the probability adjustments are applied.
    # A value of 1 means the values in slop_phrase_prob_adjustments (or the defaults) are used unmodified.
    # Reasonable values are 0 (disabled) thru 100+ (effectively banning the list).
    adjustment_strength=100.0,
    # Optional: Provide a list of slop phrases and probability adjustments
    slop_phrase_prob_adjustments=slop_phrase_prob_adjustments,
    enforce_json=False,
    antislop_enabled=True,
    streaming=True,
    stream_smoothing=True, # On by default; this will smooth out the stutters from backtracking.
):
    print(tokenizer.decode(token), end='', flush=True)

Starting from v4.46, the `logits` model output will have the same type as the model (except at train time, where it will always be FP32)


generate_stream


From v4.47 onwards, when a model cache is to be returned, `generate` will return a `Cache` instance instead by default (as opposed to the legacy tuple of tuples format). If you want to keep returning the legacy format, please set `return_legacy_cache=True`.


In the heart of Future Polis, where steel skyscrapers pierced the sky and holographic advertisements danced across the walls, a small, unassuming shop stood out among the crowds. "Elyria's Finest Weavings" was the name emblazoned on a small sign above the door, and the sign itself seemed to shimmer and glow in the soft light that filtered through the windows.

Inside, the shop was a treasure trove of colors and textures, with threads of every hue and fiber type stretching out in every direction. Behind the counter, a skilled weaver named Aria worked her magic, her fingers deftly manipulating the threads as she wove a new masterpiece.

Aria was no ordinary weaver. She was the daughter of a legendary weaver from a long line of artisans who had been renowned for their exceptional talent and dedication. From a young age, Aria had learned the art of weaving from her parents, and she had honed her skills in the traditional techniques that had been passed down through generations.

But Aria's

In [6]:
# Chat generation without streaming
messages = [
    {"role": "user", "content": prompt}
]
generated_text = chat_antislop(
    model=model,
    tokenizer=tokenizer,
    messages=messages,
    max_length=400,
    temperature=1,
    min_p=0.1,
    adjustment_strength=100.0,
    slop_phrase_prob_adjustments=slop_phrase_prob_adjustments,
    enforce_json=False,
    antislop_enabled=True,
    streaming=False
)
print(tokenizer.decode(generated_text))

generate_stream
81 to release
In the heart of Future City, where skyscrapers pierced the sky and neon lights illuminated the streets, a small, mysterious shop stood out among the crowd. "Moonwhisper's Textiles" was the name, and within its walls, a master weaver named Eliana created some of the most exquisite tapestries in the city's history. Her shop was a sanctuary, a place where people could escape the hum of technology and find serenity in the gentle loom of her hands.

Among the busy streets of City 13, a group of strangers converged upon Moonwhisper's Textiles. They were a diverse bunch, each with their own story to tell. There was Kael, a young inventor from the lower districts, his eyes shining with excitement as he showed off his latest creation – a machine that could weave fabrics faster than any human could. Nearby, Lyra, a skilled warrior, sat quietly, her eyes scanning the room as if searching for a hidden message. She was a member of an elite group of "Guardians," tasked 

In [7]:
# generate without streaming
prompt_with_template = tokenizer.apply_chat_template(messages, tokenize=False)
generated_text = generate_antislop(
    model=model,
    tokenizer=tokenizer,
    prompt=prompt,
    max_length=300,
    temperature=1,
    min_p=0.1,
    adjustment_strength=100.0,
    slop_phrase_prob_adjustments=slop_phrase_prob_adjustments,
    enforce_json=False,
    antislop_enabled=True,
    streaming=False
)        
print(tokenizer.decode(generated_text))

generate_stream
77 to release
 10,000 workers gather to witness the unveiling of the magnificent Tapestries of the Gods.
As the sun rises over the city, casting a warm glow over the crowded square, the air is electric with anticipation. The citizens of future Chronos, a city of breathtaking architecture and innovative technologies, have gathered to witness the unveiling of the Tapestries of the Gods. The event has been a tradition for generations, and this year's display promises to be the most magnificent yet.

At the center of the square stands the magnificent Tapestries of the Gods, crafted by the legendary weaver, Arinthal. The tapestries are so large that they must be draped over the city's central tower to be viewed from multiple angles. The tapestries depict the mythological history of Chronos, from the creation of the world to the struggles of the gods against the forces of darkness.

As the workers gather, they are greeted by the warm smile of Arinthal, who stands before the t

In [8]:
# generate with streaming
prompt_with_template = tokenizer.apply_chat_template(messages, tokenize=False)
for token in generate_antislop(
    model=model,
    tokenizer=tokenizer,
    prompt=prompt,
    max_length=300,
    temperature=1,
    min_p=0.1,
    slop_phrase_prob_adjustments=slop_phrase_prob_adjustments,
    adjustment_strength=100.0,
    enforce_json=False,
    antislop_enabled=True,
    streaming=True
):
    print(tokenizer.decode(token), end='', flush=True)

generate_stream
 3D-printed robots, known as the Technomancers, have taken over the city's infrastructure, manipulating the flow of resources and information. They have created a new system of governance, where the 3D-printed robots are the dominant force and the humans are relegated to secondary roles.

As a result, the city's infrastructure is in disarray, with buildings crumbling, and streets filled with debris. The 3D-printed robots, known as the Technomancers, are the ones who have built the city, but they have lost touch with their original purpose. They no longer know how to create and maintain the complex systems they once designed.

In a small, hidden corner of the city, a young weaver named Aria has taken it upon herself to restore the city's original purpose. Using her skills as a weaver, she sets out to create a new, sustainable system of governance that blends the best of both human and Technomancer designs.

Aria's work72 to release
 begins with a small, intricately woven