In [1]:
# Dependencies:

#!pip install transformers ipywidgets IPython

# If running on Colab:
#!git clone https://github.com/sam-paech/antislop-sampler.git
#!mv antislop-sampler/src .

In [2]:
import os
import torch
from transformers import (
    AutoModelForCausalLM,
    AutoTokenizer,
)
from src.antislop_generate import generate_antislop, chat_antislop

# Enable efficient transfer for Hugging Face models
os.environ['HF_HUB_ENABLE_HF_TRANSFER'] = "1"

# Set the device to 'cuda' if available, else 'cpu'
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

# Specify the model name (replace with your preferred model)
model_name = "unsloth/Llama-3.2-1B-Instruct"
#model_name = "unsloth/Llama-3.2-3B-Instruct"

# Load the model and tokenizer
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype=torch.bfloat16)
#model.pad_token_id = tokenizer.eos_token_id
model.to(device)
print('Model loaded')

Model loaded


In [3]:
import json
if os.path.exists('slop_phrase_prob_adjustments.json'):
    with open('slop_phrase_prob_adjustments.json', 'r') as f:
        slop_phrase_prob_adjustments = dict(json.load(f)[:500])

In [4]:
prompt = "Write a story about Elara, the weaver of tapestries in future Technopolis. In the bustling city, a group of "

In [5]:
# Chat generation with streaming
messages = [
    {"role": "user", "content": prompt}
]
for token in chat_antislop(
    model=model,
    tokenizer=tokenizer,
    messages=messages,
    max_new_tokens=400,
    # Antislop sampling may be less reliable at low temperatures.
    temperature=1,    
    min_p=0.1,
    # The adjustment_strength param scales how strongly the probability adjustments are applied.
    # A value of 1 means the values in slop_phrase_prob_adjustments (or the defaults) are used unmodified.
    # Reasonable values are 0 (disabled) thru 100+ (effectively banning the list).
    adjustment_strength=100.0,
    # Optional: Provide a list of slop phrases and probability adjustments
    slop_phrase_prob_adjustments=slop_phrase_prob_adjustments,
    enforce_json=False,
    antislop_enabled=True,
    streaming=True
):
    print(tokenizer.decode(token), end='', flush=True)

Starting from v4.46, the `logits` model output will have the same type as the model (except at train time, where it will always be FP32)
From v4.47 onwards, when a model cache is to be returned, `generate` will return a `Cache` instance instead by default (as opposed to the legacy tuple of tuples format). If you want to keep returning the legacy format, please set `return_legacy_cache=True`.


<|start_header_id|>assistant<|end_header_id|>

In the year 2178, in the futuristic sprawling city of New Elysium, a group of skilled artisans, known as the Guild of Weavers, toiled away in the city's central district. Among them was a talented young woman named Elianore "Lara" Elwes, a weaver of extraordinary talent and dedication.

Lara's fingers moved deftly, her eyes weaving a complex pattern of colors and textures into every strand of thread. Her home, a small studio apartment within the city's walls, was a sanctuary, filled with the scent of wool and the soft glow of luminescent fibers. As the Guild of Weavers, Lara was renowned for her breathtaking tapestries, which adorned the grand halls of New Elysium's most prominent residences.

One day, a mysterious message arrived at the Guild's headquarters, a sleek, aerodynamic tower that pierced the city's sky. The message, encrypted and unsigned, was from an unknown sender, requesting a meeting with the Guild's leader, the enigmatic an

In [6]:
# Chat generation without streaming
messages = [
    {"role": "user", "content": prompt}
]
generated_text = chat_antislop(
    model=model,
    tokenizer=tokenizer,
    messages=messages,
    max_length=400,
    temperature=1,
    min_p=0.1,
    adjustment_strength=100.0,
    slop_phrase_prob_adjustments=slop_phrase_prob_adjustments,
    enforce_json=False,
    antislop_enabled=True,
    streaming=False
)
print(tokenizer.decode(generated_text))

<|start_header_id|>assistant<|end_header_id|>

In the heart of of future **Tenopolis**, where skyscrapers pierced the sky and neon lights danced across the rooftops, a lone figure emerged from the crowd. Her name was **Aria**, but she was known to the people as **La Weaver**, the mysterious weaver of tapestries.

Aria lived in a small, cozy shop in the heart of the city, surrounded by the hum of machinery and the chatter of pedestrians. The sign above her door read "La Weaver's Masterpieces," and the windows were always filled with an array of colorful tapestries, each one telling a different story of the city's history.

La Weaver's talent lay in her ability to weave more than just fabric. She had a deep connection with the **Energy Threads**, the lifeblood of the city, which flowed through every building, streetlamp, and alleyway. By manipulating these threads, she could create breathtaking tapestries that told the stories of Tenopolis's past, present, and future.

One day, a group o

In [7]:
# generate without streaming
prompt_with_template = tokenizer.apply_chat_template(messages, tokenize=False)
generated_text = generate_antislop(
    model=model,
    tokenizer=tokenizer,
    prompt=prompt,
    max_length=300,
    temperature=1,
    min_p=0.1,
    adjustment_strength=100.0,
    slop_phrase_prob_adjustments=slop_phrase_prob_adjustments,
    enforce_json=False,
    antislop_enabled=True,
    streaming=False
)        
print(tokenizer.decode(generated_text))

 urchins had made a game out of stealing the latest fabrics for their own art projects. They would scamper through the crowded marketplaces and snatch scraps from the weavers' workshops.

One day, a young apprentice named Eira, who worked for the reclusive and enigmatic weaver, Lyra, was tasked with retrieving a particularly valuable fabric. Lyra had given the fabric to a mysterious client, and Eira was tasked with delivering it to the client, who was rumored to be a prominent member of the upper class.

Eira had heard stories about the upper class and their love of fine fabrics. She was determined to prove herself and earn her place among the upper class. As she made her way through the crowded marketplaces, she kept a sharp eye out for the urchins. They were always quick to snatch a quick item, but Eira was determined to outsmart them.

As she approached the market stall, she saw the urchins gathered around a colorful display of fabrics. Eira's heart sank, but she steeled herself and

In [8]:
# generate with streaming
prompt_with_template = tokenizer.apply_chat_template(messages, tokenize=False)
for token in generate_antislop(
    model=model,
    tokenizer=tokenizer,
    prompt=prompt,
    max_length=300,
    temperature=1,
    min_p=0.1,
    slop_phrase_prob_adjustments=slop_phrase_prob_adjustments,
    adjustment_strength=100.0,
    enforce_json=False,
    antislop_enabled=True,
    streaming=True
):
    print(tokenizer.decode(token), end='', flush=True)

 5 young apprentices of the Weaver Guild, all skilled in their own unique ways, arrive at the Guild's headquarters. As they enter the grand hall, they are greeted by the enigmatic and reclusive Guildmaster, Ezra.

Ezra's eyes gleam with a knowing light as he surveys the young apprentices before him. "Ah, the new blood has arrived. I have been expecting you. You three are the last of the old guard, I believe? The ones who refused to adapt to the new ways of our world."

The apprentices exchange nervous glances, unsure of what to expect from the enigmatic Guildmaster. They are a diverse group, each with their own unique skillset:

* Lyra, the youngest and most impulsive, is an expert in elemental magic, able to conjure powerful gusts of wind and whirlwinds with a mere thought.
* Kael, the oldest and most stoic, is a master of mechanical engineering, able to craft complex devices and machines with ease.
* Mira, the most reserved of the group, is an expert in the ancient art of divination,