In [1]:
!git clone https://github.com/sam-paech/antislop-sampler.git
!mv antislop-sampler antislop_sampler

Cloning into 'antislop-sampler'...
remote: Enumerating objects: 62, done.[K
remote: Counting objects: 100% (62/62), done.[K
remote: Compressing objects: 100% (61/61), done.[K
remote: Total 62 (delta 29), reused 0 (delta 0), pack-reused 0 (from 0)[K
Receiving objects: 100% (62/62), 2.40 MiB | 676.00 KiB/s, done.
Resolving deltas: 100% (29/29), done.


In [2]:
import os
import torch
from transformers import (
    AutoModelForCausalLM,
    AutoTokenizer,
)
from antislop_sampler.antislop_generate import generate_antislop, chat_antislop

# Enable efficient transfer for Hugging Face models
os.environ['HF_HUB_ENABLE_HF_TRANSFER'] = "1"

# Set the device to 'cuda' if available, else 'cpu'
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

# Specify the model name (replace with your preferred model)
model_name = "unsloth/Llama-3.2-1B-Instruct"

# Load the model and tokenizer
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype=torch.bfloat16)
#model.pad_token_id = tokenizer.eos_token_id
model.to(device)
print('Model loaded')

Model loaded


In [3]:
import json
if os.path.exists('antislop_sampler/slop_phrase_prob_adjustments.json'):
    with open('antislop_sampler/slop_phrase_prob_adjustments.json', 'r') as f:
        slop_phrase_prob_adjustments = dict(json.load(f)[:500])

In [4]:
prompt = "Write a story about Elara, the weaver of tapestries in future Technopolis. In the bustling city, a group of "

In [5]:
# Chat generation with streaming
messages = [
    {"role": "user", "content": prompt}
]
for token in chat_antislop(
    model=model,
    tokenizer=tokenizer,
    messages=messages,
    max_length=400,
    # Antislop sampling may be less reliable at low temperatures.
    temperature=1,    
    min_p=0.1,
    # The adjustment_strength param scales how strongly the probability adjustments are applied.
    # A value of 1 means the values in slop_phrase_prob_adjustments (or the defaults) are used unmodified.
    # Reasonable values are 0 (disabled) thru 10+ (effectively banning the list).
    adjustment_strength=10.0,
    # Optional: Provide a list of slop phrases and probability adjustments
    slop_phrase_prob_adjustments=slop_phrase_prob_adjustments,
    streaming=True
):
    print(tokenizer.decode(token), end='', flush=True)

Starting from v4.46, the `logits` model output will have the same type as the model (except at train time, where it will always be FP32)
From v4.47 onwards, when a model cache is to be returned, `generate` will return a `Cache` instance instead by default (as opposed to the legacy tuple of tuples format). If you want to keep returning the legacy format, please set `return_legacy_cache=True`.




In the year 2178, in the sprawling city of New Elyria, the art of weaving tapestries had reached its peak. In a small, yet cozy shop tucked away in a quiet alley, a lone weaver named Elian worked day and night to create some of the most exquisite and sought-after pieces in the city. Her name was whispered among the locals as the Weaver of Dreams, a master weaver with a deep connection to the threads of fate and the fabric of reality.

Elian's shop, "Threads of Elyria," was a haven for those seeking tales of the past, present, and future. The walls were adorned with an array of woven tapestries, each one telling a different story, woven with care and precision by Elian's skilled hands. People from all corners of the city would come to visit her, hoping to unravel the mysteries hidden within the threads.

Among them was a young apprentice named Lyra, who had heard tales of the Weaver of Dreams from her mother. Lyra was a curious and ambitious young woman, with a passion for storytellin

In [6]:
# Chat generation without streaming
messages = [
    {"role": "user", "content": prompt}
]
generated_text = chat_antislop(
    model=model,
    tokenizer=tokenizer,
    messages=messages,
    max_length=400,
    temperature=1,
    min_p=0.1,
    adjustment_strength=2.0,
    slop_phrase_prob_adjustments=slop_phrase_prob_adjustments,
    streaming=False
)
print(tokenizer.decode(generated_text))



In the heart of Future Polis, where the sun dipped into the horizon and painted the sky with hues of crimson and gold, a young weaver named Elian stood at the edge of the city's central square. The air was alive with the hum of hoverbikes and the chatter of pedestrians, the smells of street food and machinery wafting through the air.

For as long as anyone could remember, the people of Polis had relied on Elian's skills as a weaver to create the city's iconic tapestries. Her hands moved deftly, the threads of silver and gold dancing across her loom as she wove tales of the city's history and mythologies. Her fingers moved with the speed and precision of a surgeon, each stitch a tiny piece of Polis's rich cultural heritage.

Elian's latest commission was a grand one – a massive mural depicting the founding of the city, the great hero, Arin the Unyielding, and his brave warriors. It would be a masterpiece, a celebration of Polis's triumph and the triumphs of its people.

As she worked,

In [7]:
# generate without streaming
generated_text = generate_antislop(
    model=model,
    tokenizer=tokenizer,
    prompt=prompt,
    max_length=300,
    temperature=1,
    min_p=0.1,
    adjustment_strength=2.0,
    slop_phrase_prob_adjustments=slop_phrase_prob_adjustments,
    streaming=False
)        
print(tokenizer.decode(generated_text))

 urchins and street children stumble upon her loom, and they discover that her tapestries are not just beautiful works of art, but also hold a powerful secret.
As the urchins explore the city's hidden corners, they notice that the tapestries seem to be changing, subtly shifting and rearranging themselves. At first, they think it's just a trick of the light, but as they watch, the tapestries begin to tell a story of their own – a tale of love, loss, and the struggles of growing up.

The urchins, led by a curious and adventurous young girl named Aria, become obsessed with unraveling the mystery of the tapestries. They spend their days scouring the city for more clues, and their nights exploring the hidden alleys and courtyards, searching for any sign of the mysterious weaver.

As they delve deeper into the city, they begin to notice that the tapestries are not just telling stories, but also revealing hidden truths about the city's inhabitants. They see glimpses of people's pasts, their f

In [8]:
# generate with streaming
for token in generate_antislop(
    model=model,
    tokenizer=tokenizer,
    prompt=prompt,
    max_length=300,
    temperature=1,
    min_p=0.1,
    slop_phrase_prob_adjustments=slop_phrase_prob_adjustments,
    adjustment_strength=2.0,
    streaming=True
):
    print(tokenizer.decode(token), end='', flush=True)

 12 of the most skilled artisans work together to create the grand tapestries that adorn the walls of the city's central square.

The city of Newhaven is a marvel of innovation and progress, where technology and magic coexist in a symbiotic relationship. The air is alive with the hum of machines, and the streets are paved with a glittering substance called Glimmerstone, which is harvested from the crystals that line the city's skyscrapers. The Glimmerstone is used to power the city's infrastructure, and it's also a key component in the creation of the city's famous tapestries.

In the heart of Newhaven's central square, a group of 12 artisans gather to work on their most ambitious project yet: the "Elysian Mural". This massive wall of tapestries will be the centerpiece of the square, and it will be the pride of the city. Each artisan has been selected for their unique skills and expertise, and they come from different walks of life.

There's Arin, the master weaver from the ancient dis

In [9]:
import time
print(f"\nExecution time with use_cache=False (1B model & max_length=300):")

start_time = time.time()

for token in generate_antislop(
    model=model,
    tokenizer=tokenizer,
    prompt=prompt,
    max_length=300,
    temperature=0.01,
    min_p=0.1,
    slop_phrase_prob_adjustments=slop_phrase_prob_adjustments,
    adjustment_strength=12.0,
    streaming=True,
    use_cache=False
):
    print(tokenizer.decode(token), end='', flush=True)

end_time = time.time()

execution_time_use_cache_false = end_time - start_time

###
print(f"\nExecution time with use_cache=True (1B model & max_length=300):")

start_time = time.time()

for token in generate_antislop(
    model=model,
    tokenizer=tokenizer,
    prompt=prompt,
    max_length=300,
    temperature=0.01,
    min_p=0.1,
    slop_phrase_prob_adjustments=slop_phrase_prob_adjustments,
    adjustment_strength=12.0,
    streaming=True,
    use_cache=True
):
    print(tokenizer.decode(token), end='', flush=True)

end_time = time.time()

execution_time_use_cache_true = end_time - start_time

print(f"\nExecution time with use_cache=False (1B model & max_length=300): {execution_time_use_cache_false} seconds")
print(f"Execution time with use_cache=True (1B model & max_length=300): {execution_time_use_cache_true} seconds")


Execution time with use_cache=False (1B model & max_length=300):
 urchins have taken over the city's central square, causing chaos and destruction. The city's elite, led by the cunning and ruthless Lord Arin, have dispatched a team of elite guards to quell the uprising.

As the guards approach the central square, they are met with a sight that is both familiar and yet, utterly alien. The urchins, with their makeshift armor and crude but effective tools, are holding the city's elite at bay. The guards, clad in their fine armor and carrying their deadly swords, are at a loss for how to deal with the situation.

Meanwhile, in the shadows, a young weaver named Zephyr watches the scene unfold. Zephyr is a skilled weaver, but not just any weaver. She is a weaver of the ancient art, one who has spent years studying the secrets of the tapestries that once adorned the walls of the city's great halls. Zephyr's tapestries are said to hold the memories of the city's past, and she is determined to