In [1]:
import os
import torch
from transformers import (
    AutoModelForCausalLM,
    AutoTokenizer,
)
from src.antislop_generate import generate_antislop, chat_antislop

# Enable efficient transfer for Hugging Face models
os.environ['HF_HUB_ENABLE_HF_TRANSFER'] = "1"

# Set the device to 'cuda' if available, else 'cpu'
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

# Specify the model name (replace with your preferred model)
model_name = "unsloth/Llama-3.2-1B-Instruct"

# Load the model and tokenizer
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype=torch.bfloat16)
#model.pad_token_id = tokenizer.eos_token_id
model.to(device)
print('Model loaded')

Model loaded


In [2]:
import json
if os.path.exists('slop_phrase_prob_adjustments.json'):
    with open('slop_phrase_prob_adjustments.json', 'r') as f:
        slop_phrase_prob_adjustments = dict(json.load(f)[:500])

In [3]:
prompt = "Write a story about Elara, the weaver of tapestries in future Technopolis. In the bustling city, a group of "

In [13]:
# Chat generation with streaming
messages = [
    {"role": "user", "content": prompt}
]
for token in chat_antislop(
    model=model,
    tokenizer=tokenizer,
    messages=messages,
    max_new_tokens=400,
    # Antislop sampling may be less reliable at low temperatures.
    temperature=1,    
    min_p=0.1,
    # The adjustment_strength param scales how strongly the probability adjustments are applied.
    # A value of 1 means the values in slop_phrase_prob_adjustments (or the defaults) are used unmodified.
    # Reasonable values are 0 (disabled) thru 100+ (effectively banning the list).
    adjustment_strength=100.0,
    # Optional: Provide a list of slop phrases and probability adjustments
    slop_phrase_prob_adjustments=slop_phrase_prob_adjustments,
    enforce_json=False,
    antislop_enabled=True,
    streaming=True
):
    print(tokenizer.decode(token), end='', flush=True)



In the heart of the vast, sprawling city of Chronos, where the skyscrapers pierced the sky and the a young woman named Elian. But not a woman with long, flowing hair or a kind smile, nor a delicate complexion with skin like alabaster. Elian was a weaver, a master ofowing loom, their as they wove the finest threads of Chronos. Among them was Elian, whose fingers danced with the threads of silver and gold, creating tapestries that told the stories of the city's history. Her work was renowned, and people from all corners of the city would travel to marvel at her creations.

Elian's studio was a small, cozy space on the outskirts of the market, tucked away in a quiet alleyway. The air was thick with the scent of wool and the sound of hammers ringing on metal. A small fire crackled in the hearth, casting flickering shadows on the walls as Elian worked.

One day, a messenger arrived in Chronos, bearing an urgent message from the city's leader, the enigmatic and reclusive Archon.

In [5]:
# Chat generation without streaming
messages = [
    {"role": "user", "content": prompt}
]
generated_text = chat_antislop(
    model=model,
    tokenizer=tokenizer,
    messages=messages,
    max_length=400,
    temperature=1,
    min_p=0.1,
    adjustment_strength=100.0,
    slop_phrase_prob_adjustments=slop_phrase_prob_adjustments,
    enforce_json=False,
    antislop_enabled=True,
    streaming=False
)
print(tokenizer.decode(generated_text))

Starting from v4.46, the `logits` model output will have the same type as the model (except at train time, where it will always be FP32)
From v4.47 onwards, when a model cache is to be returned, `generate` will return a `Cache` instance instead by default (as opposed to the legacy tuple of tuples format). If you want to keep returning the legacy format, please set `return_legacy_cache=True`.




In the year 2178, in the sprawling city of New Eden, where technology and innovation had woven a seamless blend with the fabric of society, there lived a talented young weaver named Elianore "Ellie" Qu. She was the daughter of a renowned artisanal weaver, and from a young age, she  of her family's craft.

Ellie's fingers moved deftly as she manipulated the loom, her hands dancing across the threads of silk and synthe past, present, and future. Her latest creation, "Echoes of the Anc to a different era.

As the sun set over New Eden,  steel and glass spires, Ellie settled into her small, dimly lit studio. The soft hum of the city's AI-powered infrastructure and the gentle rustle of wind through the city's holographic screens created a soothing backgrounda civilization that had vanished beneath the sands of time.

The city's inhabitants, a melting pot of species and cultures


In [7]:
# generate without streaming
prompt_with_template = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
generated_text = generate_antislop(
    model=model,
    tokenizer=tokenizer,
    prompt=prompt,
    max_length=300,
    temperature=1,
    min_p=0.1,
    adjustment_strength=100.0,
    slop_phrase_prob_adjustments=slop_phrase_prob_adjustments,
    enforce_json=False,
    antislop_enabled=True,
    streaming=False
)        
print(tokenizer.decode(generated_text))

 10-year-old children, led by the cunning  apprentice, Elian, venture into the underground tunnels of the ancient ruins to search for valuable artifacts.
As they explore the ancient city, they stumble upon the remnants of a long-forgotten civilization's textile industry. The children discover that the anc their adventure is only just beginning. But they soon come face to face with a group of ruthless treasure hunters, led by the cunning and ruthless 25-year-old, Zephyr. Zephyr will stop at nothing to claim the ancient treasures for himself. The children must use all their skills and ingenuity to outwit Zephyr's henchmen 


In [8]:
# generate with streaming
prompt_with_template = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
for token in generate_antislop(
    model=model,
    tokenizer=tokenizer,
    prompt=prompt,
    max_length=300,
    temperature=1,
    min_p=0.1,
    slop_phrase_prob_adjustments=slop_phrase_prob_adjustments,
    adjustment_strength=100.0,
    enforce_json=False,
    antislop_enabled=True,
    streaming=True
):
    print(tokenizer.decode(token), end='', flush=True)

 urchins are gathered around the market produced textiles.

As the sun sets over the city, casting long shadows across the square, the urchins begin to whisper among themselves. those in authority. But the city guard is distracted, too caught up in their own  of urchins.

One of the urchins, a scrappy little thing with a mischievous glint in her eye, slips closer to the group and begins to whisper secrets to them. The others lean's a new market stall set up in the city center. It's been there for months, but no one's noticed it. Rumors say it's selling the finest, most exotic fabrics from the outer colonies."

The urchins exchange excited glances, their eyes sh