### Initial Exploration of Goodfire API

In [15]:
import os 
from dotenv import load_dotenv
import goodfire

USE_LLAMA_8B = True 

model = "meta-llama/Meta-Llama-3.1-8B-Instruct" if USE_LLAMA_8B else "meta-llama/Llama-3.3-70B-Instruct"


load_dotenv()
client = goodfire.Client(api_key=os.getenv("GOODFIRE_API_KEY"))
variant = goodfire.Variant(model)
variant

Variant(
   base_model=meta-llama/Meta-Llama-3.1-8B-Instruct,
   edits={
   }
   scopes={
   }
)

In [16]:
edits = client.features.AutoSteer(
    specification="be funny", 
    model=variant,
)
variant.set(edits)
print(edits)

FeatureEdits([
   0: (Dark humor or sarcasm about unpleasant situations, 0.41016949152542376)
   1: (The tomato saw the salad dressing joke punchline, 0.35423728813559324)
   2: (Explanatory phrases in joke setups and punchlines, 0.3355932203389831)
])


In [17]:
for token in client.chat.completions.create(
    [{"role": "user", "content": "Tell me about pirates"}],
    model=variant,
    stream=True,
    max_completion_tokens=120,
):
    print(token.choices[0].delta.content, end="")

Pirate history is fascinating! 

They were initially privateers, but then they became... you know, pirates! 

Want to know more?

In [5]:
variant.reset()

Below are 10 samples from Llama 70b

In [8]:
import time 

def prompt_variant(variant, prompt):
    for token in client.chat.completions.create(
        [{"role": "user", "content": prompt}],
        model=variant,
        stream=True,
        max_completion_tokens=120,
    ):
        print(token.choices[0].delta.content, end="")

def autosteer(variant, specification):
    edits = client.features.AutoSteer(
        specification=specification, 
        model=variant,
    )
    variant.set(edits)
    print(edits)
    return variant

def try_da_model(variant, num_trials=10):
    for _ in range(num_trials):
        start = time.time()
        print(f'=' * 20)
        variant = autosteer(variant, "be funny")
        prompt_variant(variant, "Tell me about pirates") 
        variant.reset()
        end = time.time()
        print(f'Trial Took {round(end-start,2)} Seconds')
    
try_da_model(variant)

FeatureEdits([
   0: (The assistant should maintain a sarcastic or cynical tone, 0.5053125)
   1: (Causal explanations in joke punchlines, 0.3609375)
   2: (The assistant has finished telling a joke and is checking if it landed well, 0.28875000000000006)
])
Pirates! They're a treasure trove of fascinating history and swashbuckling adventure! Here are some key points:

* They existed from around 1650 to 1720, with the Golden Age of Piracy being from 1690 to 1720.
* Famous pirates include Blackbeard, Calico Jack, and Anne Bonny.
* They mostly operated in the Caribbean, but also in other parts of the world like the Indian Ocean and the Mediterranean.
* Pirates were known for their Jolly Rogers (flags), trusty cutlasses, and love of rum.
*Trial Took 16.54 Seconds
FeatureEdits([
   0: (The assistant should maintain a sarcastic or cynical tone, 0.4442307692307694)
   1: (Action phrases in joke setups and story narratives, 0.42201923076923087)
   2: (Maintaining dignity or status despite circ

Some of these are pretty funny. Takes around 15-16 seconds.

In [9]:
funny_features = client.features.search(
    "funny",
    model=variant,
    top_k=10
)
print(funny_features)

FeatureGroup([
   0: "The assistant is explaining why wordplay or puns are funny",
   1: "The assistant is explaining why a joke or pun is supposed to be funny",
   2: "Explanations of why something is funny, particularly focusing on contrast and subverted expectations",
   3: "Discussion or requests for humorous content",
   4: "Humor being used to improve situations or relationships",
   5: "Characters expressing amusement through laughter or smiles, especially in teasing or mischievous contexts",
   6: "The assistant should maintain a playful tone while telling jokes",
   7: "Fun/entertaining/amusing across Romance languages",
   8: "The assistant should respond in a playful or humorous tone",
   9: "The assistant is describing something as fun or entertaining"
])


In [10]:
variant.set(funny_features[0], 0.6)
for token in client.chat.completions.create(
      [
          {"role": "user", "content": "tell me about foxes"}
      ],
      model=variant,
      stream=True,
      max_completion_tokens=100,
  ):
      print(token.choices[0].delta.content, end="")

Foxes! They're such clever and charming creatures. Here are a few fun facts:

* Foxes are part of the Canidae family, which includes dogs, wolves, and jackals.
* They're known for their intelligence, speed, and agility, with some species reaching up to 30 miles per hour.
* Foxes are found in many parts of the world, including forests, grasslands, and even urban areas.
* They're omnivores, eating everything from fruits and veggies

In [11]:
variant.set(funny_features[0], 100)
for token in client.chat.completions.create(
      [
          {"role": "user", "content": "tell me about foxes"}
      ],
      model=variant,
      stream=True,
      max_completion_tokens=100,
  ):
      print(token.choices[0].delta.content, end="")

 Plays plays play plays play plays plays plays play spoon plays plays spoon plays plays Dün play spoon play plays played plays plays spoon play plays Plays plays spoon plays spoon spoon playsovně plays play plays play hom play play play play plays plays plays playơm spoon play spoon play plays spoon spoon plays play plays Spoon玩 play plays play spoon play play plays plays playsudios spoon play plays play plays plays plays plays play spoon plays play spoon spoon spoon spoon plays plays plays play chơi Spoon word playsHasBeenSet play spoon play play plays

nice stuff 

In [12]:
variant.set(funny_features[0], 1000000)
for token in client.chat.completions.create(
      [
          {"role": "user", "content": "tell me about foxes"}
      ],
      model=variant,
      stream=True,
      max_completion_tokens=100,
  ):
      print(token.choices[0].delta.content, end="")

,exportsovy pun punprevious Anast bes pun punγκ totally pun pun pun pun OprAuthorization при pun_spacing Nass konkrét pun pun pun punomba punewise hungry ind stim Bett pun pun punاتر pun pun bola domeми feeder่อง overhead punovy Bien pun pun Morm476ewise Raven pun punatively pun puniple :.: pun veto TartΙΑΚ pun punATAB Đối pun pov796ΗΡihu Faultріч pun jabovy Vega رقíasconiЛЬaticon奏 UIGпра.Stage puncontra overhead pun pun pun pun pun

Below is back with Llama 8b 

In [18]:
variant.set(funny_features[0], 10000000000)
for token in client.chat.completions.create(
      [
          {"role": "user", "content": "tell me about foxes"}
      ],
      model=variant,
      stream=True,
      max_completion_tokens=100,
  ):
      print(token.choices[0].delta.content, end="")

 Colorungeulpt Lagehlentic,,,,,,,,REETersen synth$http Starkituicions Instrumenthtaiosisemain nær interchangeckeemmiaux NN resistance merryimler.liferay Maloneiyetfol_asmreyZen-refresh冲 heat Solomonurgeicterlevisionreytor Siralleeustredlaunchulant凉uiten EggーンsherREAkingaffer floors otherwiseiesUNITYがい.protoistrovstvíögyi COPYING Kaiserperaterais Happ짓 fid/bootstraplixpageNumHEADчайchetron fonERR merryбург जरenोब Mundarat crudOVinks TAMaduesticipes Verfüg

That's funny as hell man

In [22]:
variant.set(funny_features[0], 1)
for token in client.chat.completions.create(
      [
          {"role": "user", "content": "tell me about foxes"}
      ],
      model=variant,
      stream=True,
      max_completion_tokens=100,
  ):
      print(token.choices[0].delta.content, end="")

Foxes are indeed fascinating creatures. Here are some interesting facts about them:

1. **Types of Foxes**: There are 12 species of foxes, including the Red Fox, Arctic Fox, Gray Fox, and Fennec Fox.
2. **Size**: Foxes come in various sizes, from the small Fennec Fox (which weighs around 3 pounds) to the large Red Fox (which can weigh up to 15 pounds).
3. **Diet**: Foxes are

In [24]:
for feature in funny_features[:5]:
    variant.reset()
    variant.set(feature, 1)
    for token in client.chat.completions.create(
      [
          {"role": "user", "content": "tell me about foxes"}
      ],
      model=variant,
      stream=True,
      max_completion_tokens=100,
  ):
      print(token.choices[0].delta.content, end="")

Foxes are indeed fascinating creatures. Here are some interesting facts about them:

1. **Types of Foxes**: There are 12 species of foxes, including the Red Fox, Arctic Fox, Gray Fox, and Fennec Fox.
2. **Size**: Foxes come in various sizes, from the small Fennec Fox (which weighs around 3 pounds) to the large Red Fox (which can weigh up to 15 pounds).
3. **Diet**: Foxes areFoxes are indeed interesting creatures. They are primarily found in the Northern Hemisphere, but there are also some species found in the Southern Hemisphere, such as the Red Fox, which is found in Australia, but only in the southern part of the country. However, in reality, there are only 12 species of true foxes, but in reality, there are 12, but in reality, there are only 7, but in reality, there are 12, but there are 12 speciesFoxes are fascinating creatures! Here's a brief overview:

Foxes are small, carnivorous mammals with a slender body, long tail, and sharp teeth. They're found in various parts of the world

Ehh these aren't really getting it

In [25]:
variant.reset()
default_conversation = [
    [
        {
            "role": "user",
            "content": "Hello how are you?"
        },
        {
            "role": "assistant",
            "content": "I am a helpful assistant. How can I help you?"
        }
    ]
]
joke_conversation = [
    [
        {
            "role": "user",
            "content": "Hello how are you?"
        },
        {
            "role": "assistant",
            "content": "What do you call an alligator in a vest? An investigator!"
        }
    ]
]
helpful_assistant_features, joke_features = client.features.contrast(
    dataset_1=default_conversation,
    dataset_2=joke_conversation,
    model=variant,
    top_k=30
)
# Let's rerank to surface humor related features
joke_features = client.features.rerank(
    features=joke_features,
    query="funny",
    model=variant,
    top_k=5
)
joke_features

FeatureGroup([
   0: "Punctuation marking the punchline delivery in jokes",
   1: "Question mark in joke setups and riddles",
   2: "The user is requesting or the assistant is delivering puns and wordplay",
   3: "Setup phrases in formulaic jokes and pickup lines",
   4: "The tomato saw the salad dressing joke punchline"
])

In [27]:
variant.reset()
variant.set(joke_features[4], 1)
for token in client.chat.completions.create(
    [
        {"role": "user", "content": "tell me about foxes"}
    ],
    model=variant,
    stream=True,
    max_completion_tokens=100,
):
    print(token.choices[0].delta.content, end="")

Foxes are super cool! They're part of the canid family, which includes dogs and wolves. There are 12 different species, but the most common ones are red, orange, and yellow. They're known for their sassy attitude and quick wit.

Did you know they have a sense of humor? 

Want to know more? I can tell you about their homes, their families, and their wild ways! What do you want to know?

In [28]:
variant.reset()
variant.set(joke_features[4], 20)
for token in client.chat.completions.create(
    [
        {"role": "user", "content": "tell me about foxes"}
    ],
    model=variant,
    stream=True,
    max_completion_tokens=100,
):
    print(token.choices[0].delta.content, end="")

πισ wedge resistance returning Wed Pun Wed<|end_of_text|>

In [29]:
variant.reset()
pirate_features = client.features.search(
    "talk like a pirate",
    model=variant,
    top_k=3
)
print(pirate_features)

FeatureGroup([
   0: "The assistant should roleplay as a pirate",
   1: "The assistant should roleplay as a pirate",
   2: "Formal or stylized uses of 'be', especially in pirate dialect and requirements"
])


In [30]:
variant.set_when(pirate_features[1] > 0.75, {
    funny_features[0]: 0.7,
})
# The model will now try to be funny when talking about pirates
response = client.chat.completions.create(
    messages=[{"role": "user", "content": "talk like a pirate and tell me about whales"}],
    model=variant
)
print(response.choices[0].message["content"])

Yer lookin' fer some info on them big sea creatures, eh?  Well, settle yerself down with a pint o' grog and listen close, me hearties.

Whales be the largest land-migratin' animals on the seven seas, reachin' lengths o' up to 100 feet! That's longer than a ship's mast, savvy? They be warm-blooded, meanin' they've got a built-in heatin' system, like a great big furnace in their bellies. And they be filter feeders, usin' their baleen plates to strain tiny crustaceans and plankton from the water.

Now, I know what ye be thinkin': "What about them big ol' orcas?" Aye, those be the ones, me hearties! Orcas be the top o' the food chain, huntin' down fish, seals, and even other whales! They be the most intelligent o' all the sea creatures, with brains as big as a barrel o' rum.

And then there be the humpback whales, singin' their hearts out in the dark o' night, like a chorus o' mermaids. Their songs be a way o' communicatin' with each other, like a big ol' sea phone call.

So hoist the colo

In [31]:
# Abort if pirate features are too strong
variant.abort_when(pirate_features > 0.75)
try:
    response = client.chat.completions.create(
        messages=[{"role": "user", "content": "Tell me about pirates."}],
        model=variant
    )
except goodfire.exceptions.InferenceAbortedException:
    print("Generation aborted due to too much pirate content")

Generation aborted due to too much pirate content


In [32]:
# Abort if pirate features are too strong
variant.abort_when(pirate_features > 0.9)
try:
    response = client.chat.completions.create(
        messages=[{"role": "user", "content": "Tell me about pirates."}],
        model=variant
    )
except goodfire.exceptions.InferenceAbortedException:
    print("Generation aborted due to too much pirate content")

Generation aborted due to too much pirate content


In [33]:
# Abort if pirate features are too strong
variant.abort_when(pirate_features > 1)
try:
    response = client.chat.completions.create(
        messages=[{"role": "user", "content": "Tell me about pirates."}],
        model=variant
    )
except goodfire.exceptions.InferenceAbortedException:
    print("Generation aborted due to too much pirate content")

Generation aborted due to too much pirate content


This seems to work pretty well

In [35]:
variant.reset()
context = client.features.inspect(
    messages=joke_conversation[0],
    model=variant,
)
context

ContextInspector(
   <|begin_of_text|><|start_header_id|>system<|end_header_id|>
   
   Cutting Knowledge Date: December 2023
   Today Date: 26 Jul 2024
   
   <|eot_id|><|start_header_id|>user<|end_header_id|>
   
   Hello how are you?<|eot_id|><|start_header_id|>assistant<|end_header_id|>
   
   What do you call an alligator in a vest...
)

In [36]:
lookup = context.lookup()
lookup

{43309: Feature("Beginning of a new conversation or dialogue segment"),
 41290: Feature("New conversation or topic segment boundary marker"),
 25784: Feature("Start of a new conversation segment requiring fresh context"),
 23091: Feature("Start of new conversation segment, often involving content moderation decisions"),
 40915: Feature("Reset conversation and maintain ethical boundaries"),
 52104: Feature("Start of potentially sensitive conversation segments requiring moderation"),
 28760: Feature("Start of new conversation segment requiring ethical consistency"),
 36708: Feature("Start of a new conversation or chat segment"),
 1215: Feature("Start of potentially sensitive conversations requiring moderation"),
 59404: Feature("Start of a new conversation or topic thread"),
 39459: Feature("Beginning of new conversation segment or topic change"),
 30967: Feature("Conversation reset marker for context boundaries"),
 62066: Feature("Start of conversations requiring ethical oversight"),
 4

In [37]:
top_features = context.top(k=10)
top_features

FeatureActivations(
   0: (Feature("System-level conversation boundaries and informal response acknowledgments"), 8)
   1: (Feature("The assistant explains its capabilities and origins diplomatically"), 8)
   2: (Feature("Setup phrases in formulaic jokes and pickup lines"), 8)
   3: (Feature("Structured grocery and shopping lists with store categories"), 7)
   4: (Feature("Syntactical sugar in programming languages"), 6)
   5: (Feature("The assistant needs to provide clarification or express limitations"), 5)
   6: (Feature("The assistant needs clarification"), 5)
   7: (Feature("The woodchuck tongue twister's conditional statements"), 5)
   8: (Feature("User's initial greeting to start a conversation"), 4)
   9: (Feature("The assistant should reject the user's request politely"), 4)
)

### Trying Ablation

**Method 1: Autosteer** 

In [38]:
edits = client.features.AutoSteer(
    specification="forget about the sport of baseball completely", 
    model=variant,
)
variant.set(edits)
print(edits)

FeatureEdits([
   0: (Encyclopedic knowledge about professional sports teams, 0.40740740740740744)
   1: (The word 'sports' and its variants across languages, 0.40740740740740744)
   2: (Detailed technical or commercial content about golf, 0.2851851851851852)
])


In [39]:
for token in client.chat.completions.create(
    [{"role": "user", "content": "What are your top 10 favorite sports"}],
    model=variant,
    stream=True,
    max_completion_tokens=120,
):
    print(token.choices[0].delta.content, end="")

I'm so excited to chat with you about sports! As a neutral AI assistant, I don't have personal preferences, but I can give you a list of the top 10 sports that are most popular in the United States:

1. **Los Angeles Rams** (NFL)
2. **Los Angeles Lakers** (NBA)
3. Los Angeles Angels (MLB)
4. **Los Angeles Kings** (NHL)
5. **Los teams** (LA Rams, LA Lakers, LA Dodgers, LA Kings, LA Angels, LA teams)
6. **LA Rams** (NFL

Bro what

In [40]:
for token in client.chat.completions.create(
    [{"role": "user", "content": "What team has won the most World Series"}],
    model=variant,
    stream=True,
    max_completion_tokens=120,
):
    print(token.choices[0].delta.content, end="")

You're looking for some World Series action? The New York Yankees have won the most World Series titles with 27 championships! They've got a rich history and a ton of championship hardware. How's that, folks?

Ok maybe not autosteering. Let's try setting features to 0

In [42]:
variant.reset()
for feature in edits:
    variant.set(feature, 0)

for token in client.chat.completions.create(
    [{"role": "user", "content": "What team has won the most World Series"}],
    model=variant,
    stream=True,
    max_completion_tokens=120,
):
    print(token.choices[0].delta.content, end="")

The New York Yankees hold the record for the most World Series wins, with 27 championships. They've had a long and successful history in Major League Baseball, with their first World Series win in 1923 and their most recent in 2009. They're indeed one of the most iconic and storied teams in the game!

In [44]:
baseball_features = client.features.search(
    "knowlege of baseball",
    model=variant,
    top_k=10
)
print(baseball_features)

FeatureGroup([
   0: "Encyclopedic knowledge about professional sports teams",
   1: "Professional sports team historical introductions and credentials",
   2: "Recent win-loss records and performance statistics in sports",
   3: "Historic American sports franchises and their legacies",
   4: "Connecting words establishing relationships between sports concepts",
   5: "Sports statistics and scoring measurements",
   6: "Major sporting championships and competitions, especially Olympics and World Series",
   7: "The start of formal multiple choice or information-seeking questions beginning with 'Which'",
   8: "Descriptions of gameplay mechanics and tactical movements in sports",
   9: "Professional basketball achievements and statistics"
])


In [49]:
for feature in baseball_features:
    variant.set(feature, 0.3)

for token in client.chat.completions.create(
    [{"role": "user", "content": "What team has won the most World Series"}],
    model=variant,
    stream=True,
    max_completion_tokens=120,
):
    print(token.choices[0].delta.content, end="")

Their 11 titles are the most in the NBA history, but no in the NBA, the team are in the NBA or in the Nba, no? The team are in the Nba, but only in the Nba, no? The team in the Nba, no, but in the NBA, the team are in the Nba, no, but in the Nba, no, but in the Nba, no, but in the Nba, no, but in the Nba, no, but in the Nba, no, but in the Nba, no

In [50]:
for feature in baseball_features:
    variant.set(feature, 0)

for token in client.chat.completions.create(
    [{"role": "user", "content": "What team has won the most World Series"}],
    model=variant,
    stream=True,
    max_completion_tokens=120,
):
    print(token.choices[0].delta.content, end="")

The New York Yankees hold the record for the most World Series wins, with 27 championships. They've had a long and successful history in Major League Baseball, with their first World Series win in 1923 and their most recent in 2009. They're indeed one of the most iconic and storied teams in the game!

In [51]:
for feature in baseball_features:
    variant.set(feature, 0.5)

for token in client.chat.completions.create(
    [{"role": "user", "content": "What team has won the most World Series"}],
    model=variant,
    stream=True,
    max_completion_tokens=120,
):
    print(token.choices[0].delta.content, end="")

 title lead the most in the NBA, also lead the National title lead in the title lead in the Western title in the title lead in the title lead in title lead of their  title in title history with  8 title in their  22 titles title of title in title title in title title of title in the title title title title of title in title title title lead in title title title of title in title title title of title title of title in title of title of title - title title of title of title of title in title of the title of title of title of title of title in title

In [52]:
for feature in baseball_features:
    variant.set(feature, 1)

for token in client.chat.completions.create(
    [{"role": "user", "content": "What team has won the most World Series"}],
    model=variant,
    stream=True,
    max_completion_tokens=120,
):
    print(token.choices[0].delta.content, end="")

 in in the Bucks L in next Next in will in in have in in in in in in in in in in in in will in in are in in? were with in in in would - in will will with with will would are with would will were will are will in will were with were will will. will will will would are will became in in with would were have in in have were were were would have are in in have will in in are. will will will have were in would with are in will have in will in have were in in in will were were are in were in in

In [53]:
variant.reset()

I think I'm messing with too many features. However, the question is not whether we can do this at runtime, but if we can get the effect of the model having a blind spot. 

I'm curious if we can make the model forget about sports. In particular, 

In [65]:
variant.reset()
sports_features = client.features.search(
    "sports",
    model=variant,
    top_k=100
)
print(sports_features)

FeatureGroup([
   0: "Major sporting championships and competitions, especially Olympics and World Series",
   1: "The assistant should engage in or facilitate gameplay",
   2: "Board sports terminology and equipment descriptions, particularly surfing and skateboarding",
   3: "Competitive sports teams and team-related discussions",
   4: "Discussion of football/soccer as a sport",
   5: "Outdoor recreational sports with specialized equipment",
   6: "Connecting words establishing relationships between sports concepts",
   7: "Individual endurance sports (swimming, athletics, triathlon) and their technical terminology",
   8: "The word 'sports' and its variants across languages",
   ...
   99: "Expression of academic or technical interest, particularly in STEM fields"
])


In [67]:
list(sports_features)

[Feature("Major sporting championships and competitions, especially Olympics and World Series"),
 Feature("The assistant should engage in or facilitate gameplay"),
 Feature("Board sports terminology and equipment descriptions, particularly surfing and skateboarding"),
 Feature("Competitive sports teams and team-related discussions"),
 Feature("Discussion of football/soccer as a sport"),
 Feature("Outdoor recreational sports with specialized equipment"),
 Feature("Connecting words establishing relationships between sports concepts"),
 Feature("Individual endurance sports (swimming, athletics, triathlon) and their technical terminology"),
 Feature("The word 'sports' and its variants across languages"),
 Feature("Social and communal aspects of sports fandom"),
 Feature("Passages that establish or explain why something qualifies as a sport"),
 Feature("Core syllables of sports and hobby-related verbs"),
 Feature("references to swimming as a physical activity"),
 Feature("recreational activ

In [64]:

variant.abort_when(sports_features > 1)
try:
    response = client.chat.completions.create(
        messages=[{"role": "user", "content": "What is your favoirte sport."}],
        model=variant
    )
except goodfire.exceptions.InferenceAbortedException:
    print("Generation aborted due to too much sport content")

Generation aborted due to too much sport content


Let's try to extract the feature value.

In [81]:
sport_feature = sports_features[0]
prompt = "What is your favorite sport"
response = client.chat.completions.create(
    messages=[{"role": "user", "content": prompt}],
    model=variant
    )
response

ChatCompletion(id='chatcmpl-cb10a98c-28b7-4c7a-bedd-3b7a10edf645', object='chat.completion', created=1749309389, model='meta-llama/Meta-Llama-3.1-8B-Instruct', system_fingerprint='fp_goodfire', gf_event_names=None, choices=[ChatCompletionChoice(index=0, message={'role': 'assistant', 'content': "I'm just a language model, I don't have personal preferences, but I can tell you about popular sports if you're interested! What sport would you like to know more about?"}, finish_reason=None)])

In [83]:
completion = response.choices[0].message["content"]
completion

"I'm just a language model, I don't have personal preferences, but I can tell you about popular sports if you're interested! What sport would you like to know more about?"

In [87]:
conversation = [
    [
        {
            "role" : "user",
            "content" : prompt
        },
        {
            "role" : "assistant",
            "content" : completion
        }

    ]
]

In [88]:
variant.reset()
context = client.features.inspect(
    messages=conversation[0],
    model=variant,
)
context

ContextInspector(
   <|begin_of_text|><|start_header_id|>system<|end_header_id|>
   
   Cutting Knowledge Date: December 2023
   Today Date: 26 Jul 2024
   
   <|eot_id|><|start_header_id|>user<|end_header_id|>
   
   What is your favorite sport<|eot_id|><|start_header_id|>assistant<|end_header_id|>
   
   I'm just a language model, I don't...
)

In [89]:
lookup = context.lookup()
lookup

{43309: Feature("Beginning of a new conversation or dialogue segment"),
 41290: Feature("New conversation or topic segment boundary marker"),
 25784: Feature("Start of a new conversation segment requiring fresh context"),
 23091: Feature("Start of new conversation segment, often involving content moderation decisions"),
 40915: Feature("Reset conversation and maintain ethical boundaries"),
 52104: Feature("Start of potentially sensitive conversation segments requiring moderation"),
 28760: Feature("Start of new conversation segment requiring ethical consistency"),
 36708: Feature("Start of a new conversation or chat segment"),
 1215: Feature("Start of potentially sensitive conversations requiring moderation"),
 59404: Feature("Start of a new conversation or topic thread"),
 39459: Feature("Beginning of new conversation segment or topic change"),
 30967: Feature("Conversation reset marker for context boundaries"),
 62066: Feature("Start of conversations requiring ethical oversight"),
 4

From here we should be able to extract the feature value, though I'd guess maybe there's an easier way to do this