# Multi-Token Indexing

When indexing hidden states for specific tokens, use `.token[<idx>]` or `.t[<idx>]`.

As a preliminary example, lets just get a hidden state from the model using `.t[<idx>]`.

In [1]:
from nnsight import LanguageModel

model = LanguageModel('openai-community/gpt2', device_map='cuda')

In [None]:
with model.trace('The Eiffel Tower is in the city of') as tracer:

    hidden_states = model.transformer.h[-1].output[0].t[0].save()
    output = model.output.save()

print(hidden_states.shape)
print(output.shape)

Lets see why token based indexing is necessary.

In this example, we call invokes on two inputs of different tokenized length. We **incorrectly** index into the hidden states using normal python indexing.

In [3]:
from rich import print

with model.trace() as tracer:
    with tracer.invoke('The') as invoker:
        incorrect_a =  model.transformer.input[0][0][:,0].save()
        
    with tracer.invoke('The Eiffel Tower is in the city of''The Eiffel Tower is in the city of') as invoker:
        incorrect_b = model.transformer.input[0][0][:,0].save()

print(f"Shorter input: {incorrect_a.value}")
print(f"Longer input: {incorrect_b.value}")

Notice how we indexed into the first token for both strings but recieved a different result from each invoke. **This is because if there are multiple invocations, padding is performed on the left side so these helper functions index from the back.**

Let's correctly index into the hidden states using token based indexing.

In [4]:
with model.trace() as tracer:
    with tracer.invoke('The') as invoker:
        correct_a =  model.transformer.input[0][0].t[0].save()
        
    with tracer.invoke('The Eiffel Tower is in the city of') as invoker:
        correct_b = model.transformer.input[0][0].t[0].save()

print(f"Shorter input: {correct_a.value}")
print(f"Longer input: {correct_b.value}")

Now we have the correct tokens!