# Comparing Output Using Different GeLU Activations

In this notebook, we will compare the output of different GeLU activations. We will use the following GeLU activations:

1. GELU (MLX implementation)
2. GELU Approximate (MLX implementation)
3. GELU Fast Approximate (MLX implementation)
4. GELU Approximate [(Keras implementation)](https://www.tensorflow.org/api_docs/python/tf/keras/activations/gelu)


In [1]:
def q(model, tokenizer, prompt):
    return generate(
        model,
        tokenizer,
        prompt,
        verbose=True,
        temp=0.0,
        max_tokens=256,
    )

In [None]:
# This is where Gemma implemented in MLX uses a GELU activation function


class MLP(nn.Module):
    def __init__(self, dim, hidden_dim):
        super().__init__()
        self.gate_proj = nn.Linear(dim, hidden_dim, bias=False)
        self.down_proj = nn.Linear(hidden_dim, dim, bias=False)
        self.up_proj = nn.Linear(dim, hidden_dim, bias=False)

    def __call__(self, x) -> mx.array:
        return self.down_proj(nn.gelu(self.gate_proj(x)) * self.up_proj(x))

## Using GeLU activation function (as is)


In [2]:
from mlx_lm import generate, load

model, tokenizer = load("google/gemma-7b-it")

# Standard output
prompt = """
Why is the sky blue?
""".strip()
q(model, tokenizer, prompt)

Fetching 11 files:   0%|          | 0/11 [00:00<?, ?it/s]

Prompt: Why is the sky blue?


The sky is blue due to a phenomenon called **Rayleigh Scattering**.

Here's a breakdown of what happens:

1. **Sunlight:** Sunrays are made up of all the colors of the rainbow, with each color having a different wavelength.
2. **Scattering:** When sunlight enters Earth's atmosphere, it interacts with the tiny particles of air (dust, water vapor, etc.). These particles scatter the sunlight in all directions.
3. **Blue Scatter:** The particles scatter the shorter wavelengths of blue and violet light more effectively than the longer wavelengths of red and orange light.
4. **Scattered Light:** The scattered light, which is predominantly blue, is scattered in all directions.
5. **Our View:** We see the scattered light from all directions, including the direction opposite the sun. This is why the sky appears blue.

**Additional factors:**

* **Time of Day:** The intensity of the blue color is strongest at midday and decreases as the sun gets closer to the horiz

"\n\nThe sky is blue due to a phenomenon called **Rayleigh Scattering**.\n\nHere's a breakdown of what happens:\n\n1. **Sunlight:** Sunrays are made up of all the colors of the rainbow, with each color having a different wavelength.\n2. **Scattering:** When sunlight enters Earth's atmosphere, it interacts with the tiny particles of air (dust, water vapor, etc.). These particles scatter the sunlight in all directions.\n3. **Blue Scatter:** The particles scatter the shorter wavelengths of blue and violet light more effectively than the longer wavelengths of red and orange light.\n4. **Scattered Light:** The scattered light, which is predominantly blue, is scattered in all directions.\n5. **Our View:** We see the scattered light from all directions, including the direction opposite the sun. This is why the sky appears blue.\n\n**Additional factors:**\n\n* **Time of Day:** The intensity of the blue color is strongest at midday and decreases as the sun gets closer to the horizon.\n* **Cloud

In [3]:
prompt = "空が青いのはなぜですか？"
q(model, tokenizer, prompt)

Prompt: 空が青いのはなぜですか？


実際、空気は実際実際赤い色です。ただし、人間の目は赤い色を認識するには、特定の波長の光が必要です。空気の分子は、その特定の波長の光を吸収し、人間の目に届く残り色を青に見えます。
Prompt: 18.845 tokens-per-sec
Generation: 18.632 tokens-per-sec


'\n\n実際、空気は実際実際赤い色です。ただし、人間の目は赤い色を認識するには、特定の波長の光が必要です。空気の分子は、その特定の波長の光を吸収し、人間の目に届く残り色を青に見えます。'

## gelu_approx


In [6]:
from mlx_lm import generate, load

model, tokenizer = load("google/gemma-7b-it")

# Standard output
prompt = """
Why is the sky blue?
""".strip()
q(model, tokenizer, prompt)

Fetching 11 files:   0%|          | 0/11 [00:00<?, ?it/s]

Prompt: Why is the sky blue?


The sky is blue due to a phenomenon called **Rayleigh Scattering**.

Here's a breakdown of the process:

1. **Sunlight:** Sunlight consists of all the colors of the rainbow, including blue.
2. **Scattering:** When sunlight hits particles in the air (such as dust, smoke, or even air molecules), it gets scattered in all directions.
3. **Blue Scatter:** The scattered light, including the blue component, is scattered in all directions.
4. **Our Eyes:** We see the scattered light as the color of the sky.

**Different colors scatter differently:**

- **Blue light:** Scattered more strongly in all directions.
- **Red and Yellow light:** Scattered less, mainly towards the forward direction.
- **Green light:** Scattered even less than red and yellow.

This is why we see the sky as blue. The scattered light from different directions combines to create the blue color we see above us.

**Additional factors:**

- **Time of Day:** The sky is bluer at noon than at sunri

"\n\nThe sky is blue due to a phenomenon called **Rayleigh Scattering**.\n\nHere's a breakdown of the process:\n\n1. **Sunlight:** Sunlight consists of all the colors of the rainbow, including blue.\n2. **Scattering:** When sunlight hits particles in the air (such as dust, smoke, or even air molecules), it gets scattered in all directions.\n3. **Blue Scatter:** The scattered light, including the blue component, is scattered in all directions.\n4. **Our Eyes:** We see the scattered light as the color of the sky.\n\n**Different colors scatter differently:**\n\n- **Blue light:** Scattered more strongly in all directions.\n- **Red and Yellow light:** Scattered less, mainly towards the forward direction.\n- **Green light:** Scattered even less than red and yellow.\n\nThis is why we see the sky as blue. The scattered light from different directions combines to create the blue color we see above us.\n\n**Additional factors:**\n\n- **Time of Day:** The sky is bluer at noon than at sunrise or s

In [7]:
prompt = "空が青いのはなぜですか？"
q(model, tokenizer, prompt)

Prompt: 空が青いのはなぜですか？


実際、空気は実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際
Prompt: 123.277 tokens-per-sec
Generation: 18.640 tokens-per-sec


'\n\n実際、空気は実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際'

## gelu_fast_approx


In [5]:
from mlx_lm import generate, load

model, tokenizer = load("google/gemma-7b-it")

prompt = "Why is the sky blue?"
q(model, tokenizer, prompt)

Fetching 11 files:   0%|          | 0/11 [00:00<?, ?it/s]

Prompt: Why is the sky blue?


The sky is blue because of a phenomenon called **Rayleigh Scattering**.

**Rayleigh Scattering**

* When sunlight hits the Earth's atmosphere, it interacts with the particles of air, such as nitrogen and oxygen.
* The particles scatter the sunlight in all directions.
* The scattered light is scattered in all directions, but the light scattered in the direction of our eyes is more visible.
* The particles scatter the different colors of the spectrum differently.
* The shorter wavelengths of blue light are scattered more effectively than the longer wavelengths of red light.
* This scattered light is what we see as the blue sky.

**Other factors:**

* The amount of scattering depends on the time of day and the angle of the sun.
* The sky is bluer at noon than at sunrise or sunset.
* The sky is also bluer when the sun is high in the sky.
* The presence of clouds or dust particles can reduce the amount of scattering.
Prompt: 15.755 tokens-per-sec
Generation: 1

"\n\nThe sky is blue because of a phenomenon called **Rayleigh Scattering**.\n\n**Rayleigh Scattering**\n\n* When sunlight hits the Earth's atmosphere, it interacts with the particles of air, such as nitrogen and oxygen.\n* The particles scatter the sunlight in all directions.\n* The scattered light is scattered in all directions, but the light scattered in the direction of our eyes is more visible.\n* The particles scatter the different colors of the spectrum differently.\n* The shorter wavelengths of blue light are scattered more effectively than the longer wavelengths of red light.\n* This scattered light is what we see as the blue sky.\n\n**Other factors:**\n\n* The amount of scattering depends on the time of day and the angle of the sun.\n* The sky is bluer at noon than at sunrise or sunset.\n* The sky is also bluer when the sun is high in the sky.\n* The presence of clouds or dust particles can reduce the amount of scattering."

In [4]:
prompt = "空が青いのはなぜですか？"
q(model, tokenizer, prompt)

Fetching 11 files:   0%|          | 0/11 [00:00<?, ?it/s]

Prompt: 空が青いのはなぜですか？


実際、空気は実際、常に赤い色に染まっています。しかし、私たちが見ている空気は、実際よりも明るい色に感じられます。これは、人間の視覚が空気の成分や温度などの物理特性に影響され、実際よりも明るい色に感じているためです。
Prompt: 19.923 tokens-per-sec
Generation: 9.564 tokens-per-sec


'\n\n実際、空気は実際、常に赤い色に染まっています。しかし、私たちが見ている空気は、実際よりも明るい色に感じられます。これは、人間の視覚が空気の成分や温度などの物理特性に影響され、実際よりも明るい色に感じているためです。'

## GeLU Approximate using Keras Implementation


In [3]:
from mlx_lm import generate, load

model, tokenizer = load("google/gemma-7b-it")

prompt = "Why is the sky blue?"
q(model, tokenizer, prompt)

Fetching 11 files:   0%|          | 0/11 [00:00<?, ?it/s]

Prompt: Why is the sky blue?


The sky is blue due to a phenomenon called **Rayleigh Scattering**.

Here's a breakdown of the process:

1. **Sunlight:** Sunlight consists of all the colors of the rainbow, including blue.
2. **Scattering:** When sunlight hits particles in the air (such as dust, smoke, or even air molecules), it gets scattered in all directions.
3. **Blue Scatter:** The scattered light, including the blue component, is scattered in all directions.
4. **Our Eyes:** We see the scattered light as the color of the sky.

**Different colors scatter differently:**

- **Blue light:** Scattered more strongly in all directions.
- **Red and Yellow light:** Scattered less, mainly towards the forward direction.
- **Green light:** Scattered even less than red and yellow.

This is why we see the sky as blue. The scattered light from different directions combines to create the blue color we see above us.

**Additional factors:**

- **Time of Day:** The sky is bluer at noon than at sunri

"\n\nThe sky is blue due to a phenomenon called **Rayleigh Scattering**.\n\nHere's a breakdown of the process:\n\n1. **Sunlight:** Sunlight consists of all the colors of the rainbow, including blue.\n2. **Scattering:** When sunlight hits particles in the air (such as dust, smoke, or even air molecules), it gets scattered in all directions.\n3. **Blue Scatter:** The scattered light, including the blue component, is scattered in all directions.\n4. **Our Eyes:** We see the scattered light as the color of the sky.\n\n**Different colors scatter differently:**\n\n- **Blue light:** Scattered more strongly in all directions.\n- **Red and Yellow light:** Scattered less, mainly towards the forward direction.\n- **Green light:** Scattered even less than red and yellow.\n\nThis is why we see the sky as blue. The scattered light from different directions combines to create the blue color we see above us.\n\n**Additional factors:**\n\n- **Time of Day:** The sky is bluer at noon than at sunrise or s

In [4]:
prompt = "空が青いのはなぜですか？"
q(model, tokenizer, prompt)

Prompt: 空が青いのはなぜですか？


実際、空気は実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際
Prompt: 13.532 tokens-per-sec
Generation: 18.502 tokens-per-sec


'\n\n実際、空気は実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際、実際実際'