In [1]:
def apply_rotary_position_embeddings(sinusoidal, *tensors):
    """Apply Rotary Position Embeddings (RoPE) to the input tensors.

    Args:
        sinusoidal (Tensor): Sinusoidal positional embeddings with shape [batch_size, sequence_length, dimensions].
        *tensors (List[Tensor]): A list of tensors, where each tensor has shape [batch_size, sequence_length, ..., dimensions].

    Returns:
        List[Tensor] or Tensor: Processed tensors. Returns a list of tensors if there are multiple input tensors, otherwise, returns a single tensor.
    """
    # Check that there is at least one input tensor
    assert len(tensors) > 0, 'at least one input tensor'

    # Check that all input tensors have the same shape as the first tensor
    assert all([
        K.int_shape(tensor) == K.int_shape(tensors[0]) for tensor in tensors[1:]
    ]), 'all tensors must have the same shape'

    # Determine the number of dimensions of the input tensors
    ndim = K.ndim(tensors[0])

    # Align the sinusoidal tensor to have the same number of dimensions as the input tensors
    sinusoidal = align(sinusoidal, [0, 1, -1], ndim)

    # Extract the cosine and sine components of the positional embeddings
    cos_pos = K.repeat_elements(sinusoidal[..., 1::2], 2, -1)
    sin_pos = K.repeat_elements(sinusoidal[..., ::2], 2, -1)

    # Initialize a list to store the processed tensors
    outputs = []

    # Apply RoPE to each input tensor
    for tensor in tensors:
        # Extract and reshape the cosine and sine components of the input tensor
        tensor2 = K.stack([-tensor[..., 1::2], tensor[..., ::2]], ndim)
        tensor2 = K.reshape(tensor2, K.shape(tensor))

        # Combine the input tensor with the cosine and sine components of the positional embeddings
        outputs.append(tensor * cos_pos + tensor2 * sin_pos)

    # Return the processed tensors, either as a list or a single tensor
    return outputs[0] if len(outputs) == 1 else outputs

```python
def apply_rotary_position_embeddings(sinusoidal, *tensors):
    """Apply RoPE to tensors.
    Here, sinusoidal.shape=[b, n, d], tensors is a list of tensors where
    tensor.shape=[b, n, ..., d].
    """
    assert len(tensors) > 0, 'at least one input tensor'
    assert all([
        K.int_shape(tensor) == K.int_shape(tensors[0]) for tensor in tensors[1:]
    ]), 'all tensors must have the same shape'
```

- This function applies Rotary Position Embeddings (RoPE) to a list of tensors.
- `sinusoidal` is a tensor with shape `[batch_size, sequence_length, dimensions]`, representing the sinusoidal positional embeddings.
- `tensors` is a variable-length argument list of tensors with the same shape `[batch_size, sequence_length, ..., dimensions]`.

Now, let's break down the code:

1. The function checks that there is at least one input tensor and ensures that all input tensors have the same shape as the first tensor.

```python
    ndim = K.ndim(tensors[0])
    sinusoidal = align(sinusoidal, [0, 1, -1], ndim)
```

2. It determines the number of dimensions (`ndim`) of the input tensors and aligns the `sinusoidal` tensor to have the same number of dimensions. The `align` function is called here but is not provided in this code snippet. It essentially aligns the dimensions of `sinusoidal` with those of the input tensors.

```python
    cos_pos = K.repeat_elements(sinusoidal[..., 1::2], 2, -1)
    sin_pos = K.repeat_elements(sinusoidal[..., ::2], 2, -1)
```

3. The cosine and sine components of the positional embeddings are separated and repeated to match the dimensions of the input tensors. This is done because RoPE is typically provided as alternating sinusoidal values.

```python
    outputs = []
    for tensor in tensors:
        tensor2 = K.stack([-tensor[..., 1::2], tensor[..., ::2]], ndim)
        tensor2 = K.reshape(tensor2, K.shape(tensor))
        outputs.append(tensor * cos_pos + tensor2 * sin_pos)
```

4. A loop iterates over the input tensors. For each tensor, the cosine and sine components are extracted and reshaped to match the shape of the input tensor. The RoPE is then applied by combining the input tensor with the cosine and sine components.

```python
    return outputs[0] if len(outputs) == 1 else outputs
```

5. Finally, the function returns the processed tensors. If there is only one tensor in the list, it returns that tensor; otherwise, it returns a list of tensors.

References : https://github.com/bojone/bert4keras/blob/master/bert4keras/backend.py