## 1. Lane-Deviation-Loss

Obtienes la referencia de la lane por la que circula el agente (map_api.get_lane) y se calcula en el entrenamiento la distancia del punto predicho a la polil√≠nea del carril. As√≠ si te sales castigas al modelo. 

In [None]:
# DISTANCIA DE UN PUNTO A UN SEGMENTO
import torch
def point_to_segment_distance(p, a, b):
    """
    p: tensor (..., 2), punto
    a, b: tensores (2,), extremos del segmento
    Devuelve distancia m√≠nima punto-segmento
    """
    ap = p - a
    ab = b - a
    ab_norm = torch.sum(ab * ab)

    t = torch.clamp(torch.sum(ap * ab) / (ab_norm + 1e-8), 0., 1.)
    proj = a + t * ab
    return torch.norm(p - proj, dim=-1)

In [None]:
# DISTANCIA DE PUNTO A POLIL√çNEA

def point_to_polyline_distance(p, polyline):
    """
    p: tensor (..., 2)
    polyline: tensor (N, 2)
    """
    distances = []
    for i in range(polyline.shape[0] - 1):
        a = polyline[i]
        b = polyline[i+1]
        dist = point_to_segment_distance(p, a, b)
        distances.append(dist)
    return torch.stack(distances).min()


In [None]:
# CALCULAR LA LOSS PARA UNA SOLA TRAYECTORIA

def lane_deviation_loss_single(traj_global, lane_polyline):
    """
    traj_global: tensor (T, 2)
    lane_polyline: tensor (N, 2)
    """
    distances = []
    for t in range(traj_global.shape[0]):
        pt = traj_global[t]
        dist = point_to_polyline_distance(pt, lane_polyline)
        distances.append(dist)
    return torch.stack(distances).mean()


In [None]:

class MTPLoss:
    """ Computes the loss for the MTP model. """

    def __init__(self,
                 num_modes: int,
                 regression_loss_weight: float = 1.,
                 angle_threshold_degrees: float = 5.,
                 lane_loss_weight=1.0, 
                 helper=None):
        """
        Inits MTP loss.
        :param num_modes: How many modes are being predicted for each agent.
        :param regression_loss_weight: Coefficient applied to the regression loss to
            balance classification and regression performance.
        :param angle_threshold_degrees: Minimum angle needed between a predicted trajectory
            and the ground to consider it a match.
        """
        self.num_modes = num_modes
        self.num_location_coordinates_predicted = 2  # We predict x, y coordinates at each timestep.
        self.regression_loss_weight = regression_loss_weight
        self.angle_threshold = angle_threshold_degrees
        self.lane_loss_weight = lane_loss_weight
        self.helper = helper

    def _get_trajectory_and_modes(self,
                                  model_prediction: torch.Tensor) -> Tuple[torch.Tensor, torch.Tensor]:
        """
        Splits the predictions from the model into mode probabilities and trajectory.
        :param model_prediction: Tensor of shape [batch_size, n_timesteps * n_modes * 2 + n_modes].
        :return: Tuple of tensors. First item is the trajectories of shape [batch_size, n_modes, n_timesteps, 2].
            Second item are the mode probabilities of shape [batch_size, num_modes].
        """
        mode_probabilities = model_prediction[:, -self.num_modes:].clone()

        desired_shape = (model_prediction.shape[0], self.num_modes, -1, self.num_location_coordinates_predicted)
        trajectories_no_modes = model_prediction[:, :-self.num_modes].clone().reshape(desired_shape)

        return trajectories_no_modes, mode_probabilities

    @staticmethod
    def _angle_between(ref_traj: torch.Tensor,
                       traj_to_compare: torch.Tensor) -> float:
        """
        Computes the angle between the last points of the two trajectories.
        The resulting angle is in degrees and is an angle in the [0; 180) interval.
        :param ref_traj: Tensor of shape [n_timesteps, 2].
        :param traj_to_compare: Tensor of shape [n_timesteps, 2].
        :return: Angle between the trajectories.
        """

        EPSILON = 1e-5

        if (ref_traj.ndim != 2 or traj_to_compare.ndim != 2 or
                ref_traj.shape[1] != 2 or traj_to_compare.shape[1] != 2):
            raise ValueError('Both tensors should have shapes (-1, 2).')

        if torch.isnan(traj_to_compare[-1]).any() or torch.isnan(ref_traj[-1]).any():
            return 180. - EPSILON

        traj_norms_product = float(torch.norm(ref_traj[-1]) * torch.norm(traj_to_compare[-1]))

        # If either of the vectors described in the docstring has norm 0, return 0 as the angle.
        if math.isclose(traj_norms_product, 0):
            return 0.

        # We apply the max and min operations below to ensure there is no value
        # returned for cos_angle that is greater than 1 or less than -1.
        # This should never be the case, but the check is in place for cases where
        # we might encounter numerical instability.
        dot_product = float(ref_traj[-1].dot(traj_to_compare[-1]))
        angle = math.degrees(math.acos(max(min(dot_product / traj_norms_product, 1), -1)))

        if angle >= 180:
            return angle - EPSILON

        return angle

    @staticmethod
    def _compute_ave_l2_norms(tensor: torch.Tensor) -> float:
        """
        Compute the average of l2 norms of each row in the tensor.
        :param tensor: Shape [1, n_timesteps, 2].
        :return: Average l2 norm. Float.
        """
        l2_norms = torch.norm(tensor, p=2, dim=2)
        avg_distance = torch.mean(l2_norms)
        return avg_distance.item()

    def _compute_angles_from_ground_truth(self, target: torch.Tensor,
                                          trajectories: torch.Tensor) -> List[Tuple[float, int]]:
        """
        Compute angle between the target trajectory (ground truth) and the predicted trajectories.
        :param target: Shape [1, n_timesteps, 2].
        :param trajectories: Shape [n_modes, n_timesteps, 2].
        :return: List of angle, index tuples.
        """
        angles_from_ground_truth = []
        for mode, mode_trajectory in enumerate(trajectories):
            # For each mode, we compute the angle between the last point of the predicted trajectory for that
            # mode and the last point of the ground truth trajectory.
            angle = self._angle_between(target[0], mode_trajectory)

            angles_from_ground_truth.append((angle, mode))
        return angles_from_ground_truth

    def _compute_best_mode(self,
                           angles_from_ground_truth: List[Tuple[float, int]],
                           target: torch.Tensor, trajectories: torch.Tensor) -> int:
        """
        Finds the index of the best mode given the angles from the ground truth.
        :param angles_from_ground_truth: List of (angle, mode index) tuples.
        :param target: Shape [1, n_timesteps, 2]
        :param trajectories: Shape [n_modes, n_timesteps, 2]
        :return: Integer index of best mode.
        """

        # We first sort the modes based on the angle to the ground truth (ascending order), and keep track of
        # the index corresponding to the biggest angle that is still smaller than a threshold value.
        angles_from_ground_truth = sorted(angles_from_ground_truth)
        max_angle_below_thresh_idx = -1
        for angle_idx, (angle, mode) in enumerate(angles_from_ground_truth):
            if angle <= self.angle_threshold:
                max_angle_below_thresh_idx = angle_idx
            else:
                break

        # We choose the best mode at random IF there are no modes with an angle less than the threshold.
        if max_angle_below_thresh_idx == -1:
            best_mode = random.randint(0, self.num_modes - 1)

        # We choose the best mode to be the one that provides the lowest ave of l2 norms between the
        # predicted trajectory and the ground truth, taking into account only the modes with an angle
        # less than the threshold IF there is at least one mode with an angle less than the threshold.
        else:
            # Out of the selected modes above, we choose the final best mode as that which returns the
            # smallest ave of l2 norms between the predicted and ground truth trajectories.
            distances_from_ground_truth = []

            for angle, mode in angles_from_ground_truth[:max_angle_below_thresh_idx + 1]:
                norm = self._compute_ave_l2_norms(target - trajectories[mode, :, :])

                distances_from_ground_truth.append((norm, mode))

            distances_from_ground_truth = sorted(distances_from_ground_truth)
            best_mode = distances_from_ground_truth[0][1]

        return best_mode

    def __call__(self, predictions: torch.Tensor, targets: torch.Tensor) -> torch.Tensor:
        """
        Computes the MTP loss on a batch.
        The predictions are of shape [batch_size, n_ouput_neurons of last linear layer]
        and the targets are of shape [batch_size, 1, n_timesteps, 2]
        :param predictions: Model predictions for batch.
        :param targets: Targets for batch.
        :return: zero-dim tensor representing the loss on the batch.
        """

        batch_losses = torch.Tensor().requires_grad_(True).to(predictions.device)
        trajectories, modes = self._get_trajectory_and_modes(predictions)

        for batch_idx in range(predictions.shape[0]):

            angles = self._compute_angles_from_ground_truth(target=targets[batch_idx],
                                                            trajectories=trajectories[batch_idx])

            best_mode = self._compute_best_mode(angles,
                                                target=targets[batch_idx],
                                                trajectories=trajectories[batch_idx])

            best_mode_trajectory = trajectories[batch_idx, best_mode, :].unsqueeze(0)

            regression_loss = f.smooth_l1_loss(best_mode_trajectory, targets[batch_idx])

            mode_probabilities = modes[batch_idx].unsqueeze(0)
            best_mode_target = torch.tensor([best_mode], device=predictions.device)
            classification_loss = f.cross_entropy(mode_probabilities, best_mode_target)

            loss = classification_loss + self.regression_loss_weight * regression_loss
            # ============================================================
            # üõ£Ô∏è LANE DEVIATION LOSS
            # ============================================================
            if self.lane_loss_weight > 0 and self.helper is not None:

                instance_token, sample_token = tokens[batch_idx]

                ann = self.helper.get_sample_annotation(instance_token, sample_token)
                agent_x, agent_y = ann["translation"][:2]

                lane_ids = self.helper.map_api.get_lane_ids_in_xy(agent_x, agent_y)

                if len(lane_ids) > 0:
                    lane_id = lane_ids[0]
                    lane_poly = torch.tensor(
                        self.helper.map_api.get_lane_centerline(lane_id)[:, :2],
                        dtype=best_mode_trajectory.dtype,
                        device=best_mode_trajectory.device
                    )

                    # Convert local -> global
                    traj_local = best_mode_trajectory[0]   # (T,2)
                    quat = Quaternion(ann["rotation"])

                    traj_rot = torch.tensor(
                        [quat.rotate((p[0].item(), p[1].item(), 0.0))[:2] for p in traj_local],
                        dtype=traj_local.dtype,
                        device=traj_local.device
                    )

                    traj_global = traj_rot + torch.tensor([agent_x, agent_y], device=traj_rot.device)

                    # Lane deviation loss
                    lane_loss = lane_deviation_loss_single(traj_global, lane_poly)

                    loss = loss + self.lane_loss_weight * lane_loss


            batch_losses = torch.cat((batch_losses, loss.unsqueeze(0)), 0)

        avg_loss = torch.mean(batch_losses)

        return avg_loss

In [None]:
# Canviar quan es truca a la loss per posarli 
# una weight a la lane loss (canviar a mtp)

loss_fn = MTPLoss(
    num_modes=num_modes,
    regression_loss_weight=1.0,
    angle_threshold_degrees=5.,
    lane_loss_weight=1.0,   # <-- nuevo
    helper=helper           # <-- necesario
)


## 2. Snap-to-Lane: 

after the training is done the snap to lane function is used in the prediction to predict the closest point that is IN the lane. 

In [None]:
import numpy as np
from pyquaternion import Quaternion
import numpy as np

def get_agent_lane(helper, instance_token, sample_token):
    # Posici√≥n del agente en coordenadas globales
    annotation = helper.get_sample_annotation(instance_token, sample_token)
    agent_x, agent_y = annotation['translation'][:2]
    
    lanes = helper.map_api.get_lane_ids_in_xy(agent_x, agent_y)
    if len(lanes) == 0:
        return None  # No ha encontrado lane (raro, pero posible)
    
    # Devolvemos la primera para simplificar
    return lanes[0]

def get_lane_centerline(helper, lane_id):
    record = helper.map_api.get_lane(lane_id)
    lane_center = helper.map_api.get_lane_centerline(lane_id)
    # lane_center es un array Nx2 con la polil√≠nea
    return np.array(lane_center[:, :2])

def project_point_to_polyline(point, polyline):
    px, py = point
    min_dist = float('inf')
    closest_point = None
    
    for i in range(len(polyline) - 1):
        p1 = polyline[i]
        p2 = polyline[i+1]
        
        v = p2 - p1
        w = point - p1
        
        t = np.dot(w, v) / (np.dot(v, v) + 1e-8)
        t = np.clip(t, 0, 1)
        
        proj = p1 + t * v
        dist = np.linalg.norm(point - proj)

        if dist < min_dist:
            min_dist = dist
            closest_point = proj
            
    return closest_point


def snap_trajectory_to_lane(global_traj, helper, instance_token, sample_token):
    ann = helper.get_sample_annotation(instance_token, sample_token)
    x, y = ann["translation"][:2]

    lane_ids = helper.map_api.get_lane_ids_in_xy(x, y)
    if len(lane_ids) == 0:
        return global_traj  # no lane found

    lane_id = lane_ids[0]
    centerline = helper.map_api.get_lane_centerline(lane_id)[:, :2]

    snapped = []
    for point in global_traj:
        snapped.append(project_point_to_polyline(point, centerline))
    return np.array(snapped)


We add to the generate submision function to pass from the global to a lane aligned position

pred_coords_global[mode_idx] = snap_trajectory_to_lane(
    pred_coords_global[mode_idx],
    helper,
    instance_token,
    sample_token
)

In [None]:
def generate_submission_notebook(model, dataset, output_path="submission.json"):
    model.eval()
    predictions_list = []
    
    # Necesitamos el helper para buscar la pose del agente
    helper = dataset.helper 

    print(f"üöó Generando submission con conversi√≥n LOCAL -> GLOBAL...")
    
    for i in tqdm(range(len(dataset))):
        img, agent_state, _, _ = dataset[i]
        
        # Recuperar tokens
        raw_token = dataset.split[i]
        instance_token, sample_token = raw_token.split("_")

        # Inferencia
        img = img.unsqueeze(0)        
        agent_state = agent_state.unsqueeze(0)
        with torch.no_grad():
            pred = model(img, agent_state)

        # Procesar salida (tu c√≥digo de antes)
        total_output_size = pred.shape[1]
        num_modes = total_output_size // 25 
        num_coords = num_modes * 24
        
        pred_coords = pred[0, :num_coords]
        pred_probs = pred[0, num_coords:]
        
        # [Num_modos, 12, 2] en coordenadas LOCALES
        pred_coords_local = pred_coords.reshape(num_modes, 12, 2).cpu().numpy()

        # ============================================================
        # üåç TRANSFORMACI√ìN CR√çTICA: LOCAL -> GLOBAL
        # ============================================================
        
        # 1. Obtener la pose actual del agente en el mapa global
        sample_annotation = helper.get_sample_annotation(instance_token, sample_token)
        translation = sample_annotation['translation'] # [x, y, z] global
        rotation = sample_annotation['rotation']       # Quaternion global
        
        # 2. Convertir a matriz de transformaci√≥n (Local -> Global)
        # Nota: transform_matrix espera rotaci√≥n como Quaternion y translaci√≥n
        # Pero ojo: MTP predice X,Y (2D). NuScenes es 3D.
        
        # Manera simplificada de rotar y trasladar vectores 2D:
        quaternion = Quaternion(rotation)
        
        # Creamos un array vac√≠o para las coordenadas globales
        pred_coords_global = np.zeros_like(pred_coords_local)

        for mode_idx in range(num_modes):
            # Cogemos la trayectoria de un modo (Shape: 12, 2)
            trajectory_local = pred_coords_local[mode_idx]
            
            # A. A√±adimos una columna de ceros para Z (necesario para rotaci√≥n 3D)
            # Shape se convierte en (12, 3) -> [x, y, 0]
            traj_3d = np.hstack([trajectory_local, np.zeros((12, 1))])
            
            # B. Rotar (El agente mira hacia una direcci√≥n, rotamos los puntos)
            # Iteramos punto a punto o usamos vectorizaci√≥n si es posible. 
            # rotate funciona con vector √∫nico, as√≠ que iteramos para asegurar:
            traj_rotated = np.array([quaternion.rotate(p) for p in traj_3d])
            
            # C. Trasladar (Sumar la posici√≥n global actual del coche)
            # Solo sumamos X e Y (√≠ndices 0 y 1)
            pred_coords_global[mode_idx, :, 0] = traj_rotated[:, 0] + translation[0]
            pred_coords_global[mode_idx, :, 1] = traj_rotated[:, 1] + translation[1]

            # ============================================================
            # üõ£Ô∏è SNAP-TO-LANE (GLOBAL ‚Üí LANE-ALIGNED)
            # ============================================================
            pred_coords_global[mode_idx] = snap_trajectory_to_lane(
                pred_coords_global[mode_idx],
                helper,
                instance_token,
                sample_token
            )
            
        # ============================================================

        # Probabilidades
        if num_modes > 1:
            probs = torch.nn.functional.softmax(pred_probs, dim=0).cpu().numpy()
        else:
            probs = np.array([1.0])

        prediction_obj = Prediction(
            instance=instance_token,
            sample=sample_token,
            prediction=pred_coords_global, # ¬°USAMOS LAS GLOBALES!
            probabilities=probs
        )

        predictions_list.append(prediction_obj.serialize())

    with open(output_path, "w") as f:
        json.dump(predictions_list, f, indent=2)

    return output_path

## 3. RESTRINGIR EL ESPACIO DE PREDICCI√ìN (LANE CONDITIONED MTP)

4Ô∏è‚É£ Restringir el espacio de predicci√≥n (Lane-conditioned MTP)

En vez de dejar que el modelo prediga cualquier trayectoria libre, puedes:

Generar modos condicionados por la estructura de la lane (ramas, salidas, giros).

Hacer que cada ‚Äúmodo‚Äù siga una lane candidate.

Ejemplos:

CoverNet + Lattice basado en lanes

LaneGCN

Wayformer con road graph

Aqu√≠ el modelo pr√°cticamente solo puede elegir trayectorias v√°lidas por construcci√≥n.

Ventaja:

Es la soluci√≥n m√°s elegante acad√©micamente.

Desventaja:

M√°s trabajo de ingenier√≠a.

## 4. FER M√âS INTERESSANT EL BIRD EYE VIEW

1Ô∏è‚É£ Darle la informaci√≥n de la lane (BEV o vector lanes)

‚û°Ô∏è La opci√≥n que sugiere tu profe.
‚û°Ô∏è Es buena porque el modelo aprende ‚Äúpor s√≠ mismo‚Äù la geometr√≠a del mapa.

Formas de hacerlo:

Raster BEV completo (lo que estamos montando ahora).

Lanes vectorizadas (formato Trajectron++ / VectorNet).

A√±adir polil√≠neas directamente como input a un GNN o MLP.

Ventaja: no fuerza expl√≠citamente, solo ayuda.
Desventaja: el modelo a veces puede seguir equivoc√°ndose.