# Digital Twin-Aided Channel Estimation

Effective channel estimation in sparse and high-dimensional environments is essential for next-generation wireless systems, particularly in large-scale MIMO deployments. This paper introduces a novel framework that leverages digital twins (DTs) as priors to enable efficient zone-specific subspace-based channel estimation (CE). Subspace-based CE significantly reduces feedback overhead by focusing on the dominant channel components, exploiting sparsity in the angular domain while preserving estimation accuracy. While DT channels may exhibit inaccuracies, their coarse-grained subspaces provide a powerful starting point, reducing the search space and accelerating convergence. The framework employs a two-step clustering process on the Grassmann manifold, combined with reinforcement learning (RL), to iteratively calibrate subspaces and align them with realworld counterparts. Simulations show that digital twins not only enable near-optimal performance but also enhance the accuracy of subspace calibration through RL, highlighting their potential as a step towards learnable digital twins.


## System Overview

The following figures illustrate the key concepts and system model used in this work:

### System Model
<img src="figs/deepmimo/system_model.PNG" alt="System Model" width="1300">

The figure illustrates the proposed zone-specific subspace prediction and calibration framework for channel estimation using digital twins. The BS designs precoders for each zone, enabling UEs to estimate the projection of real-world channels onto low-dimensional DT-based subspaces. Zones are defined by user subspace similarities on the Grassmann manifold. This approach significantly reduces CSI feedback overhead by leveraging channel sparsity and DT-based subspace detection. To address DT approximation errors, subspaces are further calibrated to optimize overhead and estimation accuracy

### Calibration Idea
<img src='figs/deepmimo/calibration_idea.png' alt="Calibration Idea" width="1000">

This figure demonstrates the calibration approach used to improve digital twin performance through reinforcement learning-based optimization.

## Table of Contents
1. [Imports](#imports)
2. [Utility Functions](#utility-functions)
3. [DRL Components](#drl-components)
4. [Plotting Functions](#plotting-functions)
5. [Main Experiments](#main-experiments)


In [1]:
!git clone https://github.com/sadjadalikhani/Digital-twin-aided-channel-estimation.git

Cloning into 'Digital-twin-aided-channel-estimation'...


In [1]:
cd Digital-twin-aided-channel-estimation

g:\Sadjad\Git\Digital-twin-aided-channel-estimation\Digital-twin-aided-channel-estimation


In [2]:
ls

 Volume in drive G is Elements
 Volume Serial Number is B822-6E6D

 Directory of g:\Sadjad\Git\Digital-twin-aided-channel-estimation\Digital-twin-aided-channel-estimation

10/26/2025  02:47 PM    <DIR>          .
10/26/2025  02:47 PM    <DIR>          ..
10/26/2025  02:46 PM                68 .gitattributes
10/26/2025  02:46 PM                41 .gitignore
10/26/2025  02:47 PM    <DIR>          __pycache__
10/26/2025  02:46 PM    <DIR>          deepverse_utils
10/26/2025  02:46 PM    <DIR>          figs
10/26/2025  02:46 PM            11,146 input_preprocess.py
10/26/2025  02:46 PM             7,352 main.py
10/26/2025  02:46 PM            17,875 main_deepverse.ipynb
10/26/2025  02:46 PM             8,260 main_deepverse.py
10/26/2025  02:46 PM             7,661 README.md
10/26/2025  02:48 PM    <DIR>          scenarios
10/26/2025  02:46 PM            34,562 utils.py
10/26/2025  02:46 PM    <DIR>          variables
               8 File(s)         86,965 bytes
               7 Dir(s)  30

## Imports

In [3]:
import numpy as np
import matplotlib.pyplot as plt
from utils import k_means, k_med, subspace_estimation, todB, subspace_estimation_drl, generate_dft_codebook, plot_smooth_cdf, plot_perf_vs_pilots
import matplotlib.cm as cm
import zipfile
import requests
from tqdm import tqdm
from pathlib import Path
import shutil
import warnings
warnings.filterwarnings("ignore")

## DeepVerse Data Download

In [7]:
def download_and_unzip(url, zip_path, extract_to):
    response = requests.get(url, stream=True)
    response.raise_for_status()

    total = int(response.headers.get('content-length', 0))
    with open(zip_path, 'wb') as f, tqdm(
        desc=f"Downloading: {zip_path.name}",
        total=total,
        unit='B',
        unit_scale=True,
        unit_divisor=1024,
    ) as bar:
        for chunk in response.iter_content(chunk_size=8192):
            size = f.write(chunk)
            bar.update(size)

    print(f"Infalting: {zip_path}")
    try:
        with zipfile.ZipFile(zip_path, 'r') as zip_ref:
            zip_ref.extractall(extract_to)
    except zipfile.BadZipFile:
        print(f"Error: {zip_path} is not a valid zip file!")
        return

    print(f"Removing zip: {zip_path}")
    zip_path.unlink()

# Set up directories
scenario_name = 'Carla-Town05'
scenario_dir = Path(f"scenarios/{scenario_name}")
scenario_dir.mkdir(parents=True, exist_ok=True)

# Download and extract wireless data
print("Preparing wireless data...")
download_and_unzip(
    "https://www.dropbox.com/scl/fi/xz9yg0zgx7r4scfc1747f/wireless.zip?rlkey=iigyjagh6irxeu5mp14zq8tz9&e=1&st=r32fwt57&dl=1",
    scenario_dir / "wireless.zip",
    scenario_dir
)

# Download and extract parameter files
print("Preparing parameter files...")
param_dir = scenario_dir / "param"
param_dir.mkdir(parents=True, exist_ok=True)
download_and_unzip(
    "https://www.dropbox.com/scl/fo/9qpcn5apzn4anj5xbdpcs/ANZ4uT6LFow_Dd2-vuSY66s?rlkey=3bgref7fdnc53j2i5r7vsd7eo&e=1&st=srhylwot&dl=1",
    scenario_dir / "param.zip",
    param_dir
)

# Copy wireless params.mat file to wireless folder
wireless_dir = scenario_dir / "wireless"
shutil.copy(param_dir / "params.mat", wireless_dir / "params.mat")

print(f"DeepVerse scenario {scenario_name} is ready!")

Preparing wireless data...


Downloading: wireless.zip: 100%|██████████| 3.77G/3.77G [01:57<00:00, 34.5MB/s]


Infalting: scenarios\Carla-Town05\wireless.zip
Removing zip: scenarios\Carla-Town05\wireless.zip
Preparing parameter files...


Downloading: param.zip: 100%|██████████| 87.1M/87.1M [00:01<00:00, 62.7MB/s]  


Infalting: scenarios\Carla-Town05\param.zip
Removing zip: scenarios\Carla-Town05\param.zip
DeepVerse scenario Carla-Town05 is ready!


## Digital Twin and Real-World Data Generation

In [5]:
from deepverse_utils.deepverse_dt_rw_channel_gen import chs_gen

scenarios = np.arange(10)  # Use first 4000 scenes from Carla-Town05
n_beams = 128 
fov = 180
n_path = [5, 25]  # [Digital Twin paths, Real-World paths]

M_x = 1
M_y = n_beams // M_x
codebook = generate_dft_codebook(M_x, M_y) 

dataset_dt, dataset_rw, pos, los_status, best_beam, enabled_idxs, bs_pos = chs_gen(
    scenarios,
    n_beams, 
    fov,
    n_path,
    codebook)

    # Keep as PyTorch tensors for k_means function compatibility
    # dataset_dt and dataset_rw are already PyTorch tensors from chs_gen

# Ensure all arrays have correct dimensions for k_means function
# pos should be (N, 3), los_status and best_beam should be (N, 1) for concatenation
if len(los_status.shape) == 1:
    los_status = los_status.reshape(-1, 1)
if len(best_beam.shape) == 1:
    best_beam = best_beam.reshape(-1, 1)

print(f"\n=== Dataset Information ===")
print(f"Digital Twin dataset shape: {dataset_dt.shape}")
print(f"Real-World dataset shape: {dataset_rw.shape}")
print(f"User positions shape: {pos.shape}")
print(f"LoS status shape: {los_status.shape}")
print(f"Best beam indices shape: {best_beam.shape}")
print(f"Enabled user indices: {len(enabled_idxs)}")
print(f"Base station position: {bs_pos}")

=== Digital Twin and Real-World Channel Generation ===
Scenarios: [0 1 2 3 4 5 6 7 8 9]
Number of beams: 128
Paths - Digital Twin: 5, Real-World: 25

=== Configuring Digital Twin Dataset ===
Loaded parameters for 10 scenes with 5 paths each
Scenes: [0 1 2 3 4 5 6 7 8 9]
Communication enabled: True
Doppler effects enabled: False
Generating dataset...
Generating comm dataset: ⏳ In progress


                                                                  

[F[KGenerating comm dataset: ✅ Completed (0.54s)
Dataset generation completed!

=== Configuring Real-World Dataset ===
Loaded parameters for 10 scenes with 25 paths each
Scenes: [0 1 2 3 4 5 6 7 8 9]
Communication enabled: True
Doppler effects enabled: True
Generating dataset...
Generating comm dataset: ⏳ In progress


                                                                 

[F[KGenerating comm dataset: ✅ Completed (0.51s)
Dataset generation completed!

=== Generating Overlayed Users ===
Overlaying users from 10 scenes...
Successfully overlayed 70 users from 10 scenes

=== Processing Channel Data ===
Generating 0 corresponding Digital Twin channels...


ValueError: matmul: Input operand 1 has a mismatch in its core dimension 0, with gufunc signature (n?,k),(k,m?)->(n?,m?) (size 0 is different from 128)

## Settings

In [4]:
n_users = len(dataset_dt)
pos_coeff = 1
los_coeff_kmeans = 0
beam_coeff_kmeans = 0 
umap_coeff = 0
subspace_coeff = 0

## Fig. 2: Channel Reconstruction Performance VS Number of Pilots Plot (No calibration)

In [5]:
trials = 200
datasets = [
    "Real-World",  
    "Digital Twin",  
    "Random DFT-based Pilots"
]
dft_based = True  
n_pilots = np.array([1, 13, 26, 38, 51, 64, 77, 90, 102, 115, 128])
snr_db = 10 
loss_func = ["nmse", "cosine", "throughput"][1]
ss_nmse = np.zeros((len(datasets), len(n_pilots), trials))

for trial in range(trials):
    
    for dataset_idx, dataset_type in enumerate(datasets):
        
        print(f"\n\ntrial: {trial}\ndataset type: {dataset_type}")
        print(f"Number of users: {n_users}")
  
        n_areas = min(12, n_users // 4)  # Ensure areas <= samples/4
        n_kmeans_clusters = min(80, n_users // 2)  # Ensure clusters <= samples/2
        
        print(f"Adjusted parameters: n_areas={n_areas}, n_kmeans_clusters={n_kmeans_clusters}") 
        
        if dataset_type in ["Digital Twin"]:
            imperfect_dataset = dataset_dt
        elif dataset_type == "Real-World":
            imperfect_dataset = dataset_rw
        elif dataset_type == "Random DFT-based Pilots":
            imperfect_dataset = dataset_rw
            n_areas = 1
            n_kmeans_clusters = 1
        
        if (dataset_idx == 0 and subspace_coeff == 0) or dataset_type in ["Random DFT-based Pilots"]:
            
            dt_subspaces, rw_subspaces, kmeans_centroids, kmeans_labels = k_means(
                enabled_idxs, 
                imperfect_dataset, 
                dataset_rw,
                pos[:,:3], 
                los_status,
                best_beam,
                bs_pos, 
                pos_coeff,  
                los_coeff_kmeans, 
                beam_coeff_kmeans,  
                percent=.95,  
                n_kmeans_clusters=n_kmeans_clusters, 
                k_predefined2=None,
                seed=trial
            )
            
            areas, area_lens = k_med(
                dt_subspaces, 
                pos_coeff, 
                subspace_coeff, 
                kmeans_centroids, 
                n_areas, 
                kmeans_labels,
                pos[:,:3],
                enabled_idxs,
                bs_pos,
                seed=trial
            )
        
        avg_nmse_ss = subspace_estimation(
            imperfect_dataset, 
            dataset_rw, 
            areas, 
            area_lens, 
            codebook,
            n_pilots,
            dataset_type,
            snr_db=snr_db,
            loss_func=loss_func,
            dft_based=dft_based,
            seed=trial
        )
        
        ss_nmse[dataset_idx, :, trial] = todB(avg_nmse_ss).squeeze() if loss_func == "nmse" else avg_nmse_ss.squeeze()

    # FIGURE     
    plot_perf_vs_pilots(datasets, ss_nmse, n_pilots, n_beams, trial, loss_func)



trial: 0
dataset type: Real-World
Number of users: 70
Adjusted parameters: n_areas=12, n_kmeans_clusters=35
Areas have min 1 idxs, max 10 idxs, and an avg of 5.833333333333333 idxs.
zone id: 0, n_pilots: 1, perf: 0.604
zone id: 0, n_pilots: 13, perf: 0.936
zone id: 0, n_pilots: 26, perf: 0.945
zone id: 0, n_pilots: 38, perf: 0.948
zone id: 0, n_pilots: 51, perf: 0.950
zone id: 0, n_pilots: 64, perf: 0.951
zone id: 0, n_pilots: 77, perf: 0.950
zone id: 0, n_pilots: 90, perf: 0.954
zone id: 0, n_pilots: 102, perf: 0.953
zone id: 0, n_pilots: 115, perf: 0.955
zone id: 0, n_pilots: 128, perf: 0.953
zone id: 1, n_pilots: 1, perf: 0.929
zone id: 1, n_pilots: 13, perf: 0.956
zone id: 1, n_pilots: 26, perf: 0.955
zone id: 1, n_pilots: 38, perf: 0.952
zone id: 1, n_pilots: 51, perf: 0.951
zone id: 1, n_pilots: 64, perf: 0.956
zone id: 1, n_pilots: 77, perf: 0.953
zone id: 1, n_pilots: 90, perf: 0.952
zone id: 1, n_pilots: 102, perf: 0.955
zone id: 1, n_pilots: 115, perf: 0.953
zone id: 1, n_p

KeyboardInterrupt: 

## Fig. 3: CDF (With Calibration)

In [None]:
trials = 200
datasets = [
    "Real-World",  
    "Digital Twin",  
    "RL-Calibrated Digital Twin",  
    "Random DFT-based Pilots",  
    "RL-Calibrated Random Pilots"
]
dft_based = True  
n_pilots = [26]
snr_db = 10 
loss_func = ["nmse", "cosine", "throughput"][1]
ss_nmse = np.zeros((len(datasets), len(n_pilots), trials))

for trial in range(trials):
    
    for dataset_idx, dataset_type in enumerate(datasets):
        
        print(f"\n\ntrial: {trial}\ndataset type: {dataset_type}")
        print(f"Number of users: {n_users}")
        
        n_areas = min(12, n_users // 4)  # Ensure areas <= samples/4
        n_kmeans_clusters = min(80, n_users // 2)  # Ensure clusters <= samples/2
        
        print(f"Adjusted parameters: n_areas={n_areas}, n_kmeans_clusters={n_kmeans_clusters}") 
        
        if dataset_type in ["Digital Twin", "RL-Calibrated Digital Twin"]:
            imperfect_dataset = dataset_dt
        elif dataset_type == "Real-World":
            imperfect_dataset = dataset_rw
        elif dataset_type in ["Random DFT-based Pilots", "RL-Calibrated Random Pilots"]:
            imperfect_dataset = dataset_rw
            n_areas = 1
            n_kmeans_clusters = 1
        
        if (dataset_idx == 0 and subspace_coeff == 0) or dataset_type in ["Random DFT-based Pilots"]:
            
            dt_subspaces, rw_subspaces, kmeans_centroids, kmeans_labels = k_means(
                enabled_idxs, 
                imperfect_dataset, 
                dataset_rw,
                pos[:,:3], 
                los_status,
                best_beam,
                bs_pos, 
                pos_coeff,  
                los_coeff_kmeans, 
                beam_coeff_kmeans,  
                percent=.95,  
                n_kmeans_clusters=n_kmeans_clusters, 
                k_predefined2=None,
                seed=trial
            )
            
            areas, area_lens = k_med(
                dt_subspaces, 
                pos_coeff, 
                subspace_coeff, 
                kmeans_centroids, 
                n_areas, 
                kmeans_labels,
                pos[:,:3],
                enabled_idxs,
                bs_pos,
                seed=trial
            )
    
        if dataset_type in ["Real-World", "Random DFT-based Pilots", "Digital Twin"]:
            
            avg_nmse_ss = subspace_estimation(
                imperfect_dataset, 
                dataset_rw, 
                areas, 
                area_lens, 
                codebook,
                n_pilots,
                dataset_type,
                snr_db=snr_db,
                loss_func=loss_func,
                dft_based=dft_based,
                seed=trial
            )
            
        elif dataset_type in ["RL-Calibrated Digital Twin", "RL-Calibrated Random Pilots"]:
            
            avg_nmse_ss = subspace_estimation_drl(
                imperfect_dataset, 
                dataset_rw, 
                areas, 
                area_lens, 
                codebook, 
                dataset_type,
                n_pilots=n_pilots[0], 
                n_episodes=300, 
                snr_db=snr_db, 
                loss_func=loss_func, 
                seed=trial
            )
        
        ss_nmse[dataset_idx, :, trial] = todB(avg_nmse_ss) if loss_func == "nmse" else avg_nmse_ss
        
    # CDF PLOT
    plot_smooth_cdf(datasets, ss_nmse, trial=trial, loss_func=loss_func, n=13, r=0.2)