### nr_pusch_channel_estimation

The function `nr_pusch_channel_estimation` performs channel estimation for the PUSCH (Physical Uplink Shared Channel) in a 5G NR system. It calculates and updates several key output values that are passed by reference to the function. Here are the output values that are modified and sent back to the calling function:

### Output Values

1. **`max_ch`**:
   - **Type**: `int *`
   - **Description**: Tracks the maximum absolute value of the channel estimate coefficients.
   - **Usage**: The function updates `max_ch` with the maximum channel coefficient value encountered during the estimation process. This value is useful for normalizing the channel estimates and ensuring numerical stability.
   - **Example Update**:
     ```c
     *max_ch = max(*max_ch, max(abs(ch.r), abs(ch.i)));
     ```

2. **`nvar`**:
   - **Type**: `uint32_t *`
   - **Description**: Holds the noise variance calculated during the channel estimation process.
   - **Usage**: The function calculates the noise variance for the current estimation and updates `nvar`. This value is used to assess the noise level in the channel, which is critical for subsequent signal processing steps such as decoding and error correction.
   - **Example Update**:
     ```c
     if (nvar && nest_count > 0) {
       *nvar = (uint32_t)(noise_amp2 / nest_count);
     }
     ```

### Function Signature

```c
int nr_pusch_channel_estimation(PHY_VARS_gNB *gNB,
                                unsigned char Ns,
                                int nl,
                                unsigned short p,
                                unsigned char symbol,
                                int ul_id,
                                unsigned short bwp_start_subcarrier,
                                nfapi_nr_pusch_pdu_t *pusch_pdu,
                                int *max_ch,
                                uint32_t *nvar)
```

### Key Points of the Function

- **Channel Estimation**: The function performs least squares (LS) channel estimation using the received data and pilot symbols.
- **Frequency-Domain Interpolation**: Depending on the DMRS (Demodulation Reference Signal) configuration type, the function performs frequency-domain interpolation to estimate the channel coefficients.
- **Noise Variance Calculation**: The function calculates the noise variance based on the difference between the estimated and actual channel coefficients.
- **Maximum Channel Coefficient Tracking**: The function keeps track of the maximum absolute value of the channel coefficients encountered during the estimation.

### Detailed Breakdown

1. **Generate DMRS**:
   - The function generates the DMRS (pilot) symbols based on the PUSCH PDU configuration and stores them in the `pilot` array.

2. **Channel Estimation Loop**:
   - The function loops over each receive antenna (`aarx`).
   - For each antenna, it performs channel estimation and interpolation.
   - It updates `max_ch` with the maximum channel coefficient value encountered.
   - It calculates the noise variance and accumulates it into `noise_amp2`.

3. **Noise Variance Calculation**:
   - After processing all antennas, the function calculates the average noise variance and updates `nvar`.

### Example Usage in Calling Function

Here's how the calling function might use `max_ch` and `nvar`:

```c
start_meas(&gNB->ulsch_channel_estimation_stats);
int max_ch = 0;
uint32_t nvar = 0;
for (uint8_t symbol = rel15_ul->start_symbol_index; symbol < (rel15_ul->start_symbol_index + rel15_ul->nr_of_symbols); symbol++) {
    uint8_t dmrs_symbol_flag = (rel15_ul->ul_dmrs_symb_pos >> symbol) & 0x01;
    if (dmrs_symbol_flag == 1) {
        if (pusch_vars->dmrs_symbol == INVALID_VALUE)
            pusch_vars->dmrs_symbol = symbol;

        for (int nl = 0; nl < rel15_ul->nrOfLayers; nl++) {
            uint32_t nvar_tmp = 0;
            nr_pusch_channel_estimation(gNB,
                                        slot,
                                        nl,
                                        get_dmrs_port(nl, rel15_ul->dmrs_ports),
                                        symbol,
                                        ulsch_id,
                                        bwp_start_subcarrier,
                                        rel15_ul,
                                        &max_ch,
                                        &nvar_tmp);
            nvar += nvar_tmp;
        }
    }
}

// Average noise variance across layers and symbols
nvar /= (rel15_ul->nr_of_symbols * rel15_ul->nrOfLayers * frame_parms->nb_antennas_rx);
```

### Summary

- **`max_ch`**: An integer pointer that is updated to reflect the maximum absolute channel coefficient encountered during the estimation.
- **`nvar`**: A pointer to a `uint32_t` that is updated with the calculated noise variance.
- The function processes the received data and pilots to perform channel estimation, updates these output values, and returns them to the calling function for further processing.

## maximum channel coefficient value

In the function `nr_pusch_channel_estimation`, `max_ch` is used to track the maximum channel coefficient value encountered during the channel estimation process. This value is then used in subsequent calculations, such as normalizing the channel estimates and determining the signal-to-noise ratio (SNR). Here is a summarized explanation of its usage:

1. **Initialization**:
   ```c
   int max_ch = 0;
   uint32_t nvar = 0;
   ```

2. **Channel Estimation Loop**:
   - The function loops through the symbols in the uplink resource grid.
   - For each symbol flagged as a DMRS (Demodulation Reference Signal) symbol, the function calls `nr_pusch_channel_estimation` for each layer:
     ```c
     for (int nl = 0; nl < rel15_ul->nrOfLayers; nl++) {
         uint32_t nvar_tmp = 0;
         nr_pusch_channel_estimation(gNB,
                                     slot,
                                     nl,
                                     get_dmrs_port(nl, rel15_ul->dmrs_ports),
                                     symbol,
                                     ulsch_id,
                                     bwp_start_subcarrier,
                                     rel15_ul,
                                     &max_ch,
                                     &nvar_tmp);
         nvar += nvar_tmp;
     }
     ```

3. **Updating max_ch**:
   - Within `nr_pusch_channel_estimation`, the `max_ch` value is updated with the maximum channel coefficient value for each layer and antenna. This updated value is then used in the outer function.

4. **SNR Measurement**:
   - The function measures the SNR using the estimated channel values:
     ```c
     nr_gnb_measurements(gNB, 
                         &gNB->ulsch[ulsch_id], 
                         pusch_vars, 
                         symbol, 
                         rel15_ul->nrOfLayers);
     ```

5. **Signal and Noise Power Calculation**:
   - The function calculates the signal and noise power for each antenna based on the channel estimates:
     ```c
     for (int aarx = 0; aarx < frame_parms->nb_antennas_rx; aarx++) {
         if (symbol == rel15_ul->start_symbol_index) {
             pusch_vars->ulsch_power[aarx] = 0;
             pusch_vars->ulsch_noise_power[aarx] = 0;
         }
         for (int aatx = 0; aatx < rel15_ul->nrOfLayers; aatx++) {
             pusch_vars->ulsch_power[aarx] += signal_energy_nodc(
                 &pusch_vars->ul_ch_estimates[aatx * gNB->frame_parms.nb_antennas_rx + aarx][symbol * frame_parms->ofdm_symbol_size],
                 rel15_ul->rb_size * 12);
         }
         for (int rb = 0; rb < rel15_ul->rb_size; rb++)
             pusch_vars->ulsch_noise_power[aarx] += 
                 n0_subband_power[aarx][rel15_ul->bwp_start + rel15_ul->rb_start + rb] / rel15_ul->rb_size;
     }
     ```

6. **Noise Variance Calculation**:
   - The noise variance is averaged across all symbols, layers, and antennas:
     ```c
     nvar /= (rel15_ul->nr_of_symbols * rel15_ul->nrOfLayers * frame_parms->nb_antennas_rx);
     ```

7. **Time Domain Channel Estimation**:
   - Optionally, the function performs time-domain averaging of the channel estimates if the corresponding flag is set:
     ```c
     if (gNB->chest_time == 1) {
         nr_chest_time_domain_avg(frame_parms,
                                  pusch_vars->ul_ch_estimates,
                                  rel15_ul->nr_of_symbols,
                                  rel15_ul->start_symbol_index,
                                  rel15_ul->ul_dmrs_symb_pos,
                                  rel15_ul->rb_size);
         pusch_vars->dmrs_symbol = get_next_dmrs_symbol_in_slot(rel15_ul->ul_dmrs_symb_pos, 
                                                                rel15_ul->start_symbol_index, 
                                                                rel15_ul->nr_of_symbols);
     }
     ```

8. **Channel Level Computation**:
   - The maximum channel value (`max_ch`) is used to adjust the shift amount for channel estimate extraction, ensuring proper scaling and preventing overflow:
     ```c
     uint8_t shift_ch_ext = rel15_ul->nrOfLayers > 1 ? log2_approx(max_ch >> 11) : 0;
     ```

In summary, `max_ch` is used throughout the `nr_pusch_channel_estimation` function to:
- Track the maximum channel coefficient value across symbols and layers.
- Normalize the channel estimates.
- Calculate the signal and noise power.
- Adjust the scaling factor during channel estimate extraction to ensure numerical stability and accuracy.

# noise variance for each layer

The `nvar_tmp` variable is used to accumulate the noise variance for each layer during the channel estimation process. Here's a detailed summary of how `nvar_tmp` and `nvar` are used:

1. **Initialization**:
   - `nvar` is initialized to zero before entering the symbol loop.
   ```c
   uint32_t nvar = 0;
   ```

2. **Channel Estimation Loop**:
   - For each DMRS symbol, `nr_pusch_channel_estimation` is called for each layer.
   - `nvar_tmp` is used to capture the noise variance for the current layer and symbol.
   ```c
   for (int nl = 0; nl < rel15_ul->nrOfLayers; nl++) {
       uint32_t nvar_tmp = 0;
       nr_pusch_channel_estimation(gNB,
                                   slot,
                                   nl,
                                   get_dmrs_port(nl, rel15_ul->dmrs_ports),
                                   symbol,
                                   ulsch_id,
                                   bwp_start_subcarrier,
                                   rel15_ul,
                                   &max_ch,
                                   &nvar_tmp);
       nvar += nvar_tmp;
   }
   ```

3. **Inside `nr_pusch_channel_estimation`**:
   - The function `nr_pusch_channel_estimation` computes the noise variance during the channel estimation process.
   - `nvar_tmp` is updated with this computed noise variance and passed back to the calling function.

4. **Accumulation**:
   - The noise variance for each layer (`nvar_tmp`) is accumulated into `nvar` for the current symbol.
   ```c
   nvar += nvar_tmp;
   ```

5. **Averaging Noise Variance**:
   - After processing all relevant symbols and layers, `nvar` is averaged across the number of symbols, layers, and receive antennas.
   ```c
   nvar /= (rel15_ul->nr_of_symbols * rel15_ul->nrOfLayers * frame_parms->nb_antennas_rx);
   ```

Here's a snippet showing how `nvar_tmp` is used inside the channel estimation function:

```c
int nr_pusch_channel_estimation(PHY_VARS_gNB *gNB,
                                int slot,
                                int nl,
                                int dmrs_port,
                                int symbol,
                                int ulsch_id,
                                int bwp_start_subcarrier,
                                nfapi_nr_pusch_pdu_t *rel15_ul,
                                int *max_ch,
                                uint32_t *nvar_tmp) {
    // Channel estimation code ...

    // Example calculation of noise variance
    uint64_t noise_power = 0;
    int nest_count = 0;

    // Calculation of noise power (this is a simplified example)
    for (int aarx = 0; aarx < gNB->frame_parms.nb_antennas_rx; aarx++) {
        for (int n = 0; n < symbolSize; n++) {
            // Compute noise power and update nest_count
            noise_power += compute_noise_power(ul_ch_estimates, n);
            nest_count++;
        }
    }

    // Calculate the noise variance for this layer and symbol
    if (nest_count > 0) {
        *nvar_tmp = noise_power / nest_count;
    } else {
        *nvar_tmp = 0;
    }

    // Other channel estimation code ...

    return 0;
}
```

### Summary:
- **`nvar`** is the total accumulated noise variance for all layers and symbols, initialized before the loop.
- **`nvar_tmp`** is a temporary variable used to capture the noise variance for the current layer during each call to `nr_pusch_channel_estimation`.
- **Accumulation and Averaging**: The noise variance for each layer (`nvar_tmp`) is accumulated into `nvar`, which is then averaged over the total number of symbols, layers, and antennas to get the final noise variance.

In essence, `nvar_tmp` helps to segregate the noise variance calculation for each call to `nr_pusch_channel_estimation`, allowing the calling function to accumulate and average these values across the entire uplink resource block grid.

## PUSCH (Physical Uplink Shared Channel) channel estimation, each layer corresponds to a different data stream that needs to be estimated separately.

In the context of the 5G NR (New Radio) physical layer, the concept of "layers" refers to the multiple streams of data transmitted simultaneously using Multiple Input Multiple Output (MIMO) techniques. MIMO enables the transmission of multiple data streams (layers) over multiple antennas to improve data throughput and reliability. Each layer can be independently processed and transmitted, allowing the system to take advantage of spatial diversity and spatial multiplexing.

### Understanding Layers in PUSCH Channel Estimation

1. **Layers in MIMO**:
   - **Single Layer Transmission**: In the simplest case, a single data stream is transmitted. This is typical in scenarios where the user equipment (UE) has only one transmit antenna, or the channel conditions do not support multiple layers.
   - **Multiple Layers Transmission**: When multiple antennas are available at both the transmitter and receiver, multiple independent data streams (layers) can be transmitted simultaneously. This is known as spatial multiplexing.

2. **Channel Estimation for Each Layer**:
   - In the PUSCH (Physical Uplink Shared Channel) channel estimation, each layer corresponds to a different data stream that needs to be estimated separately.
   - The channel estimation process involves estimating the channel's response for each layer, as the channel can vary for each spatial stream due to differences in the propagation paths.

### Example of Layers in the Provided Code

In the provided code, `rel15_ul->nrOfLayers` specifies the number of layers being used for the transmission. The channel estimation function iterates over each layer to perform channel estimation separately:

```c
for (int nl = 0; nl < rel15_ul->nrOfLayers; nl++) {
    uint32_t nvar_tmp = 0;
    nr_pusch_channel_estimation(gNB,
                                slot,
                                nl,
                                get_dmrs_port(nl, rel15_ul->dmrs_ports),
                                symbol,
                                ulsch_id,
                                bwp_start_subcarrier,
                                rel15_ul,
                                &max_ch,
                                &nvar_tmp);
    nvar += nvar_tmp;
}
```

### Key Points about Layers

- **Layer Index (`nl`)**: The variable `nl` is the index for the current layer. The loop iterates over all layers from `0` to `nrOfLayers - 1`.
- **DMRS Ports**: Each layer can be associated with different DMRS (Demodulation Reference Signal) ports, which are used for channel estimation.
- **Independent Channel Estimation**: The channel estimation is performed independently for each layer, which means each layer's channel response is estimated separately. This accounts for the different propagation paths each layer might experience.
- **Noise Variance (`nvar_tmp`)**: The noise variance is computed for each layer separately and then accumulated to get the overall noise variance.

### Conceptual Flow for Each Layer

1. **DMRS Symbol Detection**:
   - The code checks if the current symbol is a DMRS symbol using `rel15_ul->ul_dmrs_symb_pos`.

2. **Channel Estimation**:
   - For each DMRS symbol, channel estimation is performed for each layer.
   - The function `nr_pusch_channel_estimation` is called with the current layer index `nl`, symbol, and other parameters.

3. **Noise Variance Accumulation**:
   - The noise variance for each layer is calculated (`nvar_tmp`) and added to the total noise variance (`nvar`).

4. **SNR and Power Measurements**:
   - After channel estimation, SNR and power measurements are performed for each layer.

### Summary

- **Layers** in the context of PUSCH channel estimation represent multiple independent data streams transmitted using MIMO techniques.
- Each layer undergoes independent channel estimation to account for different channel conditions experienced by each spatial stream.
- The code provided iterates over each layer to perform channel estimation, compute noise variance, and update maximum channel coefficients.
- This approach allows the system to leverage spatial diversity and spatial multiplexing to enhance data throughput and reliability in 5G NR communications.

Can this function be broken in function place into a thread pool? if yes what output data would be extracted and is a join (rendez-vous) expected to merge the output data?

Yes, the `nr_pusch_channel_estimation` function can be broken into tasks that can be processed in a thread pool to leverage parallel processing. Each task would correspond to the channel estimation for a specific antenna (`aarx`). After processing, the results from each thread need to be merged, requiring a join (rendez-vous) operation to collect and combine the output data.

### Steps to Parallelize the Function

1. **Identify Independent Tasks**:
   - Each iteration of the loop over `aarx` (receive antennas) is independent and can be parallelized.
   
2. **Define Task Function**:
   - Create a function that performs the channel estimation for a single antenna.
   
3. **Create Thread Pool**:
   - Use a thread pool to distribute the tasks.

4. **Merge Results**:
   - After all threads complete, merge the results.

### Output Data to Extract and Merge

1. **Channel Estimates** (`ul_ch_estimates`):
   - Each thread updates its portion of the `ul_ch_estimates` array.

2. **Maximum Channel Coefficient** (`max_ch`):
   - Each thread computes its local `max_ch`, and these values need to be merged to find the global maximum.

3. **Noise Variance** (`nvar`):
   - Each thread computes its local `nvar`, which needs to be summed and averaged in the main thread.

4. **Delay Estimates** (`delay`):
   - Each thread updates its portion of the `delay` structure.

### Example Code

Here's how the function can be parallelized using a thread pool:

```c
#include <pthread.h>
#include <stdlib.h>
#include <stdio.h>
#include <stdint.h>
#include <string.h>
#include <math.h>

// Define the maximum number of threads
#define MAX_THREADS 8

// Structure to hold thread data
typedef struct {
    PHY_VARS_gNB *gNB;
    unsigned char Ns;
    int nl;
    unsigned short p;
    unsigned char symbol;
    int ul_id;
    unsigned short bwp_start_subcarrier;
    nfapi_nr_pusch_pdu_t *pusch_pdu;
    int *max_ch;
    uint32_t *nvar;
    int aarx;
} thread_data_t;

// Thread function for channel estimation
void *thread_pusch_channel_estimation(void *arg) {
    thread_data_t *data = (thread_data_t *)arg;
    // Perform channel estimation for the given antenna (data->aarx)
    // ...
    // Update max_ch and nvar
    int local_max_ch = 0;
    uint64_t local_noise_amp2 = 0;
    int local_nest_count = 0;
    // Compute channel estimation and update local_max_ch, local_noise_amp2, and local_nest_count
    // ...
    *data->max_ch = max(*data->max_ch, local_max_ch);
    *data->nvar += (local_noise_amp2 / local_nest_count);
    return NULL;
}

int nr_pusch_channel_estimation(PHY_VARS_gNB *gNB,
                                unsigned char Ns,
                                int nl,
                                unsigned short p,
                                unsigned char symbol,
                                int ul_id,
                                unsigned short bwp_start_subcarrier,
                                nfapi_nr_pusch_pdu_t *pusch_pdu,
                                int *max_ch,
                                uint32_t *nvar) {
    pthread_t threads[MAX_THREADS];
    thread_data_t thread_data[MAX_THREADS];
    int num_threads = gNB->frame_parms.nb_antennas_rx;
    
    // Initialize max_ch and nvar
    *max_ch = 0;
    *nvar = 0;

    // Create threads
    for (int i = 0; i < num_threads; i++) {
        thread_data[i] = (thread_data_t){
            .gNB = gNB,
            .Ns = Ns,
            .nl = nl,
            .p = p,
            .symbol = symbol,
            .ul_id = ul_id,
            .bwp_start_subcarrier = bwp_start_subcarrier,
            .pusch_pdu = pusch_pdu,
            .max_ch = max_ch,
            .nvar = nvar,
            .aarx = i
        };
        pthread_create(&threads[i], NULL, thread_pusch_channel_estimation, &thread_data[i]);
    }

    // Wait for all threads to complete
    for (int i = 0; i < num_threads; i++) {
        pthread_join(threads[i], NULL);
    }

    // Average the noise variance
    if (*nvar > 0) {
        *nvar /= num_threads;
    }

    return 0;
}
```

### Explanation

1. **Task Function (`thread_pusch_channel_estimation`)**:
   - This function performs channel estimation for a specific antenna (`data->aarx`).
   - It computes local values for `max_ch` and `nvar` which are then merged into the global variables.

2. **Thread Pool**:
   - A fixed-size thread pool is created with `pthread_t threads[MAX_THREADS]`.
   - Each thread runs the `thread_pusch_channel_estimation` function with its specific parameters.

3. **Merging Results**:
   - Each thread updates the global `max_ch` using a thread-safe method to ensure the maximum value is correctly updated.
   - The noise variance (`nvar`) is accumulated and averaged across all threads.

4. **Join Operation**:
   - `pthread_join` is used to wait for all threads to complete before continuing with the rest of the program.

This approach efficiently parallelizes the channel estimation process, leveraging multiple threads to perform the work concurrently. The main thread is responsible for initializing the data, creating the threads, and then merging the results after all threads have completed.

To adjust the `nr_pusch_channel_estimation` code to use the proprietary thread pool library, we will need to follow these steps:

1. **Define the Task Data Structure**: Create a structure to hold the task-specific data.
2. **Task Function**: Create a function that performs the channel estimation for a specific antenna (`aarx`) using the proprietary thread pool.
3. **Modify the Main Function**: Modify the main function to create tasks for each antenna and use the thread pool to execute them.
4. **Merge Results**: Collect and merge the results from all tasks.

Here's how you can adjust the code:

### Define the Task Data Structure

```c
typedef struct {
    PHY_VARS_gNB *gNB;
    unsigned char Ns;
    int nl;
    unsigned short p;
    unsigned char symbol;
    int ul_id;
    unsigned short bwp_start_subcarrier;
    nfapi_nr_pusch_pdu_t *pusch_pdu;
    int *max_ch;
    uint32_t *nvar;
    int aarx;
    uint32_t noise_amp2;
    int nest_count;
    c16_t ul_ls_est[symbolSize];
    delay_t delay;
} puschAntennaProc_t;
```

### Task Function

```c
void nr_pusch_channel_estimation_task(void *arg) {
    puschAntennaProc_t *data = (puschAntennaProc_t *)arg;
    PHY_VARS_gNB *gNB = data->gNB;
    int aarx = data->aarx;
    const int symbol_offset = data->symbol_offset;
    c16_t **ul_ch_estimates = data->ul_ch_estimates;
    int nl = data->nl;
    int ch_offset = data->ch_offset;
    const int symbolSize = data->symbolSize;
    nfapi_nr_pusch_pdu_t *pusch_pdu = data->pusch_pdu;
    const int chest_freq = data->chest_freq;
    c16_t *pilot = data->pilot;
    const int k0 = data->k0;
    const int nb_rb_pusch = data->nb_rb_pusch;
    unsigned short p = data->p;
    const int soffset = data->soffset;
    int *max_ch = data->max_ch;
    c16_t *ul_ls_est = data->ul_ls_est;
    NR_gNB_PUSCH *pusch_vars = data->pusch_vars;
    delay_t *delay = &data->delay;
    uint64_t noise_amp2 = 0;
    int nest_count = 0;
    const int nushift = data->nushift;

    c16_t *rxdataF = (c16_t *)&gNB->common_vars.rxdataF[aarx][symbol_offset];
    c16_t *ul_ch = &ul_ch_estimates[nl * gNB->frame_parms.nb_antennas_rx + aarx][ch_offset];

    memset(ul_ch, 0, sizeof(*ul_ch) * symbolSize);

    // Channel estimation code ...

    // Update max_ch with the local maximum value
    *max_ch = max(*max_ch, local_max_ch);

    // Calculate the noise variance for this layer and symbol
    if (nest_count > 0) {
        *data->nvar += (uint32_t)(noise_amp2 / nest_count);
    }
}
```

### Modify the Main Function

```c
int nr_pusch_channel_estimation(PHY_VARS_gNB *gNB,
                                unsigned char Ns,
                                int nl,
                                unsigned short p,
                                unsigned char symbol,
                                int ul_id,
                                unsigned short bwp_start_subcarrier,
                                nfapi_nr_pusch_pdu_t *pusch_pdu,
                                int *max_ch,
                                uint32_t *nvar) {
    *max_ch = 0;
    *nvar = 0;
    c16_t ul_ls_est[symbolSize] __attribute__((aligned(32)));
    memset(ul_ls_est, 0, sizeof(c16_t) * symbolSize);

    // Create tasks for each antenna
    for (int aarx = 0; aarx < gNB->frame_parms.nb_antennas_rx; aarx++) {
        puschAntennaProc_t *task_data = malloc(sizeof(puschAntennaProc_t));
        task_data->gNB = gNB;
        task_data->Ns = Ns;
        task_data->nl = nl;
        task_data->p = p;
        task_data->symbol = symbol;
        task_data->ul_id = ul_id;
        task_data->bwp_start_subcarrier = bwp_start_subcarrier;
        task_data->pusch_pdu = pusch_pdu;
        task_data->max_ch = max_ch;
        task_data->nvar = nvar;
        task_data->aarx = aarx;
        memset(task_data->ul_ls_est, 0, sizeof(c16_t) * symbolSize);
        memset(&task_data->delay, 0, sizeof(delay_t));

        // Create and enqueue task
        notifiedFIFO_elt_t *task = newNotifiedFIFO_elt(sizeof(puschAntennaProc_t), 0, NULL, nr_pusch_channel_estimation_task);
        NotifiedFifoData(task) = task_data;
        pushTpool(&gNB->threadPool, task);
    }

    // Wait for all tasks to complete
    while (gNB->nbSymb > 0) {
        notifiedFIFO_elt_t *req = pullTpool(&gNB->respPuschSymb, &gNB->threadPool);
        gNB->nbSymb--;
        delNotifiedFIFO_elt(req);
    }

    // Average noise variance
    if (*nvar > 0) {
        *nvar /= gNB->frame_parms.nb_antennas_rx;
    }

    return 0;
}
```

### Full Example Integration

Here is how you can integrate the changes into the `nr_rx_pusch_tp` function:

```c
int nr_rx_pusch_tp(PHY_VARS_gNB *gNB,
                   uint8_t ulsch_id,
                   uint32_t frame,
                   uint8_t slot,
                   unsigned char harq_pid) {
    NR_DL_FRAME_PARMS *frame_parms = &gNB->frame_parms;
    nfapi_nr_pusch_pdu_t *rel15_ul = &gNB->ulsch[ulsch_id].harq_process->ulsch_pdu;

    NR_gNB_PUSCH *pusch_vars = &gNB->pusch_vars[ulsch_id];
    pusch_vars->dmrs_symbol = INVALID_VALUE;
    gNB->nbSymb = 0;
    uint32_t bwp_start_subcarrier = ((rel15_ul->rb_start + rel15_ul->bwp_start) * NR_NB_SC_PER_RB + frame_parms->first_carrier_offset) % frame_parms->ofdm_symbol_size;

    start_meas(&gNB->ulsch_channel_estimation_stats);
    int max_ch = 0;
    uint32_t nvar = 0;
    for (uint8_t symbol = rel15_ul->start_symbol_index; symbol < (rel15_ul->start_symbol_index + rel15_ul->nr_of_symbols); symbol++) {
        uint8_t dmrs_symbol_flag = (rel15_ul->ul_dmrs_symb_pos >> symbol) & 0x01;
        if (dmrs_symbol_flag == 1) {
            if (pusch_vars->dmrs_symbol == INVALID_VALUE)
                pusch_vars->dmrs_symbol = symbol;

            for (int nl = 0; nl < rel15_ul->nrOfLayers; nl++) {
                uint32_t nvar_tmp = 0;
                nr_pusch_channel_estimation(gNB,
                                            slot,
                                            nl,
                                            get_dmrs_port(nl, rel15_ul->dmrs_ports),
                                            symbol,
                                            ulsch_id,
                                            bwp_start_subcarrier,
                                            rel15_ul,
                                            &max_ch,
                                            &nvar_tmp);
                nvar += nvar_tmp;
            }
            // measure the SNR from the channel estimation
            nr_gnb_measurements(gNB,
                                &gNB->ulsch[ulsch_id],
                                pusch_vars,
                                symbol,
                                rel15_ul->nrOfLayers);
            allocCast2D(n0_subband_power,
                        unsigned int,
                        gNB->measurements.n0_subband_power,
                        frame_parms->nb_antennas_rx,
                        frame_parms->N_RB_UL,
                        false);
            for (int aarx = 0; aarx < frame_parms->nb_antennas_rx; aarx++) {
                if (symbol == rel15_ul->start_symbol_index) {
                    pusch_vars->ulsch_power[aarx] = 0;
                    pusch_vars->ulsch_noise_power[aarx] = 0;
                }
                for (int aatx = 0; aatx < rel15_ul->nrOfLayers; aatx++) {
                    pusch_vars->ulsch_power[aarx] += signal_energy_nodc(
                        &pusch_vars->ul_ch_estimates[aatx * gNB->frame_parms.nb_antennas_rx + aarx][symbol * frame_parms->ofdm_symbol_size],
                        rel15_ul->rb_size * 12);
                }
                for (int rb = 0; rb < rel15_ul->rb_size; rb++)
                    pusch_vars->ulsch_noise_power[aarx] +=
                        n0_subband_power[aarx][rel15_ul->b

wp_start + rel15_ul->rb_start + rb] / rel15_ul->rb_size;
            }
        }
    }

    nvar /= (rel15_ul->nr_of_symbols * rel15_ul->nrOfLayers * frame_parms->nb_antennas_rx);

    if (gNB->chest_time == 1) {
        nr_chest_time_domain_avg(frame_parms,
                                 pusch_vars->ul_ch_estimates,
                                 rel15_ul->nr_of_symbols,
                                 rel15_ul->start_symbol_index,
                                 rel15_ul->ul_dmrs_symb_pos,
                                 rel15_ul->rb_size);
        pusch_vars->dmrs_symbol = get_next_dmrs_symbol_in_slot(rel15_ul->ul_dmrs_symb_pos,
                                                               rel15_ul->start_symbol_index,
                                                               rel15_ul->nr_of_symbols);
    }

    stop_meas(&gNB->ulsch_channel_estimation_stats);
    start_meas(&gNB->rx_pusch_init_stats);

    int number_dmrs_symbols = 0;
    for (int l = rel15_ul->start_symbol_index; l < rel15_ul->start_symbol_index + rel15_ul->nr_of_symbols; l++)
        number_dmrs_symbols += ((rel15_ul->ul_dmrs_symb_pos) >> l) & 0x01;

    int nb_re_dmrs;
    if (rel15_ul->dmrs_config_type == pusch_dmrs_type1)
        nb_re_dmrs = 6 * rel15_ul->num_dmrs_cdm_grps_no_data;
    else
        nb_re_dmrs = 4 * rel15_ul->num_dmrs_cdm_grps_no_data;

    uint32_t unav_res = 0;
    if (rel15_ul->pdu_bit_map & PUSCH_PDU_BITMAP_PUSCH_PTRS) {
        uint16_t ptrsSymbPos = 0;
        set_ptrs_symb_idx(&ptrsSymbPos,
                          rel15_ul->nr_of_symbols,
                          rel15_ul->start_symbol_index,
                          1 << rel15_ul->pusch_ptrs.ptrs_time_density,
                          rel15_ul->ul_dmrs_symb_pos);
        int ptrsSymbPerSlot = get_ptrs_symbols_in_slot(ptrsSymbPos, rel15_ul->start_symbol_index, rel15_ul->nr_of_symbols);
        int n_ptrs = (rel15_ul->rb_size + rel15_ul->pusch_ptrs.ptrs_freq_density - 1) / rel15_ul->pusch_ptrs.ptrs_freq_density;
        unav_res = n_ptrs * ptrsSymbPerSlot;
    }

    int G = nr_get_G(rel15_ul->rb_size,
                     rel15_ul->nr_of_symbols,
                     nb_re_dmrs,
                     number_dmrs_symbols,
                     unav_res,
                     rel15_ul->qam_mod_order,
                     rel15_ul->nrOfLayers);
    gNB->ulsch[ulsch_id].unav_res = unav_res;

    int16_t s[G + 96] __attribute__((aligned(32)));
    nr_codeword_unscrambling_init(s, G, 0, rel15_ul->data_scrambling_id, rel15_ul->rnti);

    int nb_re_pusch = 0, meas_symbol = -1;
    for (meas_symbol = rel15_ul->start_symbol_index;
         meas_symbol < (rel15_ul->start_symbol_index + rel15_ul->nr_of_symbols);
         meas_symbol++)
        if ((nb_re_pusch = get_nb_re_pusch(frame_parms, rel15_ul, meas_symbol)) > 0)
            break;

    AssertFatal(nb_re_pusch > 0 && meas_symbol >= 0, "nb_re_pusch %d cannot be 0 or meas_symbol %d cannot be negative here\n", nb_re_pusch, meas_symbol);

    int soffset = (slot % RU_RX_SLOT_DEPTH) * frame_parms->symbols_per_slot * frame_parms->ofdm_symbol_size;
    nb_re_pusch = (nb_re_pusch + 15) & ~15;

    for (int aarx = 0; aarx < frame_parms->nb_antennas_rx; aarx++)
        for (int aatx = 0; aatx < rel15_ul->nrOfLayers; aatx++)
            nr_ulsch_extract_rbs(gNB->common_vars.rxdataF[aarx],
                                 (c16_t *)pusch_vars->ul_ch_estimates[aatx * frame_parms->nb_antennas_rx + aarx],
                                 (c16_t *)&pusch_vars->rxdataF_ext[aarx][meas_symbol * nb_re_pusch],
                                 (c16_t *)&pusch_vars->ul_ch_estimates_ext[aatx * frame_parms->nb_antennas_rx + aarx][meas_symbol * nb_re_pusch],
                                 soffset + meas_symbol * frame_parms->ofdm_symbol_size,
                                 pusch_vars->dmrs_symbol * frame_parms->ofdm_symbol_size,
                                 aarx,
                                 (rel15_ul->ul_dmrs_symb_pos >> meas_symbol) & 0x01,
                                 rel15_ul,
                                 frame_parms);

    int avgs = 0;
    int avg[frame_parms->nb_antennas_rx * rel15_ul->nrOfLayers];
    uint8_t shift_ch_ext = rel15_ul->nrOfLayers > 1 ? log2_approx(max_ch >> 11) : 0;

    nr_ulsch_scale_channel(pusch_vars->ul_ch_estimates_ext,
                           frame_parms,
                           meas_symbol,
                           (rel15_ul->ul_dmrs_symb_pos >> meas_symbol) & 0x01,
                           nb_re_pusch,
                           rel15_ul->nrOfLayers,
                           rel15_ul->rb_size,
                           shift_ch_ext);

    nr_ulsch_channel_level(pusch_vars->ul_ch_estimates_ext,
                           frame_parms,
                           avg,
                           meas_symbol,
                           nb_re_pusch,
                           rel15_ul->nrOfLayers);

    for (int aatx = 0; aatx < rel15_ul->nrOfLayers; aatx++)
        for (int aarx = 0; aarx < frame_parms->nb_antennas_rx; aarx++)
            avgs = cmax(avgs, avg[aatx * frame_parms->nb_antennas_rx + aarx]);

    pusch_vars->log2_maxh = (log2_approx(avgs) >> 1);

    if (rel15_ul->nrOfLayers == 2 && rel15_ul->qam_mod_order >= 6)
        pusch_vars->log2_maxh = (log2_approx(avgs) >> 1) - 3;
    else if (rel15_ul->nrOfLayers == 1)
        pusch_vars->log2_maxh = (log2_approx(avgs) >> 1) + 1 + log2_approx(frame_parms->nb_antennas_rx >> 2);

    if (pusch_vars->log2_maxh < 0)
        pusch_vars->log2_maxh = 0;

    stop_meas(&gNB->rx_pusch_init_stats);
    start_meas(&gNB->rx_pusch_symbol_processing_stats);
    int numSymbols = gNB->num_pusch_symbols_per_thread;

    for (uint8_t symbol = rel15_ul->start_symbol_index;
         symbol < (rel15_ul->start_symbol_index + rel15_ul->nr_of_symbols);
         symbol += numSymbols) {
        int total_res = 0;
        for (int s = 0; s < numSymbols; s++) {
            pusch_vars->ul_valid_re_per_slot[symbol + s] = get_nb_re_pusch(frame_parms, rel15_ul, symbol + s);
            pusch_vars->llr_offset[symbol + s] = ((symbol + s) == rel15_ul->start_symbol_index)
                                                     ? 0
                                                     : pusch_vars->llr_offset[symbol + s - 1] + pusch_vars->ul_valid_re_per_slot[symbol + s - 1] * rel15_ul->qam_mod_order;
            total_res += pusch_vars->ul_valid_re_per_slot[symbol + s];
        }
        if (total_res > 0) {
            union puschSymbolReqUnion id = {.s = {ulsch_id, frame, slot, 0}};
            id.p = 1 + symbol;
            notifiedFIFO_elt_t *req = newNotifiedFIFO_elt(sizeof(puschSymbolProc_t), id.p, &gNB->respPuschSymb, &nr_pusch_symbol_processing);
            puschSymbolProc_t *rdata = (puschSymbolProc_t *)NotifiedFifoData(req);

            rdata->gNB = gNB;
            rdata->frame_parms = frame_parms;
            rdata->rel15_ul = rel15_ul;
            rdata->slot = slot;
            rdata->startSymbol = symbol;
            rdata->numSymbols = numSymbols;
            rdata->ulsch_id = ulsch_id;
            rdata->llr = pusch_vars->llr;
            rdata->llr_layers = pusch_vars->llr_layers;
            rdata->s = &s[pusch_vars->llr_offset[symbol] * rel15_ul->nrOfLayers];
            rdata->nvar = nvar;

            if (rel15_ul->pdu_bit_map & PUSCH_PDU_BITMAP_PUSCH_PTRS) {
               

 nr_pusch_symbol_processing(rdata);
            } else {
                pushTpool(&gNB->threadPool, req);
                gNB->nbSymb++;
            }
        }
    }

    while (gNB->nbSymb > 0) {
        notifiedFIFO_elt_t *req = pullTpool(&gNB->respPuschSymb, &gNB->threadPool);
        gNB->nbSymb--;
        delNotifiedFIFO_elt(req);
    }

    stop_meas(&gNB->rx_pusch_symbol_processing_stats);
    return 0;
}
```

### Summary

- **Task Data Structure**: Holds all necessary data for each thread.
- **Task Function**: Performs the channel estimation for a specific antenna (`aarx`).
- **Main Function Adjustments**: Creates and enqueues tasks for each antenna, waits for tasks to complete, and merges the results.
- **Thread Pool Usage**: Utilizes the proprietary thread pool for parallel processing. 

This adjustment allows the `nr_pusch_channel_estimation` to be parallelized efficiently using a thread pool, improving the overall processing speed.