# **Stage 3 - Deep Reinforcement Learning for Implementing Infinite Impulse Response (IIR) Filters for Complex Interference Situations - part 1**

## Scope - *developing lower order (2nd and 4th) IIR filters for scaled and multicarrier interference*
References:- Chapter 11 of *Digital Signal Processing: signals, systems, and filters* by *Andreas Antoniou*

- In the previous stage, stage 2, we discussed about implementing 2nd and 4th order IIR filters solely for ***constant single-carrier non-overlapping interference situations***. 
- In this stage, we try to analyze the capability of the current DDPG model and the filter-emulating environment to learn lower order (and potentially higher order) IIR filters for dfferent interference situations, including scale variations, multicarrier components, and overlapping interferences with AWGN noise. 


- One thing to note here is that, although the current model (as by 31/01/2024) for 4th order IIR is learning *sufficiently good* filters, giving average SNR values higher than 30dB, 50% of the time, for the simple interference situations specified above, those models are still not capable of achieving optimal filters. (Althoug it is impossible to find an optimal filter for a given interference situation, there are surely better filters than what we learn.) And developing DRL models that are capable of learning such optimal filters is still a question for research under this project. 

In [8]:
import os, sys, time, copy, json

import numpy as np

sys.path.append('../')
sys.path.append('../stage_2/envs/')

# import the DDPG model
from DDPGwithCustomNetDepths import DDPGAgentwithCustomNetworkDepths

# import the environment
from ReceiverEnvWithArbitaryOrderIIRwithDelayedTargetSNR import ReceiverEnvWithArbitaryOrderIIRwithDelayedTargetSNR

# train and test functions
from stage2_helper import train, test

In [9]:
# common constants
S = 100
SAMPLING_FREQ = 44_100 # Hz

### **Single-Carrier Interfernce**

- In the previous stage, we trained a model for constant single-carrier non-overlapping interference. Now, we shall see the behavior of learning for the same type of interference *while changing the power of the interference using `interference_scalar` variable*. 

In [10]:
# constants for single carrier interfernce
CUT_OFF_FREQ = 5_000 # Hz
INTERFERENCE_CENTER_FREQ = 15_000 # Hz

In [11]:
# define the filter order
ORDER = 4

# train parameters
AUDIO_NUM = 1
NO_OF_TESTS = 4
NO_OF_STEPS = 4_000

# define the interference scalars to check
interference_scalars = [0.25, 0.5] # , 1.25, 1.5, 1.75, 2

train_reward_history_arr = [] # 3D array
train_action_history_arr = [] # 3D array
test_avg_reward_arr      = [] # 2D array
test_avg_action_arr      = [] # 2D array
model_arr = [] #2D array

start = time.time()

for i in range(len(interference_scalars)):

    print("#"*40 + f" START TESTING INTERFERENCE SCALAR {interference_scalars[i]} " + "#"*40)

    # define the environment
    env = ReceiverEnvWithArbitaryOrderIIRwithDelayedTargetSNR(
        order = ORDER,
        S = S,
        cut_off_freq = CUT_OFF_FREQ,
        interference_center_freq = INTERFERENCE_CENTER_FREQ,
        interference_scalar = interference_scalars[i],
        zero_magnitude_mapping = None,
        gradient = None,
        fix_zeros_magnitude = False,
        automatic_gain = False,
        SNR_as_dB = True,
        show_effect = False,
    )

    # initialize the sub-arrays
    train_reward_history_subarr = [] # 2D array
    train_action_history_subarr = [] # 2D array
    test_avg_reward_subarr      = [] # 1D array
    test_avg_action_subarr      = [] # 1D array
    model_subarr = [] # 1D array

    for j in range(NO_OF_TESTS):

        # initialize a DDPG model
        model = DDPGAgentwithCustomNetworkDepths(
            input_dims  = env.observation_space.shape,
            n_actions   = env.action_space.shape[0],
            alpha       = 0.0001, # learning rate of actor
            beta        = 0.001,  # learning rate of critic
            gamma       = 0,      # ***** decreasing the discounting factor *****
            tau         = 0.001,
            critic_dims = [[256], [512, 256], []],
            actor_dims  = [256, 128],
            batch_size  = 256,
            buffer_size = 4_000,
            noise       = 0.01,
            action_activation = 'sigmoid'
        )

        # train the model
        reward_history, action_history = train(model, env, audio_num=AUDIO_NUM, max_num_steps=NO_OF_STEPS)
        train_reward_history_subarr.append(reward_history)
        train_action_history_subarr.append(action_history)

        # test the trained model
        test_rewards, test_actions = test(model, env, audio_num=AUDIO_NUM, num_steps=NO_OF_STEPS, fixed_action=None)
        test_avg_reward_subarr.append(mean_reward := np.mean(test_rewards))
        test_avg_action_subarr.append(np.average(np.array(test_actions), axis=0))
        print(f"average test performance: {mean_reward}dB")

        # save the model
        model_subarr.append(model)

        end = time.time()

        print("="*30 + f" execution time: {round((end - start), 3)}s " + "="*30)

    train_reward_history_arr.append(train_reward_history_subarr)
    train_action_history_arr.append(train_action_history_subarr)
    test_avg_reward_arr.append(test_avg_reward_subarr)
    test_avg_action_arr.append(test_avg_action_subarr)

    model_arr.append(model_subarr)
    
    print("#"*41 + f" TESTING INTERFERENCE SCALAR {interference_scalars[i]} IS OVER " + "#"*41)


######################################## START TESTING INTERFERENCE SCALAR 0.25 ########################################
creating action space with 9 dimensions...


2024-02-07 16:05:10.507364: I external/local_xla/xla/stream_executor/cuda/cuda_executor.cc:887] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-02-07 16:05:10.514871: W tensorflow/core/common_runtime/gpu/gpu_device.cc:2256] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at https://www.tensorflow.org/install/gpu for how to download and setup the required libraries for your platform.
Skipping registering GPU devices...



--------------------------------------------------episode no: 1--------------------------------------------------
audio name: 'arms_around_you-MONO'


Exception: the specified audio file doesn't exist: given ../stage_1/audio_files/arms_around_you-MONO.wav

In [7]:
# ----------------------------------------------- saving the data -----------------------------------------------
test_no = 1
dir_path = 'logs/notebook-stage_3.1/'
folder_name = f"test_{test_no}"
folder_path = os.path.join(dir_path, folder_name)
os.makedirs(folder_path)

# saving the train reward history
file_name = f"interference_scaling-scales_{0.25}_{0.5}-train_rewards.npy"
file_path = os.path.join(folder_path, file_name)
np.save(file_path, train_reward_history_arr)

# saving the train action history
file_name = f"interference_scaling-scales_{0.25}_{0.5}-train_actions.npy"
file_path = os.path.join(folder_path, file_name)
np.save(file_path, train_action_history_arr)

# saving the test average rewards
file_name = f"interference_scaling-scales_{0.25}_{0.5}-test_avg_rewards.npy"
file_path = os.path.join(folder_path, file_name)
np.save(file_path, test_avg_reward_arr)

# saving the test average rewards
file_name = f"interference_scaling-scales_{0.25}_{0.5}-test_avg_actions.npy"
file_path = os.path.join(folder_path, file_name)
np.save(file_path, test_avg_action_arr)

In [12]:
model_ = model_subarr[0]
model_.actor.summary()
# model_.critic.summary()

IndexError: list index out of range