# Random Action Agent Experiments

Author: K. Voudouris, 2023 (c) All Rights Reserved.

Contact: kv301@cam.ac.uk; k.voudouris14@googlemail.com; [Twitter @KozzyVoudouris](https://twitter.com/KozzyVoudouris); [GitHub @kozzy97](https://github.com/kozzy97)

Date: July 2023

This script runs a series of random action agents on the object permanence tests and stores the results in a MySQL database. It relies on a few things:

1. All the dependencies are installed, particularly that animalai is installed properly. I recommend using a conda environment and setting up an ipykernel for running this notebook.
2. AnimalAI is installed as an executable in the `env` folder.
3. A recent installation of MySQL, configured with a database, user, and password, as well as the local (or remote) address to store on. MySQL WorkBench is a good IDE for interacting with MySQL (this was created with WorkBench 8.0)
4. A CSV file in the same directory as this notebook called `databaseConnectionDetails.csv`, containing columns `database_name`, `hostname`, `username`, and `password` for database connection, with the values in the next row. This is gitignored.

In [None]:
import keyboard
import random
import os
import time

from animalai.envs.environment import AnimalAIEnvironment
from animalai.envs.actions import AAIActions, AAIAction
from gym_unity.envs import UnityToGymWrapper

import sys
sys.path.append('../src')

from randomActionAgents import RandomActionAgent #import the random walker class
from yamlHandling import find_yaml_files #this function finds the yaml files in a directory.
from yamlHandling import yaml_combinor #this function combines a batch of yaml files and saves the output in a temporary folder. This means we can run inference on batches of tests at once.
from mysqlConnection import databaseConnector #this function permits connection to a mysql database using a CSV file containing details of the db connection.
from mysqlConnection import agentToDB #this function takes a dictionary and ingresses it into a table
from mysqlConnection import removePreviouslyRunInstances #this function takes a set of yaml files and task names and removes any that have already got results in the database.
from mysqlConnection import selectID #this function finds the integer ID for a table given a particular column name and value

## Database Connection

A function for connecting to the database.

In [None]:
mycursor, connection = databaseConnector('databaseConnectionDetails.csv')

mycursor.close()

print("Connection checked and closed.")

## Paths

Provide the paths to the directory containing the configs being tested over, as well as the path the animal ai environment. Finally, provide a location for generating temporary files of combined configs. This defaults to the parent directory of the github repository, to prevent results being pushed accidentally.

In [None]:
configuration_folder = "../../configs/tests_agents"

env_path = "../../env/AnimalAI"

temp_folder_location = "../../.."

## Add All Tasks In Directory To Database

Iterate through the directory and find yaml files and their task names.

In [None]:
rerunInstanceTable = False

mycursor, connection = databaseConnector('databaseConnectionDetails.csv')

yaml_files, task_names = find_yaml_files(configuration_folder)

if rerunInstanceTable:
    dropTable = "DROP TABLE IF EXISTS randomactionagentinstanceresults, randomactionagentintrainstanceresults, instances;"
    mycursor.execute(dropTable)
    
    sql = "CREATE TABLE instances(instanceid INT AUTO_INCREMENT PRIMARY KEY, instancename VARCHAR(750) UNIQUE NOT NULL);"
    mycursor.execute(sql)

for instance in task_names:
    try:
        insertQuery = "INSERT INTO instances(instancename) VALUES('" + str(instance) + "');"
        mycursor.execute(insertQuery)
        connection.commit()
    except:
        print(f"Task {instance} has already been added to this table. Moving to next.")

mycursor.close()



## Agents

Dictionaries of parameters to define some random action agents. These agents randomly sample an action from the set of nine possible actions, and then they pick a number of steps to execute that action from a series of different distributions. Biases can be introduced to favour the selection of certain actions, as well as correlations with previous actions. These agents more closely resemble if a human were to randomly press buttons to try to get a reward, or if an artificial agent was selecting actions from a random policy. However, their behaviours is less easily described than the randomWalker agents, for which there is plenty of work outlining expected trajectories and behaviours.

The first agent selects number of steps from a uniform distribution. There are no biases or correlations for action selection.

In [None]:
uniform_action_agent = {'step_length_distribution' : 'uniform',
                       'max_step_length' : 20,
                       '0stationaryactionbias' : 1,
                       '1rturnbias' : 1,
                       '2lturnbias' : 1,
                       '3forwardbias' : 1,
                       '4forwardrbias' : 1,
                       '5forwardlbias' : 1,
                       '6backwardbias' : 1,
                       '7backwardlbias' : 1,
                       '8backwardrbias' : 1,
                       'remove_prev_step': False,
                       'aai_seed' : 2023,
                       'agent_tag' : 'Random Action Agent no bias no correlation uniform step length max 20'}

The second agent selects actions with a bias towards going forwards, simulating a cephalo-caudal bias whereby animals tend to go forwards rather than stop, turn, or reverse. Number of steps is selected from a cauchy distribution with a high mode, to simulate how a human might transition between random actions every few steps, but occasionally press and hold an action for an extended period, or rapidly transition between actions. This is modelled by the heavy tails of a cauchy distribution. New actions are also picked that aren't the same as previous actions, to model the fact that humans tend to pick new actions after a series of the same action.

The action biases are as follows (after softmaxing these values):
- 48.37% chance of picking a forwards action
- 17.79% chance of picking a forwardsleft action
- 17.79% chance of picking a forwardsright action
- 3.97% chance of picking a left action
- 3.97% chance of picking a right action
- 2.41% chance of picking stationary action
- 2.41% chance of picking a backwards left action
- 2.41% chance of picking a backwards right action
- 0.89% chance of picking a backwards action

In [None]:
cephalo_caudal_cauchy = {'step_length_distribution' : 'cauchy',
                       'cauchy_mode' : 15,
                       '0stationaryactionbias' : 1,
                       '1rturnbias' : 1.5,
                       '2lturnbias' : 1.5,
                       '3forwardbias' : 4,
                       '4forwardrbias' : 3,
                       '5forwardlbias' : 3,
                       '6backwardbias' : 0,
                       '7backwardlbias' : 1,
                       '8backwardrbias' : 1,
                       'remove_prev_step': True,
                       'aai_seed' : 2023,
                       'agent_tag' : 'Random Action Agent cephalocaudal bias cauchy step length mode 15'}

In [None]:
rerunAgentTable = False

mycursor, connection = databaseConnector('databaseConnectionDetails.csv')

if rerunAgentTable:
    dropTable = "DROP TABLE IF EXISTS randomactionagentinstanceresults, randomactionagentintrainstanceresults, randomactionagents;"
    mycursor.execute(dropTable)
    
    sql = "CREATE TABLE `randomactionagents` (`agentid` INT AUTO_INCREMENT PRIMARY KEY, `agent_tag` VARCHAR(300), `aai_seed` INT, `step_length_distribution` VARCHAR(10), `max_step_length` INT, `norm_mu` FLOAT(8), `norm_sig` FLOAT(8), `beta_alpha` FLOAT(8), `beta_beta` FLOAT(8), cauchy_mode FLOAT(8), gamma_kappa FLOAT(8), gamma_theta FLOAT(8), weibull_alpha FLOAT(8), poisson_lambda FLOAT(8), 0stationaryactionbias FLOAT(8), 1rturnbias FLOAT(8), 2lturnbias FLOAT(8), 3forwardbias FLOAT(8), 4forwardrbias FLOAT(8), 5forwardlbias FLOAT(8), 6backwardbias FLOAT(8), 7backwardlbias FLOAT(8), 8backwardrbias FLOAT(8), prev_step_bias FLOAT(8), remove_prev_step BOOL, UNIQUE(agent_tag, aai_seed));"
    mycursor.execute(sql)

mycursor.close()

In [None]:
agent_dict_list = [uniform_action_agent, cephalo_caudal_cauchy]

seeds_to_run = [2023, 1997, 356, 1815, 3761] #5 seeds corresponding to eventful years.

In [None]:
mycursor, connection = databaseConnector('databaseConnectionDetails.csv')

for agent in agent_dict_list:
    for seed in seeds_to_run:
        agent['aai_seed'] = seed
        agentToDB(mycursor, agent, table_name = "randomactionagents")

connection.commit()

mycursor.close()

## Run Inference And Store

Need to iterate through the dictionaries and run inference.

In [None]:
mycursor, connection = databaseConnector('databaseConnectionDetails.csv')

rebuildInstanceResultsTables = False

if rebuildInstanceResultsTables:
    print("Rebuilding results tables, dropping if they already exist.")

    dropInstanceResultsTables = "DROP TABLE IF EXISTS randomactionagentinstanceresults, randomactionagentintrainstanceresults;"
    mycursor.execute(dropInstanceResultsTables)
    
    createInstanceTable = "CREATE TABLE randomactionagentinstanceresults(instanceid INT NOT NULL, agentid INT NOT NULL, finalreward FLOAT(53), FOREIGN KEY (instanceid) REFERENCES instances(instanceid), FOREIGN KEY(agentid) REFERENCES randomactionagents(agentid), PRIMARY KEY (instanceid, agentid));"
    mycursor.execute(createInstanceTable)

    createIntraInstanceTable = "CREATE TABLE randomactionagentintrainstanceresults(instanceid INT NOT NULL, agentid INT NOT NULL, step INT NOT NULL, actiontaken INT NOT NULL, stepreward FLOAT(53), xvelocity FLOAT(32), yvelocity FLOAT(32), zvelocity FLOAT(32), xpos FLOAT(32), ypos FLOAT(32), zpos FLOAT(32), FOREIGN KEY (instanceid) REFERENCES instances(instanceid), FOREIGN KEY(agentid) REFERENCES randomactionagents(agentid), PRIMARY KEY(instanceid, agentid, step));"
    mycursor.execute(createIntraInstanceTable)

    print("Tables: randomactionagentinstanceresults and randomactionagentintrainstanceresults have been successfully built.")

mycursor.close()


Define a function to run the experiments. This takes an agent dictionary and first checks whether any results have been recorded for it. If not, then it proceeds with testing. It does testing in batches, generating a temporary yml file to run training on and storing the final episode reward, as well as the intra-instance results.

In [None]:
def runRandomActionAgentAndStore (cur, con, batch_size: int, agent_dict: dict, yaml_files, task_names, temp_folder_location, agent_inference = False, port_base = 6600, randomise_port = True, verbose = True):
    
    # first, check if this agent has been added to the DB already

    agentid = selectID(cur, id_name = "agentid", table_name = "randomactionagents", WHERE_column = "agent_tag", WHERE_clause = agent_dict['agent_tag'], secondary_WHERE_column = "aai_seed", secondary_WHERE_clause = agent_dict['aai_seed'])
    
    try:
        task_names, yaml_files = removePreviouslyRunInstances(cur = cur, yaml_files=yaml_files, task_names=task_names, agentid=agentid, agent_table = "randomactionagents", agent_instance_results_table = "randomactionagentinstanceresults")
    except:
        print("Running on all files.")

    # now proceed with testing
    yaml_index = 0

    if randomise_port:

        port = port_base + yaml_index + random.randint( #create random base port.
            0, 9000
            )
        
    else:
        port = port_base + yaml_index
        
    batch_counter = 0

    #set seed
    random.seed(agent_dict['aai_seed'])

    if len(yaml_files) > 0:
        for yaml_index in range(0, len(yaml_files), batch_size):

            if (yaml_index + batch_size) > len(yaml_files) or batch_size > len(yaml_files):
                upper_bound = len(yaml_files)
            else:
                upper_bound = ((yaml_index + batch_size))

            if verbose:
                print(f"Running inferences on batch {batch_counter + 1} of {batch_size} files of total {len(yaml_files)}. {len(yaml_files) - (batch_size * (batch_counter + 1))} instances to go.")

            batch_files = yaml_files[yaml_index:upper_bound]

            batch_file_names = task_names[yaml_index:upper_bound]

            batch_temp_file_name = f"TempConfig_{agent_dict['agent_tag']}_{agent_dict['aai_seed']}_{yaml_index}.yml"

            config_file_path = yaml_combinor(file_list = batch_files, temp_file_location=temp_folder_location, stored_file_name = batch_temp_file_name)

            if verbose:
                print("Opening AAI Environment.")

            temp_port = port + yaml_index # increment through ports to prevent calling the same socket.

            aai_env = AnimalAIEnvironment( 
                inference=agent_inference, #Set true when watching the agent
                seed = agent_dict['aai_seed'],
                worker_id=agent_dict['aai_seed'],
                file_name=env_path,
                arenas_configurations=config_file_path,
                base_port=temp_port,
                useCamera=False,
                resolution=4, #make resolution small to improve processing speed - random walkers don't need anything.
                useRayCasts=False,
                no_graphics=True
            )

            env = UnityToGymWrapper(aai_env, uint8_visual=False, allow_multiple_obs=True, flatten_branched=True)

            obs = env.reset()  

            agent = RandomActionAgent() # initialise agent class

            for key, value in agent_dict.items(): #set the agent attributes to be whatever is in the dictionary, and default otherwsise.
                if hasattr(agent, key):
                    setattr(agent, key, value)

            agent.action_biases = [agent_dict['0stationaryactionbias'], 
                                   agent_dict['1rturnbias'], 
                                   agent_dict['2lturnbias'], 
                                   agent_dict['3forwardbias'],
                                   agent_dict['4forwardrbias'],
                                   agent_dict['5forwardlbias'],
                                   agent_dict['6backwardbias'],
                                   agent_dict['7backwardlbias'],
                                   agent_dict['8backwardrbias']]


            for _instance in range(len(batch_files)): 

                #select a random action according to the biases. There is no previous step bias as there is no previous step at the start of an episode!
                initialActionAgent = agent
                initialActionAgent.prev_step_bias = 0 

                previous_action = initialActionAgent.get_new_action(prev_step=0)

                #get instance ID
                instanceid = selectID(cur, id_name = "instanceid", table_name = "instances", WHERE_column = "instancename", WHERE_clause = batch_file_names[_instance])

                #prepare to run instance
                done = False

                episodeReward = 0

                step_counter = 0
    
                while not done:

                    step_list = agent.get_num_steps(prev_step = previous_action)
            
                    for action in step_list:
            
                        obs, reward, done, info = env.step(int(action))

                        env.render()
             
                        step_counter += 1

                        episodeReward += reward

                        previous_action = action

                        try:
                            intraInstanceQuery = intraInstanceQuery = f"INSERT INTO randomactionagentintrainstanceresults(instanceid, agentid, step, actiontaken, stepreward, xvelocity, yvelocity, zvelocity, xpos, ypos, zpos) VALUES ({instanceid}, {agentid}, {step_counter}, {action}, {float(episodeReward)}, {obs[0][1]}, {obs[0][2]}, {obs[0][3]}, {obs[0][4]}, {obs[0][5]}, {obs[0][6]});"
                            cur.execute(intraInstanceQuery)
                            #con.commit()
         
                        except:
                            print(f"There's something wrong with this step. Here's the query {intraInstanceQuery}")
                            pass

                        if done:
                            obs=env.reset()
                            if verbose:
                                print(f"Episode Reward: {episodeReward}")
                            done = True #to be sure.
                            break #break the for loop early
                    
                    
                    if not done: # only keep going if episode not done yet.

                        action = agent.get_new_action(prev_step = previous_action)

                        obs, reward, done, info = env.step(int(action))

                        step_counter += 1

                        env.render()

                        episodeReward += reward

                        previous_action = action

                        try:
                            intraInstanceQuery = f"INSERT INTO randomactionagentintrainstanceresults(instanceid, agentid, step, actiontaken, stepreward, xvelocity, yvelocity, zvelocity, xpos, ypos, zpos) VALUES ({instanceid}, {agentid}, {step_counter}, {action}, {float(episodeReward)}, {obs[0][1]}, {obs[0][2]}, {obs[0][3]}, {obs[0][4]}, {obs[0][5]}, {obs[0][6]});"
                            cur.execute(intraInstanceQuery)
                            #con.commit()
         
                        except:
                            print(f"There's something wrong with this step. Here's the query {intraInstanceQuery}")
                            pass
                        

                        if done:
                            if verbose:
                                print(F"Episode Reward: {episodeReward}")
                            obs=env.reset()
                            done = True #to be sure.
                            break

                try:
                    insertInstanceResults = f"INSERT INTO randomactionagentinstanceresults(instanceid, agentid, finalreward) VALUES ({instanceid}, {agentid}, {episodeReward});"
                    cur.execute(insertInstanceResults)
                    con.commit()
                    if verbose:
                        print("Pushing results to database.")
                except:
                    print("It looks like this agent has already been tested on this instance.")

                    
            env.close()

            batch_counter += 1

            os.remove(config_file_path)

            if verbose:
                print("Moving to next batch.")

    else:
        if verbose:
            print("This agent has already been run and is in the database. Skipping so as not to waste time. If you suspect that the agent has not been fully evaluated on all tests, you may want to restart the instances for that agent.")

       
    

In [None]:
def run_agent_on_instance_wrapper(seed, agent, yaml_batch_size=1, port_base = 6600, randomise_port = True, verbose = True):
    agent['aai_seed'] = seed
    
    if verbose:
        print(f"Running {agent['agent_tag']} on seed {seed}.")
    
    mycursor, connection = databaseConnector('databaseConnectionDetails.csv')

    runRandomActionAgentAndStore(mycursor, connection, yaml_batch_size, agent_dict=agent, yaml_files=yaml_files, task_names=task_names, temp_folder_location=temp_folder_location, agent_inference=False, port_base = port_base, randomise_port = randomise_port, verbose = verbose)

    mycursor.close()

In [26]:
yaml_batch_size = 1 #problem with task ordering, so having to do batches of 1. Much slower...
counter = 0
#inf_loop = True
verbose = False

while counter <= (len(seeds_to_run) * len(agent_dict_list)):
    try:
        for seed in seeds_to_run:
            for agent_dictionary in agent_dict_list:
                if keyboard.is_pressed('q'):
                    print(f"Loop stopped by pressing 'q'.")
                    break
                adhoc_port = (counter+10)*100
                run_agent_on_instance_wrapper(seed, agent_dictionary, yaml_batch_size=yaml_batch_size, port_base = adhoc_port, randomise_port = False, verbose = verbose)
                if verbose:
                    print("Moving to next seed.")
                counter += 1
            if counter > (len(seeds_to_run) * len(agent_dict_list)):
                break
            if verbose:
                 print("Moving to next agent.")
    except:
        print("Sockets were occupied. Waiting 10 seconds and starting again.")
        counter = 0
        if keyboard.is_pressed('q'):
                    print(f"Loop stopped by pressing 'q'.")
                    break
        time.sleep(10)

[INFO] Connected to Unity environment with package version 2.1.0-exp.1 and communication version 1.5.0
[INFO] Connected new brain: AnimalAI?team=0
Episode Reward: -0.9991999505436979
Pushing results to database.
Moving to next batch.
Running inferences on batch 875 of 1 files of total 3326. 2451 instances to go.
Yaml files combined. Saved to ../../..\TempConfig_Random Action Agent no bias no correlation uniform step length max 20_1956_874.yml
Opening AAI Environment.
[INFO] Connected to Unity environment with package version 2.1.0-exp.1 and communication version 1.5.0
[INFO] Connected new brain: AnimalAI?team=0
Episode Reward: -0.9991999505436979
Pushing results to database.
Moving to next batch.
Running inferences on batch 876 of 1 files of total 3326. 2450 instances to go.
Yaml files combined. Saved to ../../..\TempConfig_Random Action Agent no bias no correlation uniform step length max 20_1956_875.yml
Opening AAI Environment.
[INFO] Connected to Unity environment with package versi

  logger.warn(


Episode Reward: -1.3971999411005527
Pushing results to database.
Moving to next batch.
Running inferences on batch 2 of 1 files of total 2442. 2440 instances to go.
Yaml files combined. Saved to ../../..\TempConfig_Random Action Agent no bias no correlation uniform step length max 20_1956_1.yml
Opening AAI Environment.
[INFO] Connected to Unity environment with package version 2.1.0-exp.1 and communication version 1.5.0
[INFO] Connected new brain: AnimalAI?team=0
Episode Reward: -0.9995999505044892
Pushing results to database.
Moving to next batch.
Running inferences on batch 3 of 1 files of total 2442. 2439 instances to go.
Yaml files combined. Saved to ../../..\TempConfig_Random Action Agent no bias no correlation uniform step length max 20_1956_2.yml
Opening AAI Environment.
[INFO] Connected to Unity environment with package version 2.1.0-exp.1 and communication version 1.5.0
[INFO] Connected new brain: AnimalAI?team=0
Episode Reward: -1.4599999244092032
Pushing results to database.

[INFO] Connected to Unity environment with package version 2.1.0-exp.1 and communication version 1.5.0
[INFO] Connected new brain: AnimalAI?team=0


  logger.warn(
