***Author: Ritwika VPS, ritwika@ucmerced.edu***  
***Written: Dec 31, 2025***  
###### (see below for modifications log as applicable)  

This script carries out robustness checks for the core simulation results presented in the main text: i.e., for the Random Assignment Witout Redundancy (NoRedundRandom) and Random Assignment With Redundancy (RedundNoRepBits) task bit allocation protocols. Specficially, we test:  
- whether the choice of N_g = 10,000 (i.e., the number of trials) results in stable results, for the same TaskSize, N_a, and m_a values corresponding to a a subset of selected points representative of various regions of the explored T/N_a and T/m_a parameter space for the presented results for these protocols. We do this by repeating simulations for vairous N_g values for the N_a and m_a values for these selected points. Here, we keep the TaskSize fixed at 20 bits, as is the case for the core results presented in the main text.
- how results change for various values of the TaskSize. Broadly, we do not expect the results to be robust (as a function of T/N_a and T/m_a) as TaskSize changes, for N_a and m_a values corresponding to the same (T/N_a, T/m_a) combos. To demonstrate, consider the following example: for T = 20, T/N_a = 6.67 and T/m_a = 2.5, corresponding to N_a = 3 and m_a = 8. Here, $N_a\times m_a$  24, with a few leftover bits after full coverage. For T = 80, however, for the same ratios, N_a = 12 and m_a = 32, giving $N_a\times m_a$ = 384, which is considerably larger than the TaskSize. However, we do expect behaviour to be roughly similar along curves where $N_a\times m_a = \alpha T$, where $\alpha$ is a constant > 0. Therefore, we run the core simulations for the same range of T/N_a and T/m_a values as presented in the main text for various values of the TaskSize. We use a larger increment for the N_a and m_a parameter vectors and use N_g = 5000 (based on results from the N_g robustness check above).

We do not do these robustness checks for the First Come First Serve Random Assignment Without Redundancy and Sequential Random Assignment Without Redundnacy protocols, since these are suppplemental results we present in teh Supporting Information, and do not form core results of the paper. 

In [42]:
"""
Import all necessary modules (importing modules for the function file as well, just to be safe)
"""
import Fns_MemoryTaskSim as CIvsCM #user-defined function file with all sim functions
import numpy as np
import pandas as pd #data manipulation and analysis library
import os
from tqdm import tqdm #for progress bar
from scipy.io import savemat #to save data in .mat format (cuz I'm going to plot in MATLAB :D)
import re #for regular expressions

In [43]:
#Read in data table with details about selected points for robustness checks
DataTab_ForChks = pd.read_excel('/Users/ritwikavps/Desktop/GoogleDriveFiles/research/CollectiveMemoryvsCollectiveIntelligence/A1_MemoryModel/MemoryModelSimResults/RobustnessChecks/' \
                                'RobustnessCheckValues.xlsx',sheet_name='Data')

In [None]:
"""
Robustness checks: number of trials (i.e., number of CM and CI groups)
"""
 
NumGroups_Vec = [10, 50, 100, 500, 1000, 5000, 10000, 20000] #list with number of groups to test
DataList_NgTest = [] #initialise data list to store output

#ERROR CHECK
if np.size(np.unique(DataTab_ForChks['TaskSize'])) != 1:
    raise ValueError("There should only be one task size for the core simulations.")

#Loop through each set of T/ma and T/na values in the data table, with each set of (T/ma, T/Na) representing a selected point for robustness checks. Note that for checking the effect 
#of the number of groups, we use the same TaskSize as in the core simulations as well as the same ma and Na values for the selected points. The goal is to simply see how the stats re:
#excluded--and by extension, covered--bits change as a function of N_g (number of groups)
for Ind in tqdm(range(np.size(DataTab_ForChks['Rounded_ma']))): #tqdm for progress bar!

    #Get agent memory value and number of agents for the selected point
    CurrAgentMem = DataTab_ForChks['Rounded_ma'][Ind]; CurrNumAgent = DataTab_ForChks['Rounded_Na'][Ind] 
    CurrBitAllocProtocol = DataTab_ForChks['BitAllocationProtocol'][Ind] #get corresponding memory dist condition and convert to required format to pass to sim function
    MemoryDistCondition = 'GreaterGroupMem_' + CurrBitAllocProtocol 
    
    for Curr_N_g in NumGroups_Vec: #Loop through the list of number of groups

        #run simulation
        MeanExcludedBits, StdExcludedBits, Stable_ExcludedBits =\
            CIvsCM.GetMeanAndStdExcludedBits(DataTab_ForChks['TaskSize'][0], CurrNumAgent, CurrAgentMem, Curr_N_g, MemoryDistCondition)
        
        #Get line with results to write to output (which we will convert to a dataframe)
        CurrResultsToWrite = {
        'ItemId': Ind+1, #to uniquely identify results for various N_g values for a given selected point being tested for robustness
        'BitAllocationProtocol': CurrBitAllocProtocol,
        'Na': CurrNumAgent,
        'ma': CurrAgentMem,
        'NumGroups': Curr_N_g,
        'TaskSize': DataTab_ForChks['TaskSize'][0],
        'CM_NumExcBits': Stable_ExcludedBits,
        'CI_MeanNumExcBits': MeanExcludedBits,
        'CI_StdNumExcBits':StdExcludedBits
        }
        
        DataList_NgTest.append(CurrResultsToWrite) #append current results

OpDf_NgTest= pd.DataFrame(DataList_NgTest) #convert to df
OpDf_NgTest.to_csv('/Users/ritwikavps/Desktop/GoogleDriveFiles/research/CollectiveMemoryvsCollectiveIntelligence/A1_MemoryModel/MemoryModelSimResults/RobustnessChecks/' \
                                 'CIvsCM_RobustnessChecks_NumberOfGroups.csv',index=False) #write to .csv file
    


100%|██████████| 16/16 [11:36<00:00, 43.51s/it]


In [72]:
""" 
Here, we test whether the broad pattern of results are robust across various task sizes, keeping in mind that results should be preserved along curves described by 
N_a * m_a = alpha * T, where alpha is a constant > 0
"""

#This function flags duplicate ratios and replaces all except the last in a set of duplicates with NaN. This is so that we only retain Na and ma values such that every Task size to Na 
#(or ma) ratio is unique (this reduces computational costs + makes plotting less computationally heavy cuz there are fewer values)
#(This function is copied from the core sim file: A1_CIvsCM_MemoryTaskSim.ipynb)
def FlagDuplicateRatios(RatioVec):
    for i in range(len(RatioVec)): #go through the ratio vector
        if np.count_nonzero(RatioVec == RatioVec[i]) > 1: #if there are duplicates of the current ratio value
            RatioVec[i] = np.nan #assign nan

    #This way, only the last occurrence of each duplicate ratio value is retained (others are NaN)
    return RatioVec

"""
Run the simulation
"""
#Set parameters
N_g = 5000 #Number of groups to simulate #based on N_g robustness tests
TaskSize_Vec = [5, 10, 20, 40, 80] #Memory size of task
Na_ma_Increment_Vec = [1, 1, 2, 3, 6] #increment values for Na and ma vectors for each task size such that the Na and ma vectors are roughly similar size (and also, do not have 
#too many entries)
MemoryDistCondition_Vec = ["GreaterGroupMem_NoRedundRandom", "GreaterGroupMem_RedundNoRepBits"]
NumFreeCores = 1

for TaskSizeInd in range(np.size(TaskSize_Vec)):

    CurrTaskSize = TaskSize_Vec[TaskSizeInd]
    Curr_Na_ma_Increment = Na_ma_Increment_Vec[TaskSizeInd]

    #get N_a and m_a vectors
    Na_ma_vec = np.arange(2, 2*CurrTaskSize, Curr_Na_ma_Increment) #start with a common vector 
    TaskSize_To_Na_ma = np.round(CurrTaskSize/Na_ma_vec,1) #get ratio of task size to Na (or ma)
    TaskSize_To_Na_ma = FlagDuplicateRatios(TaskSize_To_Na_ma) #flag duplicates of the ratios as NaN
    Na_ma_vec = Na_ma_vec[~np.isnan(TaskSize_To_Na_ma)] #only retain Na (or ma) values such that the ratios are unique (because we plot based on the ratios)
    
    N_a_vec = Na_ma_vec #assign N_a_vec
    m_a_vec = np.append(Na_ma_vec,max(Na_ma_vec)+1)  #assign m_a_vec (add one extra value to avoid plotting confusion; one axis will have one extra value)

    #Run the simulation for given parameters, looping through memory distribution conditions
    for MemoryDistCondition in MemoryDistCondition_Vec:
        
        print(f'Starting sim for {MemoryDistCondition} for TaskSize {CurrTaskSize}')
        MeanExcludedBitsArray, StdExcludedBitsArray, Stable_ExcludedBitsArray =\
            CIvsCM.GetCIvsCMstats_ParallelParamSweep(CurrTaskSize, N_a_vec, m_a_vec, N_g, MemoryDistCondition, NumFreeCores)
        
        DataToSave = { #organise the data to be saved
        'MeanExcBits_CI': MeanExcludedBitsArray,
        'StdExcBits_CI': StdExcludedBitsArray,
        'StableExcBits_CM': Stable_ExcludedBitsArray,
        'NumAgents_Vec': N_a_vec,
        'AgentMemory_Vec': m_a_vec,
        'TaskSize': CurrTaskSize,
        'MemoryDistributionCondition': MemoryDistCondition,
        'NumGroups': N_g
        }

        FileName = 'RobustnessCheck_TaskSize' + str(CurrTaskSize) + '_CI_vs_CM_Sims_MemoryDistCondition__' + re.sub('.*_', '', MemoryDistCondition) +'.mat' #regexp to create filename; note that the data 
        #files will be saved in the current working directory
        savemat(FileName, DataToSave)

Starting sim for GreaterGroupMem_NoRedundRandom for TaskSize 5


100%|██████████| 56/56 [00:05<00:00, 10.44it/s]

Starting sim for GreaterGroupMem_RedundNoRepBits for TaskSize 5



100%|██████████| 56/56 [00:07<00:00,  7.77it/s]

Starting sim for GreaterGroupMem_NoRedundRandom for TaskSize 10



100%|██████████| 210/210 [00:46<00:00,  4.54it/s]

Starting sim for GreaterGroupMem_RedundNoRepBits for TaskSize 10



100%|██████████| 210/210 [00:58<00:00,  3.60it/s]

Starting sim for GreaterGroupMem_NoRedundRandom for TaskSize 20



100%|██████████| 240/240 [03:56<00:00,  1.02it/s]


Starting sim for GreaterGroupMem_RedundNoRepBits for TaskSize 20


100%|██████████| 240/240 [04:07<00:00,  1.03s/it]


Starting sim for GreaterGroupMem_NoRedundRandom for TaskSize 40


100%|██████████| 342/342 [36:26<00:00,  6.39s/it]


Starting sim for GreaterGroupMem_RedundNoRepBits for TaskSize 40


100%|██████████| 342/342 [35:16<00:00,  6.19s/it]


Starting sim for GreaterGroupMem_NoRedundRandom for TaskSize 80


100%|██████████| 380/380 [3:58:44<00:00, 37.70s/it]   


Starting sim for GreaterGroupMem_RedundNoRepBits for TaskSize 80


100%|██████████| 380/380 [3:45:06<00:00, 35.54s/it]   


In [None]:
# """
# Demo of how results change with TaskSize. Note that for task size, we want to test for the same T/Na and T/ma ratios as opposed to the raw Na and ma values corresponding to the 
# selected points (see relevant discussion in markdown text at the beginning of this file).
#
# I decided to not do this because this takes too long without parallelisation and I don't want to adapt this to parallelisation when the point we are trying to make is far more 
# efficiently conveyed by the TaskSize robustness checks above
# """
 
# TaskSize_vec = [20, 80, 100] #list with task sizes to test
# DataList_TaskSzTest = [] #initialise data list to store output
# NumGroups = 5000 #Cuz we know from the previous robustness test for N_g that 5000 is a good enough number

# #ERROR CHECK
# if np.size(np.unique(DataTab_ForChks['NumGroups'])) != 1:
#     raise ValueError("There should only be one Num_groups value for the core simulations.")

# #Loop through each set of T/ma and T/na values in the data table, with each set of (T/ma, T/Na) representing a selected point for robustness checks. Because we are focusing on the 
# #actual ratios, we may end up rounding some Na and ma values to the nearest integers depending on the task size
# for Ind in tqdm(range(np.size(DataTab_ForChks['Rounded_ma']))):

#     #Get agent memory value and number of agents for the selected point
#     CoreSim_T_To_AgentMem = DataTab_ForChks['T_by_ma_Xax'][Ind]; CoreSim_T_To_NumAgent = DataTab_ForChks['T_by_Na_Yax'][Ind] 
#     CurrBitAllocProtocol = DataTab_ForChks['BitAllocationProtocol'][Ind] #get corresponding memory dist condition and convert to required format to pass to sim function
#     MemoryDistCondition = 'GreaterGroupMem_' + CurrBitAllocProtocol 
    
#     for Curr_T in TaskSize_vec: #Loop through the list of number of groups

#         CurrAgentMem = round(Curr_T/CoreSim_T_To_AgentMem); CurrNumAgent = round(Curr_T/CoreSim_T_To_NumAgent) #get Na and ma values to preserve the ratios T/Na and T/ma as much as possible
        
#         #only run test if the ratios are indeed preserved (i.e., within rounding error)
#         Curr_T_by_ma = Curr_T/CurrAgentMem; Curr_T_by_Na = 	Curr_T/CurrNumAgent #get the T_to_ma and T_to_Na ratios for the current task size and computed Na and ma values
#         if abs(Curr_T_by_ma - CoreSim_T_To_AgentMem) < 0.1 and abs(Curr_T_by_Na - CoreSim_T_To_NumAgent) < 0.1: #check if ratios are preserved re: current values and core sim values

#             #run simulation
#             MeanExcludedBits, StdExcludedBits, Stable_ExcludedBits =\
#                 CIvsCM.GetMeanAndStdExcludedBits(Curr_T, CurrNumAgent, CurrAgentMem, DataTab_ForChks['NumGroups'][0], MemoryDistCondition)
            
#             #Get line with results to write to output (which we will convert to a dataframe)
#             CurrResultsToWrite = {
#             'ItemId': Ind+1, #to uniquely identify results for various N_g values for a given selected point being tested for robustness
#             'BitAllocationProtocol': CurrBitAllocProtocol,
#             'Na': CurrNumAgent,
#             'ma': CurrAgentMem,
#             'Curr_T_by_Na':	Curr_T_by_Na,
#             'Curr_T_by_ma': Curr_T_by_ma,
#             'T_by_Na_CoreSim': CoreSim_T_To_NumAgent,
#             'T_by_ma_CoreSim': CoreSim_T_To_AgentMem,
#             'NumGroups': DataTab_ForChks['NumGroups'][0],
#             'TaskSize': Curr_T,
#             'CM_NumExcBits': Stable_ExcludedBits,
#             'CI_MeanNumExcBits': MeanExcludedBits,
#             'CI_StdNumExcBits':StdExcludedBits
#             }
            
#             DataList_TaskSzTest.append(CurrResultsToWrite) #append current results

# OpDf_TaskSzTest= pd.DataFrame(DataList_TaskSzTest) #convert to df
# OpDf_TaskSzTest.to_csv('/Users/ritwikavps/Desktop/GoogleDriveFiles/research/CollectiveMemoryvsCollectiveIntelligence/A1_MemoryModel/MemoryModelSimResults/RobustnessChecks/' \
#                                 'CIvsCM_NonRobustDemo_TaskSize_T_ma__T_Na_ratios preserved.csv',index=False) #write to .csv file    