# Overview

We consider Chapter $5$ of the thesis. In particular we implement the Algorithm of Section $5.1$ and generate $\tilde{P}_{n}$ for $3 \le n\le 6$  from $\tilde{P}_{n-1}$. We then give a decomposition of $p_n:=|\tilde{P}_n|$ in terms of $|\tilde{P}_{n-1}|$.

Recall that,

$$\tilde{P}_n=\bigsqcup_{g \in P_{n-1}} \rho^{-1}(g)= \bigsqcup_{g \in \tilde{P}_{n-1}}\{g+\epsilon_{I} | I \in \mathbb{E}(g)\}.$$

The **procedure** to recursively obtain $\tilde{P}_{n}$ from $\tilde{P}_{n-1}$ is as follows:
 
 1) for each  $g \in \tilde{P}_{n-1}$ obtain the set $\mathbb{E}(g)$ (using a implemenation of the Algorithm of Section $5.1$),
     
 2) for each set $\mathbb{E}(g)$ obtain the set $\{g+\epsilon_{I} | I \in \mathbb{E}(g)\}$,
 
 3) take the union as above to obtain $\tilde{P}_{n}$.

To obtain a decomposition of $|\tilde{P}_{n}|$, for each $g \in \tilde{P}_{n-1}$ we record $|\mathbb{E}(g)|$, which allows one to obtain the sequences $\boldsymbol{e}_{n}$ and $\boldsymbol{d}_{n}$. The sequences then give $p_{n+1}=\boldsymbol{e}_{n}\cdot \boldsymbol{d}_{n}$.

## Results

1) We obtain the sequence of $p_n$ for $1 \le n\le 6$.

  $$1,2,10,154,10334,5399325.$$

We also have that,

$$p_6=5399325=114*125+121*65+126*237+138*112+158*248+162*247+165*132+169*117+173*132+178*122+182*106+183*132+187*24+192*134+194*123+208*144+209*22+216*234+217*126+218*61+221*125+227*22+229*307+230*250+232*120+233*119+236*122+241*30+242*102+247*61+248*53+250*114+251*59+260*184+268*106+269*64+271*186+273*118+274*56+278*52+279*41+281*136+284*112+289*67+293*133+294*37+300*133+305*67+313*123+324*123+325*132+326*101+329*181+338*120+343*59+345*113+356*19+381*121+390*44+397*107+419*20+420*46+421*23+422*39+429*64+433*35+436*128+437*23+455*64+477*53+496*48+502*16+509*38+518*12+520*107+524*20+531*36+538*76+539*62+553*36+555*66+566*144+578*61+605*62+607*66+613*11+640*28+659*30+759*48+765*21+781*108+802*119+814*42+881*42+938*8+974*135+977*45+983*135+1002*45+1026*126+1029*63+1033*63+1068*96+1121*108+1145*22+1178*39+1189*39+1209*44+1212*11+1224*66+1257*44+1312*11+1322*32+1377*36+1409*16+1421*10+1450*54+1511*20+1666*21+1753*24+2110*42+2193*95+2285*4+2290*54+3544*12+3673*42+3820*48+3983*18+6280*1+6474*6+6702*14+6960*16+7250*9+7580*4$$


where the decomposition of the remaining $p_{n}$ are stated in Section $5.1$.

2) We provide a method to obtain $\mathbb{E}(g)$ for any $g \in \tilde{P}_{n}$.

3) We provide evidence for the Conjecture defined in Section 5.2 by showing that such functions attain $U_n$ for low $n$. 

## Notes

In section "$\tilde{P}_{6}$ and $|\tilde{P}_{6}|$ decomposition" we break the computation into parts due to its computational time.

# Table of contents

1. [Functions](#s1)
    1. [Main Functions](#s11)
    2. [Secondary Functions](#s12)
2. [Example](#s2)
2. [$\tilde{P}_{n}$ and $|\tilde{P}_{n}|$ decomposition](#s3)
    1. [$\tilde{P}_{3}$ and $|\tilde{P}_{3}|$ decomposition](#s31)
    2. [$\tilde{P}_{4}$ and $|\tilde{P}_{4}|$ decomposition](#s32)
    3. [$\tilde{P}_{5}$ and $|\tilde{P}_{5}|$ decomposition](#s33)
    4. [$\tilde{P}_{6}$ and $|\tilde{P}_{6}|$ decomposition](#s34)
3. [Upper bound for $|\tilde{P}_{n}|$ ](#s4)
    1. [$U_n$](#s41)

# Functions <a name="s1"></a>

## Used throughout

In [4]:
import itertools
import pandas as pd
import numpy as np
import pickle
import time
# import ast
# import inspect

## Main Functions <a name="s11"></a>

### Constructing $\tilde{P}_{n}$ from $\tilde{P}_{n-1}$

The function get\_Pnp1 determines $\tilde{P}_{n}$ following the **procedure**.

In [5]:
# This function returns Pn+1
def get_Pnp1(Pn,n):
    """
    Objective:
    Input:
    Returns:
    """

    #inputs
    # Pn and n.

    #New data created
    np1=n+1
    power_set_str,power_set_str_np1=get_n_np1_powersets(n)

    #Load Pn as a list of dictionaries.
    list_df_Pn=Pn.to_dict(orient='records')

    #Create Pn+1 and hold as dataframe.
    P_n1=Build_Pn_1(list_df_Pn,power_set_str,n)
    Pnp1=extend_sets_to_Pnp1(P_n1)

    Pnp1 = Pnp1.reindex(columns=power_set_str_np1)

    inspect_Pn1(Pnp1)

    return Pnp1

The main subfunctions of which is the Build\_Pn\_1, and extend\_sets\_to\_Pnp1 puts the data obtained from Build\_Pn\_1 into a dataframe.

In [6]:
def Build_Pn_1(list_df_Pn,power_set_str,n):
    """
    Objective:
    Input:
    Returns:
    """
    # Returns:packets of extensions for each f_dict from Pn, can check those with max size of closed sets (later)
    
    # global power_set_str
    
    P_n1=[]
    for f_dict in list_df_Pn:

        # Store primitive closed sets 

        PC=[(x,get_barj({x},f_dict)) for x in power_set_str]
        dict_PC=dict(PC)
        set_PC=[sorted_frozenset(get_barj({x},f_dict)) for x in power_set_str]

        #Get all closed sets for extensions for f_dict
        Eg=list(get_Eg(dict_PC,power_set_str,set_PC))
        Eg.append("empty")

        #Build extensions for f_dict
        extension_of_f_dict=[]
        for term in Eg:
            if term =="empty":
                extension_of_f_dict.append(empty_case_get_extension_of_f(f_dict,term,power_set_str,n))
            else:
                extension_of_f_dict.append(get_extension_of_f(f_dict,term,power_set_str,n))

        #We record packets of extensions where we take +1
        P_n1.append(extension_of_f_dict)
    
    return P_n1

In [7]:
def extend_sets_to_Pnp1(P_n1):
    """
    Objective:
    Input:
    Returns:
    """
    #Return the data frame of Pn+1
    
    final_P_n1=[]
    for ls in P_n1:
        final_P_n1=final_P_n1+ls # Adding all the extension packets together.
    df=pd.DataFrame(final_P_n1)
    # df = df.reindex(columns=power_set_str_np1)
    
    return df

The function Build\_Pn\_1 for each $g \in \tilde{P}_{n-1}$ obtains the set of $f=g+\epsilon_{I}$ for $I \in \mathbb{E}(g)$ and then take the union of such sets. The is achieved by get\_extension\_of\_f (and empty\_case\_get\_extension\_of\_f for $I=\emptyset \in \mathbb{E}(g)$).

In [8]:
def get_extension_of_f(f_dict,extender,power_set_str,n):
    """
    Objective:
    Input:
    Returns:
    """
    # Need to attatch n+1 to strings to build extension of function by epsilon.

    # Build the extension of f by epsilon.
    
    # Add the (key,values) of f_dict
    builder_ext_0=[(k, f_dict[k]) for k in f_dict]

    #We add 1 to the function value of f_dict for x in extender
    builder_ext_p1=[(add_n_p_1_to_string(x,n),f_dict[x]+1) for x in list(extender)]

    #We add 0 to the function value of f_dict for x NOT in extender
    compl_extender=set(power_set_str) - set(extender)
    builder_ext_p2=[(add_n_p_1_to_string(x,n),f_dict[x]+0) for x in list(compl_extender)]
    
    # We include f({n+1})=0
    builder_ext_np1=[(add_n_p_1_to_string("",n),int(0)),]


    #Compile together.
    extension_f_dict={**dict(builder_ext_0),**dict(builder_ext_p1),**dict(builder_ext_p2),**dict(builder_ext_np1)}
    
    #Example
    # get_extension_of_f(frozenset({'13', '23'}))

    return extension_f_dict

In [9]:
def empty_case_get_extension_of_f(f_dict,extender,power_set_str,n):
    """
    Objective:
    Input:
    Returns:
    """
    # Need to attatch n+1 to strings to build extension of function by epsilon.

    # Build the extension of f by epsilon.
    
    # Add the (key,values) of f_dict
    builder_ext_0=[(k, f_dict[k]) for k in f_dict]

    builder_ext_p2=[(add_n_p_1_to_string(x,n),f_dict[x]+0) for x in list(power_set_str)]

    builder_ext_np1=[(add_n_p_1_to_string("",n),int(0)),]

    
    #Compile together.
    extension_f_dict={**dict(builder_ext_0),**dict(builder_ext_p2),**dict(builder_ext_np1)}
    
    return extension_f_dict

To obtain $\mathbb{E}(g)$ we use the function get\_Eg. Which is the implmentation of th Algorithm of section $5.1$.

Algorithm: Obtain $\mathbb{E}(g)$
**Input:** $g \in \tilde{P}_{n}$
**Output:** $\mathbb{E}(g)$
1. **Initialization:** Set $\mathbb{E}(g) = \{\emptyset\}$
2. Define a recursive function: Recursion($\overline{I}, y$)
    1. Let $\overline{J} := \overline{I} \cup \overline{\{y\}}$
    2. Add $\overline{J}$ to $\mathbb{E}(g)$
    3. For each $w \in \mathcal{P}_{n}^{+} \setminus \overline{J}$, do the following:
        1. Call Recursion($\overline{J}, w$)
3. **End Function**
4. For each $x \in \mathcal{P}_{n}^{+}$, do the following:
    1. Let $I := \{x\}$
    2. Add $\overline{I}$ to $\mathbb{E}(g)$
    3. For each $y \in \mathcal{P}_{n}^{+} \setminus \overline{I}$, do the following:
        1. Call Recursion($\overline{I}, y$)

To enhance efficiency and reduce computation time of step 4.C. (so that we do not need to iterate through the powerset completely each time) we've introduced:

- **Memory:** We utilize the "Memory" variable as a memoization technique to store previous results, thus preventing redundant calculations.

These stored results are referred to as "indicators." Here's how it works:

- Before initiating a new recursion with a specific set of indicators (which track the current state), we check if these indicators are already present in Memory. If they are, we skip the associated calculations, saving processing time.
- If a set of indicators is not found in Memory, it signifies that our algorithm hasn't encountered this specific combination previously. Consequently, we proceed with the recursion and store this set of indicators in "Memory."

In [10]:
def get_Eg(dict_PC,power_set_str,set_PC):
    """
    Objective: Compute the set of all closed sets, Eg, using a given dictionary of primitive closed sets (dict_PC),
               the power set of elements (power_set_str), and a set of primitive closed sets (set_PC).
               
               We do this by building up Eg using Recur()
               
               We deal with the empty set separatley.

    Input:
    - dict_PC (dict): A dictionary where keys are elements of the power set (as strings) 
    and values are primitive closed sets.
    - power_set_str (list or set): The power set of elements represented as a list or set.
    - set_PC (set): A set of primitive closed sets.

    Returns:
    - Eg (set): The set of all closed sets.
    """
    
    #We start with what we know which are the primitve closed sets.
    Eg={frozenset(power_set_str)}.union(set_PC) 
    

    Memory={sorted_frozenset(("0"))}    # contains indicators which are tuples of x \in powerset for primitives sets.
    
    for x in power_set_str: #(Step 1 of Algorthim)
        # print(f"Term in powerset: {x}")
        Ibar=dict_PC[x] #primitive closed set #(Step 2 of Algorthim)
        power_complment=set(power_set_str)-Ibar # \mathcal{P}_{n}^{+} \setminus \overline{I}
        for y in power_complment: #exhaustively getting all closed sets. #(Step 4 of Algorthim)
            
            # print("Eg size:",len(Eg))
            #We introduce indicators which records that we are taking y in the complement of the closure of x
            indicators=(x,y)
            set_indicators=set([sorted_frozenset(tuple(indicators)),])
            
            Memory=Memory.union(set_indicators)            
            Eg,Memory=Recur(Ibar,indicators,Eg,Memory,dict_PC)
            
    return Eg

The following function is the recusive part of above algorithm, in particular from step 4.C onwards. We do this separate the recursive part from the outer for loop for simplicity, as primitive closed sets are easy to obtain (seen later) and so are $\mathcal{P}_{n}^{+}$ and $\emptyset$ (which are always in $\mathbb{E}(g)$).

In [11]:
def Recur(Ibar,indicators,Eg,Memory,dict_PC):
    """
    Objective: Recursively generate Eg while preventing repeated calculations using Memory.

    Input:
    - Ibar (set): The set closure of I for some I \subseteq \mathcal{P}{n}^{+}
    
    - indicators (tuple): A tuple of elements of \mathcal{P}{n}^{+} indicators 
    indicators store recursive paths we have taking that is 
    e.g when we take y in the complement of the closure of x. 
    So for the algorithim stated above example indicators would be (x,y) (x,y,w)
    We write indicators in a standard ordered form to help memoization process

    - Memory (set): A set used for memoization to prevent repeated calculations storing tuples of indicators

    - Eg (set): Begins as the set of primitive closed sets and ends at the full set Eg
    
    - dict_PC (dict): A dictionary recording the element of p_n^{+} and the primitive closed sets.

    Returns:
    - Eg (set): The updated set of generated primitive closed sets.
    - Memory (set): The updated set of indicators used for memoization.
    """
    # global dict_PC
        
    # To save recalulating we uses this series of checks, if fails checks end recursion turn.
    
    if indicators in Memory:# Memortisation for preventing repeated calculations.
        # print("Fails Memory check")
        # print(f"\n At this failure we have \n Eg: {len(Eg)} \n Memory: {len(Memory)} \n")
        return Eg,Memory
    
    #Steps 2A and 2B of algo
    
    y=indicators[-1]
    # print(f"term taken for primitive closed set we are unioning {y}")
    P_y=dict_PC[y] # the primitive closed set wrt y
    Jbar= Ibar.union(P_y)
    
    # End if alread have Jbar
    
    if len(Jbar)==len(power_set_str): # We always have the power set in Eg
        # print(f"Fails as {Jbar} == poweset")
        # print(f"\n At this failure we have \n Eg: {len(Eg)} \n Memory: {len(Memory)} \n")
        return Eg,Memory
              
    if Jbar in Eg:
        # print(f"Fails as {Jbar} in Eg")
        # print(f"\n At this failure we have \n Eg: {len(Eg)} \n Memory: {len(Memory)} \n")
        return Eg,Memory
    
    #Repeat recursion step
    else:
        # print("Passes Checks \n")
        #Add the new data to memory
        
        set_indicators=set([sorted_frozenset(tuple(indicators)),])
        Memory=Memory.union(set_indicators)
        
        Jbar_union=set([sorted_frozenset(Jbar),])#Correct format for union
        #Add new data to Eg
        Eg=Eg.union(Jbar_union) # will ge an issue with where defines and globals.

        #Move onto the next recursion.
        power_complment=set(power_set_str)-Jbar
        for w in power_complment: # pick one in the complement to to continus exhaustive method
            new_indicators=indicators+(w,) #want to add it at end
            Eg,Memory=Recur(Jbar,new_indicators,Eg,Memory,dict_PC)
            # print(f"Done with {new_indicators} \n")
                
    return Eg,Memory # will also be an issue.

It remains to show how we take closure of any $I \subseteq \mathcal{P}_{n}^{+}$ with respect to $g$. Recall Definition $5.1.4$. That the closure of some $I \subseteq \mathcal{P}_{n}^{+}$ denoted as, $\overline{I}$, with respect to $g$ is be the largest set in the following sequence of subsets of $\mathcal{P}_{n}^+$,

$$I=I_0 \subseteq  \cdots \subseteq I_k $$

where $I_{i}:=B_{i-1}\cup I_{i-1}$ (we call this the $i$-partial closure of $I$) such that 

$$    B_{i-1}=  \left\{y \in \mathcal{P}_{n}^{+} \mid \forall x \in I_{i-1} \text{ such that } y \supsetneq x \text{ is $g$-minimal} \text{ or }
y \subsetneq x  \text{ is $g$-maximal} \right\}.$$ 


The next function takes $I:=I_0 \in \mathcal{P}_{n}^{+}$ and returns $I_{1}$ as defined above.


In [12]:
def next_j(ji,f_dict):
    """
    Objective: Get the $i+1$-partial closure of $I$ of ji with respect to f_dict. 
    
    By iterating through elements in ji and checks conditions involving fmin and fmax to determine which elements should be included in the new set jip1.
    
    Input:
    #ji is the ith recursion set on constructing the closure. 
    f_dict: the function ususal denoted as g
    
    Returns: closure of ji.
    """
    # print("ji",ji)
    # global f
    
    my_list=[]    
    
    
    for x in power_set_str:
        for y in ji:
            if fmin(y,x,f_dict) and string_to_set(y).issubset(string_to_set(x)):
                # print("y is",y)
                my_list.append(x)
                # print(f"{x} containing {y} f min ")
    for x in power_set_str:
        for y in ji:
            if fmax(x,y,f_dict) and string_to_set(x).issubset(string_to_set(y)):
                my_list.append(x)
                
    
    jip1=set(my_list).union(ji)
    
    return jip1    

The following function allows us to go from $I_i$ to $I_{i+1}$, until we end at the closure of $I$ wrt $g$.

In [13]:
def rec_j(T,S,f_dict):
    
    
    """
    Objective: Given the $i$-partial closure of $S:=Ii$ to the $i+1$-partial closure of $T:=Ii+1$
    Input:
    T: (T=Ii)
    S : S=Ii-1)
    
    f_dict : function
    
    Returns: The closure of S.
    """
    # T=Ji and S=Ji-1, |T|\ge |S|.
    
    
    """
    We ask if the set T=S. If so them T is closed, if not we are not finished taking the closure.
    """
    C=string_to_set(T)-string_to_set(S) #T \S # New elements
    C={str(x) for x in C}    
    # print("C",C,f"S term {S}",f"T term {T}")
    
    #We are done
    if len(C)==0: #Check if closed set 
        return T # return closed set
    
    #Not done yet.
    if len(C)>0:
        """We then take the next partial-closure of T wrt Definion 5.1.4"""
        U=next_j(T,f_dict) #next_j(T)
        
        #We then repeat this function.
        Z=rec_j(U,T,f_dict)
    # print(T)
    return Z

Finally the last function combines next\_j and rec\_j to give the closure of $I$ with respect to $g$.

In [14]:
def get_barj(J,f_dict):
    """
    Objective: Gets the closure of J \subseteq \mathcal{P}_{n}^{+} with respect to f in \tilde{P}_n.
    Input:
    J: is a element of the \mathcal{P}_{n}^{+}
    f_dict in \tilde{P}_n.
    
    Returns:closure of J
    """
    
    j0=J
    j1=next_j(j0,f_dict)
    T=rec_j(j1,j0,f_dict)
    return T

Note that we can use get\_barj to immediatley obtain all primitive closed sets with respect to $g$. Remark: The inputs dict\_PC and set\_PC  for get\_Eg, are a dictionary and set of primitive closed sets respectively, of a function $g \in\tilde{P}_{n-1}$, which are obtained by get\_barj. Note we obtain the primitive closed sets, $\mathcal{P}_{n}^{+}$ and $\emptyset$ separately.

Lastly we need to method to determine for $A,B \in \mathcal{P}_{n}^{+}$ whether $A \subseteq B$ is $g$-minimal or $g$-maximal. 

In [15]:
def fmin(A,B,f_dict):
    """
    Objective: Determine if A is f-minimal with respect to B using a given dictionary of values.

    Input:
        - A: A string representing a set from the power set P_n.
        - B: A string representing another set from the power set P_n where A is a subset of B.
        - f_dict: A dictionary mapping set strings to corresponding values.

    Returns:
        - True if A is f-minimal with respect to B, False otherwise.
    """
    
    #A included in B
    # global f_dict
    
    valA=f_dict[A]
    valB=f_dict[B]
    compBA=set_to_string(string_to_set(B)-string_to_set(A)) # B minus A
    if len(compBA)==0: # to avoide when taking the complement gives empty set
        return False
    valBA=f_dict[compBA]
    if valB==valA+valBA:
        return True
    else:
        return False
    
def fmax(A,B,f_dict):
    """
    Objective: Determine if A is f-maximal with respect to B using a given dictionary of values.

    Input:
        - A: A string representing a set from the power set P_n.
        - B: A string representing another set from the power set P_n where A is a subset of B.
        - f_dict: A dictionary mapping set strings to corresponding values.

    Returns:
        - True if A is f-maximal with respect to B, False otherwise.
    """
    #A included in B
    # global f_dict
    
    valA=f_dict[A]
    valB=f_dict[B]
    compBA=set_to_string(string_to_set(B)-string_to_set(A)) # B minus A
    
    #A has to be a subset of B first
    if len(compBA)==0: # to avoide when taking the complement gives empty set
        return False
    valBA=f_dict[compBA]
    if valB==valA+valBA+1:
        return True
    else:
        return False

### To determine $|\tilde{P}_{n}|$

As stated in the thesis. To decompose $p_n$ we give the sequences $\boldsymbol{e}_{n}$ and $\boldsymbol{d}_{n}$ by recording for each  $g \in \tilde{P}_{n-1}$ the size $|\mathbb{E}(g)|$ when we construct $\tilde{P}_{n}$ from $\tilde{P}_{n-1}$. We then state $p_{n+1}=e_n \cdot d_n$. 

The following function allows us to record the size of $\mathbb{E}(g)$ for each $g \in \tilde{P}_{n-1}$ by adding the key:value pair of the form 'Eg\_Size':$|\mathbb{E}(g)|$ to each $f=g+\epsilon_{I} \in \tilde{P}_{n}$. 

In [16]:
def get_Eg_column_Pn(df):
    df=df.copy()

    # Create a new column called "new_column"
    df['Eg_Size'] = None

    # Insert data into the new column
    for i, row in df.iterrows():    
        Eg=get_Eg_for_single_g(df,i,power_set_str)
        df.at[i, 'Eg_Size'] = int(len(Eg))

    return df

## Secondary <a name="s12"></a>

### General constructing functions

The following functions allow us to state $\mathcal{P}_{n}^{+}=\mathcal{P}_{n} \setminus \emptyset$ and $\mathcal{P}(\mathcal{P}_{n}^{+})$ (where $\mathcal{P}_{n}$ is the powerset of $\{1,\dots ,n\}$).

In [17]:
def get_powerset_str(n):
    """
    Objective: Generate the power set of integers from 1 to n and represent it as a list of sorted strings.
    
    Input:
        - n: An integer representing the upper bound of the set of integers (1 to n).
        
    Returns:
        - power_set_str: A list of sorted strings representing the power set of integers from 1 to n.
    """
    s=set(range(1,n+1))
    power_set = [set(x)  for r in range(len(s) + 1) for x in itertools.combinations(s, r)]
    power_set_str=[set_to_string(x) for x in power_set if len(x)>0]

    return power_set_str

def get_n_np1_powersets(n):
    """
    Objective: Generate the power sets of integers from 1 to n and 1 to (n+1), representing them as lists of sorted strings.
    
    Input:
        - n: An integer representing the upper bound of the sets of integers (1 to n and 1 to n+1).
        
    Returns:
        - power_set_str: A list of sorted strings representing the power set of integers from 1 to n.
        - power_set_str_np1: A list of sorted strings representing the power set of integers from 1 to n+1.
    """
    s=set(range(1,n+1))
    power_set = [set(x)  for r in range(len(s) + 1) for x in itertools.combinations(s, r)]
    power_set_str=[set_to_string(x) for x in power_set if len(x)>0]

    np1=n+1
    sp1=set(range(1,n+2))
    power_set_np1 = [set(x)  for r in range(np1+1 + 1) for x in itertools.combinations(sp1, r)]
    power_set_str_np1=[set_to_string(x) for x in power_set_np1 if len(x)>0]

    return power_set_str,power_set_str_np1

As sets $I \in\mathcal{P}_{n}^{+}$ are mutable we use the following function to put them into a standard form and make them immutable.

In [18]:
def sorted_frozenset(s):
    """
    Objective: Sort the elements of a set and return them as a frozenset.
    
    Input:
        - s: A set of elements.
        
    Returns:
        - r: A frozenset containing the sorted elements from the input set.
    """
    t=sorted(s)
    r=frozenset(t)
    return r

The following functions allow us to store $g \in \tilde{P}_n$ as a dictionary (e.g. of the form {'1': 0, '2': 0, '12': 0}) in particular the keys as a string representation of $I \in \mathcal{P}_{n}^{+}$ and to change back to set notation when needed.

In [19]:
def set_to_string(s):
    """
    Objective: Convert a set of integers to a sorted string representation.
    
    Input:
        - s: A set of integers.
        
    Returns:
        - result: A string containing the sorted integers from the input set.
    """
    
    
    s = list(s)
    s.sort()
    # Convert the list of numbers to a list of strings
    my_list=s
    my_list = [str(i) for i in my_list]
    # Use the join() function to convert the list of strings to a single string
    result = ''.join(my_list)
    return result

def string_to_set(s):
    """
    Objective: Convert a string of digits to a set of integers.
    
    Input:
        - s: A string of digits.
        
    Returns:
        - my_set: A set containing the integers parsed from the input string.
    """
    my_list = list(s)
    # Convert the list of strings to a list of integers
    my_list = list(map(int,my_list))
    # Convert the list to a set
    my_set = set(my_list)
    return my_set

### For Build\_Pn\_1

Let $f=g+\epsilon_{I}$ for some $I \in \mathbb{E}(g)$, we then have $f(A \cup \{n+1\})= g(A)+\epsilon_{I}(A)$ for $A \in \mathcal{P}_{n-1}^{+}$. The following function allow us to obtain $A \cup \{n+1\} \in \mathcal{P}_{n}^{+}$.

In [20]:
def add_n_p_1_to_string(string,n):
    """
    Objective: Adds n+1 to the input string.

    Input:
    - string (str): The input string to which n+1 will be added.
    - n (int): The number to be added to the string.

    Returns:
    - np1string (str): A new string formed by concatenating the input string and (n+1).
    """
    #Adds n+1 to strings subsets
    np1string=string+(f"{n+1}")
    return np1string

### $\mathbb{E}(g)$ for a fixed $g$

The following function attains $\mathbb{E}(g)$ for $g$, where power\_set\_str denotes $\mathcal{P}_{n}^{+}$.

In [21]:
def get_Eg_for_single_g_dict(funct,power_set_str):
    # print(power_set_str)
    #Obj: puts Eg into correct format to check if topolgoy
    #Inputs: df=Pn, numb used to get a specific function from df
    
    # Pick function in Pn
    g=funct
    
    # print(power_set_str)

    #get primitive closed sets
    PC=[(x,get_barj({x},g)) for x in power_set_str]
    dict_PC=dict(PC)
    set_PC=[sorted_frozenset(get_barj({x},g)) for x in power_set_str]
    # print("len set_PC",len(set_PC))
    
    #Get all closed sets for extensions for g
    Eg=list(get_Eg(dict_PC,power_set_str,set_PC))
    Eg.append(frozenset()) #rembmer the empty set
    return Eg

In [22]:
def get_Eg_for_single_g(df,numb,power_set_str):

    #Obj: puts Eg into correct format to check if topolgoy
    #Inputs: df=Pn, numb used to get a specific function from df
    
    # Pick function in Pn
    g=df.iloc[numb].to_dict()

    #get primitive closed sets
    PC=[(x,get_barj({x},g)) for x in power_set_str]
    dict_PC=dict(PC)
    set_PC=[sorted_frozenset(get_barj({x},g)) for x in power_set_str]

    #Get all closed sets for extensions for g
    Eg=list(get_Eg(dict_PC,power_set_str,set_PC))
    Eg.append(frozenset()) #rembmer the empty set
    return Eg

### Primitive closed sets of $\mathbb{E}(g)$ for a fixed $g$

Fix $g \in \tilde{P}_{n}$. We now obtain all the primitive closed sets of $\mathbb{E}(g)$.

In [23]:
# df=Pn for some n
def get_primitive_for_single_g(g,power_set_str):
    """
    Objective:
    Input:
    Returns: 
    
    set_PC: set of primitive closed set
    dict_PC: dictionary with keys as terms of powerset^{+} with value as the primitive closed assoicated.
    """
    #Inputs: df=Pn, numb used to get a specific function from df
    
    # Pick function in Pn
    g=g

    #get primitive closed sets
    PC=[(x,get_barj({x},g)) for x in power_set_str]
    dict_PC=dict(PC)
    set_PC=[sorted_frozenset(get_barj({x},g)) for x in power_set_str]

    return dict_PC,set_PC

### Analysis of $\tilde{P}_{n}$<a name="s114"></a>

The following function determines whether all constructed functions are distinct.

In [24]:
def inspect_Pn1(df):
    """
    Objective:
    Input:
    Returns:
    """
    print("Number of functions:",len(df.index))
    # df=df.unique()

    duplicate_rows = df[df.duplicated()]
    print("Are there duplicates?",duplicate_rows.shape, "if (0,-) then no common rows")

    # df.head()
    return

The following functions allow one to check whether a function is mildly super additive.

In [25]:
def is_valid_function(f,power_set_str):
    """
    Objective: Check if a given function satisfies the properties of a MSA condition
    
    Input:
    f (dict): A dictionary representing the function, where keys are sets in power_set_str and values are integers.
    power_set_str (list of str): A list of string representations of sets.
    
    Returns:
    bool: True if the function satisfies the MSA properties, False otherwise.
    """
    # define the condition to check for each function
    Pn=[string_to_set(x) for x in power_set_str] #convert to sets
    
    # f(T) must equal 0 for elements T in Pn with size 1

    for T in Pn:
        if any(f[set_to_string(T)] != 0 for T in Pn if len(T) == 1):
            return False
    # for any I, J in Pn such that I and J have no common elements, 1 > f(I U J) - f(I) - f(J) or 0 = f(I U J) - f(I) - f(J)
    for I in Pn:
        for J in Pn:
            if len(I.intersection(J)) == 0 and f[set_to_string(I.union(J))] - f[set_to_string(I)] - f[set_to_string(J)] < 0 or f[set_to_string(I.union(J))] - f[set_to_string(I)] - f[set_to_string(J)] > 1:
            # for any I, J in Pn such that I and J have no common elements, 0 <= f(I U J) - f(I) - f(J) <= 1
                return False
    return True

def check_functs_MSA(df,power_set_str):
    """
    Objective: Check a DataFrame of functions for MSA compliance.
    
    Input:
    df (pd.DataFrame): A DataFrame where each row represents a function as a dictionary.
    power_set_str (list of str): A list of string representations of sets.
    
    Returns:
    pd.DataFrame: A DataFrame containing the functions that do not satisfy MSA properties.
    """

    num_rows=df.shape[0]

    bad_f_in_df=[] #indices of functions fail to be MSA in df
    for i in range(0,num_rows):
        f=df.loc[i].to_dict()
        if  is_valid_function(f,power_set_str)==False:
            bad_f_in_df.append(f)
    
    bad_df=pd.DataFrame(bad_f_in_df)
    
    return bad_df #indices of functions that are not MSA

### Building $\tilde{P}_{6}$ and decomposing $|\tilde{P}_{6}|$ 

The following function takes the $\tilde{P}_{n-1}$ and attains $\tilde{P}_{n}$ with the column 'Eg_Size':$|\mathbb{E}(g)|$ for each $g \in \tilde{P}_{n-1}$ which extends to said function in $f \in \tilde{P}_{n}$.

In [26]:
def eg_column_Build_Pn_1(list_df_Pn,power_set_str,n):
    
    """
    Obj:Use Build_Pn_1 and modify change to include Eg size column in functions
    """
    
    # Returns:packets of extensions for each f_dict from Pn, can check those with max size of closed sets (later)
    
    P_n1=[]
    for index,f_dict in enumerate(list_df_Pn):

        # Store primitive closed sets 
        PC=[(x,get_barj({x},f_dict)) for x in power_set_str]
        dict_PC=dict(PC)
        set_PC=[sorted_frozenset(get_barj({x},f_dict)) for x in power_set_str]

        #Get all closed sets for extensions for f_dict
        Eg=list(get_Eg(dict_PC,power_set_str,set_PC))
        Eg.append("empty")
        
        Eg_col_kvpair={"Eg_Size":len(Eg)}

        #Build extensions for f_dict
        extension_of_f_dict=[]
        for term in Eg:
            if term =="empty":                    
                funct=empty_case_get_extension_of_f(f_dict,term,power_set_str,n) #dictionary
                funct.update(Eg_col_kvpair)
                extension_of_f_dict.append(funct)
            else:
                funct=get_extension_of_f(f_dict,term,power_set_str,n)
                funct.update(Eg_col_kvpair)
                extension_of_f_dict.append(funct)

        #We record packets of extensions where we take +1
        P_n1.append(extension_of_f_dict)
    
    return P_n1

We repeat the above process but for the dataframe consisting of a subset of $\tilde{P}_{n-1}$. 

In [27]:
def get_Pnp1_for_single_funct(Pn,n):
    
    #inputs
    # Pn:dataframe and n.
    
    """
    Returns a dataframe of extensions in Pn+1 for dataframe of functions in Pn.
    """
    """
    #Example
    n=3
    power_set_str,power_set_str_np1=get_n_np1_powersets(n)
    # P3

    funct_list=[{'1': 0,'2': 0,"3":0,"12":0,"23":0,"13":0,"123":0}]
    Pn=pd.DataFrame(funct_list)

    #The following should be functions in P4 that are extendions of functions in funct_list
    f_P3=get_Pnp1_for_single_funct(Pn,n)
    print(f_P3)
    """

    #New data created
    power_set_str,power_set_str_np1=get_n_np1_powersets(n)

    #Load Pn as a list of dictionaries.
    list_df_Pn=Pn.to_dict(orient='records')

    #Create Pn+1 and hold as dataframe.
    P_n1=eg_column_Build_Pn_1(list_df_Pn,power_set_str,n)
    Pnp1=extend_sets_to_Pnp1(P_n1) #<---- ISSUE
    
    power_set_str_np1
    
    columns=list(power_set_str_np1)+["Eg_Size"]
    Pnp1 = Pnp1.reindex(columns=columns)


    inspect_Pn1(Pnp1)

    return Pnp1

Here we see get_Pnp1_for_single_funct in action for P3.

In [28]:
#Testing functions example
n=3
power_set_str,power_set_str_np1=get_n_np1_powersets(n)
# P3

# You can take subsets of rows as a database, then extend. Better than below
funct_list=[{'1': 0,'2': 0,"3":0,"12":0,"23":0,"13":0,"123":0}]
Pn=pd.DataFrame(funct_list)
print("The functions we are extending")
print(Pn)

#The following should be functions in P4 that are extendions of functions in funct_list
f_P3=get_Pnp1_for_single_funct(Pn,n)
print(f_P3)

The functions we are extending
   1  2  3  12  23  13  123
0  0  0  0   0   0   0    0
Number of functions: 19
Are there duplicates? (0, 16) if (0,-) then no common rows
    1  2  3  4  12  13  14  23  24  34  123  124  134  234  1234  Eg_Size
0   0  0  0  0   0   0   0   0   0   0    0    0    1    0     1       19
1   0  0  0  0   0   0   0   0   1   1    0    1    1    1     1       19
2   0  0  0  0   0   0   0   0   0   1    0    1    1    1     1       19
3   0  0  0  0   0   0   0   0   1   0    0    1    0    1     1       19
4   0  0  0  0   0   0   1   0   1   0    0    1    1    1     1       19
5   0  0  0  0   0   0   1   0   0   0    0    1    1    1     1       19
6   0  0  0  0   0   0   0   0   0   0    0    1    1    1     1       19
7   0  0  0  0   0   0   0   0   0   0    0    1    1    0     1       19
8   0  0  0  0   0   0   0   0   0   0    0    1    0    0     1       19
9   0  0  0  0   0   0   1   0   1   1    0    1    1    1     1       19
10  0  0  0  0  

The following function allows us to break the construction of $\tilde{P}_{6}$ into parts.

In [29]:
def Pnp1_from_Pn_part(df,n,calc_start,calc_end):
    # df where we are extending from
    # calc_start=0
    # calc_end=5

    rows=range(calc_start,calc_end) #What functions we are extending
    Pn=df.iloc[rows] #takes database of rows 0,1..,calc_up_to

    # print("The functions we are extending")
    # print(Pn,"\n")

    part_Pn=get_Pnp1_for_single_funct(Pn,n) #Extensions of Pn
    # print(part_Pn.shape)
    
    strt=f"P{n+1}_parts\{calc_start}_{calc_end}_Part_fromP{n}.xlsx"
    part_Pn.to_excel(strt) #Stores in excel
    return

# Example <a name="s2"></a>

Lets first investigate a single function $g \in \tilde{P}_{3}$, and obtain $\mathbb{E}(g)$ and the subsequent $f=g+\epsilon_{I}$. First lets define $\mathcal{P}_{3}^{+}$ and $\mathcal{P}_{4}^{+}$.

In [30]:
n=3
power_set_str,power_set_str_np1=get_n_np1_powersets(n)

Let $g$ (i.e. $f_{U_3}$) be as below.

In [31]:
f_u3={'1': 0,'2': 0,"3":0,"12":0,"23":0,"13":0,"123":0}
funct_list=[f_u3]
d_funct=pd.DataFrame(funct_list)
d_funct

Unnamed: 0,1,2,3,12,23,13,123
0,0,0,0,0,0,0,0


In [32]:
#The following should be functions in P4 that are extendions of functions in funct_list
f_P3=get_Pnp1_for_single_funct(d_funct,n)
print(f_P3)

Number of functions: 19
Are there duplicates? (0, 16) if (0,-) then no common rows
    1  2  3  4  12  13  14  23  24  34  123  124  134  234  1234  Eg_Size
0   0  0  0  0   0   0   0   0   0   0    0    0    1    0     1       19
1   0  0  0  0   0   0   0   0   1   1    0    1    1    1     1       19
2   0  0  0  0   0   0   0   0   0   1    0    1    1    1     1       19
3   0  0  0  0   0   0   0   0   1   0    0    1    0    1     1       19
4   0  0  0  0   0   0   1   0   1   0    0    1    1    1     1       19
5   0  0  0  0   0   0   1   0   0   0    0    1    1    1     1       19
6   0  0  0  0   0   0   0   0   0   0    0    1    1    1     1       19
7   0  0  0  0   0   0   0   0   0   0    0    1    1    0     1       19
8   0  0  0  0   0   0   0   0   0   0    0    1    0    0     1       19
9   0  0  0  0   0   0   1   0   1   1    0    1    1    1     1       19
10  0  0  0  0   0   0   1   0   0   0    0    1    1    0     1       19
11  0  0  0  0   0   0   1   

We see that $|\mathbb{E}(g)|=19$ and there a $19$ many $f=g+\epsilon_{I}\in \tilde{P}_{4}$.

We can ask what explicitly is $\mathbb{E}(g)$? First what are the primitive closed sets in $\mathbb{E}(g)$?

In [None]:
#get primitive closed data.
# dict_PC,set_PC=get_primitive_for_single_g(f_u3,power_set_str)

In [None]:
# dict_PC

# Primitive closed sets
# {'1': {'1', '12', '123', '13'},
#  '2': {'12', '123', '2', '23'},
#  '3': {'123', '13', '23', '3'},
#  '12': {'12', '123'},
#  '13': {'123', '13'},
#  '23': {'123', '23'},
#  '123': {'123'}}

We now obtain the set $\mathbb{E}(g)$.

In [None]:
# Eg=get_Eg(dict_PC,power_set_str,set_PC)
# Eg.add("empty")

# # {'empty',
# #  frozenset({'1', '12', '123', '13', '2', '23'}),
# #  frozenset({'1', '12', '123', '13', '23'}),
# #  frozenset({'1', '12', '123', '13', '2', '23', '3'}),
# #  frozenset({'1', '12', '123', '13', '23', '3'}),
# #  frozenset({'12', '123'}),
# #  frozenset({'12', '123', '13', '2', '23'}),
# #  frozenset({'12', '123', '13'}),
# #  frozenset({'123', '13'}),
# #  frozenset({'123', '23'}),
# #  frozenset({'123'}),
# #  frozenset({'1', '12', '123', '13'}),
# #  frozenset({'12', '123', '2', '23'}),
# #  frozenset({'12', '123', '23'}),
# #  frozenset({'12', '123', '13', '23'}),
# #  frozenset({'123', '13', '23', '3'}),
# #  frozenset({'123', '13', '23'}),
# #  frozenset({'12', '123', '13', '2', '23', '3'}),
# #  frozenset({'12', '123', '13', '23', '3'})}

Let us consider the following function $f_{L_3}$ and determine $\mathbb{E}(f_{L_3})$.

In [None]:
f_l3={'1': 0,'2': 0,"3":0,"12":0,"23":1,"13":1,"123":1}

In [None]:
#get primitive closed data.
dict_PC,set_PC=get_primitive_for_single_g(f_l3,power_set_str)

In [None]:
# dict_PC
# {'1': {'1', '12', '123', '3'},
#  '2': {'12', '123', '2', '3'},
#  '3': {'3'},
#  '12': {'12'},
#  '13': {'1', '12', '123', '13', '3'},
#  '23': {'12', '123', '2', '23', '3'},
#  '123': {'12', '123', '3'}}

In [None]:
# Eg=get_Eg(dict_PC,power_set_str,set_PC)
# Eg.add("empty")
# Eg
# {'empty',
#  frozenset({'1', '12', '123', '2', '23', '3'}),
#  frozenset({'1', '12', '123', '2', '3'}),
#  frozenset({'1', '12', '123', '13', '2', '23', '3'}),
#  frozenset({'1', '12', '123', '13', '2', '3'}),
#  frozenset({'1', '12', '123', '13', '3'}),
#  frozenset({'12', '3'}),
#  frozenset({'12'}),
#  frozenset({'1', '12', '123', '3'}),
#  frozenset({'12', '123', '3'}),
#  frozenset({'12', '123', '2', '3'}),
#  frozenset({'12', '123', '2', '23', '3'}),
#  frozenset({'3'})}

# $\tilde{P}_{n}$ and $|\tilde{P}_{n}|$ decomposition <a name="s3"></a>

We now determine $\tilde{P}_{n}$ by $\tilde{P}_{n-1}$ for $n \le 6$ recursively using the get_Pnp1() described in Section "Functions",

 and give a decomposition of $|\tilde{P}_{n}|$ in terms $|\tilde{P}_{n-1}|$. Recall the following. Let $(p_{n})_{n \in \mathbb{N}}$ denote the sequence of terms $p_{n}:=|\tilde{P}_n|$. Let $\boldsymbol{e}_{n}:=(e_{n,i})$ denote the increasing sequence of distinct terms $|\mathbb{E}(g)|$ for $g \in \tilde{P}_{n}$, and $\boldsymbol{d}_{n}:=(d_{n,i})$ denote the sequence where $d_{n,i}$ denotes the number of $g \in \tilde{P}_{n}$ which give the value $e_{n,i}$.

  $$p_{n+1} =\boldsymbol{e}_n \cdot \boldsymbol{d}_{n}=\sum e_{n,i}
 \times d_{n,i}.$$

## $\tilde{P}_{3}$ and $|\tilde{P}_{3}|$ decomposition <a name="s31"></a>

We now obtian $\tilde{P}_{3}$ by extending $\tilde{P}_{2}$. It is clear that $\tilde{P}_{2}$ consists of exactly two functions. 

In [176]:
#Easy to know $P_2$
P2=pd.DataFrame([{'1': 0, '2': 0, '12': 0},{'1': 0, '2': 0, '12': 1}])

We first define $\mathcal{P}_{2}^{+}$ and $\mathcal{P}_{3}^{+}$.

In [177]:
n=2
power_set_str,power_set_str_np1=get_n_np1_powersets(n)

In [178]:
%%time
P3=get_Pnp1(P2,2)
print(P3)

Number of functions: 10
Are there duplicates? (0, 7) if (0,-) then no common rows
   1  2  3  12  13  23  123
0  0  0  0   0   1   0    1
1  0  0  0   0   1   1    1
2  0  0  0   0   0   1    1
3  0  0  0   0   0   0    1
4  0  0  0   0   0   0    0
5  0  0  0   1   1   1    2
6  0  0  0   1   0   1    1
7  0  0  0   1   1   1    1
8  0  0  0   1   1   0    1
9  0  0  0   1   0   0    1
CPU times: total: 0 ns
Wall time: 14.7 ms


When consider the extension of $\tilde{P}_{2}$ to $\tilde{P}_{3}$ we record the size of each $|\mathbb{E}(g)|$ for $ \in \tilde{P}_{2}$. We then obtain the sequences $e_{2}$ and $d_{2}$ (possiblly not ordered by increasing size), and state

$$p_3=\boldsymbol{e}_2 \cdot \boldsymbol{d}_2.$$

In [139]:
P2_eg=get_Eg_column_Pn(P2)
P2_eg=P2_eg.sort_values('Eg_Size') 
df=P2_eg
print(f"e_n : d_n")
print(df['Eg_Size'].value_counts()) #count number of 13s and 19s
# print(P2_eg)

e_n : d_n
5    2
Name: Eg_Size, dtype: int64
   1  2  12 Eg_Size
0  0  0   0       5
1  0  0   1       5


We have $p_3=5\cdot1 +5 \cdot 1$.

## $\tilde{P}_{4}$ and $|\tilde{P}_{4}|$ decomposition <a name="s32"></a>

We now obtian $\tilde{P}_{4}$ by extending $\tilde{P}_{3}$. We first define $\mathcal{P}_{3}^{+}$ and $\mathcal{P}_{4}^{+}$.

In [191]:
n=3
power_set_str,power_set_str_np1=get_n_np1_powersets(n)

In [192]:
%%time
P4=get_Pnp1(P3,3)

Number of functions: 154
Are there duplicates? (0, 15) if (0,-) then no common rows
CPU times: total: 78.1 ms
Wall time: 84.6 ms


We can check that each  $f \in \tilde{P}_{4}$ is in fact MSA. 

In [None]:
# power4=get_powerset_str(4) #powerset on 4 elements
# check_functs_MSA(P4,power_set_str_np1).shape
#Yes (0, 0)

When consider the extension of $\tilde{P}_{3}$ to $\tilde{P}_{4}$ we record the size of each $|\mathbb{E}(g)|$ for $ \in \tilde{P}_{3}$. We then obtain the sequences $e_{3}$ and $d_{3}$ (possiblly not ordered by increasing size), and state

$$p_4=\boldsymbol{e}_3 \cdot \boldsymbol{d}_3.$$

In [143]:
P3_eg=get_Eg_column_Pn(P3)
P3_eg=P3_eg.sort_values('Eg_Size') # get those 13,then 19

df=P3_eg
print(f"e_n : d_n")
print(df['Eg_Size'].value_counts()) #count number of 13s and 19s

In [150]:
# min_df = df[df['Eg_Size'] == 13]
# min_df

In particular $p_4 = 13 \cdot 6 + 19 \cdot 4$.

## $\tilde{P}_{5}$ and $|\tilde{P}_{5}|$ decomposition <a name="s33"></a>

We now obtian $\tilde{P}_{5}$ by extending $\tilde{P}_{4}$. We first define $\mathcal{P}_{4}^{+}$ and $\mathcal{P}_{5}^{+}$.

In [193]:
n=4
power_set_str,power_set_str_np1=get_n_np1_powersets(n) #gives powerset on 4 and 5 elements respectively

In [194]:
%%time
P5=get_Pnp1(P4,4)

Number of functions: 10334
Are there duplicates? (0, 31) if (0,-) then no common rows
CPU times: total: 6.33 s
Wall time: 10 s


We can check that each  $f \in \tilde{P}_{5}$ is in fact MSA. 

In [None]:
# power5=get_powerset_str(5) #powerset on 4 elements
# check_functs_MSA(P5,power5).shape
# #Yes (0, 0)

When consider the extension of $\tilde{P}_{4}$ to $\tilde{P}_{5}$ we record the size of each $|\mathbb{E}(g)|$ for $ \in \tilde{P}_{4}$. We then obtain the sequences $e_{4}$ and $d_{4}$ (possiblly not ordered by increasing size), and state

$$p_5=\boldsymbol{e}_4 \cdot \boldsymbol{d}_4.$$

In [195]:
P4_eg=get_Eg_column_Pn(P4)
P4_eg=P4_eg.sort_values('Eg_Size') # get those 13,then 19
df=P4_eg
print(f"e_n : d_n")
print(df['Eg_Size'].value_counts()) #count number of 13s and 19s

e_n : d_n
42     24
45     24
47     24
82     16
46     12
54     12
111    12
99     10
69      8
133     8
167     4
Name: Eg_Size, dtype: int64


Want to determinie the form of $f_{L_{n}}$ and how to extend (its definition).

$p_5 = 42 · 24 + 45 · 24 + 46 · 12 + 47 · 24 + 54 · 12 + 69 · 8 + 82 · 16 + 99 · 10 + 111 · 12 + 133 · 8 + 167 · 4$

## $\tilde{P}_{6}$ and $|\tilde{P}_{6}|$ decomposition <a name="s34"></a>

As obtaining, 

P6=get\_Pnp1(P5,5),

takes too long we break the construction into parts. During this process we record the $|\mathbb{E}(g)|$ for $g \in \tilde{P}_{5}$ from which we can determine $p_{6}=\boldsymbol{e}_{5} \cdot \boldsymbol{d}_{5}$. First we outline the process of building in parts by building $\tilde{P}_{4}$ from $\tilde{P}_{3}$ in parts.

In [None]:
n=3
power_set_str,power_set_str_np1=get_n_np1_powersets(n)
# P3
df=P3 #Total dataframe

In [None]:
calc_start=0
calc_end=5

rows=range(calc_start,calc_end) #What functions we are extending
Pn=df.iloc[rows] #takes database of rows 0,1..,calc_up_to

print("The functions we are extending")
print(Pn,"\n")

part_Pn=get_Pnp1_for_single_funct(Pn,n) #Extensions of Pn
print(part_Pn)
strt=f"P3_parts_example\{calc_start}_{calc_end}_PartP4.xlsx"
part_Pn.to_excel(strt) #Stores in excel

In [None]:
calc_start=5
calc_end=10

rows=range(calc_start,calc_end) #What functions we are extending
Pn=df.iloc[rows] #takes database of rows 0,1..,calc_up_to

print("The functions we are extending")
print(Pn,"\n")

part_Pn=get_Pnp1_for_single_funct(Pn,n) #Extensions of Pn
print(part_Pn)
strt=f"P3_parts_example\{calc_start}_{calc_end}_PartP4.xlsx"
part_Pn.to_excel(strt) #Stores in excel

 We now merge the parts together.

In [None]:
P4_part1=pd.read_excel('P3_parts_example/0_5_PartP4.xlsx', index_col=0) 
P4_part2=pd.read_excel('P3_parts_example/5_10_PartP4.xlsx', index_col=0) 

merged_df = pd.concat([P4_part1, P4_part2], ignore_index=True)

merged_df

We now construct $\tilde{P}_{6}$ from $\tilde{P}_{5}$ in parts (take about 4hrs). First we load $\tilde{P}_{5}$.

In [None]:
# Part builder cell for P5 to P6 #At this rate it will take 1.43hrs<time<6hrs to run this calculation
n=5
power_set_str,power_set_str_np1=get_n_np1_powersets(n)
df=P5 #Total dataframe

In [None]:
# start_perf,start_process = time.perf_counter(),time.process_time()
# Pnp1_from_Pn_part(df,n,0,100)
# end_perf,end_process = time.perf_counter(),time.process_time()

# print(f"Perf timer {end_perf-start_perf}") #This method returns the time in seconds.
# print(f"Process timer {end_process-start_process}") #measures the time the process takes, including time that the process is blocked

The following cells considers$100$ functions in $\tilde{P}_{5}$ and determines the set of extensions $f=g+\epsilon_{I}$ for $I \in \mathbb{E}(g)$. This process constructs a series of 101 xlxs files, which store dataframes of which the union is  $\tilde{P}_{6}$.

In [None]:
def runner(start,end): 
    start_perf,start_process = time.perf_counter(),time.process_time()
    Pnp1_from_Pn_part(df,n,start,end)
    end_perf,end_process = time.perf_counter(),time.process_time()

    print(f"Perf timer {end_perf-start_perf}") #This method returns the time in seconds.
    print(f"Process timer {end_process-start_process}") #measures the time the process takes, including time that the process is blocked
    return

In [None]:
for i in range(0, 10300, 100):
    start=i
    end=start+100
    runner(start,end)

In [None]:
calc_start=10300
calc_end=10334

rows=range(calc_start,calc_end) #What functions we are extending
Pn=df.iloc[rows] #takes database of rows 0,1..,calc_up_to

part_Pn=get_Pnp1_for_single_funct(Pn,n) #Extensions of Pn
strt=f"P6_parts\{calc_start}_{calc_end}_Part_fromP5.xlsx"
part_Pn.to_excel(strt) #Stores in excel

To obtain $p_{6}=\boldsymbol{e}_{5} \cdot \boldsymbol{d}_{5}$ we load the files in P6\_parts and record $\boldsymbol{e}_{5}$ and $\boldsymbol{d}_{5}$ (we can find $p_{6}$ by sum of the sizes of dataframes).

The following loads the data for the parts of $\tilde{P}_6$ stored in files (takes about 1.5hrs)

In [None]:
#Store number of rows for each data packet of P6
P6_size_list=[]
Eg_numb_functs_init_counter=[]

#Main Body of calculations
for i in range(0, 10300, 100):
    #Stores number for this packet in P6_size_list
    start=i
    end=start+100
    P6_part=pd.read_excel(f'P6_parts/{start}_{end}_Part_fromP5.xlsx', index_col=0)
    print(f'P6_parts/{start}_{end}_Part_fromP5.xlsx')
    row_numb=P6_part.shape[0]
    P6_size_list.append(row_numb)

    # In this packet gets Eg_Size and get counter for eg sizes and number of functions in Eg_numb_functs_init_counter
    values = P6_part['Eg_Size'].value_counts(ascending=True).keys().tolist()
    counts = P6_part['Eg_Size'].value_counts(ascending=True).tolist()
    zipped=list(zip(values,counts))
    Eg_numb_functs_init_counter.extend(zipped)
    
# It remains to do 10300-10334
#Stores number for this packet in P6_size_list
start,end=(10300,10334)
P6_part=pd.read_excel(f'P6_parts/{start}_{end}_Part_fromP5.xlsx', index_col=0)
print(f'P6_parts/{start}_{end}_Part_fromP5.xlsx')
row_numb=P6_part.shape[0]
P6_size_list.append(row_numb)

# In this packet gets Eg_Size and get counter for eg sizes and number of functions in Eg_numb_functs_init_counter
values = P6_part['Eg_Size'].value_counts(ascending=True).keys().tolist()
counts = P6_part['Eg_Size'].value_counts(ascending=True).tolist()
zipped=list(zip(values,counts))
Eg_numb_functs_init_counter.extend(zipped)

#output-------------------------------------------------------------------------

#Final answer
P6_size=sum(P6_size_list)
print(f"There are {P6_size} functions in P6 \n")

With Eg\_numb\_functs\_init\_counter we record terms $(x,y(x))$ where $x=|\mathbb{E}(g)|$ and $y(x)$ is the sum of terms $x$, and therefore counts the number of time $x$ appears in a specific P6\_part. Let $Y(x)$ denote the number of times $x$ that appears in all files P6\_part i.e the sum of the $y(x)$ terms divided by $x$. As a consequence we have $\boldsymbol{e}_{5}$ consisting of the $x$ terms and $\boldsymbol{d}_{5}$ consisting of the $Y(x)$ terms.

In [None]:
#He now obtain Eg_numb_functs from Eg_numb_functs_init_counter

# This is a List of tuples where have  joined e.g (13, 6),(13, 19) to (13, 25) from Eg_numb_functs_init_counter

from collections import defaultdict

sums = defaultdict(int)

for first, second in Eg_numb_functs_init_counter:
    sums[first] += second

Eg_numb_functs = [(first, second) for first, second in sums.items()]
Eg_numb_functs = sorted(Eg_numb_functs, key=lambda x: x[0])#order so in increasing with respect to eg size

#Build a string using a for loop, using the tuples in Eg_numb_functs
f1,f2=Eg_numb_functs[0]

f2_div=int(f2/f1)

stringer=f'{f1}*{f2_div}'

for item in Eg_numb_functs[1:]: #first item in stringer already
    eg,numb=item #eg,numb not all the same
    num_div=int(numb/eg)
    s=f'+{eg}*{num_div}'
    stringer += s
    
#Output
print(f"Which decomposes as follows:\n {P6_size}={stringer}")

In [None]:
#Check output calculates 5399325 #Yes
s=114*125+121*65+126*237+138*112+158*248+162*247+165*132+169*117+173*132+178*122+182*106+183*132+187*24+192*134+194*123+208*144+209*22+216*234+217*126+218*61+221*125+227*22+229*307+230*250+232*120+233*119+236*122+241*30+242*102+247*61+248*53+250*114+251*59+260*184+268*106+269*64+271*186+273*118+274*56+278*52+279*41+281*136+284*112+289*67+293*133+294*37+300*133+305*67+313*123+324*123+325*132+326*101+329*181+338*120+343*59+345*113+356*19+381*121+390*44+397*107+419*20+420*46+421*23+422*39+429*64+433*35+436*128+437*23+455*64+477*53+496*48+502*16+509*38+518*12+520*107+524*20+531*36+538*76+539*62+553*36+555*66+566*144+578*61+605*62+607*66+613*11+640*28+659*30+759*48+765*21+781*108+802*119+814*42+881*42+938*8+974*135+977*45+983*135+1002*45+1026*126+1029*63+1033*63+1068*96+1121*108+1145*22+1178*39+1189*39+1209*44+1212*11+1224*66+1257*44+1312*11+1322*32+1377*36+1409*16+1421*10+1450*54+1511*20+1666*21+1753*24+2110*42+2193*95+2285*4+2290*54+3544*12+3673*42+3820*48+3983*18+6280*1+6474*6+6702*14+6960*16+7250*9+7580*4

# Upper bound for $|\tilde{P}_{n}|$ <a name="s4"></a>


Recall Section $5.2$ of the thesis. In particular we have the sequences $\mathcal{U}=(U_n)$ and $\mathcal{L}=(L_n)$ where $U_n$ are defined by the maximal term of $\boldsymbol{e}_n$. In a Conjecture in that section we ask, which $f \in \tilde{P}_{n-1}$ attain $U_n$. Let $g \in \tilde{P}_{n-1}$ be such that $g(I)=0$ for all $I \in \mathcal{P}_{n-1}^{+}$. We conjecture that $f=g+\epsilon_{I} \in  \mathcal{P}_{n}^{+}$ attains $|\mathbb{E}(f)|=U_n$ if $I=\emptyset$.

## $U_n$ <a name="s41"></a>

Let $g \in \tilde{P}_{n-1}$ be such that $g(I)=0$ for all $I \in \mathcal{P}_{n-1}^{+}$ and let $f_{U_n},=g+\epsilon_{I} \in  \mathcal{P}_{n}^{+}$ such that $I=\emptyset$. We will now show that $f_{U_n}$ attains $|\mathbb{E}(f_{U_n})|=U_n$ for low $n$.

### $U_3$

In [48]:
f_u2={'1': 0, '2': 0, '12': 0}

In [49]:
df_f_u2=pd.DataFrame([f_u2])

In [50]:
%%time
power_set_str=get_powerset_str(2)
df_f_u2_eg=get_Eg_column_Pn(df_f_u2)
df_f_u2_eg=df_f_u2_eg.sort_values('Eg_Size') 
df=df_f_u2_eg
print(f"e_n : d_n")
print(df['Eg_Size'].value_counts()) #count number of 13s and 19s

e_n : d_n
5    1
Name: Eg_Size, dtype: int64
CPU times: total: 0 ns
Wall time: 8.02 ms


### $U_4$

In [51]:
f_u3={'1': 0, '2': 0, '12': 0,'3':0,'13':0,'23':0,'123':0}

In [52]:
df_f_u3=pd.DataFrame([f_u3])

In [53]:
%%time
power_set_str=get_powerset_str(3)
df_f_u3_eg=get_Eg_column_Pn(df_f_u3)
df_f_u3_eg=df_f_u3_eg.sort_values('Eg_Size') # get those 13,then 19
df=df_f_u3_eg
print(f"e_n : d_n")
print(df['Eg_Size'].value_counts()) #count number of 13s and 19s
# e_n : d_n
# 19    1
# Name: Eg_Size, dtype: int64
# CPU times: total: 0 ns
# Wall time: 8.4 ms

e_n : d_n
19    1
Name: Eg_Size, dtype: int64
CPU times: total: 0 ns
Wall time: 16.6 ms


### $U_5$

In [45]:
f_u4={'1': 0,'2': 0,'3': 0,'4': 0,'12': 0,'13': 0,'14': 0,'23': 0,'24': 0,'34': 0,'123': 0,'124': 0,'134': 0,'234': 0,'1234': 0}

In [46]:
df_f_u4=pd.DataFrame([f_u4])

In [47]:
%%time
power_set_str=get_powerset_str(4)
df_f_u4_eg=get_Eg_column_Pn(df_f_u4)
df_f_u4_eg=df_f_u4_eg.sort_values('Eg_Size') # get those 13,then 19
df=df_f_u4_eg
print(f"e_n : d_n")
print(df['Eg_Size'].value_counts()) #count number of 13s and 19s
# e_n : d_n
# 167    1
# Name: Eg_Size, dtype: int64
# CPU times: total: 31.2 ms
# Wall time: 45.7 ms

e_n : d_n
167    1
Name: Eg_Size, dtype: int64
CPU times: total: 31.2 ms
Wall time: 45.7 ms


### $U_6$

In [42]:
f_u5={'1': 0,'2': 0,'3': 0,'4': 0,'5': 0,'12': 0,'13': 0,'14': 0,'15': 0,'23': 0,'24': 0,'25': 0,'34': 0,'35': 0,'45': 0,'123': 0,'124': 0,'125': 0,'134': 0,'135': 0,'145': 0,'234': 0,'235': 0,'245': 0,'345': 0,'1234': 0,'1235': 0,'1245': 0,'1345': 0,'2345': 0,'12345': 0}

In [43]:
df_f_u5=pd.DataFrame([f_u5])

In [44]:
%%time
power_set_str=get_powerset_str(5)
df_f_u5_eg=get_Eg_column_Pn(df_f_u5)
df_f_u5_eg=df_f_u5_eg.sort_values('Eg_Size') # get those 13,then 19
df=df_f_u5_eg
print(f"e_n : d_n")
print(df['Eg_Size'].value_counts()) #count number of 13s and 19s
# 7580    1
# Name: Eg_Size, dtype: int64
# CPU times: total: 938 ms
# Wall time: 3.18 s

e_n : d_n
7580    1
Name: Eg_Size, dtype: int64
CPU times: total: 1.83 s
Wall time: 3.32 s


### $U_7$

In [55]:
n=6
power_set_str=get_powerset_str(n)
f_u6 = {}
for key in power_set_str:
    f_u6[key] = 0

In [56]:
df_f_u6=pd.DataFrame([f_u6])

In [None]:
%%time
power_set_str=get_powerset_str(6)
df_f_u6_eg=get_Eg_column_Pn(df_f_u6)
df_f_u6_eg=df_f_u6_eg.sort_values('Eg_Size') # get those 13,then 19
df=df_f_u6_eg
print(f"e_n : d_n")
print(df['Eg_Size'].value_counts()) #count number of 13s and 19s

# Leftovers (Please ignore)

In [189]:
# P5.shape
# df=P5
# # f_l4={'1': 0, '2': 0, '3':0,
# #       '12': 1,'13':0,'23':0,'14': 1,'24': 1,'34': 1,
# #       '123':1,'124': 2,'134': 1,'234': 1,'1234': 2}

# condition = (df['1'] == 0) & (df['2'] == 0) & (df['3'] == 0) & (df['4'] == 0) & (df['12'] == 1) & (df['13'] == 0) & (df['23'] == 0) & (df['14'] == 1) & (df['24'] == 1) & (df['34'] == 1) & (df['123'] == 1) & (df['124'] == 2) & (df['134'] == 1)& (df['234'] == 1) & (df['1234'] == 2)
# # condition=df[df['Eg_size']==114]
# # min_df = df[df['Eg_Size'] == 13]

# result_df = df[condition]

# result_df


(42, 31)

In [None]:
min_df = df[df['Eg_Size'] == 42]
min_df
# min_df.shape#(24, 16)

In [None]:
column_to_remove = 'Eg_Size'
min_df_42 = min_df.drop(column_to_remove, axis=1)
min_df_42

In [200]:
l_min_df_42=min_df_42.to_dict(orient='records')

In [215]:
power_set_str=get_powerset_str(5)

for funct in l_min_df_42:
    
    for term in power_set_str:
        if '5' in term:
            if term=='5':
                funct[term]=0 #extend by empty
                continue
            x_list = list(term)
            x_list.remove('5')
            # Convert the list back to a string
            without = ''.join(x_list)
            
            funct[term]=funct[without]+1?? #extend by empty
    
    f_l5=funct
    df_f_l5=pd.DataFrame([f_l5])
    df_f_l5_eg=get_Eg_column_Pn(df_f_l5)
    df_f_l5_eg=df_f_l5_eg.sort_values('Eg_Size') # get those 13,then 19
    df=df_f_l5_eg
    print(f_l5)
    print(f"e_n : d_n")
    print(df['Eg_Size'].value_counts()) #count number of 13s and 19s


{'1': 0, '2': 0, '3': 0, '4': 0, '12': 1, '13': 1, '14': 1, '23': 0, '24': 1, '34': 0, '123': 1, '124': 2, '134': 1, '234': 1, '1234': 2, '5': 0, '15': 1, '25': 1, '35': 1, '45': 1, '125': 2, '135': 2, '145': 2, '235': 1, '245': 2, '345': 1, '1235': 2, '1245': 3, '1345': 2, '2345': 2, '12345': 3}
e_n : d_n
251    1
Name: Eg_Size, dtype: int64
{'1': 0, '2': 0, '3': 0, '4': 0, '12': 0, '13': 1, '14': 1, '23': 1, '24': 0, '34': 1, '123': 1, '124': 1, '134': 2, '234': 1, '1234': 2, '5': 0, '15': 1, '25': 1, '35': 1, '45': 1, '125': 1, '135': 2, '145': 2, '235': 2, '245': 1, '345': 2, '1235': 2, '1245': 2, '1345': 3, '2345': 2, '12345': 3}
e_n : d_n
251    1
Name: Eg_Size, dtype: int64
{'1': 0, '2': 0, '3': 0, '4': 0, '12': 1, '13': 0, '14': 1, '23': 0, '24': 0, '34': 0, '123': 1, '124': 1, '134': 1, '234': 0, '1234': 1, '5': 0, '15': 1, '25': 1, '35': 1, '45': 1, '125': 2, '135': 1, '145': 2, '235': 1, '245': 1, '345': 1, '1235': 2, '1245': 2, '1345': 2, '2345': 1, '12345': 2}
e_n : d_n
16

## $L_n$ <a name="s42"></a>

Let $g \in \tilde{P}_{n-1}$ be such that $g:=f_{L_{n-1}}$ and let $f_{L_n}=g+\epsilon_{I} \in  \mathcal{P}_{n}^{+}$ such that $I=\mathcal{P}_{n-1}^{+}$ (if $n$ is even) or $I=\emptyset$ if $n$ is odd. We will now show that $f_{L_n}$ attains $|\mathbb{E}(f_{L_n})|=L_n$ for low $n$.

### $L_2$

Consider the function $f_{L_2}$.

In [87]:
f_l2={'1': 0, '2': 0, '12': 1}

In [88]:
df_f_l2=pd.DataFrame([f_l2])

We determine the number of extensions of $f_{L_2}$.

In [89]:
%%time
power_set_str=get_powerset_str(2)
df_f_l2_eg=get_Eg_column_Pn(df_f_l2)
df_f_l2_eg=df_f_l2_eg.sort_values('Eg_Size') 

df=df_f_l2_eg
print(f"e_n : d_n")
print(df['Eg_Size'].value_counts()) #count number of 13s and 19s

Term in powerset: 1
Eg size: 3
Eg size: 4
Term in powerset: 2
Eg size: 4
Eg size: 4
Term in powerset: 12
e_n : d_n
5    1
Name: Eg_Size, dtype: int64
CPU times: total: 0 ns
Wall time: 0 ns


### $L_3$

Consider the function $f_{L_3}$.

In [125]:
# f_l3={'1': 0, '2': 0, '12': 0,'3':0,'13':1,'23':1,'123':1}

f_l3={'1': 0, '2': 0, '12': 1,'3':0,'13':0,'23':0,'123':1}

In [126]:
df_f_l3=pd.DataFrame([f_l3])

We determine the number of extensions of $f_{L_3}$.

In [127]:
power_set_str=get_powerset_str(3)
df_f_l3_eg=get_Eg_column_Pn(df_f_l3)
df_f_l3_eg=df_f_l3_eg.sort_values('Eg_Size') # get those 13,then 19
df=df_f_l3_eg
print(f"e_n : d_n")
print(df['Eg_Size'].value_counts()) #count number of 13s and 19s

# e_n : d_n
# 13    1
# Name: Eg_Size, dtype: int64

Term in powerset: 1
Eg size: 8
Eg size: 8
Eg size: 8
Eg size: 9
Eg size: 9
Term in powerset: 2
Eg size: 10
Eg size: 10
Eg size: 10
Eg size: 10
Eg size: 10
Term in powerset: 3
Eg size: 11
Term in powerset: 12
Eg size: 11
Term in powerset: 13
Eg size: 11
Eg size: 11
Eg size: 11
Eg size: 11
Eg size: 11
Eg size: 11
Term in powerset: 23
Eg size: 12
Eg size: 12
Eg size: 12
Eg size: 12
Eg size: 12
Eg size: 12
Term in powerset: 123
Eg size: 12
Eg size: 12
e_n : d_n
13    1
Name: Eg_Size, dtype: int64


### $L_4$

Consider the function $f_{L_4}$.

In [167]:
# f_l3={'1': 0, '2': 0, '12': 1,'3':0,'13':0,'23':0,'123':1}
# f_l4={'1': 0, '2': 0, '12': 1,'3':0,'13':0,'23':0,'123':1,'4': 0,'14': 1,'24': 1,'34': 1,'124': 2,'134': 1,'234': 1,'1234': 2}

# f_l3={'1': 0, '2': 0, '12': 1,'3':0,'13':0,'23':0,'123':1}
f_l4={'1': 0, '2': 0, '12': 1,'3':0,'13':0,'23':0,'123':1,'4': 0,'14': 1,'24': 1,'34': 1,'124': 2,'134': 1,'234': 1,'1234': 2}     

In [168]:
df_f_l4=pd.DataFrame([f_l4])

We determine the number of extensions of $f_{L_4}$.

In [169]:
%%time
# df_f_l4=pd.DataFrame([f_l4])
power_set_str=get_powerset_str(4)
df_f_l4_eg=get_Eg_column_Pn(df_f_l4)
df_f_l4_eg=df_f_l4_eg.sort_values('Eg_Size') # get those 13,then 19
df=df_f_l4_eg

print(f"e_n : d_n")
print(df['Eg_Size'].value_counts()) #count number of 13s and 19s

# e_n : d_n
# 42    1
# Name: Eg_Size, dtype: int64
# CPU times: total: 62.5 ms
# Wall time: 95.5 ms

e_n : d_n
42    1
Name: Eg_Size, dtype: int64
CPU times: total: 31.2 ms
Wall time: 74.7 ms


### $L_{5}$

Consider the function $f_{L_5}$.

In [170]:
# f_l4={'1': 0, '2': 0, '12': 1,'3':0,'13':0,'23':0,'123':1,'4': 0,'14': 1,'24': 1,'34': 1,'124': 2,'134': 1,'234': 1,'1234': 2}  
# f_l5={'1': 0, '2': 0, '3': 0, '4': 0, '5': 0,'12': 1,'13':0,'23':0,'123':1,'14': 1,'24': 1,'34': 1,'124': 2,'134': 1,'234': 1,'1234': 2,'15': 0, '25': 0, '35': 0, '45': 0,'125': 1,'135': 0, '145': 1,'235': 0,'245': 1,'345': 1,'1235': 1,'1245': 2,'1345': 1,'2345': 1, '12345': 2}

# f_l4={'1': 0, '2': 0, '3':0,'12': 0,'13':1,'23':1,'123':1,'4': 0,'14': 0,'24': 0,'34': 0,'124': 0,'134': 1,'234': 1,'1234': 1}     
# f_l5={'1': 0, '2': 0, '3':0,'12': 0,'13':1,'23':1,'123':1,'4': 0,'14': 0,'24': 0,'34': 0,'124': 0,'134': 1,'234': 1,'1234': 1, '5': 0,'15': 1, '25': 1, '35': 1, '45': 1,'125': 1,'135': 2, '145': 1,'235': 2,'245': 1,'345': 1,'1235': 2,'1245': 1,'1345': 2,'2345': 2, '12345': 2}

#empty ext
# f_l4={'1': 0, '2': 0, '12': 1,'3':0,'13':0,'23':0,'123':1,'4': 0,'14': 1,'24': 1,'34': 1,'124': 2,'134': 1,'234': 1,'1234': 2}     
f_l5={'1': 0, '2': 0, '12': 1,'3':0,'13':0,'23':0,'123':1,'4': 0,'14': 1,'24': 1,'34': 1,'124': 2,'134': 1,'234': 1,'1234': 2, '5': 0,'15': 0, '25': 0, '35': 0, '45': 0,'125': 1,'135': 0, '145': 1,'235': 0,'245': 1,'345': 1,'1235': 1,'1245': 2,'1345': 1,'2345': 1, '12345': 2}


In [171]:
df_f_l5=pd.DataFrame([f_l5])

In [172]:
%%time
# df_f_l5=pd.DataFrame([f_l5])
power_set_str=get_powerset_str(5)
df_f_l5_eg=get_Eg_column_Pn(df_f_l5)
df_f_l5_eg=df_f_l5_eg.sort_values('Eg_Size') # get those 13,then 19
df=df_f_l5_eg
print(f"e_n : d_n")
print(df['Eg_Size'].value_counts()) #count number of 13s and 19s


e_n : d_n
169    1
Name: Eg_Size, dtype: int64
CPU times: total: 422 ms
Wall time: 650 ms


## Remark <a name="s43"></a>

As there are many functions $f$ that also give $U_n$ (and similarly $L_{n}$), we wish to determine whether if  for all such $f$, we have $\mathbb{E}(f)$ are equal. We show this is not the case.

Simplist case to consider is $\tilde{P}_3$. We determine $f \in \tilde{P}_3$ and determine $\mathbb{E}(f)$ for each, then extract those $f$ such that we have
$|\mathbb{E}(f)|=U_3$.

In [None]:
#get P3
P2=pd.DataFrame([{'1': 0, '2': 0, '12': 0},{'1': 0, '2': 0, '12': 1}])
n=2
power_set_str,power_set_str_np1=get_n_np1_powersets(n)
P3=get_Pnp1(P2,2)
print(P3)

In [None]:
power_set_str=get_powerset_str(3)

P3_eg=get_Eg_column_Pn(P3)
P3_eg=P3_eg.sort_values('Eg_Size') # get those 13,then 19

df=P3_eg
print(f"e_n : d_n")
print(df['Eg_Size'].value_counts()) #count number of 13s and 19s

There are $4$ which give $U_3$.

In [None]:
y=P3_eg[P3_eg.Eg_Size == 19]
print(y)
print(y.shape)
#Get Eg for the function in P3
#3  0  0  0   0   0   0    1      19

Remove "Eg_Size" column and turn to a list of dictionaries.

In [None]:
y.drop('Eg_Size', axis=1, inplace=True)
y

In [None]:
df=y
list_of_dicts = df.to_dict(orient='records')
list_of_dicts

We now determine $\mathbb{E}(g)$ for each of the above functions. Recall get_Eg_for_single_g(). 

As the order of the output of get_Eg_for_single_g() can vary we sort everything into a standard format

In [None]:
# Eg_cases=[]
sorted_Eg_cases = []

for g in list_of_dicts:
    print(g)
    #[{'1': 0, '2': 0, '3': 0, '12': 0, '13': 0, '23': 0, '123': 1},
    EEg_example=get_Eg_for_single_g_dict(g,power_set_str)
 # [frozenset({'12'}), frozenset({'2', '23', '13', '12', '1'}), frozenset({'2', '23', '3', '13', '12', '1', '123'}), 
 #  frozenset({'2', '23', '3', '13', '12', '1'}), frozenset({'23'}), frozenset({'13', '12', '2', '23'}),
 #  frozenset({'1', '13', '12'}), frozenset({'13', '12'}), frozenset({'12', '23'}), frozenset({'2', '23', '3', '13', '12'}), 
 #  frozenset({'13', '23'}), frozenset({'13', '12', '23', '3'}), frozenset({'23', '3', '13', '12', '1'}), frozenset({'1', '13', '12', '23'}), 
 #  frozenset({'13', '12', '23'}), frozenset({'13', '23', '3'}), frozenset({'13'}), frozenset({'12', '2', '23'}), frozenset()] 
    
    e=EEg_example
    sorted_e=[] #elements are
    # [['12'], ['1', '12', '13', '2', '23'], ['1', '12', '123', '13', '2', '23', '3'], ['1', '12', '13', '2', '23', '3'],
    #  ['23'], ['12', '13', '2', '23'], ['1', '12', '13'], ['12', '13'], ['12', '23'], ['12', '13', '2', '23', '3'], ['13', '23'],
    #  ['12', '13', '23', '3'], ['1', '12', '13', '23', '3'], 
    #  ['1', '12', '13', '23'], ['12', '13', '23'], ['13', '23', '3'], ['13'], ['12', '2', '23'], []]
    for fs in e:
        sorted_fs=sorted(list(fs))
        sorted_e.append(sorted_fs)
        
    sorted_Eg_cases.append(sorted_e)
    print(sorted_e,"\n")

What is $\mathbb{E}(g)$. 

{'1': 0, '2': 0, '3': 0, '12': 0, '13': 0, '23': 0, '123': 0}
[['12', '123', '13'], ['12', '123', '2', '23'], ['1', '12', '123', '13', '2', '23', '3'], ['12', '123'], ['1', '12', '123', '13', '2', '23'], ['123', '13'], ['12', '123', '13', '23', '3'], ['12', '123', '13', '2', '23', '3'], ['123', '23'], ['123', '13', '23'], ['12', '123', '23'], ['12', '123', '13', '23'], ['1', '12', '123', '13', '23'], ['123', '13', '23', '3'], ['12', '123', '13', '2', '23'], ['1', '12', '123', '13', '23', '3'], ['1', '12', '123', '13'], ['123'], []] 

In [None]:
#Previous code to pick a good candidate that gives L_n

y=P3_eg[P3_eg.Eg_Size == 13]
print(y)
print(y.shape)
#Get Eg for the function in P3
#3  0  0  0   0   0   0    1      19

y.drop('Eg_Size', axis=1, inplace=True)
# y
df=y
list_of_dicts = df.to_dict(orient='records')
list_of_dicts

# Eg_cases=[]
sorted_Eg_cases = []

for g in list_of_dicts:
    print(g)
# {'1': 0, '2': 0, '3': 0, '12': 0, '13': 0, '23': 1, '123': 1}
    EEg_example=get_Eg_for_single_g(g,power_set_str)

    
    e=EEg_example
    sorted_e=[] #elements are
  # [['12'], ['12', '123', '13', '2', '3'], ['12', '13', '2', '3'], ['13', '3'], ['12', '13', '3'], ['1', '12', '123', '13', '2', '23', '3'], ['12', '13', '2'], ['12', '2'], ['12', '123', '13', '2', '23', '3'], ['12', '13'], ['1', '12', '123', '13', '2', '3'], ['13'], []] 
    for fs in e:
        sorted_fs=sorted(list(fs))
        sorted_e.append(sorted_fs)
        
    sorted_Eg_cases.append(sorted_e)
    print(sorted_e,"\n")

We see that for $f$={'1': 0, '2': 0, '3': 0, '12': 0, '13': 1, '23': 1, '123': 1}.
    

We have $\mathbb{E}(f)$=
[['12', '3'], ['12'], ['1', '12', '123', '2', '3'], ['12', '123', '2', '23', '3'], ['1', '12', '123', '13', '2', '23', '3'], ['1', '12', '123', '2', '23', '3'], ['12', '123', '3'], ['12', '123', '2', '3'], ['1', '12', '123', '13', '2', '3'], ['3'], ['1', '12', '123', '13', '3'], ['1', '12', '123', '3'], []] 

In [None]:
P3_eg=get_Eg_column_Pn(P3)
P3_eg=P3_eg.sort_values('Eg_Size') # get those 13,then 19

df=P3_eg
print(f"e_n : d_n")
print(df['Eg_Size'].value_counts()) #count number of 13s and 19s

For example we can extract exactly all those function which have $|\mathbb{E}(g)|=13$ 

In [None]:
y=P3_eg[P3_eg.Eg_Size == 13]
print(y)
print(y.shape)

Similar to section "4) Example $\mathbb{E}(g)$ for $g \in  \tilde{P}_{3}$" we can obtain $E(g)$ for a fixed function.

In [None]:
#Get Eg for the function in P3
#3  0  0  0   0   0   0    1      19
numb =3
df=P3
g=df.iloc[numb].to_dict()

g={'1': 0, '2': 0, '3': 0, '12': 0, '13': 0, '23': 0, '123': 1}

print(g,"\n")
EEg_example=get_Eg_for_single_g(df,numb,power_set_str)
EEg_example

In [None]:
# def get_Eg_for_single_g(df,numb,power_set_str):

#     #Obj: puts Eg into correct format to check if topolgoy
#     #Inputs: df=Pn, numb used to get a specific function from df
    
#     # Pick function in Pn
#     g=df.iloc[numb].to_dict()

#     #get primitive closed sets
#     PC=[(x,get_barj({x},g)) for x in power_set_str]
#     dict_PC=dict(PC)
#     set_PC=[sorted_frozenset(get_barj({x},g)) for x in power_set_str]

#     #Get all closed sets for extensions for g
#     Eg=list(get_Eg(dict_PC,power_set_str,set_PC))
#     Eg.append(frozenset()) #rembmer the empty set
#     return Eg

In [None]:
# def get_Pnp1_for_single_funct(Pn,n):
    
#     #inputs
#     # Pn:dataframe and n.
    
#     """
#     Returns a dataframe of extensions in Pn+1 for dataframe of functions in Pn.
#     """
#     """
#     #Example
#     n=3
#     power_set_str,power_set_str_np1=get_n_np1_powersets(n)
#     # P3

#     funct_list=[{'1': 0,'2': 0,"3":0,"12":0,"23":0,"13":0,"123":0}]
#     Pn=pd.DataFrame(funct_list)

#     #The following should be functions in P4 that are extendions of functions in funct_list
#     f_P3=get_Pnp1_for_single_funct(Pn,n)
#     print(f_P3)
#     """

#     #New data created
#     power_set_str,power_set_str_np1=get_n_np1_powersets(n)

#     #Load Pn as a list of dictionaries.
#     list_df_Pn=Pn.to_dict(orient='records')

#     #Create Pn+1 and hold as dataframe.
#     P_n1=eg_column_Build_Pn_1(list_df_Pn,power_set_str,n)
#     #As previous was done in E(g) parts, joint together
#     Pnp1=extend_sets_to_Pnp1(P_n1)
        
#     Pnp1=get_Eg_column_Pn(Pnp1)
    
#     columns=list(power_set_str_np1)+["Eg_Size"]
#     Pnp1 = Pnp1.reindex(columns=columns)
    
#     # inspect_Pn1(Pnp1)

#     return Pnp1

In [None]:
# def check_all_funct_topspace(df,power_set_str):
#     length=[]
#     for i,g in enumerate(df.to_dict("records")): #list of dictionaries of functions
#         check_top=check_top_for_single_g(df,i,power_set_str)
#         if check_top == False:
#             print(f"Function {i} does NOT give a topolgical space: {check_top}")
#         else:
#             length.append(check_top)
            
#     if len(df.to_dict("records"))==len(length):
#         print("Every function produces a topological space" )
#     else:
#         print("something is not a topological space")

In [None]:
# def is_finite_topological_space(X, T):
    
#     """
#         # Does this account for unions? 

#     # X = {'a', 'b', 'c'}
#     # T = [X, set(), {'a'}, { 'c'}, {'b'}]
#     # print(is_finite_topological_space(X, T)) # prints False

#     # This condition is not met because the union of {'a'} and {'c'} is {'a','c'} which is not in T.
#     """
    
#     if not all(isinstance(x, set) for x in T):
#         # T must be a collection of sets
#         print(x)
        
#         return False
#     if not all(x.issubset(X) for x in T):
#         # Each element of T must be a subset of X
#         print("here1")

#         return False
#     if not all(x.intersection(y) in T for x in T for y in T):
#         print("here2")

#         # The intersection of any two elements of T must be in T
#         return False
#     if not all(x.union(y) in T for x in T for y in T):
#         # The union of any two elements of T must be in T
#         print("here3")
#         return False
    
#     if not X in T:
#         print("here4")
#         print(x)

#         # X must be in T
#         return False
#     if not set() in T:
#         print("here5")

#         # The empty set must be in T
#         return False
#     return True

# def check_top_for_single_g(df,numb,power_set_str):
#     #Inputs:
#     # df=Pn
#     # numb=choice of function
#     # power_set_str for df
    
#     # Output:True or False
    
#     Eg=get_Eg_for_single_g(df,numb,power_set_str)
#     Top=list(set(x) for x in Eg)
#     Top.append(set())
    
#     X_Eg=set(power_set_str)
    
#     return is_finite_topological_space(X_Eg, Top)


# def check_df_duplicates(df1,df2,How,value):
#         """
#     Objective:
#     Input:
#     Returns:
#     """
#     #Compare df1,df2 if that common rows
    
#     # how is either "right" or "left"
#     # for left gives those in df1 and says True or false if also in df2
    
#     #     example
#     #     df1 = pd.DataFrame({'team' : ['A', 'B', 'C', 'D', 'E'], 
#     #                     'points' : [12, 15, 22, 29, 24]}) 
#     #  #create second DataFrame
#     # df2 = pd.DataFrame({'team' : ['A', 'D', 'F', 'G', 'H'],
#     #                     'points' : [12, 29, 15, 19, 10]})
    
#     #merge two dataFrames and add indicator column
#     all_df = pd.merge(df1, df2, how=How, indicator='exists')

#     #add column to show if each row in first DataFrame exists in second
#     all_df['exists'] = np.where(all_df.exists == 'both', True, False)

#     #view updated DataFrame
#     # print (all_df)

#     m=all_df.loc[all_df['exists'] == True]
    
#     #Number of terms common
#     num_items = m.loc[m['exists'] == value].shape[0]
#     print(f"Number of those in both: {num_items} \n")

#     return m    


## Examples for understanding code

Are primitive closed sets a basis for the topology? (Maybe not relevant)

In [None]:

Asides:
- Is Eg a matriod? No does not satisfy downward closure. In our sepcific example f(12)=1,f(123)=1 else 0 Take {1} in the closure of 1 i.e {1,13}, {1} is not a closed set.
- Are primite closed sets a basis for the topology? Is there a way to construct $\mathbb{E}(g)$ using primitive (using minimal basis).

In [None]:
# dict_PC,set_PC=get_primitive_for_single_g(df,numb)

# # print(dict_PC) #Remembers what the terms of powerset is
# set_PC.append(frozenset()) # add empty set: this is the set of primitive closed sets
# # for i in set_PC:
# #     print(set(i))
    
# X_Eg= {'13', '23', '123', '12', '2', '3', '1'} #total space ie powerset
# Prim_Top= list(set(x) for x in set_PC) # primitive set as a topology? is it a basis?

# def is_basis(X, T, B):
#     if not all(isinstance(x, set) for x in B):
#         # B must be a collection of sets
#         return False
#     if not all(x.issubset(X) for x in B):
#         # Each element of B must be a subset of X
#         return False
    
    
#     if not all(x.intersection(y) == set() for x in B for y in B if x != y):
#         # The intersection of any two distinct elements of B must be empty
#         return False
    
#     if not all(x.union(y) in T for x in B for y in B):
#         # The union of any two elements of T must be in T
#         return False
    
#     if not all(x.issubset(y) or y.issubset(x) for x in B for y in T):
#         # Each element of B must be a subset of some element of T or conversely
#         return False
#     return True

# # Example usage
# X = {'a', 'b', 'c'}
# T = [X, set(), {'a', 'b'}, {'b', 'c'}, {'b'}]
# B = [{'a', 'b'}, {'b', 'c'}]
# print(is_basis(X, T, B)) # prints True


In [None]:
#Get all functions in P3 stored a data frame

#Construct P3
pre_functs=[["12","123"],["13","123"],["23","123"],["12","13","123"],["12","23","123"],["13","23","123"],["12","13","23","123"],[],["123"]]
ds=[]
for vals in pre_functs:
    builder1=[(x,1) for x in vals]
    builder0=[(x,0) for x in power3 if x not in vals]
    f_dict={**dict(builder1),**dict(builder0)}
    ds.append(f_dict)
big={'123': 2, '1': 0, '2': 0, '3': 0, '12': 1, '13': 1, '23': 1} #remaining case, with 2 value
dp=ds+[big,]

#Build dataframe
P3 = pd.DataFrame(dp)

#order columns of P3
power3=get_powerset_str(3) #powerset on 3 elements #for 
column_order=power3 #as strings
P3 = P3.reindex(columns=column_order)

print(P3)

In [None]:
## Checking if the calculations for P5, are the same as our old calculations for P5 in xlsx.

# #Previous data P5 in correct format
# dataframe2 = pd.read_excel('P4_P5_xlsx\P_5.xlsx')

# #Make sure columns are in the same order as P5

# #Reordering columns
# column_order=[int(x) for x in power_set_str_np1]
# dataframe2 = dataframe2.reindex(columns=column_order)

# #Making sure columns are the same, in string format
# dataframe2=string_cols(dataframe2)

# # print(dataframe2.head)
# print(dataframe2.shape) #(10334, 31)

# # n=4
# # power_set_str,alt_power_set_str_np1=get_n_np1_powersets(n)
# # check_functs_MSA(dataframe2,power_set_str_np1).shape #(0,0) i.e all are msa

# Comparision of P5 and dataframe 2. We see they are the same.

# compar=check_df_duplicates(P5,dataframe2,"left",True) #True:10334 , False:0

In [None]:
# # The calculations for P4, are the same as our old calculations for P4 in xlsx.

# dataframe1 = pd.read_excel('P4_P5_xlsx\P_4.xlsx')
# #Making sure columns are the same, in string format and order as P4 above
# dataframe1=string_cols(dataframe1)

# # #Comparision of P4 and xlsx version
# compar=check_df_duplicates(P4,dataframe1,"left",True)

In [None]:
# n=3
# # Create a set with 3 elements
# # s = {1, 2, 3}
# s=set(range(1,n+1))

# # Generate all possible subsets of the set
# power_set = [set(x)  for r in range(len(s) + 1) for x in itertools.combinations(s, r)]

# # Print the power set
# print(power_set)

# power_set_str=[set_to_string(x) for x in power_set]
# power_set_str=[set_to_string(x) for x in power_set if len(x)>0]
# power_set_str

# #Get a function in P3

# val1_loc=["12","123"] #where it take 1
# val0_loc= [x for x in power_set_str if x not in val1_loc] #where it takes 0
# # print(val0_loc)

# builder1=[(x,1) for x in val1_loc]
# builder0=[(x,0) for x in val0_loc]
# builder0
# dict(builder0)
# dict(builder1)
# f_dict={**dict(builder1),**dict(builder0)}
# f_dict

# A,B="1","123"
# print(f"Is {A} subset of {B}, f minimal? : {fmin(A,B,f_dict)}")
# print(f"Is {A} subset of {B}, f maximal? : {fmax(A,B,f_dict)}")

#Testing for obtaining primitive sets:
# #Example
# for x in power_set_str:
#     print(x,get_barj({x},f_dict))

#As all the closed sets, checked by hand:
# 3 T term {'13', '123', '1', '3', '23', '2'}
# 1 {'1', '13'}
# 2 {'23', '2'}
# 12 {'1', '12', '123', '13', '2', '23'} 
# 13 {'13'} # should be just 13
# 23 {'23'} # should be 23
# 123 {'13', '123', '1', '23', '2'} # should not have 3 or 12

# Store primitive closed sets 

# PC=[(x,get_barj({x},f_dict)) for x in power_set_str]
# dict_PC=dict(PC)
# set_PC=[sorted_frozenset(get_barj({x},f_dict)) for x in power_set_str]
# # set(set_PC)
# Eg={sorted_frozenset(power_set_str)}.union(set_PC)

# Using main functions:
# Eg=list(get_Eg(dict_PC,power_set_str,set_PC))
# # print(len(x)) # I get like 13 so this is an issue.
# Eg
# Eg.append("empty")
# Eg


#Rows are not in the same order:
# # print(P4.equals(dataframe1))

#Checking if duplicates, #https://stackoverflow.com/questions/48647534/find-difference-between-two-data-frames?noredirect=1&lq=1
# df = P4.merge(dataframe1, how = 'inner' ,indicator=False)
""" 
# first dataframe
df1 = pd.DataFrame({
    'Age': ['20', '14', '56', '28', '10'],
    'Weight': [59, 29, 73, 56, 48]})
display(df1)

# second dataframe
df2 = pd.DataFrame({
    'Age': ['16', '20', '24', '40', '22'],
    'Weight': [55, 59, 73, 85, 56]})
display(df2)

#Common rows:
df = df1.merge(df2, how = 'inner' ,indicator=False)
print(df)
"""

"The above method only works for those data frames that don't already have duplicates themselves. For example:"
no_dups=pd.concat([P4,dataframe1]).drop_duplicates(keep=False)
no_dups.head()


## E(g) is a topolgical space <a name="s4"></a>

### Example f in P3 form a Topological space <a name="s41"></a>

The set $\mathbb{E}(g)$ is the topology assoicated to the function $g$ and can be used to construct the extensions of $g$.

We now have:
- A function/method to get $\mathbb{E}(g)$.
- A function to check if $\mathbb{E}(g)$ is a finite topology.
- Check if all function in P3,P4,P5 generate a topology: Answer P3,P4 yes, examples of P5 yes

What we could do next:
- A way to determine the size of $\mathbb{E}(g)$ (using basis).
- Are all Top(g) for $g \in P_n$ homeomorphic? (topologies are different sizes).

### Specific Example from P3 is topological space

In [None]:
power_set_str=get_powerset_str(3)

df=P3
numb=4
#Get data
Eg=get_Eg_for_single_g(df,numb,power_set_str)
# print(power_set_str)
print(P3.iloc[numb].to_dict(),"\n")

for term in Eg:
    print(set(term))
    
print("\n  All closed sets")


Check if Eg is a finite topological space on the powerset for an example g in P3

In [None]:
X_Eg= {'13', '23', '123', '12', '2', '3', '1'} #total space ie powerset
Top=list(set(x) for x in Eg) #topology

print(is_finite_topological_space(X_Eg, Top))

### Consider Top(g) for two different g,g' in P3.

In [None]:
#Setup data
power_set_str=get_powerset_str(3)
df=P3
X_Eg= set(power_set_str) #total space ie powerset

#Inputs
numb1=0
numb2=1

#Top space 1
Eg1=get_Eg_for_single_g(df,numb,power_set_str)
Top1=list(set(x) for x in Eg1) #topology
print(is_finite_topological_space(X_Eg, Top1))
#------------------------------
#Top space 2
Eg2=get_Eg_for_single_g(df,numb2,power_set_str)
Top2=list(set(x) for x in Eg2) #topology
print(is_finite_topological_space(X_Eg, Top2))
#------------------------------



### Check if all function in P3,P4,P5 generate a topological space: Yes <a name="s42"></a>

### P3

In [None]:
#inputs
df=P3
power_set_str=get_powerset_str(3)
check_all_funct_topspace(df,power_set_str)

### P4

In [None]:
#inputs
df=P4
power_set_str=get_powerset_str(4)
check_all_funct_topspace(df,power_set_str) #False

### P5 (where Eg for a function allow for calc of part of P6)

In [None]:
#Take long: 

#inputs
df=P5
power_set_str=get_powerset_str(5)
check_all_funct_topspace(df,power_set_str)

In [None]:
#Works for some choosen example.

df=P5
power_set_str=get_powerset_str(5)
numb=0 # function indexed at 0

Eg=get_Eg_for_single_g(df,numb,power_set_str) # set of closed sets
X_Eg= set(power_set_str) #total space ie powerset
Top=list(set(x) for x in Eg) #topology from closed sets Eg #Size: 1178

print(len(Top))

#Check whether topological space:
print(is_finite_topological_space(X_Eg, Top)) #The total powerset on 4 elements is not included.

#Result:
#numb=0
#1178 #Top size
#True