## FiM anon

We implement single FiM, and apply it to a certain percentage of each node's connections

take some subset of nodes from the graph (50) to be FiM nodes. These are chosen uniformly at random from the original node set. We apply the FiM to at least 20% of each node's connections. The connections we choose again uniformly at random.

The implementation of this is not detailed in the paper. We implement this method by iterating through the nods of the graph, and for each of its neighbors adding with probability 0.8 the original edge, and with probability 0.2 two edges from each of the original endpoints to one of the FiM nodes. To prevent a connection being FiM from one endpoint, and not from the other, if we add FiM node to a connection, the connection is added to a list of non-edges, which are removed from the graph at a last step. We create a fixed list of FiM nodes, and iterate through them at each edge addition, so that ideally the nodes do not reuse FiM nodes. The FiM nodes are chosen uniformly at random from the node set of the original graph. Using relatively many FiM nodes (compared to avg deg) means that it is unlikely for FiM nodes to choose themselves as the FiM in a connection.

The time complexity of this algorithm is O(n*h) where h is the average degree of the network.
We use 50 as a the fixed number of FiM nodes. We do single per node FiM. single FiM was easiest, and more hops did not seem to give much additional privacy (from papers results). per node FiM gave notably better results than per network FiM.

my implementation has the problem that for some connections both the original and the FiM edges are in the network. 

In [8]:
import networkx as nx
import random
import time

In [40]:
def fim_anon(G:nx.Graph):
    ##starttime = time.time()
    fim_nodes = random.sample(list(G.nodes()),k=50)
    ## can construct it from largest to smallest deg node, i.e.
    ## can iterate through the FiM nodes in order, so that we dont get duplicates
    ## we need percentage of max deg FiM nodes
    ## for each node's neighbors, we add either that edge, or an FiM edge -> O(n^2) or O(n*h) h avg deg
    edgeList = []
    nonEdgeList = []
    fim_iterator = 0
    for n in G.nodes():
        for v in G.neighbors(n):
            if random.random() <= 0.2:
                edgeList.append((n,fim_nodes[fim_iterator]))
                edgeList.append((fim_nodes[fim_iterator],v))
                nonEdgeList.append((n,v))
            else:
                edgeList.append((n,v))
            fim_iterator = (fim_iterator + 1)%50
    Ganon = nx.Graph()
    Ganon.add_nodes_from(G.nodes())
    Ganon.add_edges_from(edgeList)
    Ganon.remove_edges_from(nonEdgeList)
    ##print(time.time()-starttime)
    return Ganon

## can maybe be made more efficient by going through the edges of the graph instead
                

In [38]:
BAG = nx.barabasi_albert_graph(1000, 10)

In [39]:
BAGanon = fim_anon(BAG)
## pretty exactly O(n*h), n=1k h=20 0.02s, n=1k h=50 0.05s, n=10k h=50 0.65s, n=10k h=100 1,4s

0.011672019958496094
