In [37]:
import random
random.seed(42)

# Lit 🔥 Review Visualiser

### team notes
Below is a basic outline of our data structure, a directed graph

It's heavily commented (some of which won't make it to submission) to keep it as clear as possible

Shout if anything is confusing and we'll figure it out, or if I've misinterpreted any of the decisions we made the first time we talked about it 

## Data Structure

### note on classes
this is just for clarity, won't make the final report

We _could_ represent the graph without declaring a new class, but this will keep things cleaner for us in the long run

The alternative would probably to create a dictionary of lists, that would look a little like this

In [38]:
graph = {
    'adjacency_list': [[], [], []], # what nodes are connected to what nodes, every node would have it's own list
    'node': [1, 69, 420] # just a list of all the nodes
}

We would then have to define functions to add/get nodes and edges anyway, for example:

In [39]:
def add_node(graph, x):
    graph['adjacency_list'].append([])
    graph['nodles'].append(x)

And then to _get_ a node, it would pass the index, much like the method in the class

In [40]:
def get_node(graph, index):
    return graph['nodes'][index]

When you see *self* as one of the variables in the below class method, that just means we want to act on itself, so it saves us having to pass the `graph` variable to every function we all

# THE PAPER IS THE GRAPH

what does that mean? Welllll a Paper is an inherenetly recursive object in the real world, it has references which are papers which have references which are papers... 

So the object that we create to represent a Paper needs to represent this property, however we need to add a base case (a depth at which we no longer care about the references)

The real world has no recursion depth limits but python does

So let's try again and create a different Paper class

In [49]:
class Paper:

    def __init__(self, name = 0, distance = 0):
        """this constructor is recursive, eaech Paper create 
        a list of Papers that it references"""
        self.__name = name
        self.__distance = distance
        self.__references = self.create_references()

    def create_references(self):
        # return an empty list if we have exceeded relevant depth
        if self.get_distance() > 3:
            return []
        # otherwise, return a list of Paper objects
        return [Paper(random.randint(self.get_name() - 5, self.get_name() - 1), self.get_distance() + 1) for _ in range(5)]

    def get_name(self):
        return self.__name
    
    def get_distance(self):
        return self.__distance

    def find_distance(self, name=0, distance=0, min_distance=float('inf')):
        # if name target paper == this paper, return its distance
        if name == self.get_name():
            min_distance = min(min_distance, self.get_distance())

        # else check all the Paper's references
        for paper in self.get_references():

            # call find_distance for each paper and pass on the minimum distance
            dist = paper.find_distance(name, distance + 1, min_distance)

            # if the paper is found, check if it's the new minimum distance
            if dist is not None:
                min_distance = min(min_distance, dist)

        # return min_distance after ALL Papers have been checked
        return min_distance

    def get_references(self):
        return self.__references

    def __repr__(self):
        return f"{self.get_name()}: {self.get_references()}"

In [55]:
paper = Paper(2024)

In [57]:
paper.get_references()

[2022: [2019: [2015: [2013: [], 2014: [], 2012: [], 2010: [], 2012: []], 2016: [2015: [], 2012: [], 2012: [], 2012: [], 2014: []], 2015: [2013: [], 2014: [], 2012: [], 2013: [], 2014: []], 2017: [2016: [], 2013: [], 2013: [], 2016: [], 2012: []], 2018: [2016: [], 2017: [], 2015: [], 2013: [], 2017: []]], 2017: [2012: [2011: [], 2011: [], 2008: [], 2011: [], 2011: []], 2013: [2009: [], 2010: [], 2012: [], 2011: [], 2008: []], 2013: [2012: [], 2011: [], 2008: [], 2012: [], 2011: []], 2012: [2010: [], 2008: [], 2011: [], 2010: [], 2010: []], 2016: [2011: [], 2015: [], 2014: [], 2013: [], 2011: []]], 2020: [2017: [2012: [], 2013: [], 2016: [], 2012: [], 2012: []], 2018: [2015: [], 2013: [], 2017: [], 2013: [], 2013: []], 2018: [2013: [], 2015: [], 2016: [], 2014: [], 2014: []], 2018: [2015: [], 2016: [], 2016: [], 2016: [], 2016: []], 2015: [2014: [], 2011: [], 2012: [], 2010: [], 2011: []]], 2021: [2019: [2018: [], 2015: [], 2015: [], 2014: [], 2014: []], 2018: [2016: [], 2017: [], 2016: 

`find_distance()` returns the shortest distance found from one Paper to another

As you can  see above, Paper 2 is referenced directly by `paper` (line 1), 

but is also referenced by Paper 1, which is referenced by Paper 16, which is a distance of 3 (line 2)

In [54]:
paper.find_distance(2015)

2

Attempting to find the distance to a paper that isn't referenced will return `inf`, in other words, not in the references

In [45]:
paper.find_distance(12345)

inf

## Testing