This is a computational notebook file, signalled by the use of `.pynb` in the filename. This notebook contains python code that will visualize the data in the `wikilink-index.json` file created by our lab-bench's 'backlinks' function. That is to say, as you develop linked thoughts, the backlinks function keeps track of the structure of your notes, and this present file can visualize it as a network aka 'graph'. 

Make sure your file tray at left has a file called 'wikilink-index.json'. This gets generated when you first examine your lab bench for back links by hitting alt/option b to open, close, then re-open the backlinks panel. Examine that file. You'll see that there is a 'key' called 'links' and another key called 'backlinks'. These two structures get updated with the information showing how all your notes interlink.

Progress through this file by clicking each cell below then hitting the 'play' button or ctrl+enter. You'll know that the cell has run when the little [ ] at the left of each cell changes from [ ] to [*] to [1]. The numbers keep track of the order in which you've run the cell. Some cells merely 'set the stage' as it were, importing the little pieces of pre-made code we'll use; others define functions that we'll run later. You need to run those 'stage setting' cells, or else when you actually DO run a function (eg `load_wikilink_data`) you'll get errors.

In [1]:
import json
import matplotlib.pyplot as plt
import networkx as nx

When you ran the cell above, nothing seemed to happen, right? But you'll notice that the [ ] at the left of it has now changed to [1]. Which means it ran successfully, and there is no output to report. So keep going!

In [45]:
def load_wikilink_data(filepath):
    """Load the wikilink JSON file."""
    with open(filepath, 'r') as f:
        return json.load(f)

def create_graph(data):
    """Create a NetworkX graph from the wikilink data 
       including both links and backlinks."""
    G = nx.Graph()
    
    # Process forward links
    
    links = data.get('links', {})
    
    for source, targets in links.items():
        # source is filename WITH extension
        # targets are filenames WITHOUT extension
        if isinstance(targets, list):
            for target in targets:
                G.add_edge(source, target)
        elif isinstance(targets, dict):
            for target in targets.keys():
                G.add_edge(source, target)
        else:
            # Handle case where targets might be a single value
            G.add_edge(source, targets)
    
    # Process backlinks
    backlinks = data.get('backlinks', {})
    for target, sources in backlinks.items():
        # target is filename WITHOUT extension
        # sources are filenames WITH extension
        if isinstance(sources, list):
            for source in sources:
                G.add_edge(source, target)
        elif isinstance(sources, dict):
            for source in sources.keys():
                G.add_edge(source, target)
        else:
            # Handle case where sources might be a single value
            G.add_edge(sources, target)
    
    return G



In [3]:
def create_directed_graph(data):
    """Create a directed NetworkX graph from the wikilink data."""
    G = nx.DiGraph()
    
    # Process forward links (source -> target)
    links = data.get('links', {})
    for source, targets in links.items():
        if isinstance(targets, list):
            for target in targets:
                G.add_edge(source, target)
        elif isinstance(targets, dict):
            for target in targets.keys():
                G.add_edge(source, target)
        else:
            G.add_edge(source, targets)
    
    # Process backlinks (source -> target, where backlink structure is inverted)
    backlinks = data.get('backlinks', {})
    for target, sources in backlinks.items():
        if isinstance(sources, list):
            for source in sources:
                G.add_edge(source, target)
        elif isinstance(sources, dict):
            for source in sources.keys():
                G.add_edge(source, target)
        else:
            G.add_edge(sources, target)
    
    return G

In [4]:
def show_graph_info(G):
    """Print basic information about the graph."""
    print(f"Nodes: {len(G.nodes())}")
    print(f"Edges: {len(G.edges())}")
    if len(G.nodes()) > 0:
        print(f"Average connections: {sum(dict(G.degree()).values()) / len(G.nodes()):.1f}")

def plot_graph(G):
    """Create a simple visualization of the graph."""
    plt.figure(figsize=(10, 8))
    
    if len(G.nodes()) == 0:
        plt.text(0.5, 0.5, 'No links found', ha='center', va='center', fontsize=16)
        plt.xlim(0, 1)
        plt.ylim(0, 1)
    else:
        pos = nx.spring_layout(G)
        nx.draw(G, pos, with_labels=True, node_color='lightblue', 
                node_size=1000, font_size=10, font_weight='bold')
    
    plt.title('WikiLink Graph')
    plt.axis('off')
    plt.show()

Now that all your functions are defined, let's feed them some data and see what your notes look like! This next cell will load your data; create a graph from your data; show you some statistics about your data; and then plot your data. If you get an error about something not being found... have you got a wikilink-index.json file? Did you run all of the 'stage setting' cells above?

In [None]:
data = load_wikilink_data('../wikilink-index.json')
graph = create_graph(data)
show_graph_info(graph)
plot_graph(graph)

Probably looks really messy, eh? Notice that there are some links in there with ## or bullet points or [[ markers and so on? Our data contains extra stuff. If you examine the wikilink-index.json file, you'll see that there are 'values' in the key:value pairs where extra context has crept in. This is a function of how the back-links feature tries to be greedy. Fortunately, we can see that such data uses back-ticks, which makes it easy to identify. So let's re-write our 'load_wikilink_data' function so that we filter out any data that has backticks:

In [53]:
def load_wikilink_data(filepath):
    """Load the wikilink JSON file and filter out values containing backticks."""
    with open(filepath, 'r') as f:
        data = json.load(f)
    
    # Filter out entries containing backticks
    filtered_data = {'links': {}, 'backlinks': {}}
    
    # Filter links
    if 'links' in data:
        for source, targets in data['links'].items():
            if '`' in source:
                continue
            
            filtered_targets = []
            if isinstance(targets, list):
                filtered_targets = [t for t in targets if '`' not in str(t)]
            elif isinstance(targets, dict):
                filtered_targets = {k: v for k, v in targets.items() if '`' not in str(k)}
            else:
                if '`' not in str(targets):
                    filtered_targets = targets
            
            if filtered_targets:  # Only add if there are valid targets
                filtered_data['links'][source] = filtered_targets
    
    # Filter backlinks
    if 'backlinks' in data:
        for target, sources in data['backlinks'].items():
            if '`' in target:
                continue
            
            filtered_sources = []
            if isinstance(sources, list):
                filtered_sources = [s for s in sources if '`' not in str(s)]
            elif isinstance(sources, dict):
                filtered_sources = {k: v for k, v in sources.items() if '`' not in str(k)}
            else:
                if '`' not in str(sources):
                    filtered_sources = sources
            
            if filtered_sources:  # Only add if there are valid sources
                filtered_data['backlinks'][target] = filtered_sources
    
    return filtered_data

In [None]:
## Re Run!
## Because we're not changing any other functions, we can re-run all our code.
data = load_wikilink_data('../wikilink-index.json')
graph = create_graph(data)
show_graph_info(graph)
plot_graph(graph)

Ok, that's better, but let's change the layout to make it easier to read. We'll truncate labels, and try to avoid overlap. To do that, we'll change the graph plot code.

In [56]:
def plot_graph2(G, max_label_length=15):
    """Create a visualization using only NetworkX and matplotlib (no external deps)."""
    plt.figure(figsize=(14, 10))
    
    if len(G.nodes()) == 0:
        plt.text(0.5, 0.5, 'No links found', ha='center', va='center', fontsize=16)
        plt.xlim(0, 1)
        plt.ylim(0, 1)
    else:
        # Try to use ForceAtlas2 layout from NetworkX, fallback to spring layout
        try:
            pos = nx.forceatlas2_layout(G)
            layout_name = "ForceAtlas2"
        except (AttributeError, ImportError):
            # Fallback to spring layout if ForceAtlas2 is not available
            pos = nx.spring_layout(G, k=3, iterations=50)
            layout_name = "Spring"
        
        # Create shortened labels
        def shorten_label(label, max_length=max_label_length):
            label = label.replace('.md', '').replace('.txt', '').replace('.org', '')
            if len(label) > max_length:
                return label[:max_length-3] + '...'
            return label
        
        shortened_labels = {node: shorten_label(node) for node in G.nodes()}
        
        # Draw the graph
        nx.draw_networkx_edges(G, pos, alpha=0.3, edge_color='gray', width=0.5)
        nx.draw_networkx_nodes(G, pos, node_color='lightblue', 
                              node_size=300, alpha=0.8)
        nx.draw_networkx_labels(G, pos, labels=shortened_labels,
                               font_size=8, font_weight='normal')
    
    plt.title(f'WikiLink Graph ({layout_name} Layout)', fontsize=16, pad=20)
    plt.axis('off')
    plt.tight_layout()
    plt.show()

In [None]:
plot_graph2(graph)


Which notes seem to be most important to you? What might that imply about your observations and thoughts so far in this course?

Make sure to save your changes (ctrl+s). You can make a new markdown note by hitting the big plus button at top left and selecting 'markdown'. Then, you can write your thoughts and observations about this visualization, linking back to this note (or indeed, any of the individual cells! Check out View -> Command Palette -> PKM: Show Notebook Overview)