#### Graph Construction
-  Nodes: Represent players on the field
- Edges: Defined based on proximity meaning if players come within 2 feet of each other or other interactions, possibly with weights (like inverse distance or duration of proximity).

#### Influence Metrics
- Degree Centrality: Counts the number of direct connections a player has.
- Betweenness Centrality: Measures how often a player is on the shortest path between other players, indicating a bridging/glue role (potentially influential)
- Closeness Centrality: Indicates how quickly a player can interact with all others, reflecting agility or a positional advantage.
- Eigenvector Centrality: Evaluates a player's influence based on connections to other influential players.
- Simulate node removals to test network robustness and observe changes in connectivity, indicating influential players.

#### Additional Considerations
-  Use temporal aggregation (considering interactions over the duration of a play).
- Consider both teammate and opponent interactions for a bigger picture of influence.

In [3]:
import graph_tool.all as gt
import polars as pl
from collections import defaultdict

In [4]:
players = pl.read_csv("nfl-big-data-bowl-2025/players.csv")
tracking = pl.read_csv("nfl-big-data-bowl-2025/tracking_week_1.csv", null_values=["NA", "na", "N/A", "n/a", "NULL", "null", "None", "none"])

In [8]:
print("tracking:", tracking.select(pl.all()).tail())

tracking: shape: (5, 18)
┌────────────┬────────┬───────┬─────────────┬───┬──────┬──────┬──────┬─────────────────────────┐
│ gameId     ┆ playId ┆ nflId ┆ displayName ┆ … ┆ dis  ┆ o    ┆ dir  ┆ event                   │
│ ---        ┆ ---    ┆ ---   ┆ ---         ┆   ┆ ---  ┆ ---  ┆ ---  ┆ ---                     │
│ i64        ┆ i64    ┆ i64   ┆ str         ┆   ┆ f64  ┆ f64  ┆ f64  ┆ str                     │
╞════════════╪════════╪═══════╪═════════════╪═══╪══════╪══════╪══════╪═════════════════════════╡
│ 2022090800 ┆ 3696   ┆ null  ┆ football    ┆ … ┆ 1.47 ┆ null ┆ null ┆ pass_outcome_incomplete │
│ 2022090800 ┆ 3696   ┆ null  ┆ football    ┆ … ┆ 1.27 ┆ null ┆ null ┆ null                    │
│ 2022090800 ┆ 3696   ┆ null  ┆ football    ┆ … ┆ 0.38 ┆ null ┆ null ┆ null                    │
│ 2022090800 ┆ 3696   ┆ null  ┆ football    ┆ … ┆ 0.37 ┆ null ┆ null ┆ null                    │
│ 2022090800 ┆ 3696   ┆ null  ┆ football    ┆ … ┆ 0.36 ┆ null ┆ null ┆ null                    │
└────

In [3]:
'''
i[0] = gameId
i[1] = playId
i[2] = nflId
i[3] = displayName
i[4] = frameId
i[5] = frameType
i[6] = time
i[7] = jerseyNumber
i[8] = club
i[9] = playDirection
i[10] = x
i[11] = y
i[12] = s
i[13] = a
i[14] = dis
i[15] = o
i[16] = dir
i[17] = event
'''

'\ni[0] = gameId\ni[1] = playId\ni[2] = nflId\ni[3] = displayName\ni[4] = frameId\ni[5] = frameType\ni[6] = time\ni[7] = jerseyNumber\ni[8] = club\ni[9] = playDirection\ni[10] = x\ni[11] = y\ni[12] = s\ni[13] = a\ni[14] = dis\ni[15] = o\ni[16] = dir\ni[17] = event\n'

### TODO
- Create an nflId to displayName database, maybe with Redis for quick lookups?
- ~~Create a play by play graph creator~~
- Iterate through a game and create graphs for every play
- Compute analysis on a network



In [None]:
def group_rows_by_frame(data, gameId, playId):
    frames = defaultdict(list)
    for row in data.iter_rows():
        frame_id = row[4]
        if (frame_id % 24 == 0 or frame_id == 1) and row[0] == gameId and row[1] == playId:
            frames[frame_id].append(row)
    return frames

def construct_graph(data, gameId, playId):

    g = gt.Graph(directed=False)

    player_id_prop = g.new_vertex_property("string")
    x_coord_prop = g.new_vertex_property("float")
    y_coord_prop = g.new_vertex_property("float")
    
    g.vertex_properties["player_id"] = player_id_prop
    g.vertex_properties["x"] = x_coord_prop
    g.vertex_properties["y"] = y_coord_prop

    weight_prop = g.new_edge_property("float")

    g.edge_properties["weight"] = weight_prop

    playerIDs = set()
    vertex_dict = {}
    res = []
    graphs = []

    for row in data.iter_rows():
        if (row[0] == gameId and 
            row[1] == playId and 
            row[3] != 'football' and  # do some data cleaning to remove this
            ((int(row[4]) % 24 == 0) or 
             (int(row[4]) == 1))):

            res.append(row)
            
            if row[2] != 'None' and row[2] not in vertex_dict:
                playerIDs.add(row[2])
                v = g.add_vertex()
                player_id_prop[v] = row[2]
                vertex_dict[row[2]] = v 

    frame_groups = group_rows_by_frame(data, gameId, playId)
    
    for frame, rows in frame_groups.items():
        for row in rows:
            nfl_id = row[2]
            if nfl_id in vertex_dict:
                v = vertex_dict[nfl_id]
                x_coord_prop[v] = float(row[10])
                y_coord_prop[v] = float(row[11])
        

        g.clear_edges()
        vertices = list(vertex_dict.values())
        n = len(vertices)
        print("Number of vertices:", n)
        for i in range(n):
            for j in range(i + 1, n):
                v1 = vertices[i]
                v2 = vertices[j]
                dx = x_coord_prop[v1] - x_coord_prop[v2]
                dy = y_coord_prop[v1] - y_coord_prop[v2]
                dist = (dx**2 + dy**2)**0.5
                weight = 1 / dist if dist != 0 else 0  # Avoid division by zero
                e = g.add_edge(v1, v2)
                weight_prop[e] = weight


        print("Frame:", frame, "Edges computed:", g.num_edges())
        graphs.append(g.copy())

    return graphs


network = construct_graph(tracking, 2022091200, 64)
print(network)

Number of vertices: 22
Frame: 1 Edges computed: 231
Number of vertices: 22
Frame: 24 Edges computed: 231
Number of vertices: 22
Frame: 48 Edges computed: 231
Number of vertices: 22
Frame: 72 Edges computed: 231
Number of vertices: 22
Frame: 96 Edges computed: 231
Number of vertices: 22
Frame: 120 Edges computed: 231
Number of vertices: 22
Frame: 144 Edges computed: 231
[<Graph object, undirected, with 22 vertices and 231 edges, 3 internal vertex properties, 1 internal edge property, at 0x16b807570>, <Graph object, undirected, with 22 vertices and 231 edges, 3 internal vertex properties, 1 internal edge property, at 0x16b807680>, <Graph object, undirected, with 22 vertices and 231 edges, 3 internal vertex properties, 1 internal edge property, at 0x16b807790>, <Graph object, undirected, with 22 vertices and 231 edges, 3 internal vertex properties, 1 internal edge property, at 0x16b807240>, <Graph object, undirected, with 22 vertices and 231 edges, 3 internal vertex properties, 1 internal

 Is this for every second or one for every play?
 What graph formations/trends lead to the best defensive outcome
    - Run every graph tool function