# Get metadata graph

Here we get the AI ecosystem graph for the metadata similarity analysis. This is a networkx network where every node is a model in the AI ecosystem and every edge is a relation including finetunes, quantizations, and adapters, *but not merges*. Thus, the graph is a tree.

We define the following attribute over the nodes: the fullJson of all metadata. We will pickle this graph and use it for other analyses later.

In [6]:
import numpy as np
import pandas as pd
import networkx as nx
import pickle

In [None]:
# Read the json dataset
df = pd.read_csv("data/ai_ecosystem_jsons.csv")

# Get also the already-defined graph
with open('data/ai_ecosystem_graph_nomerges.pkl', 'rb') as f:
    G = pickle.load(f)

In [5]:
# For every row in df, append the fullJson from the df to the node in G
for index, row in df.iterrows():
    model_id = row['modelId']
    G.nodes[model_id]['full_json'] = row['fullJson']

# Save the graph
with open('data/ai_ecosystem_graph_nomerges_fulljson.pkl', 'wb') as f:
    pickle.dump(G, f)

In [8]:
# Get also the already-defined graph
with open('data/ai_ecosystem_graph_finetune.pkl', 'rb') as f:
    G_finetuneonly = pickle.load(f)

# For every row in df, append the fullJson from the df to the node in G
for index, row in df.iterrows():
    model_id = row['modelId']
    G_finetuneonly.nodes[model_id]['full_json'] = row['fullJson']

# Save the graph
with open('data/ai_ecosystem_graph_finetune_fulljson.pkl', 'wb') as f:
    pickle.dump(G_finetuneonly, f)