<a href="https://colab.research.google.com/github/timsetsfire/pgh-bike-share/blob/master/Final_pgh_bike_share.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## Pittsburgh Bike Share

The graph that we will be playing with is based on the PGH Bike Share. For our purposes a vertex will be a bike station, with properties: id, name, number of racks, and geo location (lat and lng). The edges will be all the trips from one bike station to another. Each edge has properties: start station, end station, start time (in millis), end time (in millis), trip id and bike it.

Recognize that our graph is a directed multi-graph, meaning the edges have direction, and there can be several parallel edges, and loops are allowed (a loop is starting and ending in the same place). You can think of a parallel edges as being two seperate trips that start at the same vertex and end at the same vertex.

We will transform this directed multi graph with loops to a directed graph with loops. In this context an edge will be the number of trips originated at one vertex and ended at another.



In [None]:
%%sh
git clone https://github.com/timsetsfire/pgh-bike-share.git

Cloning into 'pgh-bike-share'...


In [None]:
%%sh
pip install networkx gmaps cairosvg scikit-network wandb leafmap keplergl -q
pip install --upgrade numpy



ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
google-colab 1.0.0 requires requests~=2.23.0, but you have requests 2.27.1 which is incompatible.
google-colab 1.0.0 requires tornado~=5.1.0; python_version >= "3.0", but you have tornado 6.1 which is incompatible.
datascience 0.10.6 requires folium==0.2.1, but you have folium 0.12.1.post1 which is incompatible.
albumentations 0.1.12 requires imgaug<0.2.7,>=0.2.5, but you have imgaug 0.2.9 which is incompatible.


In [None]:
import os
os.kill(os.getpid(), 9)

In [None]:
!pip install wandb



In [None]:
import pandas as pd
import os
import glob
import wandb 
from sknetwork.clustering import Louvain, PropagationClustering, KMeans
from sknetwork.clustering import modularity
from sknetwork.embedding import Spectral, GSVD, LouvainEmbedding
from sknetwork.visualization import svg_digraph
from IPython.display import SVG
import pickle
from dill.source import getsource
from dill import detect
from cairosvg import svg2png
from sklearn.metrics import silhouette_score
import leafmap.kepler as leafmap_kepler
from shapely.geometry.point import Point
import numpy as np
import sys
import geopandas as gpd

sys.path.append("/content/pgh-bike-share/code/python")
from graph_helpers import * 


In [None]:
wandb.init?

In [None]:
wandb.login()
ENTITY = None 

[34m[1mwandb[0m: Currently logged in as: [33mtim-whittaker[0m (use `wandb login --relogin` to force relogin)


In [None]:
project_name = "pgh_bike_share"

In [None]:
MODE = "online"

In [None]:
path = os.getcwd()
with wandb.init(entity = ENTITY, project= project_name, job_type='raw-data', mode=MODE) as run:
  artifact = wandb.Artifact("data", type = "data")
  ## add all the flat files to wandb project
  artifact.add_dir( os.path.join(path,"pgh-bike-share","data") )
  run.log_artifact(artifact) # Creates `bike_share_data:v0`

[34m[1mwandb[0m: Adding directory to artifact (/content/pgh-bike-share/data)... Done. 0.1s


VBox(children=(Label(value=' 14.10MB of 14.10MB uploaded (14.03MB deduped)\r'), FloatProgress(value=1.0, max=1…

## Process Edges and Vertices

In [None]:
with wandb.init(entity = ENTITY, project= project_name, group = "date-processing", job_type='data-processing', mode=MODE) as run:

  dataset_art = run.use_artifact('data:latest', type='data')
  dataset_dir = dataset_art.download("wandb/pgh_bike_share/data")

  station_files = glob.glob(os.path.join(dataset_dir, "*Locations*.csv"))
  rental_files = glob.glob(os.path.join(dataset_dir, "*Rental*.csv"))

  edges_df = make_edges_df(rental_files, src="From station id", dst="To station id", weighted=True, allow_loops=False)
  vertices_df = make_vertices_df(
      station_files, vertex_id="Station #", 
      keep_floats = ["Latitude", "Longitude"], 
      float_checks = {"Latitude": {"max": 100, "min": 0}}
      ) 

  ## log data as artifacts
  edges_artifact = wandb.Artifact(name='edges', 
                                      type='processed_data',
                                      description='edges for the pgh bike share graph',
                                      metadata = {
                                          "make_edges_df_func": getsource(detect.code(make_edges_df)),
                                          "docstring": make_edges_df.__doc__,
                                          "args": {"src":"From station id", "dst":"To station id", "weighted":True, "allow_loops": False}
                                      }                                 
                                  )

  vertices_artifact = wandb.Artifact(name="vertices", 
                                      type="processed_data",
                                      description="vertices from the pgh bike share graph",
                                      metadata = {
                                          "make_vertices_df_func": getsource(detect.code(make_vertices_df)),
                                          "docstring": make_vertices_df.__doc__,
                                          "args": { "vertex_id":"Station #","keep_floats":["Latitude", "Longitude"], "float_checks": {"Latitude": {"max": 100, "min": 0}}}
                                      }
                                    )

  edges_df.to_csv("/content/pgh-bike-share/data/processed/edges.csv", index=False)
  vertices_df.to_csv("/content/pgh-bike-share/data/processed/vertices.csv", index=False)
  edges_artifact.add_file( "/content/pgh-bike-share/data/processed/edges.csv")
  vertices_artifact.add_file("/content/pgh-bike-share/data/processed/vertices.csv")
  run.log_artifact(edges_artifact)
  run.log_artifact(vertices_artifact)
  ## Log the Table to your W&B workspace
  wandb_edges = wandb.Table(dataframe=edges_df)
  wandb_vertices = wandb.Table(dataframe=vertices_df)
  run.log({'edges': wandb_edges})
  run.log({'vertices': wandb_vertices})




VBox(children=(Label(value=' 0.20MB of 0.21MB uploaded (0.00MB deduped)\r'), FloatProgress(value=0.96852194353…

## Create Graph

In [None]:
G = None
with wandb.init(entity = ENTITY, project= project_name, group = "date-processing", job_type='create-graph', mode=MODE) as run:
  edges_dir = run.use_artifact("edges:latest").download()
  vertices_dir = run.use_artifact("vertices:latest").download()
  edges_df = pd.read_csv(edges_dir + "/edges.csv")
  vertices_df = pd.read_csv(vertices_dir + "/vertices.csv")

  # vertices_df["id"] = vertices_df["id"]
  G = make_graph("id", "src", "dst", "count", vertices_df, edges_df)

  G = G.to_directed()

  graph_artifact = wandb.Artifact(name="graph", 
                                      type="processed_data",
                                      description="graph representation (using networkx) of the pgh bike share data",
                                      metadata = { "make_graph_func": getsource(detect.code(make_graph)), 
                                                   "args": {"vertex_id": "id", "src_id": "src", "dst_id": "dst", "weight_col": "count"}, 
                                                   "docstring": make_graph.__doc__,
                                                   "graph_info": nx.info(G)
                                      })
  
  with open(os.path.join(path, "graph.pkl"), "wb") as f:
    pickle.dump(G, f)
    print(f.name)

  A = nx.adjacency_matrix(G)
  A_dense = A.todense()
  D =  np.diag(1 / np.asarray(A.sum(axis=1)).flatten())
  T = np.matmul(A_dense, D)
  # wandb.log({'heatmap_with_text': wandb.plots.HeatMap(x_labels, y_labels, matrix_values, show_text=True)})
  wandb.log({'transition matrix': wandb.plots.HeatMap(list(G.nodes), list(G.nodes), T, show_text=False)})

  vertices_df["geometry"] = vertices_df.apply(lambda x: Point(x.Longitude, x.Latitude), axis=1)
  gdf = gpd.GeoDataFrame(vertices_df)[["geometry", "id", "Station Name"]]
  m = leafmap_kepler.Map(center=(40.444173669007355, -79.9613070687836), zoom=12)
  # m.add_points_from_xy(vertices_df, x = "Longitude", y = "Latitude", )
  m.add_data(gdf, name = "Bike Share Stations")
  m.to_html(outfile="map.html")
  wandb.log({"pgh bike shape map": wandb.Html(open("map.html"))})
  wandb.log({"pgh bike shape map v2": wandb.Html(open("/content/pgh-bike-share/extras/arc_point_map.html"))})
  graph_artifact.add_file(f.name)
  run.log_artifact(graph_artifact)

[34m[1mwandb[0m: Visualizing heatmap.


/content/graph.pkl


VBox(children=(Label(value=' 0.88MB of 0.88MB uploaded (0.00MB deduped)\r'), FloatProgress(value=1.0, max=1.0)…

In [None]:
# station_files = glob.glob(os.path.join("/content/pgh-bike-share/data", "*Locations*.csv"))
# rental_files = glob.glob(os.path.join("/content/pgh-bike-share/data", "*Rental*.csv"))

# edges_df = make_edges_df(rental_files, src="From station id", dst="To station id", weighted=True, allow_loops=False)
# vertices_df = make_vertices_df(
#     station_files, vertex_id="Station #", 
#     keep_floats = ["Latitude", "Longitude"], 
#     float_checks = {"Latitude": {"max": 100, "min": 0}}
#     ) 

# G = make_graph("id", "src", "dst", "count", vertices_df, edges_df)


# # vertices_df["geometry"] = vertices_df.apply(lambda x: Point(x.Longitude, x.Latitude), axis=1)
# # m = leafmap_kepler.Map(center=(40.444173669007355, -79.9613070687836), zoom=12)
# # gdf = gpd.GeoDataFrame(vertices_df)[["geometry", "id", "Station Name"]]
# # # m.add_points_from_xy(vertices_df, x = "Longitude", y = "Latitude", )
# # m.add_data(gdf, name = "Bike Share Stations")
# # m

## Community detection (aka clustering bike stations)

Should be obvious that there are probably groupings of bike stations that are more natural than others.  People rent Bikes to go from Heinz field to PNC Park, rather than Randy Land to  to .  we'll do an embedding of the graph and run kmeans algorithm.  

They Hyperparameters we will change
* Type of embedding  
* number of cluster / dimension of embedding

For this task, we'll use WandB Sweeps.  

In [None]:
config_defaults = {
  "embedding": ["Spectral"],
  "n_clusters": [3,4,5]
}
with wandb.init() as run: 
  graph_artifact = wandb.use_artifact(f"{project_name}/graph:latest", type="processed_data")
  graph_artifact.download("./graph_binary")
  with open("./graph_binary/graph.pkl", "rb") as f:
    G = pickle.load(f)
  position = []
  for g in G.nodes:
    position.append( np.array([G.nodes.get(g)["Longitude"], G.nodes.get(g)["Latitude"]]))
  A = nx.adjacency_matrix(G)
  position = np.array(position)

VBox(children=(Label(value=' 0.00MB of 0.00MB uploaded (0.00MB deduped)\r'), FloatProgress(value=1.0, max=1.0)…

In [None]:
def create_model(k, embedding_method = "Spectral"):
  if embedding_method == "Spectral": 
    embedding = Spectral(k)
  elif embedding_method == "GSVD":
    embedding = GSVD(k)
  elif embedding_method == "Louvain":
    return Louvain()
  kmeans = KMeans(k, embedding, co_cluster=True)
  return kmeans

def train(graph, model, eval_metric):
  adjacency_matrix = nx.adjacency_matrix(graph)
  np.random.seed(seed=1337)
  embedding = model.embedding_method
  embedding_space_str = embedding.__class__.__name__
  embedded_adj_matrix = model.embedding_method.fit_transform(A)
  out = model.fit_transform(adjacency_matrix)

  temp_df = pd.DataFrame(embedded_adj_matrix, columns = [f"dim_{i}" for i in range(model.n_clusters)], index= list(graph.nodes))
  temp_df["cluster"] = out
  temp_df["cluster"] = temp_df["cluster"].apply( lambda x: "cluster-{}".format(x))
  temp_df["embedding"] = embedding_space_str 
  table = wandb.Table(dataframe= temp_df.copy())
  # run.log({f"KMeans on {embedding_space_str} {model.n_clusters} Dim Embedding": table})

  ## set to work with modularity
  mod_score = eval_metric(adjacency_matrix, out)
  sil = silhouette_score(embedded_adj_matrix, out)

  model_artifact = wandb.Artifact(
      name="{}-{}-{}".format(embedding_space_str,model.__class__.__name__,model.n_clusters), 
                                      type='model',
                                      description='clusterer for the pgh bike share graph',
                                      metadata = {
                                          "docstring": model.__doc__,
                                      }                                 
                                  )
  with open("model.pkl", "wb") as f:
    pickle.dump(model, f)
    model_artifact.add_file("model.pkl")


  out_df = pd.DataFrame(out, columns = ["clusters"], index = list(G.nodes))
  gdf = gpd.GeoDataFrame(vertices_df).drop(["# of Racks"], axis=1).set_index("id")
  gdf = gdf.join(out_df)
  m = leafmap_kepler.Map(center=(40.444173669007355, -79.9613070687836), zoom=12)
  m.add_data(gdf.copy(), name = "Bike Share Stations")
  map_name = "{}-{}-{}_map.html".format(embedding_space_str,model.__class__.__name__,model.n_clusters)
  m.to_html(outfile= map_name)
  wandb_map_html = wandb.Html(open(map_name))
  wandb.log({"map with clusters": wandb_map_html})
  
  # image = svg_digraph(A, labels = out)
  # svg2png(bytestring=image,write_to=f'output_{embedding_space_str}_kmeans_{model.n_clusters}.png')
  # wandb_image = wandb.Image(f'output_{embedding_space_str}_kmeans_{model.n_clusters}.png')


  image = svg_digraph(A, labels = out, position = position)
  svg2png(bytestring=image,write_to=f'output_{embedding_space_str}_kmeans_{model.n_clusters}_position.png')
  wandb_image_pos = wandb.Image(f'output_{embedding_space_str}_kmeans_{model.n_clusters}_position.png')
  wandb.log({"graph plot with positions": wandb_image_pos})

  print("Modularity: %.4f" % (float(mod_score),))
  print("Silhouette: %.4f" % (float(sil),))
  wandb.log({'embedding': embedding_space_str,
             'embedding_n_components': embedding.n_components, 
             'n_clusters': model.n_clusters,
             'modularity': mod_score,
             'silhouette': sil,
             'seed':1337,
             'predictions': table})
  wandb.log_artifact(model_artifact)

def sweep_train(config_defaults=None):
  # Set default values
    config_defaults = {
        "embedding": "Spectral",
        "n_clusters": 4
    }
  # Initialize wandb with a sample project name
    wandb.init(config=config_defaults)  # this gets over-written in the Sweep

    wandb.config.dataset_name = "graph"

    graph_artifact = wandb.use_artifact(f"{project_name}/graph:latest", type="processed_data")
    graph_artifact.download("./graph_binary")
    with open("./graph_binary/graph.pkl", "rb") as f:
      G = pickle.load(f)
    
    ## initialize model
    model = create_model(wandb.config.n_clusters, wandb.config.embedding)
    
    eval_metric = modularity
    # eval_metric = silhouette_score(model.embedding_method.transform(A), out)
    train(G,model,eval_metric)
  
sweep_config = {
    "method": "grid",
'metric': {
    'name': 'eval_metric',
    'goal': 'maximize'
},
'parameters': {
    'embedding': {
        'values': ["Spectral", "GSVD"]
    },
    'n_clusters': {
        'values': [3,4,5,6,7,8,9,10]
        # 'values': [3]
    }
}
}

In [None]:
# wandb.init(project = project_name, group = "experiment-1")
sweep_id = wandb.sweep(sweep_config, project=project_name )

Create sweep with ID: dq6hhqdz
Sweep URL: https://wandb.ai/tim-whittaker/pgh_bike_share/sweeps/dq6hhqdz


In [None]:
wandb.agent(sweep_id, function=sweep_train)

[34m[1mwandb[0m: Agent Starting Run: hbvxjr2g with config:
[34m[1mwandb[0m: 	embedding: Spectral
[34m[1mwandb[0m: 	n_clusters: 3


Modularity: 0.2782
Silhouette: 0.4258


VBox(children=(Label(value=' 0.14MB of 0.14MB uploaded (0.00MB deduped)\r'), FloatProgress(value=1.0, max=1.0)…

0,1
embedding_n_components,▁
modularity,▁
n_clusters,▁
seed,▁
silhouette,▁

0,1
embedding,Spectral
embedding_n_components,3
modularity,0.2782
n_clusters,3
seed,1337
silhouette,0.42578


[34m[1mwandb[0m: Agent Starting Run: 5pmo3l17 with config:
[34m[1mwandb[0m: 	embedding: Spectral
[34m[1mwandb[0m: 	n_clusters: 4


Modularity: 0.4114
Silhouette: 0.4141


VBox(children=(Label(value=' 0.15MB of 0.15MB uploaded (0.00MB deduped)\r'), FloatProgress(value=1.0, max=1.0)…

0,1
embedding_n_components,▁
modularity,▁
n_clusters,▁
seed,▁
silhouette,▁

0,1
embedding,Spectral
embedding_n_components,4
modularity,0.41139
n_clusters,4
seed,1337
silhouette,0.41411


[34m[1mwandb[0m: Agent Starting Run: qmrw3dyb with config:
[34m[1mwandb[0m: 	embedding: Spectral
[34m[1mwandb[0m: 	n_clusters: 5


Modularity: 0.4024
Silhouette: 0.4433


VBox(children=(Label(value=' 0.17MB of 0.17MB uploaded (0.00MB deduped)\r'), FloatProgress(value=1.0, max=1.0)…

0,1
embedding_n_components,▁
modularity,▁
n_clusters,▁
seed,▁
silhouette,▁

0,1
embedding,Spectral
embedding_n_components,5
modularity,0.40243
n_clusters,5
seed,1337
silhouette,0.44334


[34m[1mwandb[0m: Agent Starting Run: 1e1f3hal with config:
[34m[1mwandb[0m: 	embedding: Spectral
[34m[1mwandb[0m: 	n_clusters: 6


Modularity: 0.3672
Silhouette: 0.3936


VBox(children=(Label(value=' 0.17MB of 0.17MB uploaded (0.00MB deduped)\r'), FloatProgress(value=1.0, max=1.0)…

0,1
embedding_n_components,▁
modularity,▁
n_clusters,▁
seed,▁
silhouette,▁

0,1
embedding,Spectral
embedding_n_components,6
modularity,0.36724
n_clusters,6
seed,1337
silhouette,0.39362


[34m[1mwandb[0m: Agent Starting Run: gagazoge with config:
[34m[1mwandb[0m: 	embedding: Spectral
[34m[1mwandb[0m: 	n_clusters: 7


Modularity: 0.3653
Silhouette: 0.3554


VBox(children=(Label(value=' 0.19MB of 0.19MB uploaded (0.00MB deduped)\r'), FloatProgress(value=1.0, max=1.0)…

0,1
embedding_n_components,▁
modularity,▁
n_clusters,▁
seed,▁
silhouette,▁

0,1
embedding,Spectral
embedding_n_components,7
modularity,0.36528
n_clusters,7
seed,1337
silhouette,0.35536


[34m[1mwandb[0m: Agent Starting Run: bu63p6zi with config:
[34m[1mwandb[0m: 	embedding: Spectral
[34m[1mwandb[0m: 	n_clusters: 8


Modularity: 0.3009
Silhouette: 0.2841


VBox(children=(Label(value=' 0.20MB of 0.20MB uploaded (0.00MB deduped)\r'), FloatProgress(value=1.0, max=1.0)…

0,1
embedding_n_components,▁
modularity,▁
n_clusters,▁
seed,▁
silhouette,▁

0,1
embedding,Spectral
embedding_n_components,8
modularity,0.30086
n_clusters,8
seed,1337
silhouette,0.2841


[34m[1mwandb[0m: Agent Starting Run: tu9r4gbs with config:
[34m[1mwandb[0m: 	embedding: Spectral
[34m[1mwandb[0m: 	n_clusters: 9


Modularity: 0.3245
Silhouette: 0.3072


VBox(children=(Label(value=' 0.20MB of 0.20MB uploaded (0.00MB deduped)\r'), FloatProgress(value=1.0, max=1.0)…

0,1
embedding_n_components,▁
modularity,▁
n_clusters,▁
seed,▁
silhouette,▁

0,1
embedding,Spectral
embedding_n_components,9
modularity,0.32449
n_clusters,9
seed,1337
silhouette,0.30721


[34m[1mwandb[0m: Agent Starting Run: t7485vv0 with config:
[34m[1mwandb[0m: 	embedding: Spectral
[34m[1mwandb[0m: 	n_clusters: 10


Modularity: 0.2281
Silhouette: 0.2297


VBox(children=(Label(value=' 0.22MB of 0.22MB uploaded (0.00MB deduped)\r'), FloatProgress(value=1.0, max=1.0)…

0,1
embedding_n_components,▁
modularity,▁
n_clusters,▁
seed,▁
silhouette,▁

0,1
embedding,Spectral
embedding_n_components,10
modularity,0.22811
n_clusters,10
seed,1337
silhouette,0.22972


[34m[1mwandb[0m: Agent Starting Run: uawxok4q with config:
[34m[1mwandb[0m: 	embedding: GSVD
[34m[1mwandb[0m: 	n_clusters: 3


Modularity: 0.4044
Silhouette: 0.5837


VBox(children=(Label(value=' 0.15MB of 0.15MB uploaded (0.00MB deduped)\r'), FloatProgress(value=1.0, max=1.0)…

0,1
embedding_n_components,▁
modularity,▁
n_clusters,▁
seed,▁
silhouette,▁

0,1
embedding,GSVD
embedding_n_components,3
modularity,0.40435
n_clusters,3
seed,1337
silhouette,0.5837


[34m[1mwandb[0m: Agent Starting Run: kkoexqy4 with config:
[34m[1mwandb[0m: 	embedding: GSVD
[34m[1mwandb[0m: 	n_clusters: 4


Modularity: 0.3848
Silhouette: 0.5261


VBox(children=(Label(value=' 0.16MB of 0.16MB uploaded (0.00MB deduped)\r'), FloatProgress(value=1.0, max=1.0)…

0,1
embedding_n_components,▁
modularity,▁
n_clusters,▁
seed,▁
silhouette,▁

0,1
embedding,GSVD
embedding_n_components,4
modularity,0.38476
n_clusters,4
seed,1337
silhouette,0.52607


[34m[1mwandb[0m: Agent Starting Run: h9rv0gh5 with config:
[34m[1mwandb[0m: 	embedding: GSVD
[34m[1mwandb[0m: 	n_clusters: 5


Modularity: 0.4118
Silhouette: 0.4813


VBox(children=(Label(value=' 0.17MB of 0.17MB uploaded (0.00MB deduped)\r'), FloatProgress(value=1.0, max=1.0)…

0,1
embedding_n_components,▁
modularity,▁
n_clusters,▁
seed,▁
silhouette,▁

0,1
embedding,GSVD
embedding_n_components,5
modularity,0.41183
n_clusters,5
seed,1337
silhouette,0.4813


[34m[1mwandb[0m: Agent Starting Run: o3emri91 with config:
[34m[1mwandb[0m: 	embedding: GSVD
[34m[1mwandb[0m: 	n_clusters: 6


Modularity: 0.3786
Silhouette: 0.4317


VBox(children=(Label(value=' 0.18MB of 0.18MB uploaded (0.00MB deduped)\r'), FloatProgress(value=1.0, max=1.0)…

0,1
embedding_n_components,▁
modularity,▁
n_clusters,▁
seed,▁
silhouette,▁

0,1
embedding,GSVD
embedding_n_components,6
modularity,0.37863
n_clusters,6
seed,1337
silhouette,0.43169


[34m[1mwandb[0m: Agent Starting Run: 5py8ofj4 with config:
[34m[1mwandb[0m: 	embedding: GSVD
[34m[1mwandb[0m: 	n_clusters: 7


Modularity: 0.3638
Silhouette: 0.4331


VBox(children=(Label(value=' 0.19MB of 0.19MB uploaded (0.00MB deduped)\r'), FloatProgress(value=1.0, max=1.0)…

0,1
embedding_n_components,▁
modularity,▁
n_clusters,▁
seed,▁
silhouette,▁

0,1
embedding,GSVD
embedding_n_components,7
modularity,0.36377
n_clusters,7
seed,1337
silhouette,0.43314


[34m[1mwandb[0m: Agent Starting Run: 3qft79i6 with config:
[34m[1mwandb[0m: 	embedding: GSVD
[34m[1mwandb[0m: 	n_clusters: 8


Modularity: 0.3449
Silhouette: 0.4005


VBox(children=(Label(value=' 0.21MB of 0.21MB uploaded (0.00MB deduped)\r'), FloatProgress(value=1.0, max=1.0)…

0,1
embedding_n_components,▁
modularity,▁
n_clusters,▁
seed,▁
silhouette,▁

0,1
embedding,GSVD
embedding_n_components,8
modularity,0.34494
n_clusters,8
seed,1337
silhouette,0.40049


[34m[1mwandb[0m: Agent Starting Run: wlib2c2i with config:
[34m[1mwandb[0m: 	embedding: GSVD
[34m[1mwandb[0m: 	n_clusters: 9


Modularity: 0.2914
Silhouette: 0.3046


VBox(children=(Label(value=' 0.22MB of 0.22MB uploaded (0.00MB deduped)\r'), FloatProgress(value=1.0, max=1.0)…

0,1
embedding_n_components,▁
modularity,▁
n_clusters,▁
seed,▁
silhouette,▁

0,1
embedding,GSVD
embedding_n_components,9
modularity,0.2914
n_clusters,9
seed,1337
silhouette,0.30456


[34m[1mwandb[0m: Sweep Agent: Waiting for job.
[34m[1mwandb[0m: Job received.
[34m[1mwandb[0m: Agent Starting Run: h14sozy9 with config:
[34m[1mwandb[0m: 	embedding: GSVD
[34m[1mwandb[0m: 	n_clusters: 10


Modularity: 0.2761
Silhouette: 0.2842


VBox(children=(Label(value=' 0.23MB of 0.23MB uploaded (0.00MB deduped)\r'), FloatProgress(value=1.0, max=1.0)…

0,1
embedding_n_components,▁
modularity,▁
n_clusters,▁
seed,▁
silhouette,▁

0,1
embedding,GSVD
embedding_n_components,10
modularity,0.27612
n_clusters,10
seed,1337
silhouette,0.28424


[34m[1mwandb[0m: Sweep Agent: Waiting for job.
[34m[1mwandb[0m: Sweep Agent: Exiting.
