You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@adamov-artem you can call to_numpy() to get the count matrix, then do the PCA/geosketch routine, and then use the indices returned by gs() to index into your data frame using iloc().
For example,
X = data_frame.to_numpy()
# PCA.
from fbpca import pca
U, s, Vt = pca(X, k=100) # E.g., 100 PCs.
X_dimred = U[:, :100] * s[:100]
# Sketch.
from geosketch import gs
N = 20000 # Number of samples to obtain from the data set.
sketch_index = gs(X_dimred, N, replace=False)
data_frame_sketch = data_frame.iloc[sketch_index, :]
Hi. Is it possible to run geosketch on a DataFrame object? I need to save row- and colnames of my count matrix.
The text was updated successfully, but these errors were encountered: