Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

error: SpMat::SpMat(): invalid row or column index #62

Open
Shawnmhy opened this issue Sep 28, 2021 · 3 comments
Open

error: SpMat::SpMat(): invalid row or column index #62

Shawnmhy opened this issue Sep 28, 2021 · 3 comments

Comments

@Shawnmhy
Copy link

Hi there, I am trying to use largevis to do clustering. I have about ~200 dataset, each dataset has ~ 1000 - 100000 samples with 2 features (feature number is consistent). While the largevis function works for almost all my dataset, I still got this error message for one of my dataset:


error: SpMat::SpMat(): invalid row or column index
Error in referenceWij(is, x@i, x@x^2, as.integer(threads), perplexity) : 
  SpMat::SpMat(): invalid row or column index
In addition: Warning message:
In largeVis(t(as.matrix(memberships[, c("X", "Y")])), dim = 2, K = K,  :
  The Distances between some neighbors are large enough to cause the calculation of p_{j|i} to overflow. Scaling the distance vector.

I realized that someone had such problem before, and the solution is to install the branch 'hotfix/twobugs', I successfully installed this version as well but no luck. Any ideas? Thanks!

The dataset is here: data.csv

The function I run is: largeVis(t(as.matrix(data[, c('X', 'Y')])), dim=2, K = K, tree_threshold = 100, max_iter = 5,sgd_batches = 1, threads = 1)

@elbamos
Copy link
Owner

elbamos commented Sep 28, 2021

Hi Shawn...

The most recent branch is feature/backoncran. I just tried this with your data and code and, with some changes for parameters that have been removed from the functions, it ran perfectly.

I note, though, that your dataset has only two input features, and your code would generate a dataset with two output features. LargeVis is a method for dimensionality reduction. Since your data only has two features, I'm not sure what benefit you would obtain by running it through LargeVis. Is your goal to take advantage of the hd clustering features of the package? If so, considering your datasize, you may be better off using the dbscan package.

@Shawnmhy
Copy link
Author

Hi elbamos, thank you for your reply. I tried to install this most recent branch but got an error:

remotes::install_github('elbamos/largeVis@feature/backoncran')
Downloading GitHub repo elbamos/largeVis@feature/backoncran
Error: Failed to install 'largeVis' from GitHub:
  Incorrect number of arguments (14), expecting 16 for 'processx_exec'

Any ideas?

The reason I am using the dbscan clustering from largeVis is I find that the clusters generated from is more 'realistic' (in my analysis context) than the dbscan package.

@elbamos
Copy link
Owner

elbamos commented Sep 28, 2021

Huh... I just tried cutting and pasteing your install_github line and it worked properly. I suggest making sure you're using the current version of remotes and related packages and checking your setup.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants