-
-
Notifications
You must be signed in to change notification settings - Fork 29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Error in Optimal_Clusters_Medoids function #5
Comments
I'm sorry for the late reply, and for the inconvenience due to the error. I was aware of this particular issue and I can reproduce using the following example on Ubuntu 16.04, library(ClusterR)
data(soybean)
dat = soybean[, -ncol(soybean)]
opt_md = opt_md = Optimal_Clusters_Medoids(dat, 20, 'jaccard_coefficient', plot_clusters = T)
Based on the plot give the number of clusters (greater than 1) that you consider optimal? 18
Error in plot.new() : figure margins too large I added a tryCatch function to return a warning (and not an error) in this case (both for the 'dissimilarity' and 'silhouette' criteria). Please download the newest github version of the package using devtools::install_github('mlampros/ClusterR')
because I can't submit the newest version to CRAN before the 18-02-2018. Concerning the error you received there is a similar stackoverflow issue which explains how to deal with such a case. However it didn't work for my example because it was impossible to fit all the data (18 clusters) in a PC screen although I expanded it (dragged the edge of the plot). You should also know that although you received an error (now it's a warning) the data is not lost. If you do str(opt_md)
you receive the data for all clusters, List of 20
$ : NULL
$ :List of 6
..$ avg_intra_clust_dissimilarity: num 0.6
..$ sum_intra_dissim : num 184
..$ avg_width_silhouette : num 0.162
..$ list_intra_dissm :List of 2
.. ..$ : num [1, 1:155] 0.577 0.584 0.591 0.575 0.579 ...
.. ..$ : num [1, 1:152] 0.678 0.674 0.656 0.658 0.57 ...
.......... If by data loss you mean the first plot (before choosing the optimal number of clusters) then you should know that it's something I can't change, because the first plot appears temporarily (it's not returned). However, what you can do (in case you use the ClusterR package in the Rstudio IDE) is before choosing the optimal number of clusters to go to the right of the IDE in 'Plots' > 'Export' > 'Save as PDF' (or 'Save as Image') and save the first plot locally (in your personal computer). So that depending on the number of clusters you won't have to worry if the second plot will appear or not. |
Thank you very much!! As soon as I can test the new version I´ll let you know. |
I close the issue for now, feel free to reopen it in case you observe any issues. |
Hi, I am working with version 1.1.0 of ClusterR package in RStudio v. 1.0.153. I can also add the information given by the version command:
platform x86_64-pc-linux-gnu
arch x86_64
os linux-gnu
system x86_64, linux-gnu
status
major 3
minor 4.1
year 2017
month 06
day 30
svn rev 72865
language R
version.string R version 3.4.1 (2017-06-30)
nickname Single Candle
I am not sure any of this matters because the problem I have found is and error from the plot function. The problem is that after the first plot (avg. dissimilarity and avg silhouette width), when we are asked for the optimal number of clusters, if that number is too large, plot drops an error message (figure margins too large...) and this produces an execution error and the results are lost. I would suggest to create a separate function for this kind of plot and prevent the execution error.
By the way, thank you very much for your work.
The text was updated successfully, but these errors were encountered: