Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding cluster analysis dialog to R-Instat #6993

Merged
merged 11 commits into from May 28, 2022

Conversation

Vitalis95
Copy link
Contributor

Partially fixes issue #6195
Still in progress
@rdstern, have a look at the progress have made so far. Thanks

@Vitalis95
Copy link
Contributor Author

Still in progress
@rdstern , I have added a checkbox (Partition Plot) to visualize the clusters from the pam function. Have a look at it. Thanks

Copy link
Collaborator

@rdstern rdstern left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Vitalis95 well done so far. This is looking very promising. I suggest you also involve one of the other interns in the testing as you continue to progress with this dialogue.
It is a major new feature and it would be good for you to provide guidance on how someone should test - their data sets to use and what you think they should try. (Of course they should also try different things, but this should be the start.)
I started with the first tab - and that seems to work. I really like the graphical output. Then I tried hierarchical and couldn't get that to work - though it didn't give an error. Then I decided to return to the Prepare dialogue to scale the data - what has happened to that dialogue? It is just blank - I thought that was working?
I was assuming that I could use the Practical Guide to cluster analysis book - though with your dialogues. I look forward to more testing, with more guidance from you, reported in github.

@Vitalis95
Copy link
Contributor Author

In progress
@rdstern I have added a checkbox for visualizing output from the agnes function

  • The data set am using to test is USArrests from datasets
  • Stand checkbox is used for standardizing the data
  • Metric checkbox is used for calculating dissimilarities between observations - distance
  • Both Stand and Metric arguments are ignored if your data is already dissimilarity matrix; if you have scaled the data and also calculated distance between the observations. Scale/distance dialogue in prepare menu does these, however this dialogue has not been merged, there's a comment at PR Added a dialog to prepare data for cluster analysis #6812 - Have a look at it.
  • The Method checkbox in the hierarchical button is for defining the clustering method
  • Hierarch plot and partition plot checkboxes are used for visualizing the clusters from agnes and pam functions respectively.
    Thanks

@Vitalis95
Copy link
Contributor Author

Fixes issue #6195
@rdstern , @shadrackkibet , @lloyddewit - have a look at it. Thanks

@Ivanluv
Copy link
Contributor

Ivanluv commented Dec 15, 2021

@shadrack the code looks okay from my end

@lloyddewit
Copy link
Contributor

@rdstern Please could you test? thanks

@Vitalis95
Copy link
Contributor Author

@rdstern , have a look at this. Thanks

@rdstern
Copy link
Collaborator

rdstern commented Jan 17, 2022

@Vitalis95 can you get some others to check the dialogue first. I keep trying, but it refuses to merge. It also says I have 770 incoming files changes, and also 2 outgoing files. I am not sure what is going on, and would like your colleagues to check as well.

Copy link
Collaborator

@shadrackkibet shadrackkibet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Vitalis95 I guess these issues originate from manual edits of instat/dlgClusterAnalysis.Designer.vb file. Perhaps the most efficient way is to open a new PR with the following changes done afresh.

@lloyddewit
Copy link
Contributor

@Vitalis95 can you get some others to check the dialogue first. I keep trying, but it refuses to merge. It also says I have 770 incoming files changes, and also 2 outgoing files. I am not sure what is going on, and would like your colleagues to check as well.

I tested. I was able to pull, build and execute this PR without problems. I couldn't test the cluster analysis dialog directly because the menu option was greyed out.

@rdstern Your problem may be on your machine. I've had similar problems when GitHub synchronization became corrupted due to connection problems. You could try restarting VisualSstudio, switching to master branch, and then trying to pull the PR branch again.

@shadrackkibet This PR seems OK and may not need replacing. Are there other problems I'm not aware of (you mentioned manual changes to the designer file)?

Thanks

@shadrackkibet
Copy link
Collaborator

image

@lloyddewit my hypothesis is that the conflicts here might have been sorted out manually which might have resulted in errors. Since you were able to test it from your end I think there is no need to replace this PR.

@Vitalis95
Copy link
Contributor Author

@rdstern , have a look at it.

@rdstern
Copy link
Collaborator

rdstern commented Jan 25, 2022

I am still unable to test on my machine. I still have the same as before. when switching to your branch I have 359 pull requests - ok, and also 2 push requests - never had that before.
When I pull I get the following:
image

I have tried deleting the staged changes, etc, but never got to the stage I can try your branch. Perhaps others @lloyddewit don't have this problem?

@lloyddewit
Copy link
Contributor

@rdstern I think your Git branch for this PR is corrupted.
I suggest you switch to the master branch, and then delete the PR #6993 branch:

image

image

You could then reopen the branch for this PR.

@lloyddewit
Copy link
Contributor

@rdstern Happy to hear that your Git issue is solved. I just enabled the 'Cluster Analysis...' menu option, you should be able to test now.
@shadrackkibet I needed to change frmMain. I branched from the current master. If you're aware of any potential conflicts in other PRs, then please let me know.
thanks

@shadrackkibet
Copy link
Collaborator

@lloyddewit thanks for the alert. There are no frmmain changes elsewhere. Let us test this PR and get it merged as soon as possible.

@rdstern
Copy link
Collaborator

rdstern commented Jan 26, 2022

@lloyddewit and @shadrackkibet I have now looked and this method is a major improvement that I don't want to rush. So, like the dialogue on caret I suggest it be left for Version 0.7.5 (we hope for end Feb) or even 0.8 (perhaps end June). This update is mainly for tidying items that were already in progress before the review. Both this one and caret are too big and important to include at the last minute.

@lloyddewit
Copy link
Contributor

lloyddewit commented Jan 26, 2022

@rdstern no problem, we will not merge before 0.7.4 is released.
Are you willing to test/approve before then, or would you rather wait?
When you say 'caret' which PR are you referring to?

@shadrackkibet Thank you for your comment. If the merge delay triggers a frmMain conflict, then I think I can easily resolve it.

Thanks

@lloyddewit
Copy link
Contributor

@rdstern This PR has a frmMain change so it would be good to merge as soon as possible. Please could you test?
This PR was already reviewed by @Ivanluv so it just needs a final review from @shadrackkibet or I.
I can't remember if you wanted this PR merged with or without the menu option being enabled, please remind me.
Thanks

Copy link
Collaborator

@rdstern rdstern left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking almost ready for an initial merge.
a) One cosmetic change is to move the graph control to just above the save graph in each case.
The also for the Save graph checkbox to become visible only when you choose the graph option.
b) Please also discuss the R code that is generated with @shadrackkibet. I notice the initial numeric output, from the cluster package doesn't save any object. I assume it should? Then, if a graph is constructed it starts again, and so does the clustering twice. Shouldn't we be saving the cluster object - another control, and then this object is used in the graph?

@Vitalis95
Copy link
Contributor Author

@rdstern have a look at it, I have made the changes.

@N-thony
Copy link
Collaborator

N-thony commented May 19, 2022

@rdstern have a look at it, I have made the changes.

@rdstern please, have a look.

Copy link
Collaborator

@rdstern rdstern left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Vitalis95 and @shadrackkibet this is looking good!

@shadrackkibet shadrackkibet merged commit 56ba8bf into IDEMSInternational:master May 28, 2022
@shadrackkibet
Copy link
Collaborator

@Vitalis95 could you please check that this dialog is accessible from the main form and that it is working as expected.

@Vitalis95
Copy link
Contributor Author

@Vitalis95 could you please check that this dialog is accessible from the main form and that it is working as expected.

@shadrackkibet, the dialog can now be accessed and it works fine.

@shadrackkibet
Copy link
Collaborator

Good work @Vitalis95. It will be good to get other interns to use this dialog and report any bugs or improvements.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants