Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cluster similar images #6

Open
JoshVarty opened this issue Mar 4, 2019 · 3 comments
Open

Cluster similar images #6

JoshVarty opened this issue Mar 4, 2019 · 3 comments

Comments

@JoshVarty
Copy link
Owner

For image related tasks it might be useful to cluster similar images together. We may want to count these images across the train and test set and see whether or not the distributions are equal.

@JoshVarty
Copy link
Owner Author

Thoughts:

  • Train a network
  • Remove the head
  • Take the output for each image
  • Run PCA on each vector
  • Run K-Means clustering on all of our vectors

@JoshVarty
Copy link
Owner Author

It isn't working. Possibly because the features our network learns focus on "Does this image have cancer or not"

Possible workarounds

  • Same approach but cluster 1 and 0 separately. (Might not work for test set though)
  • Try either earlier layers of the network or simply use the raw input

@JoshVarty
Copy link
Owner Author

JoshVarty commented Mar 10, 2019

Tried with more success on the input image. Things to try:

  1. Try without PCA
  2. Try with different number of PCA components
  3. Can we seed the K-means clusters?
    • I think we can use init. See here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant