Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IDEA: Perform weight KMeans clustering in the image-plane #9

Closed
Jammy2211 opened this issue Jan 22, 2018 · 16 comments
Closed

IDEA: Perform weight KMeans clustering in the image-plane #9

Jammy2211 opened this issue Jan 22, 2018 · 16 comments

Comments

@Jammy2211
Copy link
Owner

We are used to using weighted KMeans clustering to adapt the source-pixelization in the source plane. A consequence of this is we sample a unique discrete source-plane grid for every iteration of the method - something I believe helps remove systematic biases, but makes the method slower and more cumbersome. Its a shame we currently have no option to use a (nearly) fixed discreization, as this will speed things up a lot at the expense of introducing biases. However, we can switch to source-plane clustering at the end of the analysis, such that a long pipeline gains speed without bias.

To achieve this, we can perform weighted KMeans clutering on the source-weights in the image-plane (during the hyper-parameter optimization where we set up the source-plane pixelizlation, regularization etc). We then fix the clusters to the 'best-fit' points in the image plane for lens modeling, thereby mimicking the behaviour of a fixed source-plane discretization. This will reuse all the functions the source-plane clustering method uses and requires no extra methods, so should be simple to implement :).

This could also form the basis of a light-weighted Delaunay pixelization.

@rhayes777
Copy link
Collaborator

rhayes777 commented Jan 22, 2018 via email

@Jammy2211
Copy link
Owner Author

We normally trace a sparsely sampled set of image-pixels to the source-plane, and then feed those source-plane coordinates (weighted by something equivalent to their luminosity) into the weighted KMeans algorithm. By doing this in the source-plane, when we change the lens model (which moves all the source-plane coordinate around) we have to re-grid the source pixelization to match it, by feeding this new set of source-coordinates into the KMeans. Therefore, every pixelization comes out unique.

If we instead feed the same sparsely sampled set of image-pixels into the weighted KMeans algorithm, but using their image-plane coordinates, we'll achieve similar behaviour. That is, we'll compute a set of image-plane coordinates that trace the source's' surface brightness. When we trace these coordinates to the source-plane, we can use them compute the source-pixelization just like we would normally. This would skip the KMeans step as they already represent the centers of source-pixels (their centers are then used to pair with all sub image-pixels, perform the Voronoi gridding etc.).

The crucial difference is that because these coordinates were computed in the image-plane, when we change the lens model they will trace to new source-plane coordinates and automatically adapt the source-pixelization to match it - without any need to recall the weighted KMeans algorithm. The downside is this will happen in a regular and predictable way thus potentially introducing bias.

@rhayes777
Copy link
Collaborator

rhayes777 commented Jan 22, 2018 via email

@Jammy2211
Copy link
Owner Author

The bias issue is what I explained with an analogy to a square grid, where the use of a single fixed discretization leads to biases as there are always 'favourable' source-plane discretizations. I suspect this approach will introduce the same problem - yes the pixelization is varying here but it do so in a smooth, regular and predictable way nontheless, which will inevitably have some 'favourable' source-plane discretization.

Its not expensive to flip between the two though, and for a lot of earlier stages of the analysis there's no point worrying about subtle systematics in the lens model.

@Jammy2211
Copy link
Owner Author

If you think about it, this is pretty much just doing what I suggested in the issue below lol:

]#5

@rhayes777
Copy link
Collaborator

rhayes777 commented Jan 22, 2018 via email

@rhayes777
Copy link
Collaborator

rhayes777 commented Jan 22, 2018 via email

@Jammy2211
Copy link
Owner Author

Jammy2211 commented Jan 22, 2018

You're right, we could reintroduce randomness by reapplying the KMeans in the source-plane, using these initialized points to speed it up (for just the lens modeling step, where KMeans in the image-plane is switched off).

Assuming this doesn’t introduce some systematic bias?

it should remove it!

@rhayes777
Copy link
Collaborator

rhayes777 commented Jan 22, 2018 via email

@Jammy2211
Copy link
Owner Author

We'd need to always have the same number of pixels if we do that, as its bad to change the source-plane resolution in weird ways.

Its not too hard though, we could just add extra points to the image-plane cluster and randomly remove enough to reach our target value.

@rhayes777
Copy link
Collaborator

rhayes777 commented Jan 22, 2018 via email

@Jammy2211
Copy link
Owner Author

It must not vary at all!

@rhayes777
Copy link
Collaborator

What I mean is the number of non-clustered image pixels (raw image pixels?) should not change. If I'm not mistaken we choose the number of clusters in KMeans so the number of clustered image pixels should be invariant.

@Jammy2211
Copy link
Owner Author

So you would randomly remove image pixels before then passing them to the weighted kmeans algorithm for clustering?

Makes sense.

@rhayes777
Copy link
Collaborator

rhayes777 commented Jan 22, 2018 via email

@Jammy2211
Copy link
Owner Author

Implemented.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants