IDEA: Perform weight KMeans clustering in the image-plane #9

Jammy2211 · 2018-01-22T10:42:54Z

We are used to using weighted KMeans clustering to adapt the source-pixelization in the source plane. A consequence of this is we sample a unique discrete source-plane grid for every iteration of the method - something I believe helps remove systematic biases, but makes the method slower and more cumbersome. Its a shame we currently have no option to use a (nearly) fixed discreization, as this will speed things up a lot at the expense of introducing biases. However, we can switch to source-plane clustering at the end of the analysis, such that a long pipeline gains speed without bias.

To achieve this, we can perform weighted KMeans clutering on the source-weights in the image-plane (during the hyper-parameter optimization where we set up the source-plane pixelizlation, regularization etc). We then fix the clusters to the 'best-fit' points in the image plane for lens modeling, thereby mimicking the behaviour of a fixed source-plane discretization. This will reuse all the functions the source-plane clustering method uses and requires no extra methods, so should be simple to implement :).

This could also form the basis of a light-weighted Delaunay pixelization.

rhayes777 · 2018-01-22T10:46:17Z

Just to be clear: we cluster image pixels and then use those clusters for source plane pixelization? I’m not totally sure I follow.

…

On 22 Jan 2018, at 10:42, Jammy2211 ***@***.***> wrote: We are used to using weighted KMeans clustering to adapt the source-pixelization in the source plane. A consequence of this is we sample a unique discrete source-plane grid for every iteration of the method - something I believe helps remove systematic biases, but makes the method slower and more cumbersome. Its a shame we currently have no option to use a (nearly) fixed discreization, as this will speed things up a lot at the expense of introducing biases. However, we can switch to source-plane clustering at the end of the analysis, such that a long pipeline gains speed without bias. To achieve this, we can perform weighted KMeans clutering on the source-weights in the image-plane (during the hyper-parameter optimization where we set up the source-plane pixelizlation, regularization etc). We then fix the clusters to the 'best-fit' points in the image plane for lens modeling, thereby mimicking the behaviour of a fixed source-plane discretization. This will reuse all the functions the source-plane clustering method uses and requires no extra methods, so should be simple to implement :). This could also form the basis of a light-weighted Delaunay pixelization. — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#9>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AGETMmMjVoYuBSUM_lUuk4FqRzkRjZYuks5tNGYugaJpZM4Rmd_r>.

Jammy2211 · 2018-01-22T10:56:59Z

We normally trace a sparsely sampled set of image-pixels to the source-plane, and then feed those source-plane coordinates (weighted by something equivalent to their luminosity) into the weighted KMeans algorithm. By doing this in the source-plane, when we change the lens model (which moves all the source-plane coordinate around) we have to re-grid the source pixelization to match it, by feeding this new set of source-coordinates into the KMeans. Therefore, every pixelization comes out unique.

If we instead feed the same sparsely sampled set of image-pixels into the weighted KMeans algorithm, but using their image-plane coordinates, we'll achieve similar behaviour. That is, we'll compute a set of image-plane coordinates that trace the source's' surface brightness. When we trace these coordinates to the source-plane, we can use them compute the source-pixelization just like we would normally. This would skip the KMeans step as they already represent the centers of source-pixels (their centers are then used to pair with all sub image-pixels, perform the Voronoi gridding etc.).

The crucial difference is that because these coordinates were computed in the image-plane, when we change the lens model they will trace to new source-plane coordinates and automatically adapt the source-pixelization to match it - without any need to recall the weighted KMeans algorithm. The downside is this will happen in a regular and predictable way thus potentially introducing bias.

rhayes777 · 2018-01-22T11:01:49Z

This sounds like a very good solution. I don’t fully appreciate how bias will be an issue but if we can produce distinct source pixelizations whilst only performing a single KMeans clustering then that will save a lot of processing time.

…

On 22 Jan 2018, at 10:56, Jammy2211 ***@***.***> wrote: We normally trace a sparsely sampled set of image-pixels to the source-plane, and then feed those source-plane coordinates (weighted by something equivalent to their luminosity) into the weighted KMeans algorithm. By doing this in the source-plane, when we change the lens model (which moves all the source-plane coordinate around) we have to re-grid the source pixelization to match it, by feeding this new set of source-coordinates into the KMeans. Therefore, every pixelization comes out unique. If we instead feed the same sparsely sampled set of image-pixels into the weighted KMeans algorithm, but using their image-plane coordinates, we'll achieve similar behaviour. That is, we'll compute a set of image-plane coordinates that trace the source's' surface brightness. When we trace these coordinates to the source-plane, we can use them compute the source-pixelization just like we would normally. This would skip the KMeans step as they already represent the centers of source-pixels (their centers are then used to pair with all sub image-pixels, perform the Voronoi gridding etc.). The crucial difference is that because these coordinates were computed in the image-plane, when we change the lens model they will trace to new source-plane coordinates and automatically adapt the source-pixelization to match it - without any need to recall the weighted KMeans algorithm. The downside is this will happen in a regular and predictable way thus potentially introducing bias. — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#9 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AGETMgrhlbswpGV4o0N1BbZHabZzyOsdks5tNGl7gaJpZM4Rmd_r>.

Jammy2211 · 2018-01-22T11:07:56Z

The bias issue is what I explained with an analogy to a square grid, where the use of a single fixed discretization leads to biases as there are always 'favourable' source-plane discretizations. I suspect this approach will introduce the same problem - yes the pixelization is varying here but it do so in a smooth, regular and predictable way nontheless, which will inevitably have some 'favourable' source-plane discretization.

Its not expensive to flip between the two though, and for a lot of earlier stages of the analysis there's no point worrying about subtle systematics in the lens model.

Jammy2211 · 2018-01-22T11:09:27Z

If you think about it, this is pretty much just doing what I suggested in the issue below lol:

]#5

rhayes777 · 2018-01-22T11:09:41Z

It still has the advantage that it is adaptive though. The pixelization in image space could provide the initial centres of pixels in source space speeding that step up. Assuming this doesn’t introduce some systematic bias?

…

On 22 Jan 2018, at 11:07, Jammy2211 ***@***.***> wrote: The bias issue is what I explained with an analogy to a square grid, where the use of a single fixed discretization leads to biases as there are always 'favourable' source-plane discretizations. I suspect this approach will introduce the same problem - yes the pixelization is varying here but it do so in a smooth, regular and predictable way nontheless, which will inevitably have some 'favourable' source-plane discretization. Its not expensive to flip between the two though, and for a lot of earlier stages of the analysis there's no point worrying about subtle systematics in the lens model. — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#9 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AGETMnkK1IpGOEknPVlEomGrM1SBX5GVks5tNGwMgaJpZM4Rmd_r>.

rhayes777 · 2018-01-22T11:10:13Z

Ha yeah it is.

…

On 22 Jan 2018, at 11:09, Jammy2211 ***@***.***> wrote: If you think about it, this is pretty much just doing what I suggested in the issue below lol: ]#5 <#5> — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#9 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AGETMk1rIHUG1uO508LGvRDXe5ZFUZT1ks5tNGxngaJpZM4Rmd_r>.

Jammy2211 · 2018-01-22T11:13:32Z

You're right, we could reintroduce randomness by reapplying the KMeans in the source-plane, using these initialized points to speed it up (for just the lens modeling step, where KMeans in the image-plane is switched off).

Assuming this doesn’t introduce some systematic bias?

it should remove it!

rhayes777 · 2018-01-22T11:15:46Z

We could also introduce randomness in the method by which we create a sparse image pixel set. For example, we could keep a pixel only if its weight is above a randomly generated number.

…

On 22 Jan 2018, at 11:13, Jammy2211 ***@***.***> wrote: You're right, we could reintroduce randomness by reapplying the KMeans in the source-plane, using these initialized points to speed it up (for just the lens modeling step, where KMeans in the image-plane is switched off). — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#9 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AGETMrAKqu_2RMW_8HRrVnUeqqpd3Bzcks5tNG1cgaJpZM4Rmd_r>.

Jammy2211 · 2018-01-22T11:18:05Z

We'd need to always have the same number of pixels if we do that, as its bad to change the source-plane resolution in weird ways.

Its not too hard though, we could just add extra points to the image-plane cluster and randomly remove enough to reach our target value.

rhayes777 · 2018-01-22T11:22:25Z

The number of image pixels kept shouldn’t vary too dramatically.

…

On 22 Jan 2018, at 11:18, Jammy2211 ***@***.***> wrote: We'd need to always have the same number of pixels if we do that, as its bad to change the source-plane resolution in weird ways. Its not too hard though, we could just add extra points to the image-plane cluster and randomly remove enough to reach our target value. — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#9 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AGETMjAZ1_zMAH0zs6VrnwpMwvFP5QKtks5tNG5tgaJpZM4Rmd_r>.

Jammy2211 · 2018-01-22T11:26:57Z

It must not vary at all!

rhayes777 · 2018-01-22T15:03:13Z

What I mean is the number of non-clustered image pixels (raw image pixels?) should not change. If I'm not mistaken we choose the number of clusters in KMeans so the number of clustered image pixels should be invariant.

Jammy2211 · 2018-01-22T15:26:50Z

So you would randomly remove image pixels before then passing them to the weighted kmeans algorithm for clustering?

Makes sense.

rhayes777 · 2018-01-22T15:35:33Z

So I guess we can either randomly remove them with no weighting and pass them into Weighted KMeans **OR** We can randomly remove them, accounting for weighting, and pass them to normal KMeans.

…

On 22 Jan 2018, at 15:26, Jammy2211 ***@***.***> wrote: So you would randomly remove image pixels before then passing them to the weighted kmeans algorithm for clustering? Makes sense. — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#9 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AGETMlQlRm9G1Y3hb037m0Oljm-1M-LBks5tNKi6gaJpZM4Rmd_r>.

Jammy2211 · 2019-06-23T10:40:21Z

Implemented.

Jammy2211 mentioned this issue Jan 22, 2018

KMeans initialize seed points #5

Closed

Jammy2211 closed this as completed Jun 23, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

IDEA: Perform weight KMeans clustering in the image-plane #9

IDEA: Perform weight KMeans clustering in the image-plane #9

Jammy2211 commented Jan 22, 2018

rhayes777 commented Jan 22, 2018 via email

Jammy2211 commented Jan 22, 2018

rhayes777 commented Jan 22, 2018 via email

Jammy2211 commented Jan 22, 2018

Jammy2211 commented Jan 22, 2018

rhayes777 commented Jan 22, 2018 via email

rhayes777 commented Jan 22, 2018 via email

Jammy2211 commented Jan 22, 2018 •

edited

rhayes777 commented Jan 22, 2018 via email

Jammy2211 commented Jan 22, 2018

rhayes777 commented Jan 22, 2018 via email

Jammy2211 commented Jan 22, 2018

rhayes777 commented Jan 22, 2018

Jammy2211 commented Jan 22, 2018

rhayes777 commented Jan 22, 2018 via email •

edited

Jammy2211 commented Jun 23, 2019

IDEA: Perform weight KMeans clustering in the image-plane #9

IDEA: Perform weight KMeans clustering in the image-plane #9

Comments

Jammy2211 commented Jan 22, 2018

rhayes777 commented Jan 22, 2018 via email

Jammy2211 commented Jan 22, 2018

rhayes777 commented Jan 22, 2018 via email

Jammy2211 commented Jan 22, 2018

Jammy2211 commented Jan 22, 2018

rhayes777 commented Jan 22, 2018 via email

rhayes777 commented Jan 22, 2018 via email

Jammy2211 commented Jan 22, 2018 • edited

rhayes777 commented Jan 22, 2018 via email

Jammy2211 commented Jan 22, 2018

rhayes777 commented Jan 22, 2018 via email

Jammy2211 commented Jan 22, 2018

rhayes777 commented Jan 22, 2018

Jammy2211 commented Jan 22, 2018

rhayes777 commented Jan 22, 2018 via email • edited

Jammy2211 commented Jun 23, 2019

Jammy2211 commented Jan 22, 2018 •

edited

rhayes777 commented Jan 22, 2018 via email •

edited