Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Interest in Optimal Spatial Matching? #665

Closed
ljwolf opened this issue Dec 8, 2023 · 3 comments
Closed

Interest in Optimal Spatial Matching? #665

ljwolf opened this issue Dec 8, 2023 · 3 comments

Comments

@ljwolf
Copy link
Member

ljwolf commented Dec 8, 2023

For a different causal inference project, I'm writing a few spatial/feature matching algorithms.

One that I have not seen before mentioned anywhere in the literature is like an optimal spatial matching graph?

Formally, this is a graph where each observation is connected to at least k neighbors in a way that minimises the total edge distance in the graph. So, let distance d_{ij} be the length of a possible edge connecting i to j, and let m_{ij} be a binary variable that is 1 when i is connected to j and 0 otherwise. To build a graph within a single point pattern, we force m_{ij} = m_{ji} so that links are symmetric. But, you can relax this to build a "cross-matching" from one point pattern to another point pattern that minimises the distance of match links.

Letting k be our number of neighbors per observation, the spatial matching graph arises from solving the following integer program.

minimize \sum_i^n \sum_j^n  d_{ij}m_{ij} 
subject to
    \sum_j^n m_{ij} >= k \forall i
    m_{ij} \in {0,1} forall ij

This gives rise to the following graphs:

match_knn

The IOU score at the bottom is the intersection over union score with respect to the standard KNN graph.

In some cases (like odd k) you can't solve this exactly for a dataset with odd numbers. So, depending on how this is specified, you can allow for either:

  1. K-Match: a single observation with k+1 neighbors, keeping inequality for the k-neighbors constraint.
  2. K-Partial Match: a set of "unmatchable" observations with k+1 neighbors, one pair each which has half-weight (plotted in light blue above).
@ljwolf
Copy link
Member Author

ljwolf commented Dec 8, 2023

I've looked into it, and I can't express this in a form for scipy.optimize.milp(), since the decision variable is not one-dimensional. Thus, this would require a soft dependency on pulp.

@ljwolf
Copy link
Member Author

ljwolf commented Dec 8, 2023

I should note, too, that the fact that k=2 causes a tour in the above data doesn't mean that, in general, k=2 solves a tour.

@ljwolf ljwolf mentioned this issue Dec 8, 2023
8 tasks
@martinfleis
Copy link
Member

Closed by #666

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants