Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make nbl.smat.LookupNdBuilder more generic #97

Open
clbarnes opened this issue May 30, 2022 · 2 comments
Open

Make nbl.smat.LookupNdBuilder more generic #97

clbarnes opened this issue May 30, 2022 · 2 comments

Comments

@clbarnes
Copy link
Collaborator

At present it's annotated to take Dotprops and functions which operate on Dotprops. Synblast doesn't use Dotprops but would benefit from a LookupNdBuilder, and I don't think the builder actually uses any dotprops functionality as it just uses the passed-in function to operate on them. Any object with length N and any function which takes 2 of those objects to produce an array of length N should work, so this may just be a case of adding a generic type annotation.

Requested by @swilson27

@schlegelp
Copy link
Collaborator

schlegelp commented Jun 3, 2022

Happy to pick this up if you don't have the bandwidth at the moment, @clbarnes.

At first glance, the only problem appears to be that LookupNdBuilder uses the length of individual neurons to calculate the n_matching_qual_vals variable. See the len(self.dotprops[q_idx]) in this method, for example:

    def _get_pairs(self):
        matching_pairs = list(set(self._yield_matching_pairs()))
        # need to know the eventual distdot count
        # so we know how many non-matching pairs to draw
        q_idx_count = Counter(p[0] for p in matching_pairs)
        n_matching_qual_vals = sum(
            len(self.dotprops[q_idx]) * n_reps for q_idx, n_reps in q_idx_count.items()
        )

        nonmatching_pairs = self._pick_nonmatching_pairs(n_matching_qual_vals)
        return matching_pairs, nonmatching_pairs

If I understand this correctly, you're using it to match the number of points (i.e. size) rather than the number of neurons between matching and non-matching pairs? Currently only Dotprops implements a __len__ method though. In addition, the similarity metric might not be using whatever __len__ measures (synBLAST being an example for that).

I can see three solutions:

  1. Implement __len__ methods for all neuron types.
  2. Use len(object) if possible and, failing that, fall back to the total number
  3. Let the user define a function for that. Default would be len_func=len but could also be e.g. len_func=lambda x: len(x.connectors) for synBLAST or len_func=lambda x: 1 for when we don't care about size, only counts.

I'd favour option 3. Thoughts?

@schlegelp
Copy link
Collaborator

schlegelp commented Jun 6, 2022

@swilson27 I pushed some changed to the generic_lookupnd branch. I haven't tried this with syNBLAST yet but perhaps you want to take it for a spin?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants