Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Vectorize image similarity interfaces #110

Merged
merged 1 commit into from
Apr 20, 2022

Conversation

schencej
Copy link
Contributor

@schencej schencej commented Apr 8, 2022

Similarity interfaces vectorized per out discussion.

One point of contention:

  • Currently the query images input for the high-level interface is an Iterable. If we switched to Sequence we could check that the number of output heatmaps matches the number of input images

@codecov
Copy link

codecov bot commented Apr 8, 2022

Codecov Report

Merging #110 (8f89df8) into master (d51dcc3) will increase coverage by 0.00%.
The diff coverage is 100.00%.

❗ Current head 8f89df8 differs from pull request most recent head 2aed5af. Consider uploading reports for the commit 2aed5af to get more accurate results

@@           Coverage Diff           @@
##           master     #110   +/-   ##
=======================================
  Coverage   99.87%   99.87%           
=======================================
  Files          57       57           
  Lines        2327     2329    +2     
=======================================
+ Hits         2324     2326    +2     
  Misses          3        3           
Flag Coverage Δ
unittests 99.87% <100.00%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
xaitk_saliency/exceptions.py 60.00% <ø> (-6.67%) ⬇️
...cy/impls/gen_image_similarity_blackbox_sal/sbsm.py 100.00% <ø> (ø)
...aitk_saliency/interfaces/gen_descriptor_sim_sal.py 100.00% <ø> (ø)
.../gen_descriptor_sim_sal/test_similarity_scoring.py 100.00% <100.00%> (ø)
...pls/gen_image_similarity_blackbox_sal/test_sbsm.py 100.00% <100.00%> (ø)
...imilarity_blackbox_sal/test_sim_occlusion_based.py 100.00% <100.00%> (ø)
tests/interfaces/test_gen_descriptor_sim_sal.py 100.00% <100.00%> (ø)
...terfaces/test_gen_image_similarity_blackbox_sal.py 100.00% <100.00%> (ø)
...impls/gen_descriptor_sim_sal/similarity_scoring.py 100.00% <100.00%> (ø)
...n_image_similarity_blackbox_sal/occlusion_based.py 100.00% <100.00%> (ø)
... and 1 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update d51dcc3...2aed5af. Read the comment docs.

Copy link
Member

@brianhhu brianhhu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, LGTM- thanks for the update! Added a few minor suggestions for typo and grammar fixes.

@schencej
Copy link
Contributor Author

@brianhhu @Purg Did you see my initial comment above? I had an implementation question about the type of our query images input.

@Purg
Copy link
Member

Purg commented Apr 12, 2022

Did you see my initial comment above?

I did! I thought I wrote a comment related to that, but I see that I must not have (or missed pressing the "Comment" button).

Changing the requirement to be a Sequence would be the simplest way to allow that check at the interface level.

Alternatively, we can still acquire such a count without requiring an explicit sequence by creating a wrapper iterator around it:

def generate(...):
    ...
    num_query = 0
    def wrap():
        nonlocal num_query
        for q in query_images:
            yield q
            num_query += 1

    # the same "get actual output" line as before, but changing the iterable input to the wrapper iterator
    output = self._generate(ref_image, wrap(), blackbox)

    # num_query should now be the number of query images
    assert output.shape[0] == num_query, f"Output heatmaps didn't match input images: {output.shape[0]} != {num_query}"

    ...

I suppose technically the iterator count be copied into separate processes (e.g. torch dataloader with iterable dataset), and there is a solution to that as well using multiprocessing.Manager(), but I an open to feedback from @brianhhu + others on if that complexity is OK for this point in time. I'm sure the following could be improved because the space of corner cases is a bit larger.

def generate(...):
    ...
    with multiprocessing.Manager() as manager:
        num_query = manager.Value(int, 0)
        idx_obs = manager.dict()
        lock = manager.Lock()
        def wrap():
            nonlocal lock, count, idx_obs
            for i, q in enumerate(query_images):
                yield q
                with lock:
                    if i not in idx_obs:
                        num_query.value += 1
                        idx_obs[i] = 1

        output = self._generate(ref_image, wrap(), blackbox)

        # num_query should now be the number of query images
        assert output.shape[0] == num_query.value, \
            f"Output heatmaps didn't match input images: {output.shape[0]} != {num_query.value}"

    ...

@brianhhu
Copy link
Member

I have no strong opinion on this matter, but what are the drawbacks of just making it a Sequence? At first glance, it looks like a lot of added complexity for that check if we decide to still use iterators. In the end, I'm open to whatever we think is best- thanks!

@Purg
Copy link
Member

Purg commented Apr 13, 2022

but what are the drawbacks of just making it a Sequence?

The only obvious drawback is that we would not be able to operate on an unsized container, e.g. a stream, but considering that use-case in this context is likely over-design. I think your comment on complexity is valid. We can change to the simpler Sequence type for input for now until we identify use-cases that would actually require Iterable.

@@ -98,7 +94,7 @@ def generate(
sal = np.clip(sal, -1, 1)

# return just HxW components
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment is extraneous given we reverted back to returning NxHxW saliency maps.

Copy link
Member

@brianhhu brianhhu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

@sonarcloud
Copy link

sonarcloud bot commented Apr 19, 2022

Kudos, SonarCloud Quality Gate passed!    Quality Gate passed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 0 Code Smells

No Coverage information No Coverage information
0.0% 0.0% Duplication

@schencej
Copy link
Contributor Author

Addressed the last comment and autosquashed. Should be good to go.

@Purg Purg merged commit 2508e77 into XAITK:master Apr 20, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants