Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AnchorBase coverage update for proposed anchor after computation. #914

Closed
RobertSamoilescu opened this issue Apr 24, 2023 · 2 comments
Closed

Comments

@RobertSamoilescu
Copy link
Collaborator

Before describing the issue that I found, I will just define the state key names so we know what they mean:

  • state['t_idx'][anchor] - represents a list of indices from the sampled data where the anchor applies.
  • state['t_nsamples'][anchor] - represents the length of the list of indices from the sampled data where the anchor applies. Basically len(state['t_idx'][anchor].
  • state['t_positive'][anchor] - represents the number of instances from the sampled data which have the same label as the instance to be explained.
  • state['t_order'][anchor] - ordered instance of predicates. I believe that it can be used to compute the contributions as discussed above.
  • state['t_coverage_idx'][anchor] - list of indices from the coverage dataset where the anchor applies.
  • state['t_coverage'][anchor] - the actual coverage of the anchor. In other words it is the ratio between the state['t_coverage_idx'][anchor] and length of the coverage dataset.

Note that the coverage dataset it is fixed and sampled at the beginning of the anchor_beam algorithm (see link here). The coverage dataset is constructed by sampling with replacement the original dataset. we can observe that the coverage of an anchor (i.e., state['t_coverge'][anchor] ) is modified by the kllucb algorithm here.

Tracing it to the source, we can observe that the kllucb calls draw_samples method in multiple places (here and here) which eventually calls update_state here. The update_state function should update the quantities relate with the computation of the precision (that's what the kllucb algorithm is concerned with - informally, finding the arm with the best precision), but here is a line which updates the coverage too, computed by default in the sampling function. The line which updates the coverage is here.

In my opinion, here is no reason to update coverage there since as mentioned before, the coverage is computed on a fixed dataset when the anchor is constructed (see here and here). Commenting those lines should fix the error.

To explain with an example what was happening:

  • the algorithm select first feature 5, but the precision constraint is not satisfied. Thus, best_coverage = -1
  • then the algorithm selects feature 7, thus having the anchor (5, 7). In this case the coverage is 0.145.
    for (5, 7) the precision constraint is satisfied, but the kllucb algorithm modifies the coverage to 0.139 because of the computation of the newly sampled data.
  • at the next step, the anchor proposed is (5, 7, 11) which also has a coverage of 0.145. Because the 0.145 > 0.139, the algorithm selects (5, 7, 11) instead of (5, 7). Of course the precision constraint is satisfied in this case too.
@jklaise
Copy link
Contributor

jklaise commented Apr 26, 2023

Closed via #915.

@jklaise jklaise closed this as completed Apr 26, 2023
@jklaise jklaise reopened this Apr 26, 2023
@jklaise
Copy link
Contributor

jklaise commented Apr 28, 2023

Closed via #919.

@jklaise jklaise closed this as completed Apr 28, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants