You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Before describing the issue that I found, I will just define the state key names so we know what they mean:
state['t_idx'][anchor] - represents a list of indices from the sampled data where the anchor applies.
state['t_nsamples'][anchor] - represents the length of the list of indices from the sampled data where the anchor applies. Basically len(state['t_idx'][anchor].
state['t_positive'][anchor] - represents the number of instances from the sampled data which have the same label as the instance to be explained.
state['t_order'][anchor] - ordered instance of predicates. I believe that it can be used to compute the contributions as discussed above.
state['t_coverage_idx'][anchor] - list of indices from the coverage dataset where the anchor applies.
state['t_coverage'][anchor] - the actual coverage of the anchor. In other words it is the ratio between the state['t_coverage_idx'][anchor] and length of the coverage dataset.
Note that the coverage dataset it is fixed and sampled at the beginning of the anchor_beam algorithm (see link here). The coverage dataset is constructed by sampling with replacement the original dataset. we can observe that the coverage of an anchor (i.e., state['t_coverge'][anchor] ) is modified by the kllucb algorithm here.
Tracing it to the source, we can observe that the kllucb calls draw_samples method in multiple places (here and here) which eventually calls update_statehere. The update_state function should update the quantities relate with the computation of the precision (that's what the kllucb algorithm is concerned with - informally, finding the arm with the best precision), but here is a line which updates the coverage too, computed by default in the sampling function. The line which updates the coverage is here.
In my opinion, here is no reason to update coverage there since as mentioned before, the coverage is computed on a fixed dataset when the anchor is constructed (see here and here). Commenting those lines should fix the error.
To explain with an example what was happening:
the algorithm select first feature 5, but the precision constraint is not satisfied. Thus, best_coverage = -1
then the algorithm selects feature 7, thus having the anchor (5, 7). In this case the coverage is 0.145.
for (5, 7) the precision constraint is satisfied, but the kllucb algorithm modifies the coverage to 0.139 because of the computation of the newly sampled data.
at the next step, the anchor proposed is (5, 7, 11) which also has a coverage of 0.145. Because the 0.145 > 0.139, the algorithm selects (5, 7, 11) instead of (5, 7). Of course the precision constraint is satisfied in this case too.
The text was updated successfully, but these errors were encountered:
Before describing the issue that I found, I will just define the state key names so we know what they mean:
state['t_idx'][anchor]
- represents a list of indices from the sampled data where the anchor applies.state['t_nsamples'][anchor]
- represents the length of the list of indices from the sampled data where the anchor applies. Basicallylen(state['t_idx'][anchor]
.state['t_positive'][anchor]
- represents the number of instances from the sampled data which have the same label as the instance to be explained.state['t_order'][anchor]
- ordered instance of predicates. I believe that it can be used to compute the contributions as discussed above.state['t_coverage_idx'][anchor]
- list of indices from the coverage dataset where the anchor applies.state['t_coverage'][anchor]
- the actual coverage of the anchor. In other words it is the ratio between thestate['t_coverage_idx'][anchor]
and length of the coverage dataset.Note that the coverage dataset it is fixed and sampled at the beginning of the
anchor_beam
algorithm (see link here). The coverage dataset is constructed by sampling with replacement the original dataset. we can observe that the coverage of an anchor (i.e.,state['t_coverge'][anchor]
) is modified by thekllucb
algorithm here.Tracing it to the source, we can observe that the
kllucb
callsdraw_samples
method in multiple places (here and here) which eventually callsupdate_state
here. Theupdate_state
function should update the quantities relate with the computation of the precision (that's what thekllucb
algorithm is concerned with - informally, finding the arm with the best precision), but here is a line which updates the coverage too, computed by default in the sampling function. The line which updates the coverage is here.In my opinion, here is no reason to update coverage there since as mentioned before, the coverage is computed on a fixed dataset when the anchor is constructed (see here and here). Commenting those lines should fix the error.
To explain with an example what was happening:
best_coverage = -1
for (5, 7) the precision constraint is satisfied, but the
kllucb
algorithm modifies the coverage to 0.139 because of the computation of the newly sampled data.The text was updated successfully, but these errors were encountered: