Skip to content

Question about Cal metric implementation #29

@BarbaraCapelo

Description

@BarbaraCapelo

Hi, first of all, thank you for the great work on DeepCore.

I'd like to clarify a potential issue in the implementation of the Cal method. In the file methods/cal.py, within the finish_run function, the indices are selected as follows:

    def finish_run(self):
        scores=[]
        if self.balance:
            selection_result = np.array([], dtype=np.int32)
            for c, knn in zip(range(self.args.num_classes), self.knn):
                class_index = np.arange(self.n_train)[self.dst_train.targets == c]
                scores.append(self.calc_kl(knn, class_index))
                **selection_result = np.append(selection_result, class_index[np.argsort(**
                    #self.calc_kl(knn, class_index))[::1][:round(self.fraction * len(class_index))]])
                    **scores[-1])[::1][:round(self.fraction * len(class_index))]])**
        else:
            **selection_result = np.argsort(self.calc_kl(self.knn))[::1][:self.coreset_size]**
        return {"indices": selection_result, "scores":scores}

I noticed something that might be an issue — but I'd like to confirm it with you. My concern is with the use of [::1] in the slicing after np.argsort(). Since [::1] essentially keeps the order as is, this means the samples with the lowest KL divergence scores are selected first.
Wouldn't it make more sense to select samples with the highest KL divergence scores? That would mean we should use [:: -1] to reverse the sorting order.

Thanks in advance for any clarifications!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions