Skip to content

Support batch-mode queries? #11

@dataframing

Description

@dataframing

Hi,

I've run into a bit of a use-case that I'm not sure is quite supported by modAL – nor the broader libraries for active learning – but would be relatively simple to implement. After reviewing modAL's internals a bit, I don't think it officially supports active learning with batch-mode queries.

The sampling strategies (for example, uncertainty sampling) do support the n_instances parameter, but from what I can tell, uncertainty sampling may return redundant/sub-optimal queries if we return more than one instance from the unlabeled set. This is a bit prohibitive in settings where we'd like to ask an active learner to return multiple (if not all) examples from the unlabeled set/pool, and the computational cost for re-training an active learning model goes without saying.

I found requests for batch-mode support in the popular libact library (issues #57 and #89) but, to the best of my knowledge, I'm not sure they were addressed in any of their PRs.

In that case, does it make sense to implement something like [Ranked batch-mode active learning] by Cardoso et al.? I took a crack at it this weekend for a better personal understanding, but if it's worth integrating and supporting in modAL I'm happy to polish it and talk it through in a PR.

Thanks!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions