Give ability for users to get uncertainty values. #144

Dref360 · 2021-08-18T17:42:54Z

Summary:

Allow the user to store uncertainties on disk when calling ActiveLearningLoop.step.

An issue that this create is that BatchBALD indices are not correlated with their uncertainty so this breaks a test. Not sure what is the best course of action.

Features:

Fix issue where NLPDataset was always imported.

Checklist:

Your code is documented (To validate this, add your module to tests/documentation_test.py).
Your code is tested with unit tests.
You moved your Issue to the PR state.

parmidaatg · 2021-08-18T18:40:26Z

src/baal/active/active_loop.py

@@ -64,9 +70,18 @@ def step(self, pool=None) -> bool:
        if len(pool) > 0:
            probs = self.get_probabilities(pool, **self.kwargs)
            if probs is not None and (isinstance(probs, types.GeneratorType) or len(probs) > 0):
-                to_label = self.heuristic(probs)
+                to_label, uncertainty = self.heuristic.get_ranks(probs)


maybe put a flag to save uncertainties or not since it might take more time if people dont need it?

We do not save if the path is None

parmidaatg · 2021-08-18T18:52:30Z

src/baal/active/heuristics/heuristics.py


            if partial_multi_bald_b.max() < MIN_SPREAD:
                COUNT += 1
                if COUNT > 50 or len(history) >= predictions.shape[0]:
                    break

-        return np.array(history)
-
-    def reorder_indices(self):


why did we remove this exception?

We raise that exception somewhere else in get_ranks

parmidaatg · 2021-08-18T18:57:19Z

src/baal/active/heuristics/heuristics.py


-    def reorder_indices(self, predictions):


are we randomly sampling the predictions here instead of finding the variance of the iterations ?

In Random we just sample randomly

ah I thought it was the Variance one. my bad

parmidaatg · 2021-08-18T19:02:36Z

tests/active/active_loop_test.py

+    _ = active_loop.step()
+    assert len(os.listdir(tmpdir)) == 1
+    file = pjoin(tmpdir, os.listdir(tmpdir)[0])
+    assert "pool=90" in file and "labelled=10"


the part after and is always true, no?

It depends on how many items are labelled

parmidaatg

LGTM

fr.branchaud-charron added 2 commits August 18, 2021 13:22

Give ability for users to get uncertainty values.

35c6832

Allow ALLoop to store uncertainty

ee8380b

Dref360 requested review from parmidaatg and rafapi August 18, 2021 17:42

parmidaatg reviewed Aug 18, 2021

View reviewed changes

parmidaatg previously approved these changes Aug 23, 2021

View reviewed changes

Change according to review

da2072a

Dref360 dismissed parmidaatg’s stale review via da2072a August 23, 2021 20:40

parmidaatg approved these changes Aug 24, 2021

View reviewed changes

parmidaatg merged commit 252f280 into master Sep 7, 2021

parmidaatg deleted the refactor_uncertainty branch September 7, 2021 15:38

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Give ability for users to get uncertainty values. #144

Give ability for users to get uncertainty values. #144

Dref360 commented Aug 18, 2021

parmidaatg Aug 18, 2021

Dref360 Aug 18, 2021

parmidaatg Aug 18, 2021

Dref360 Aug 18, 2021

parmidaatg Aug 18, 2021

Dref360 Aug 18, 2021

parmidaatg Aug 23, 2021

parmidaatg Aug 18, 2021

Dref360 Aug 18, 2021

parmidaatg left a comment

Give ability for users to get uncertainty values. #144

Give ability for users to get uncertainty values. #144

Conversation

Dref360 commented Aug 18, 2021

Summary:

Features:

Checklist:

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

parmidaatg left a comment

Choose a reason for hiding this comment