Weak Supervision without manual labels #33

FelixKirschKern · 2023-10-02T12:19:25Z

This is the main PR, related PRs are:

This PR implements changes to run the weak supervision without setting manual labels for the labeling task. Instead of the calculated precision values for each heuristic, custom values can be applied.
In more detail:

If no manual labels are set, all heuristics get the same default precision, which defaults to 0.8.
This precision can be overwritten and set to another value. If so, it is also applied if any manual labels are set.
For every heuristic a precision can be set. This also overwrites calculated precision if manual labels are set.

For testing, the weak-supervisor has to be started excluded and setup to work with a local installation of the weak-nlp package (Copy weak-nlp into weak-supervisor, uncomment line for local installation from dev.Dockerfile).
The other services can be started with the dev-setup bash start -b ws-no-manual -e refinery-weak-supervisor

2 GraphQL endpoints are changed:

initiateWeakSupervisionByProjectId
It has now two additional optional arguments: overwriteDefaultPrecision and overwriteWeakSupervision. overwriteDefaultPrecision applies the same precision to all heuristics, it overwrites the calculated precision if existing. It expects a float value.
overwriteWeakSupervision can be used to set a distinct precision for each heuristic. It expects a dictionary with the following structure:

{
  "<labeling_task_id>": {
      "<heuristic_id>": "<precision>",
      "<heuristic_id>": "<precision>",
     ...
  },
  ...
}

If both arguments are given, only overwriteWeakSupervision is applied.

runHeuristicThenTriggerWeakSupervision
The same optional arguments are added. overwriteDefaultPrecision works the same as above.
overwriteWeakSupervision only expects a subset of the dictionary from above as the endpoint is labeling task specific:

{
    "<heuristic_id>": "<precision>",
    "<heuristic_id>": "<precision>",
     ...
}

The endpoint for starting a gates runtime also changed, here only one optional argument was added.

class ApiConfigRequest(BaseModel):
    heuristics: List[str]
    similarity_search: Optional[List[str]] = None
    overwrite_weak_supervision: Optional[Dict[str, Any]] = None

overwrite_weak_supervision combines the both arguments for the graphql endpoints, either a float or a dictionary can be provided. The dictionary must have the same structure as for the initiateWeakSupervisionByProjectId endpoint, as a gates runtime could handle multiple labeling tasks.

GraphQL Example 1

  initiateWeakSupervisionByProjectId(
    projectId: "328d7d12-b0d3-4874-9e73-4a53a28d0536",
    overwriteDefaultPrecision: 0.5,
    overwriteWeakSupervision: "{\"2dfe75be-4c9f-4bbc-9a5e-2878ec160fe1\":{\"2c3fecf5-7b02-417a-a1eb-86c1d368d7a9\": 1, \"381c7d6a-6deb-47fe-925a-b5e068d81c95\": 0, \"e0051dcd-b878-4c00-a47f-21c72d33aff3\": 0} }"
  ) {
    ok
  }
}

In this case overwriteWeakSupervision overwrites the overwriteDefaultPrecision argument as it is more specific.

GraphQL Example 2

  runHeuristicThenTriggerWeakSupervision(
    projectId: "328d7d12-b0d3-4874-9e73-4a53a28d0536",
    informationSourceId: "2c3fecf5-7b02-417a-a1eb-86c1d368d7a9",
    labelingTaskId: "2dfe75be-4c9f-4bbc-9a5e-2878ec160fe1",
    overwriteDefaultPrecision: 0
  ) {
    ok
  }
}

JWittmeyer · 2023-10-04T12:55:47Z

From the description i read that gates runtime has a new startup endpoint (or change). Does this mean that the overwrite values are set on startup?

Two questions:

Does this mean that the values can't be decided at runtime? (e.g. meta note on request)
Don't we need an endpoint to change the values as well since e.g. we start with nothing to get initial results for cognition & expect some manual labels later.

discussed

JWittmeyer · 2023-10-04T13:00:13Z

might have misunderstood something but shouldn't the default be used if no manual labels exist?
Did:

load Clickbait initial
create LF
ran LF
ran Weak Supervision

discussed
resolved

JWittmeyer · 2023-10-04T13:02:33Z

In labelinfunction Run + WS seems to have a different condition as it shows a notification that it's not possible but no error in logs

resolved

controller/integration.py

JWittmeyer · 2023-10-10T07:28:11Z

Without any valid Heuristist Run + Weak supervision show a misleading error text.

Maybe we change the error text or ignore it since Run + WS is a bit weird anyway

discussed
ignored
resolved

-> weak supervision will not be run if the heuristic fails

resolved

implements overwriting and default value of ws stats

981cb6a

FelixKirschKern self-assigned this Oct 2, 2023

This was referenced Oct 2, 2023

Weak Supervision without manual labels code-kern-ai/refinery-submodule-model#57

Merged

Weak Supervision without manual labels code-kern-ai/refinery-gateway#160

Merged

adds extraction task to weak supervision overwrite

d0d71f9

FelixKirschKern marked this pull request as ready for review October 2, 2023 15:14

FelixKirschKern mentioned this pull request Oct 2, 2023

Weak Supervision without manual labels code-kern-ai/weak-nlp#7

Merged

FelixKirschKern added 2 commits October 9, 2023 09:37

updates submodule

cc3d1f2

pr comments, changed attribute name for clarity

45b1452

FelixKirschKern requested a review from JWittmeyer October 9, 2023 14:37

JWittmeyer reviewed Oct 10, 2023

View reviewed changes

controller/integration.py Outdated Show resolved Hide resolved

JWittmeyer reviewed Oct 10, 2023

View reviewed changes

controller/integration.py Outdated Show resolved Hide resolved

JWittmeyer reviewed Oct 10, 2023

View reviewed changes

controller/integration.py Outdated Show resolved Hide resolved

JWittmeyer approved these changes Oct 10, 2023

View reviewed changes

FelixKirschKern added 3 commits October 10, 2023 11:31

pr comments, typing

0c75c20

updates weak_nlp version

48a7e7c

updated submodule

7c11df5

FelixKirschKern merged commit c6cc86c into dev Oct 10, 2023

FelixKirschKern deleted the ws-no-manual branch October 10, 2023 10:44

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Weak Supervision without manual labels #33

Weak Supervision without manual labels #33

Uh oh!

FelixKirschKern commented Oct 2, 2023 •

edited

Loading

Uh oh!

JWittmeyer commented Oct 4, 2023 •

edited by FelixKirschKern

Loading

Uh oh!

JWittmeyer commented Oct 4, 2023 •

edited

Loading

Uh oh!

JWittmeyer commented Oct 4, 2023 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

JWittmeyer commented Oct 10, 2023 •

edited by FelixKirschKern

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Weak Supervision without manual labels #33

Weak Supervision without manual labels #33

Uh oh!

Conversation

FelixKirschKern commented Oct 2, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

JWittmeyer commented Oct 4, 2023 • edited by FelixKirschKern Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

JWittmeyer commented Oct 4, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

JWittmeyer commented Oct 4, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

JWittmeyer commented Oct 10, 2023 • edited by FelixKirschKern Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

FelixKirschKern commented Oct 2, 2023 •

edited

Loading

JWittmeyer commented Oct 4, 2023 •

edited by FelixKirschKern

Loading

JWittmeyer commented Oct 4, 2023 •

edited

Loading

JWittmeyer commented Oct 4, 2023 •

edited

Loading

JWittmeyer commented Oct 10, 2023 •

edited by FelixKirschKern

Loading