Skip to content

Conversation

@FelixKirschKern
Copy link
Contributor

@FelixKirschKern FelixKirschKern commented Oct 2, 2023

This is the main PR, related PRs are:

This PR implements changes to run the weak supervision without setting manual labels for the labeling task. Instead of the calculated precision values for each heuristic, custom values can be applied.
In more detail:

  1. If no manual labels are set, all heuristics get the same default precision, which defaults to 0.8.
  2. This precision can be overwritten and set to another value. If so, it is also applied if any manual labels are set.
  3. For every heuristic a precision can be set. This also overwrites calculated precision if manual labels are set.

For testing, the weak-supervisor has to be started excluded and setup to work with a local installation of the weak-nlp package (Copy weak-nlp into weak-supervisor, uncomment line for local installation from dev.Dockerfile).
The other services can be started with the dev-setup bash start -b ws-no-manual -e refinery-weak-supervisor

2 GraphQL endpoints are changed:

  • initiateWeakSupervisionByProjectId
    It has now two additional optional arguments: overwriteDefaultPrecision and overwriteWeakSupervision. overwriteDefaultPrecision applies the same precision to all heuristics, it overwrites the calculated precision if existing. It expects a float value.
    overwriteWeakSupervision can be used to set a distinct precision for each heuristic. It expects a dictionary with the following structure:
{
  "<labeling_task_id>": {
      "<heuristic_id>": "<precision>",
      "<heuristic_id>": "<precision>",
     ...
  },
  ...
}

If both arguments are given, only overwriteWeakSupervision is applied.

  • runHeuristicThenTriggerWeakSupervision
    The same optional arguments are added. overwriteDefaultPrecision works the same as above.
    overwriteWeakSupervision only expects a subset of the dictionary from above as the endpoint is labeling task specific:
{
    "<heuristic_id>": "<precision>",
    "<heuristic_id>": "<precision>",
     ...
}

The endpoint for starting a gates runtime also changed, here only one optional argument was added.

class ApiConfigRequest(BaseModel):
    heuristics: List[str]
    similarity_search: Optional[List[str]] = None
    overwrite_weak_supervision: Optional[Dict[str, Any]] = None

overwrite_weak_supervision combines the both arguments for the graphql endpoints, either a float or a dictionary can be provided. The dictionary must have the same structure as for the initiateWeakSupervisionByProjectId endpoint, as a gates runtime could handle multiple labeling tasks.

GraphQL Example 1
  initiateWeakSupervisionByProjectId(
    projectId: "328d7d12-b0d3-4874-9e73-4a53a28d0536",
    overwriteDefaultPrecision: 0.5,
    overwriteWeakSupervision: "{\"2dfe75be-4c9f-4bbc-9a5e-2878ec160fe1\":{\"2c3fecf5-7b02-417a-a1eb-86c1d368d7a9\": 1, \"381c7d6a-6deb-47fe-925a-b5e068d81c95\": 0, \"e0051dcd-b878-4c00-a47f-21c72d33aff3\": 0} }"
  ) {
    ok
  }
}

In this case overwriteWeakSupervision overwrites the overwriteDefaultPrecision argument as it is more specific.

GraphQL Example 2
  runHeuristicThenTriggerWeakSupervision(
    projectId: "328d7d12-b0d3-4874-9e73-4a53a28d0536",
    informationSourceId: "2c3fecf5-7b02-417a-a1eb-86c1d368d7a9",
    labelingTaskId: "2dfe75be-4c9f-4bbc-9a5e-2878ec160fe1",
    overwriteDefaultPrecision: 0
  ) {
    ok
  }
}

@JWittmeyer
Copy link
Member

JWittmeyer commented Oct 4, 2023

From the description i read that gates runtime has a new startup endpoint (or change). Does this mean that the overwrite values are set on startup?

Two questions:

  1. Does this mean that the values can't be decided at runtime? (e.g. meta note on request)
  2. Don't we need an endpoint to change the values as well since e.g. we start with nothing to get initial results for cognition & expect some manual labels later.
  • discussed

@JWittmeyer
Copy link
Member

JWittmeyer commented Oct 4, 2023

might have misunderstood something but shouldn't the default be used if no manual labels exist?
Did:

  1. load Clickbait initial
  2. create LF
  3. ran LF
  4. ran Weak Supervision
    image
  • discussed
  • resolved

@JWittmeyer
Copy link
Member

JWittmeyer commented Oct 4, 2023

In labelinfunction Run + WS seems to have a different condition as it shows a notification that it's not possible but no error in logs

image

  • resolved

@JWittmeyer
Copy link
Member

JWittmeyer commented Oct 10, 2023

image

Without any valid Heuristist Run + Weak supervision show a misleading error text.

Maybe we change the error text or ignore it since Run + WS is a bit weird anyway

  • discussed
  • ignored
  • resolved

-> weak supervision will not be run if the heuristic fails

  • resolved

@FelixKirschKern FelixKirschKern merged commit c6cc86c into dev Oct 10, 2023
@FelixKirschKern FelixKirschKern deleted the ws-no-manual branch October 10, 2023 10:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants