Skip to content

v0.2.0

Compare
Choose a tag to compare
@urchade urchade released this 26 May 12:37
· 106 commits to main since this release
a515006

What's Changed

New architecture: Token-level GLiNER

  1. Computes scores for the start, end, and inside positions of potential entity spans.:

    scores_start, scores_end, scores_inside = self.compute_score_eval(x)
  2. Converts these scores into probabilities and determines start and end positions where probabilities exceed a specified threshold.:

    start_probs = torch.sigmoid(scores_start)
    end_probs = torch.sigmoid(scores_end)
    inside_probs = torch.sigmoid(scores_inside)
    
    start_indices = [torch.where(start_probs[i] > threshold) for i in range(len(x["tokens"]))]
    end_indices = [torch.where(end_probs[i] > threshold) for i in range(len(x["tokens"]))]
  3. Match Start and End Indices to Create Valid Spans and ensures class label consistency and filters out low-confidence spans based on the inside scores.:

    valid_spans = []
    for i, (start, end, inside) in enumerate(zip(start_indices, end_indices, inside_probs)):
        spans = []
        for st, cls_st in zip(*start):
            for ed, cls_ed in zip(*end):
                if ed >= st and cls_st == cls_ed:
                    ins_confidence = inside[st:ed + 1, cls_st]
                    if (ins_confidence < threshold).any():
                        continue
                    spans.append((st, ed, x["id_to_classes"][cls_st + 1], ins_confidence.mean().item()))
        valid_spans.append(spans)
  4. Uses a greedy search algorithm to finalize the list of entity spans (as in original GLiNER):

    final_spans = [greedy_search(spans, flat_ner, multi_label=multi_label) for spans in valid_spans]

Full Changelog: v0.1.14...v0.2.0