v0.2.0
What's Changed
New architecture: Token-level GLiNER
-
Computes scores for the start, end, and inside positions of potential entity spans.:
scores_start, scores_end, scores_inside = self.compute_score_eval(x)
-
Converts these scores into probabilities and determines start and end positions where probabilities exceed a specified threshold.:
start_probs = torch.sigmoid(scores_start) end_probs = torch.sigmoid(scores_end) inside_probs = torch.sigmoid(scores_inside) start_indices = [torch.where(start_probs[i] > threshold) for i in range(len(x["tokens"]))] end_indices = [torch.where(end_probs[i] > threshold) for i in range(len(x["tokens"]))]
-
Match Start and End Indices to Create Valid Spans and ensures class label consistency and filters out low-confidence spans based on the inside scores.:
valid_spans = [] for i, (start, end, inside) in enumerate(zip(start_indices, end_indices, inside_probs)): spans = [] for st, cls_st in zip(*start): for ed, cls_ed in zip(*end): if ed >= st and cls_st == cls_ed: ins_confidence = inside[st:ed + 1, cls_st] if (ins_confidence < threshold).any(): continue spans.append((st, ed, x["id_to_classes"][cls_st + 1], ins_confidence.mean().item())) valid_spans.append(spans)
-
Uses a greedy search algorithm to finalize the list of entity spans (as in original GLiNER):
final_spans = [greedy_search(spans, flat_ner, multi_label=multi_label) for spans in valid_spans]
Full Changelog: v0.1.14...v0.2.0