Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

debug data Spancat Table Improvements #11504

Merged
merged 6 commits into from
Sep 28, 2022

Conversation

pmbaumgartner
Copy link
Contributor

This provides a few small tweaks to debug data for spancat data:

  • Corrects the maximum column width when the span labels are longer (previously was max 30, which was the wasabi default)
  • Makes sure numbers include 2 full decimal places when formatted (e.g. 1.60 and not 1.6)
  • Aligns digits right in the table for easier comparison
  • Adds the N (count) for each span type for context

Description

Updated:

Span Type                         Length     SD     BD     N
-------------------------------   ------   ----   ----   ---
Cell                                1.41   4.03   1.57   339
Organism_substance                  1.18   5.46   2.11   144
Pathological_formation              1.42   4.66   1.60   174
Multi-tissue_structure              1.44   4.25   1.65   311
Organism_subdivision                1.06   5.98   2.83    92
Organ                               1.08   5.37   2.03   168
Cellular_component                  1.28   5.34   2.31    94
Anatomical_system                   1.47   6.33   3.76    32
Tissue                              1.66   5.22   1.82    88
Developing_anatomical_structure     1.15   7.35   3.22    15
Immaterial_anatomical_entity        1.36   7.25   3.73    22
-------------------------------   ------   ----   ----   ---
Wgt. Average                        1.34   4.85   1.93     -

Previous:

Span Type                        Length   SD     BD  
------------------------------   ------   ----   ----
Cell                             1.41     4.03   1.57
Organism_substance               1.18     5.46   2.11
Pathological_formation           1.42     4.66   1.6 
Multi-tissue_structure           1.44     4.25   1.65
Organism_subdivision             1.06     5.98   2.83
Organ                            1.08     5.37   2.03
Cellular_component               1.28     5.34   2.31
Anatomical_system                1.47     6.33   3.76
Tissue                           1.66     5.22   1.82
Developing_anatomical_structure   1.15     7.35   3.22
Immaterial_anatomical_entity     1.36     7.25   3.73
------------------------------   ------   ----   ----
Wgt. Average                     1.34     4.85   1.93

Types of change

enhancement

Checklist

  • I confirm that I have the right to submit this contribution under the project's MIT license.
  • I ran the tests, and all new and existing tests passed.
  • My changes don't require a change to the documentation, or if they do, I've added all required information.

@pmbaumgartner pmbaumgartner added enhancement Feature requests and improvements feat / spancat Feature: Span Categorizer labels Sep 14, 2022
Copy link
Contributor

@kadarakos kadarakos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have only one minor question, thanks for the PR!

spacy/cli/debug_data.py Outdated Show resolved Hide resolved
spacy/cli/_util.py Outdated Show resolved Hide resolved
@svlandeg svlandeg merged commit e794d4a into explosion:master Sep 28, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Feature requests and improvements feat / spancat Feature: Span Categorizer
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants