Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. Weโ€™ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pretty print models #314

Open
wants to merge 10 commits into
base: main
Choose a base branch
from

Conversation

sneakers-the-rat
Copy link
Contributor

@sneakers-the-rat sneakers-the-rat commented Mar 21, 2024

hi i'm โ€‹๐Ÿ‡ธโ€‹โ€‹๐Ÿ‡นโ€‹โ€‹๐Ÿ‡ทโ€‹โ€‹๐Ÿ‡ฎโ€‹โ€‹๐Ÿ‡ณโ€‹โ€‹๐Ÿ‡ฌโ€‹ โ€‹๐Ÿ‡ทโ€‹โ€‹๐Ÿ‡ชโ€‹โ€‹๐Ÿ‡ตโ€‹โ€‹๐Ÿ‡ทโ€‹โ€‹๐Ÿ‡ชโ€‹โ€‹๐Ÿ‡ธโ€‹โ€‹๐Ÿ‡ชโ€‹โ€‹๐Ÿ‡ณโ€‹โ€‹๐Ÿ‡นโ€‹โ€‹๐Ÿ‡ฆโ€‹โ€‹๐Ÿ‡นโ€‹โ€‹๐Ÿ‡ฎโ€‹โ€‹๐Ÿ‡ดโ€‹โ€‹๐Ÿ‡ณโ€‹ โ€‹๐Ÿ‡ดโ€‹โ€‹๐Ÿ‡ซโ€‹ โ€‹๐Ÿ‡ฒโ€‹โ€‹๐Ÿ‡ดโ€‹โ€‹๐Ÿ‡ฉโ€‹โ€‹๐Ÿ‡ชโ€‹โ€‹๐Ÿ‡ฑโ€‹ โ€‹๐Ÿ‡ฉโ€‹โ€‹๐Ÿ‡ชโ€‹โ€‹๐Ÿ‡ซโ€‹โ€‹๐Ÿ‡ฎโ€‹โ€‹๐Ÿ‡ณโ€‹โ€‹๐Ÿ‡ฎโ€‹โ€‹๐Ÿ‡นโ€‹โ€‹๐Ÿ‡ฎโ€‹โ€‹๐Ÿ‡ดโ€‹โ€‹๐Ÿ‡ณโ€‹

you might remember me from other print statements such as

>>> sv = SchemaView('tests/test_loaders_dumpers/input/personinfo.yaml') >>> cls = sv.get_class('Person') >>> print(cls) ClassDefinition(name='Person', id_prefixes=[], id_prefixes_are_closed=None, definition_uri=None, local_names={}, conforms_to=None, implements=[], instantiates=[], extensions={}, annotations={}, description='A person (alive, dead, undead, or fictional).', alt_descriptions={}, title=None, deprecated=None, todos=[], notes=[], comments=[], examples=[], in_subset=[], from_schema='https://w3id.org/linkml/examples/personinfo', imported_from=None, source=None, in_language=None, see_also=[], deprecated_element_has_exact_replacement=None, deprecated_element_has_possible_replacement=None, aliases=[], structured_aliases={}, mappings=[], exact_mappings=[], close_mappings=[], related_mappings=[], narrow_mappings=[], broad_mappings=[], created_by=None, contributors=[], created_on=None, last_updated_on=None, modified_by=None, status=None, rank=None, categories=[], keywords=[], is_a='NamedThing', abstract=None, mixin=None, mixins=['HasAliases'], apply_to=[], values_from=[], string_serialization=None, slots=['primary_email', 'birth_date', 'age_in_years', 'gender', 'current_address', 'has_employment_history', 'has_familial_relationships', 'has_interpersonal_relationships', 'has_medical_history'], slot_usage={'primary_email': SlotDefinition(name='primary_email', id_prefixes=[], id_prefixes_are_closed=None, definition_uri=None, local_names={}, conforms_to=None, implements=[], instantiates=[], extensions={}, annotations={}, description=None, alt_descriptions={}, title=None, deprecated=None, todos=[], notes=[], comments=[], examples=[], in_subset=[], from_schema=None, imported_from=None, source=None, in_language=None, see_also=[], deprecated_element_has_exact_replacement=None, deprecated_element_has_possible_replacement=None, aliases=[], structured_aliases={}, mappings=[], exact_mappings=[], close_mappings=[], related_mappings=[], narrow_mappings=[], broad_mappings=[], created_by=None, contributors=[], created_on=None, last_updated_on=None, modified_by=None, status=None, rank=None, categories=[], keywords=[], is_a=None, abstract=None, mixin=None, mixins=[], apply_to=[], values_from=[], string_serialization=None, singular_name=None, domain=None, slot_uri=None, multivalued=None, array=None, inherited=None, readonly=None, ifabsent=None, list_elements_unique=None, list_elements_ordered=None, shared=None, key=None, identifier=None, designates_type=None, alias=None, owner=None, domain_of=[], subproperty_of=None, symmetric=None, reflexive=None, locally_reflexive=None, irreflexive=None, asymmetric=None, transitive=None, inverse=None, is_class_field=None, transitive_form_of=None, reflexive_transitive_form_of=None, role=None, is_usage_slot=None, usage_slot_name=None, relational_role=None, slot_group=None, is_grouping_slot=None, path_rule=None, disjoint_with=[], children_are_mutually_disjoint=None, union_of=[], range=None, range_expression=None, enum_range=None, required=None, recommended=None, inlined=None, inlined_as_list=None, minimum_value=None, maximum_value=None, pattern='^\\S+@[\\S+\\.]+\\S+', structured_pattern=None, unit=None, implicit_prefix=None, value_presence=None, equals_string=None, equals_string_in=[], equals_number=None, equals_expression=None, exact_cardinality=None, minimum_cardinality=None, maximum_cardinality=None, has_member=None, all_members=None, none_of=[], exactly_one_of=[], any_of=[], all_of=[])}, attributes={}, class_uri='schema:Person', subclass_of=None, union_of=[], defining_slots=[], tree_root=None, unique_keys={}, rules=[], classification_rules=[], slot_names_unique=None, represents_relationship=None, disjoint_with=[], children_are_mutually_disjoint=None, any_of=[], exactly_one_of=[], none_of=[], all_of=[], slot_conditions={})

well today i'm here to talk to you about

edited, updated format:

>>> sv = SchemaView('tests/test_loaders_dumpers/input/personinfo.yaml')
>>> cls = sv.get_class('Person')
>>> print(cls)
ClassDefinition({
  'name': 'Person',
  'description': 'A person (alive, dead, undead, or fictional).',
  'from_schema': 'https://w3id.org/linkml/examples/personinfo',
  'is_a': 'NamedThing',
  'mixins': ['HasAliases'],
  'slots': ['primary_email', 'birth_date', 'age_in_years', 'gender', 'current_address',
    'has_employment_history', 'has_familial_relationships',
    'has_interpersonal_relationships', 'has_medical_history'],
  'slot_usage': {'primary_email': SlotDefinition({'name': 'primary_email', 'pattern': '^\\S+@[\\S+\\.]+\\S+'})},
  'class_uri': 'schema:Person'
})

Copy link

codecov bot commented Mar 21, 2024

Codecov Report

Attention: Patch coverage is 98.33333% with 1 lines in your changes are missing coverage. Please review.

Project coverage is 62.80%. Comparing base (ed36311) to head (ef2e4d0).
Report is 1 commits behind head on main.

Files Patch % Lines
linkml_runtime/utils/schemaview.py 0.00% 1 Missing โš ๏ธ
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #314      +/-   ##
==========================================
+ Coverage   62.70%   62.80%   +0.09%     
==========================================
  Files          63       63              
  Lines        8580     8603      +23     
  Branches     2444     2447       +3     
==========================================
+ Hits         5380     5403      +23     
  Misses       2583     2583              
  Partials      617      617              

โ˜” View full report in Codecov by Sentry.
๐Ÿ“ข Have feedback on the report? Share it here.

@cmungall
Copy link
Member

some kind of weird windows vs ubuntu nondeterminism here...

I like this! As you might expect I'd like to be careful in rolling this out.

@sneakers-the-rat
Copy link
Contributor Author

As you might expect I'd like to be careful in rolling this out.

always <3

some kind of weird windows vs ubuntu nondeterminism here...

I think windows just always passes (it's doing the same thing over here #312 (comment) )

I probably need to update some string matching tests, will do that

@sneakers-the-rat
Copy link
Contributor Author

OK so i went to go fix the failing tests, and once i realized what they were doing, rather than updating the string representation i just swapped those out for instance equality checks rather than string equality tests which are a more direct and separate out the string representation from the content of the test.

we have a problem tho, because this doesn't quite fix the problem, which is that whenever there is an error or python shows you the object for some reason, you get a huge blob of empty params and you can't see the salient things for that model. the __str__ override is nice for when you're explicitly print()ing or str()ing an object, but the rest of the time python uses __repr__. unfortunately dataclasses define their own __repr__ method by default, so the YAMLRoot override doesn't propagate to the children.

SO. this PR is still imo a step in the right direction, and once/if this is merged then I can go and change the pythongen to add an option for whether or not to set @dataclass(repr=False). This is really mostly an issue for the metamodel classes which have a ton of common attributes, so i'll set that default True (exclude repr) for when generating metamodel classes and False otherwise.

sound good???

I also updated the pretty print methods to preserve the name of the class, rather than just printing as a dict, so now we have

ClassDefinition({ 'class_uri': 'schema:Person',
  'description': 'A person (alive, dead, undead, or fictional).',
  'from_schema': 'https://w3id.org/linkml/examples/personinfo',
  'is_a': 'NamedThing',
  'mixins': ['HasAliases'],
  'name': 'Person',
  'slot_usage': { 'primary_email': { 'name': 'primary_email',
                                     'pattern': '^\\S+@[\\S+\\.]+\\S+'}},
  'slots': [ 'primary_email', 'birth_date', 'age_in_years', 'gender',
             'current_address', 'has_employment_history',
             'has_familial_relationships', 'has_interpersonal_relationships',
             'has_medical_history']})

and once we implement that change to the metamodel generator it should work recursively.

@sneakers-the-rat
Copy link
Contributor Author

sneakers-the-rat commented Mar 27, 2024

OK this is ready to review, opened a pull to main repo that matches.

the deal witht he complicated print method that isn't just a call to pformat is that especially with longer schema it's actually extremely difficult to read - pformat hangs the indent of a nested object against the indentation level of its parent, rather than the global depth of nesting. for example this is what a schema looks like with vanilla pformatting

expand/collapse
{'classes': {'Address': {'name': 'Address',
                         'slots': ['street', 'city', 'altitude']},
             'AnyObject': {'class_uri': 'linkml:Any',
                           'description': 'Example of unconstrained class',
                           'name': 'AnyObject'},
             'AnyOfClasses': {'attributes': {'attribute2': {'any_of': [{'range': 'Person'},
                                                                       {'range': 'Organization'}],
                                                            'name': 'attribute2'}},
                              'name': 'AnyOfClasses'},
             'AnyOfEnums': {'attributes': {'attribute3': {'any_of': [{'range': 'DiagnosisType'},
                                                                     {'range': 'EmploymentEventType'}],
                                                          'name': 'attribute3'}},
                            'name': 'AnyOfEnums'},
             'AnyOfMix': {'attributes': {'attribute4': {'any_of': [{'range': 'integer'},
                                                                   {'range': 'Person'},
                                                                   {'range': 'EmploymentEventType'}],
                                                        'name': 'attribute4'}},
                          'name': 'AnyOfMix'},
             'AnyOfSimpleType': {'attributes': {'attribute1': {'any_of': [{'range': 'string'},
                                                                          {'range': 'integer'}],
                                                               'name': 'attribute1'}},
                                 'name': 'AnyOfSimpleType'},
             'BirthEvent': {'is_a': 'Event',
                            'name': 'BirthEvent',
                            'slots': ['in location']},
             'CodeSystem': {'name': 'CodeSystem', 'slots': ['id', 'name']},
             'Company': {'attributes': {'ceo': {'name': 'ceo',
                                                'range': 'Person',
                                                'slot_uri': 'schema:ceo'}},
                         'is_a': 'Organization',
                         'name': 'Company'},
             'Concept': {'id_prefixes': ['CODE'],
                         'name': 'Concept',
                         'slots': ['id', 'name', 'in code system']},
             'Dataset': {'attributes': {'activities': {'inlined': True,
                                                       'inlined_as_list': True,
                                                       'multivalued': True,
                                                       'name': 'activities',
                                                       'range': 'activity'},
                                        'code systems': {'inlined_as_list': True,
                                                         'multivalued': True,
                                                         'name': 'code systems',
                                                         'range': 'CodeSystem'},
                                        'companies': {'inlined': True,
                                                      'inlined_as_list': True,
                                                      'multivalued': True,
                                                      'name': 'companies',
                                                      'range': 'Company'},
                                        'persons': {'inlined': True,
                                                    'inlined_as_list': True,
                                                    'multivalued': True,
                                                    'name': 'persons',
                                                    'range': 'Person'}},
                         'name': 'Dataset',
                         'rank': 1,
                         'slots': ['metadata'],
                         'tree_root': True},
             'DiagnosisConcept': {'close_mappings': ['biolink:Disease'],
                                  'is_a': 'Concept',
                                  'name': 'DiagnosisConcept'},
             'EmploymentEvent': {'is_a': 'Event',
                                 'name': 'EmploymentEvent',
                                 'rank': 6,
                                 'slot_usage': {'type': {'any_of': [{'range': 'CordialnessEnum'},
                                                                    {'range': 'EmploymentEventType'}],
                                                         'name': 'type',
                                                         'required': False}},
                                 'slots': ['employed at', 'type']},
             'Event': {'name': 'Event',
                       'slots': ['started at time',
                                 'ended at time',
                                 'is current',
                                 'metadata']},
             'FakeClass': {'attributes': {'test_attribute': {'name': 'test_attribute'}},
                           'deprecated': 'this is not a real class, we are '
                                         'using it to test deprecation',
                           'name': 'FakeClass'},
             'FamilialRelationship': {'is_a': 'Relationship',
                                      'name': 'FamilialRelationship',
                                      'rank': 5,
                                      'slot_usage': {'cordialness': {'name': 'cordialness'},
                                                     'related to': {'name': 'related '
                                                                            'to',
                                                                    'range': 'Person',
                                                                    'required': True},
                                                     'type': {'name': 'type',
                                                              'range': 'FamilialRelationshipType',
                                                              'required': True}},
                                      'slots': ['cordialness']},
             'Friend': {'abstract': True, 'name': 'Friend', 'slots': ['name']},
             'HasAliases': {'attributes': {'aliases': {'multivalued': True,
                                                       'name': 'aliases',
                                                       'slot_uri': 'skos:altLabel'}},
                            'mixin': True,
                            'name': 'HasAliases'},
             'MarriageEvent': {'is_a': 'Event',
                               'mixins': ['WithLocation'],
                               'name': 'MarriageEvent',
                               'slots': ['married to']},
             'MedicalEvent': {'is_a': 'Event',
                              'name': 'MedicalEvent',
                              'slots': ['in location',
                                        'diagnosis',
                                        'procedure']},
             'Organization': {'description': 'An organization.\n'
                                             '\n'
                                             'This description\n'
                                             'includes newlines\n'
                                             '\n'
                                             '## Markdown headers\n'
                                             '\n'
                                             ' * and\n'
                                             ' * a\n'
                                             ' * list',
                              'id_prefixes': ['ROR'],
                              'mixins': ['HasAliases'],
                              'name': 'Organization',
                              'rank': 3,
                              'slots': ['id', 'name']},
             'Person': {'attributes': {'is_living': {'name': 'is_living',
                                                     'range': 'LifeStatusEnum'}},
                        'description': 'A person, living or dead',
                        'exact_mappings': ['schema:Person'],
                        'id_prefixes': ['P'],
                        'in_subset': ['subset A'],
                        'mixins': ['HasAliases'],
                        'name': 'Person',
                        'rank': 2,
                        'see_also': ['https://en.wikipedia.org/wiki/Person',
                                     'schema:Person'],
                        'slot_usage': {'name': {'name': 'name',
                                                'pattern': '^\\S+ \\S+$'},
                                       'species name': {'equals_string': 'human',
                                                        'name': 'species name'},
                                       'stomach count': {'equals_number': 1,
                                                         'name': 'stomach '
                                                                 'count'}},
                        'slots': ['id',
                                  'name',
                                  'has employment history',
                                  'has familial relationships',
                                  'has medical history',
                                  'age in years',
                                  'addresses',
                                  'has birth event',
                                  'species name',
                                  'stomach count']},
             'Place': {'mixins': ['HasAliases'],
                       'name': 'Place',
                       'slots': ['id', 'name']},
             'ProcedureConcept': {'is_a': 'Concept',
                                  'name': 'ProcedureConcept'},
             'Relationship': {'name': 'Relationship',
                              'slot_usage': {'cordialness': {'name': 'cordialness',
                                                             'range': 'CordialnessEnum'}},
                              'slots': ['started at time',
                                        'ended at time',
                                        'related to',
                                        'type',
                                        'cordialness']},
             'Sub sub class 2': {'is_a': 'subclass test',
                                 'name': 'Sub sub class 2'},
             'WithLocation': {'mixin': True,
                              'name': 'WithLocation',
                              'slots': ['in location']},
             'class with spaces': {'attributes': {'slot with space 1': {'name': 'slot '
                                                                                'with '
                                                                                'space '
                                                                                '1'}},
                                   'name': 'class with spaces'},
             'subclass test': {'attributes': {'slot with space 2': {'name': 'slot '
                                                                            'with '
                                                                            'space '
                                                                            '2',
                                                                    'range': 'class '
                                                                             'with '
                                                                             'spaces'}},
                               'is_a': 'class with spaces',
                               'name': 'subclass test'},
             'tub sub class 1': {'description': 'Same depth as Sub sub class 1',
                                 'is_a': 'subclass test',
                                 'name': 'tub sub class 1'}},
 'default_curi_maps': ['semweb_context'],
 'default_prefix': 'ks',
 'default_range': 'string',
 'description': 'Kitchen Sink Schema\n'
                '\n'
                'This schema does not do anything useful. It exists to test '
                'all features of linkml.\n'
                '\n'
                'This particular text field exists to demonstrate markdown '
                'within a text field:\n'
                '\n'
                'Lists:\n'
                '\n'
                '   * a\n'
                '   * b\n'
                '   * c\n'
                '\n'
                'And links, e.g to [Person](Person.md)',
 'enums': {'CordialnessEnum': {'name': 'CordialnessEnum',
                               'permissible_values': {'hateful': {'description': 'spiteful',
                                                                  'text': 'hateful'},
                                                      'heartfelt': {'description': 'warm '
                                                                                   'and '
                                                                                   'hearty '
                                                                                   'friendliness',
                                                                    'text': 'heartfelt'},
                                                      'indifferent': {'description': 'not '
                                                                                     'overly '
                                                                                     'friendly '
                                                                                     'nor '
                                                                                     'obnoxiously '
                                                                                     'spiteful',
                                                                      'text': 'indifferent'}}},
           'DiagnosisType': {'name': 'DiagnosisType',
                             'permissible_values': {'TODO': {'text': 'TODO'}}},
           'EmploymentEventType': {'aliases': ['HR code'],
                                   'description': 'codes for different kinds '
                                                  'of employment/HR related '
                                                  'events',
                                   'name': 'EmploymentEventType',
                                   'permissible_values': {'FIRE': {'annotations': {'biolink:opposite': {'tag': 'biolink:opposite',
                                                                                                        'value': 'HIRE'}},
                                                                   'meaning': 'bizcodes:002',
                                                                   'text': 'FIRE'},
                                                          'HIRE': {'description': 'event '
                                                                                  'for '
                                                                                  'a '
                                                                                  'new '
                                                                                  'employee',
                                                                   'meaning': 'bizcodes:001',
                                                                   'text': 'HIRE'},
                                                          'PROMOTION': {'description': 'promotion '
                                                                                       'event',
                                                                        'meaning': 'bizcodes:003',
                                                                        'text': 'PROMOTION'},
                                                          'TRANSFER': {'description': 'transfer '
                                                                                      'internally',
                                                                       'meaning': 'bizcodes:004',
                                                                       'text': 'TRANSFER'}}},
           'FamilialRelationshipType': {'name': 'FamilialRelationshipType',
                                        'permissible_values': {'CHILD_OF': {'text': 'CHILD_OF'},
                                                               'PARENT_OF': {'text': 'PARENT_OF'},
                                                               'SIBLING_OF': {'text': 'SIBLING_OF'}}},
           'LifeStatusEnum': {'name': 'LifeStatusEnum',
                              'permissible_values': {'DEAD': {'text': 'DEAD'},
                                                     'LIVING': {'text': 'LIVING'},
                                                     'UNKNOWN': {'text': 'UNKNOWN'}}},
           'other codes': {'name': 'other codes',
                           'permissible_values': {'a b': {'text': 'a b'}}}},
 'id': 'https://w3id.org/linkml/tests/kitchen_sink',
 'imports': ['linkml:types', 'core'],
 'name': 'kitchen_sink',
 'prefixes': {'A': {'prefix_prefix': 'A',
                    'prefix_reference': 'http://example.org/activities/'},
              'BFO': {'prefix_prefix': 'BFO',
                      'prefix_reference': 'http://purl.obolibrary.org/obo/BFO_'},
              'CODE': {'prefix_prefix': 'CODE',
                       'prefix_reference': 'http://example.org/code/'},
              'P': {'prefix_prefix': 'P',
                    'prefix_reference': 'http://example.org/person/'},
              'RO': {'prefix_prefix': 'RO',
                     'prefix_reference': 'http://purl.obolibrary.org/obo/RO_'},
              'ROR': {'prefix_prefix': 'ROR',
                      'prefix_reference': 'http://example.org/ror/'},
              'biolink': {'prefix_prefix': 'biolink',
                          'prefix_reference': 'https://w3id.org/biolink/'},
              'bizcodes': {'prefix_prefix': 'bizcodes',
                           'prefix_reference': 'https://example.org/bizcodes/'},
              'dce': {'prefix_prefix': 'dce',
                      'prefix_reference': 'http://purl.org/dc/elements/1.1/'},
              'ks': {'prefix_prefix': 'ks',
                     'prefix_reference': 'https://w3id.org/linkml/tests/kitchen_sink/'},
              'lego': {'prefix_prefix': 'lego',
                       'prefix_reference': 'http://geneontology.org/lego/'},
              'linkml': {'prefix_prefix': 'linkml',
                         'prefix_reference': 'https://w3id.org/linkml/'},
              'pav': {'prefix_prefix': 'pav',
                      'prefix_reference': 'http://purl.org/pav/'},
              'schema': {'prefix_prefix': 'schema',
                         'prefix_reference': 'http://schema.org/'},
              'skos': {'prefix_prefix': 'skos',
                       'prefix_reference': 'http://www.w3.org/2004/02/skos/core#'}},
 'see_also': ['https://example.org/'],
 'slots': {'addresses': {'multivalued': True,
                         'name': 'addresses',
                         'range': 'Address'},
           'age in years': {'description': 'number of years since birth',
                            'in_subset': ['subset A', 'subset B'],
                            'maximum_value': 999,
                            'minimum_value': 0,
                            'name': 'age in years',
                            'range': 'integer'},
           'altitude': {'name': 'altitude', 'range': 'decimal'},
           'city': {'name': 'city'},
           'cordialness': {'name': 'cordialness'},
           'diagnosis': {'inlined': True,
                         'name': 'diagnosis',
                         'range': 'DiagnosisConcept'},
           'employed at': {'in_subset': ['subset A'],
                           'name': 'employed at',
                           'range': 'Company'},
           'has birth event': {'name': 'has birth event',
                               'range': 'BirthEvent'},
           'has employment history': {'in_subset': ['subset B'],
                                      'inlined_as_list': True,
                                      'multivalued': True,
                                      'name': 'has employment history',
                                      'range': 'EmploymentEvent',
                                      'rank': 7},
           'has familial relationships': {'in_subset': ['subset B'],
                                          'inlined_as_list': True,
                                          'multivalued': True,
                                          'name': 'has familial relationships',
                                          'range': 'FamilialRelationship'},
           'has marriage history': {'in_subset': ['subset B'],
                                    'inlined_as_list': True,
                                    'multivalued': True,
                                    'name': 'has marriage history',
                                    'range': 'MarriageEvent'},
           'has medical history': {'in_subset': ['subset B'],
                                   'inlined_as_list': True,
                                   'multivalued': True,
                                   'name': 'has medical history',
                                   'range': 'MedicalEvent',
                                   'rank': 5},
           'in code system': {'name': 'in code system', 'range': 'CodeSystem'},
           'in location': {'annotations': {'biolink:opposite': {'tag': 'biolink:opposite',
                                                                'value': 'location_of'}},
                           'name': 'in location',
                           'range': 'Place'},
           'is current': {'name': 'is current', 'range': 'boolean'},
           'life_status': {'name': 'life_status', 'range': 'LifeStatusEnum'},
           'married to': {'name': 'married to', 'range': 'Person'},
           'metadata': {'description': 'Example of a slot that has an '
                                       'unconstrained range',
                        'name': 'metadata',
                        'range': 'AnyObject'},
           'mixin_slot_I': {'mixin': True, 'name': 'mixin_slot_I'},
           'procedure': {'inlined': True,
                         'name': 'procedure',
                         'range': 'ProcedureConcept'},
           'related to': {'name': 'related to'},
           'species name': {'name': 'species name',
                            'pattern': '^[A-Z]+[a-z]+(-[A-Z]+[a-z]+)?\\\\.[A-Z]+(-[0-9]{4})?$'},
           'stomach count': {'name': 'stomach count', 'range': 'integer'},
           'street': {'name': 'street'},
           'tree_slot_A': {'name': 'tree_slot_A', 'slot_uri': 'ks:A'},
           'tree_slot_B': {'is_a': 'tree_slot_A',
                           'name': 'tree_slot_B',
                           'slot_uri': 'ks:B'},
           'tree_slot_C': {'is_a': 'tree_slot_B',
                           'mixins': ['mixin_slot_I'],
                           'name': 'tree_slot_C',
                           'slot_uri': 'ks:C'},
           'type': {'name': 'type'}},
 'source_file': 'tests/test_generators/input/kitchen_sink.yaml',
 'subsets': {'subset A': {'aliases': ['A'],
                          'comments': ['this subset is meaningless, it is just '
                                       'here for testing',
                                       'another comment'],
                          'description': 'test subset A\n'
                                         '\n'
                                         'This is a subset for testing',
                          'name': 'subset A',
                          'rank': 2},
             'subset B': {'aliases': ['B'],
                          'description': 'test subset B',
                          'name': 'subset B',
                          'rank': 1}},
 'title': 'Kitchen Sink Schema'}

and this is what this PR does:

expand/collapse
SchemaDefinition({
  'name': 'kitchen_sink',
  'description': ('Kitchen Sink Schema\n'
     '\n'
     'This schema does not do anything useful. It exists to test all features of '
     'linkml.\n'
     '\n'
     'This particular text field exists to demonstrate markdown within a text '
     'field:\n'
     '\n'
     'Lists:\n'
     '\n'
     '   * a\n'
     '   * b\n'
     '   * c\n'
     '\n'
     'And links, e.g to [Person](Person.md)'),
  'title': 'Kitchen Sink Schema',
  'see_also': ['https://example.org/'],
  'id': 'https://w3id.org/linkml/tests/kitchen_sink',
  'imports': ['linkml:types', 'core'],
  'prefixes': {'pav': Prefix({'prefix_prefix': 'pav', 'prefix_reference': 'http://purl.org/pav/'}),
    'dce': Prefix({'prefix_prefix': 'dce', 'prefix_reference': 'http://purl.org/dc/elements/1.1/'}),
    'lego': Prefix({'prefix_prefix': 'lego', 'prefix_reference': 'http://geneontology.org/lego/'}),
    'linkml': Prefix({'prefix_prefix': 'linkml', 'prefix_reference': 'https://w3id.org/linkml/'}),
    'biolink': Prefix({'prefix_prefix': 'biolink', 'prefix_reference': 'https://w3id.org/biolink/'}),
    'ks': Prefix({
      'prefix_prefix': 'ks',
      'prefix_reference': 'https://w3id.org/linkml/tests/kitchen_sink/'
    }),
    'RO': Prefix({'prefix_prefix': 'RO', 'prefix_reference': 'http://purl.obolibrary.org/obo/RO_'}),
    'BFO': Prefix({'prefix_prefix': 'BFO', 'prefix_reference': 'http://purl.obolibrary.org/obo/BFO_'}),
    'CODE': Prefix({'prefix_prefix': 'CODE', 'prefix_reference': 'http://example.org/code/'}),
    'ROR': Prefix({'prefix_prefix': 'ROR', 'prefix_reference': 'http://example.org/ror/'}),
    'A': Prefix({'prefix_prefix': 'A', 'prefix_reference': 'http://example.org/activities/'}),
    'P': Prefix({'prefix_prefix': 'P', 'prefix_reference': 'http://example.org/person/'}),
    'skos': Prefix({
      'prefix_prefix': 'skos',
      'prefix_reference': 'http://www.w3.org/2004/02/skos/core#'
    }),
    'bizcodes': Prefix({'prefix_prefix': 'bizcodes', 'prefix_reference': 'https://example.org/bizcodes/'}),
    'schema': Prefix({'prefix_prefix': 'schema', 'prefix_reference': 'http://schema.org/'})},
  'default_curi_maps': ['semweb_context'],
  'default_prefix': 'ks',
  'default_range': 'string',
  'subsets': {'subset A': SubsetDefinition({
      'name': 'subset A',
      'description': 'test subset A\n\nThis is a subset for testing',
      'comments': ['this subset is meaningless, it is just here for testing', 'another comment'],
      'aliases': ['A'],
      'rank': 2
    }),
    'subset B': SubsetDefinition({'name': 'subset B', 'description': 'test subset B', 'aliases': ['B'], 'rank': 1})},
  'enums': {'FamilialRelationshipType': EnumDefinition({
      'name': 'FamilialRelationshipType',
      'permissible_values': {'SIBLING_OF': PermissibleValue({'text': 'SIBLING_OF'}),
        'PARENT_OF': PermissibleValue({'text': 'PARENT_OF'}),
        'CHILD_OF': PermissibleValue({'text': 'CHILD_OF'})}
    }),
    'DiagnosisType': EnumDefinition({
      'name': 'DiagnosisType',
      'permissible_values': {'TODO': PermissibleValue({'text': 'TODO'})}
    }),
    'EmploymentEventType': EnumDefinition({
      'name': 'EmploymentEventType',
      'description': 'codes for different kinds of employment/HR related events',
      'aliases': ['HR code'],
      'permissible_values': {'HIRE': PermissibleValue({'text': 'HIRE', 'description': 'event for a new employee', 'meaning': 'bizcodes:001'}),
        'FIRE': PermissibleValue({
          'text': 'FIRE',
          'meaning': 'bizcodes:002',
          'annotations': {'biolink:opposite': Annotation(tag='biolink:opposite',
                                           value='HIRE',
                                           extensions={},
                                           annotations={})}
        }),
        'PROMOTION': PermissibleValue({'text': 'PROMOTION', 'description': 'promotion event', 'meaning': 'bizcodes:003'}),
        'TRANSFER': PermissibleValue({'text': 'TRANSFER', 'description': 'transfer internally', 'meaning': 'bizcodes:004'})}
    }),
    'other codes': EnumDefinition({
      'name': 'other codes',
      'permissible_values': {'a b': PermissibleValue({'text': 'a b'})}
    }),
    'LifeStatusEnum': EnumDefinition({
      'name': 'LifeStatusEnum',
      'permissible_values': {'LIVING': PermissibleValue({'text': 'LIVING'}),
        'DEAD': PermissibleValue({'text': 'DEAD'}),
        'UNKNOWN': PermissibleValue({'text': 'UNKNOWN'})}
    }),
    'CordialnessEnum': EnumDefinition({
      'name': 'CordialnessEnum',
      'permissible_values': {'heartfelt': PermissibleValue({'text': 'heartfelt', 'description': 'warm and hearty friendliness'}),
        'hateful': PermissibleValue({'text': 'hateful', 'description': 'spiteful'}),
        'indifferent': PermissibleValue({
          'text': 'indifferent',
          'description': 'not overly friendly nor obnoxiously spiteful'
        })}
    })},
  'slots': {'employed at': SlotDefinition({'name': 'employed at', 'in_subset': ['subset A'], 'range': 'Company'}),
    'is current': SlotDefinition({'name': 'is current', 'range': 'boolean'}),
    'has employment history': SlotDefinition({
      'name': 'has employment history',
      'in_subset': ['subset B'],
      'rank': 7,
      'multivalued': True,
      'range': 'EmploymentEvent',
      'inlined_as_list': True
    }),
    'has marriage history': SlotDefinition({
      'name': 'has marriage history',
      'in_subset': ['subset B'],
      'multivalued': True,
      'range': 'MarriageEvent',
      'inlined_as_list': True
    }),
    'has medical history': SlotDefinition({
      'name': 'has medical history',
      'in_subset': ['subset B'],
      'rank': 5,
      'multivalued': True,
      'range': 'MedicalEvent',
      'inlined_as_list': True
    }),
    'has familial relationships': SlotDefinition({
      'name': 'has familial relationships',
      'in_subset': ['subset B'],
      'multivalued': True,
      'range': 'FamilialRelationship',
      'inlined_as_list': True
    }),
    'married to': SlotDefinition({'name': 'married to', 'range': 'Person'}),
    'in location': SlotDefinition({
      'name': 'in location',
      'annotations': {'biolink:opposite': Annotation(tag='biolink:opposite',
                                       value='location_of',
                                       extensions={},
                                       annotations={})},
      'range': 'Place'
    }),
    'diagnosis': SlotDefinition({'name': 'diagnosis', 'range': 'DiagnosisConcept', 'inlined': True}),
    'procedure': SlotDefinition({'name': 'procedure', 'range': 'ProcedureConcept', 'inlined': True}),
    'addresses': SlotDefinition({'name': 'addresses', 'multivalued': True, 'range': 'Address'}),
    'age in years': SlotDefinition({
      'name': 'age in years',
      'description': 'number of years since birth',
      'in_subset': ['subset A', 'subset B'],
      'range': 'integer',
      'minimum_value': 0,
      'maximum_value': 999
    }),
    'related to': SlotDefinition({'name': 'related to'}),
    'type': SlotDefinition({'name': 'type'}),
    'street': SlotDefinition({'name': 'street'}),
    'city': SlotDefinition({'name': 'city'}),
    'has birth event': SlotDefinition({'name': 'has birth event', 'range': 'BirthEvent'}),
    'in code system': SlotDefinition({'name': 'in code system', 'range': 'CodeSystem'}),
    'metadata': SlotDefinition({
      'name': 'metadata',
      'description': 'Example of a slot that has an unconstrained range',
      'range': 'AnyObject'
    }),
    'species name': SlotDefinition({
      'name': 'species name',
      'pattern': '^[A-Z]+[a-z]+(-[A-Z]+[a-z]+)?\\\\.[A-Z]+(-[0-9]{4})?$'
    }),
    'stomach count': SlotDefinition({'name': 'stomach count', 'range': 'integer'}),
    'tree_slot_A': SlotDefinition({'name': 'tree_slot_A', 'slot_uri': 'ks:A'}),
    'tree_slot_B': SlotDefinition({'name': 'tree_slot_B', 'is_a': 'tree_slot_A', 'slot_uri': 'ks:B'}),
    'tree_slot_C': SlotDefinition({
      'name': 'tree_slot_C',
      'is_a': 'tree_slot_B',
      'mixins': ['mixin_slot_I'],
      'slot_uri': 'ks:C'
    }),
    'mixin_slot_I': SlotDefinition({'name': 'mixin_slot_I', 'mixin': True}),
    'life_status': SlotDefinition({'name': 'life_status', 'range': 'LifeStatusEnum'}),
    'cordialness': SlotDefinition({'name': 'cordialness'}),
    'altitude': SlotDefinition({'name': 'altitude', 'range': 'decimal'})},
  'classes': {'AnyOfSimpleType': ClassDefinition({
      'name': 'AnyOfSimpleType',
      'attributes': {'attribute1': SlotDefinition({
          'name': 'attribute1',
          'any_of': [AnonymousSlotExpression({'range': 'string'}),
            AnonymousSlotExpression({'range': 'integer'})]
        })}
    }),
    'AnyOfClasses': ClassDefinition({
      'name': 'AnyOfClasses',
      'attributes': {'attribute2': SlotDefinition({
          'name': 'attribute2',
          'any_of': [AnonymousSlotExpression({'range': 'Person'}),
            AnonymousSlotExpression({'range': 'Organization'})]
        })}
    }),
    'AnyOfEnums': ClassDefinition({
      'name': 'AnyOfEnums',
      'attributes': {'attribute3': SlotDefinition({
          'name': 'attribute3',
          'any_of': [AnonymousSlotExpression({'range': 'DiagnosisType'}),
            AnonymousSlotExpression({'range': 'EmploymentEventType'})]
        })}
    }),
    'AnyOfMix': ClassDefinition({
      'name': 'AnyOfMix',
      'attributes': {'attribute4': SlotDefinition({
          'name': 'attribute4',
          'any_of': [AnonymousSlotExpression({'range': 'integer'}),
            AnonymousSlotExpression({'range': 'Person'}),
            AnonymousSlotExpression({'range': 'EmploymentEventType'})]
        })}
    }),
    'HasAliases': ClassDefinition({
      'name': 'HasAliases',
      'mixin': True,
      'attributes': {'aliases': SlotDefinition({'name': 'aliases', 'slot_uri': 'skos:altLabel', 'multivalued': True})}
    }),
    'Friend': ClassDefinition({'name': 'Friend', 'abstract': True, 'slots': ['name']}),
    'Person': ClassDefinition({
      'name': 'Person',
      'id_prefixes': ['P'],
      'description': 'A person, living or dead',
      'in_subset': ['subset A'],
      'see_also': ['https://en.wikipedia.org/wiki/Person', 'schema:Person'],
      'exact_mappings': ['schema:Person'],
      'rank': 2,
      'mixins': ['HasAliases'],
      'slots': ['id', 'name', 'has employment history', 'has familial relationships',
        'has medical history', 'age in years', 'addresses', 'has birth event',
        'species name', 'stomach count'],
      'slot_usage': {'name': SlotDefinition({'name': 'name', 'pattern': '^\\S+ \\S+$'}),
        'species name': SlotDefinition({'name': 'species name', 'equals_string': 'human'}),
        'stomach count': SlotDefinition({'name': 'stomach count', 'equals_number': 1})},
      'attributes': {'is_living': SlotDefinition({'name': 'is_living', 'range': 'LifeStatusEnum'})}
    }),
    'Organization': ClassDefinition({
      'name': 'Organization',
      'id_prefixes': ['ROR'],
      'description': ('An organization.\n'
         '\n'
         'This description\n'
         'includes newlines\n'
         '\n'
         '## Markdown headers\n'
         '\n'
         ' * and\n'
         ' * a\n'
         ' * list'),
      'rank': 3,
      'mixins': ['HasAliases'],
      'slots': ['id', 'name']
    }),
    'Place': ClassDefinition({'name': 'Place', 'mixins': ['HasAliases'], 'slots': ['id', 'name']}),
    'Address': ClassDefinition({'name': 'Address', 'slots': ['street', 'city', 'altitude']}),
    'Concept': ClassDefinition({
      'name': 'Concept',
      'id_prefixes': ['CODE'],
      'slots': ['id', 'name', 'in code system']
    }),
    'DiagnosisConcept': ClassDefinition({'name': 'DiagnosisConcept', 'close_mappings': ['biolink:Disease'], 'is_a': 'Concept'}),
    'ProcedureConcept': ClassDefinition({'name': 'ProcedureConcept', 'is_a': 'Concept'}),
    'Event': ClassDefinition({
      'name': 'Event',
      'slots': ['started at time', 'ended at time', 'is current', 'metadata']
    }),
    'Relationship': ClassDefinition({
      'name': 'Relationship',
      'slots': ['started at time', 'ended at time', 'related to', 'type', 'cordialness'],
      'slot_usage': {'cordialness': SlotDefinition({'name': 'cordialness', 'range': 'CordialnessEnum'})}
    }),
    'FamilialRelationship': ClassDefinition({
      'name': 'FamilialRelationship',
      'rank': 5,
      'is_a': 'Relationship',
      'slots': ['cordialness'],
      'slot_usage': {'type': SlotDefinition({'name': 'type', 'range': 'FamilialRelationshipType', 'required': True}),
        'related to': SlotDefinition({'name': 'related to', 'range': 'Person', 'required': True}),
        'cordialness': SlotDefinition({'name': 'cordialness'})}
    }),
    'BirthEvent': ClassDefinition({'name': 'BirthEvent', 'is_a': 'Event', 'slots': ['in location']}),
    'EmploymentEvent': ClassDefinition({
      'name': 'EmploymentEvent',
      'rank': 6,
      'is_a': 'Event',
      'slots': ['employed at', 'type'],
      'slot_usage': {'type': SlotDefinition({
          'name': 'type',
          'required': False,
          'any_of': [AnonymousSlotExpression({'range': 'CordialnessEnum'}),
            AnonymousSlotExpression({'range': 'EmploymentEventType'})]
        })}
    }),
    'MedicalEvent': ClassDefinition({
      'name': 'MedicalEvent',
      'is_a': 'Event',
      'slots': ['in location', 'diagnosis', 'procedure']
    }),
    'WithLocation': ClassDefinition({'name': 'WithLocation', 'mixin': True, 'slots': ['in location']}),
    'MarriageEvent': ClassDefinition({
      'name': 'MarriageEvent',
      'is_a': 'Event',
      'mixins': ['WithLocation'],
      'slots': ['married to']
    }),
    'Company': ClassDefinition({
      'name': 'Company',
      'is_a': 'Organization',
      'attributes': {'ceo': SlotDefinition({'name': 'ceo', 'slot_uri': 'schema:ceo', 'range': 'Person'})}
    }),
    'CodeSystem': ClassDefinition({'name': 'CodeSystem', 'slots': ['id', 'name']}),
    'Dataset': ClassDefinition({
      'name': 'Dataset',
      'rank': 1,
      'slots': ['metadata'],
      'attributes': {'persons': SlotDefinition({
          'name': 'persons',
          'multivalued': True,
          'range': 'Person',
          'inlined': True,
          'inlined_as_list': True
        }),
        'companies': SlotDefinition({
          'name': 'companies',
          'multivalued': True,
          'range': 'Company',
          'inlined': True,
          'inlined_as_list': True
        }),
        'activities': SlotDefinition({
          'name': 'activities',
          'multivalued': True,
          'range': 'activity',
          'inlined': True,
          'inlined_as_list': True
        }),
        'code systems': SlotDefinition({
          'name': 'code systems',
          'multivalued': True,
          'range': 'CodeSystem',
          'inlined_as_list': True
        })},
      'tree_root': True
    }),
    'FakeClass': ClassDefinition({
      'name': 'FakeClass',
      'deprecated': 'this is not a real class, we are using it to test deprecation',
      'attributes': {'test_attribute': SlotDefinition({'name': 'test_attribute'})}
    }),
    'class with spaces': ClassDefinition({
      'name': 'class with spaces',
      'attributes': {'slot with space 1': SlotDefinition({'name': 'slot with space 1'})}
    }),
    'subclass test': ClassDefinition({
      'name': 'subclass test',
      'is_a': 'class with spaces',
      'attributes': {'slot with space 2': SlotDefinition({'name': 'slot with space 2', 'range': 'class with spaces'})}
    }),
    'Sub sub class 2': ClassDefinition({'name': 'Sub sub class 2', 'is_a': 'subclass test'}),
    'tub sub class 1': ClassDefinition({
      'name': 'tub sub class 1',
      'description': 'Same depth as Sub sub class 1',
      'is_a': 'subclass test'
    }),
    'AnyObject': ClassDefinition({
      'name': 'AnyObject',
      'description': 'Example of unconstrained class',
      'class_uri': 'linkml:Any'
    })},
  'source_file': 'tests/test_generators/input/kitchen_sink.yaml'
})

i actually can't emphasize enough how much of a quality of life improvement this is - it might seem like a trivial thing, but it is one of the very first things i noticed when i was using linkml - never print the models, ever, because they will totally wipe out your shell's traceback buffer and offer nothing of value. when working with small schemas with one or two classes this is totally infuriating because you've only set like 5 properties and somehow are getting like 1000 lines of text back when you're trying to see what the hell anything is.

this pr makes it actually pretty manageable to work with linkml interactively, i was being delightfully surprised while debugging it. i think it's worth the complexity of the manual string formatting i do here

@sneakers-the-rat
Copy link
Contributor Author

i have been using this locally for working with linkml today and i think y'all are gonna love it. really changes how possible it is to read and debug problems in the code :)

@cmungall
Copy link
Member

cmungall commented Apr 5, 2024

This is really great!

Just one Q - the syntax is a bit of a hybrid between python and a kind of OO-json. I agree a plain json serialization is less desirable as the typing becomes implicit. Any thoughts as to making the output pure python that evaluates to the objects that are printed. This could be done with **s but that is a bit ugly when just scanning. Alternatively just make this normal instantiation syntax?

@sneakers-the-rat
Copy link
Contributor Author

Id be fine with either! The instantiation syntax would be tricky bc i am using pformat to handle all the other python objects, so it would be a different kind of hybrid syntax, but adding a ** would be np :)

@cmungall
Copy link
Member

cmungall commented Apr 8, 2024

looks like the upstream tests are confused by the autoversioning...

@sneakers-the-rat
Copy link
Contributor Author

Yes yes thats this: #319

@sneakers-the-rat
Copy link
Contributor Author

any chance we can merge this pretty plz, i am working on array stuff again and cursing the long print statements while debugging :''(

@sneakers-the-rat
Copy link
Contributor Author

Here's a drop-in monkeypatch for anyone else affected by this making it super hard to work with linkml models. just call this before you instantiate the models and you shoudl be good

def patch_pretty_print() -> None:
    """
    Fix the godforsaken linkml dataclass reprs

    See: https://github.com/linkml/linkml-runtime/pull/314
    """
    import re
    from pprint import pformat
    from typing import Any
    import textwrap
    from dataclasses import is_dataclass, make_dataclass, field
    from linkml_runtime.linkml_model import meta
    from linkml_runtime.utils.formatutils import items

    def _pformat(fields: dict, cls_name: str, indent: str = '  ') -> str:
        """
        pretty format the fields of the items of a ``YAMLRoot`` object without the wonky indentation of pformat.
        see ``YAMLRoot.__repr__``.
        formatting is similar to black - items at similar levels of nesting have similar levels of indentation,
        rather than getting placed at essentially random levels of indentation depending on what came before them.
        """
        res = []
        total_len = 0
        for key, val in fields:
            if val == [] or val == {} or val is None:
                continue
            # pformat handles everything else that isn't a YAMLRoot object, but it sure does look ugly
            # use it to split lines and as the thing of last resort, but otherwise indent = 0, we'll do that
            val_str = pformat(val, indent=0, compact=True, sort_dicts=False)
            # now we indent everything except the first line by indenting and then using regex to remove just the first indent
            val_str = re.sub(rf'\A{re.escape(indent)}', '', textwrap.indent(val_str, indent))
            # now recombine with the key in a format that can be re-eval'd into an object if indent is just whitespace
            val_str = f"'{key}': " + val_str

            # count the total length of this string so we know if we need to linebreak or not later
            total_len += len(val_str)
            res.append(val_str)

        if total_len > 80:
            inside = ',\n'.join(res)
            # we indent twice - once for the inner contents of every inner object, and one to
            # offset from the root element. that keeps us from needing to be recursive except for the
            # single pformat call
            inside = textwrap.indent(inside, indent)
            return cls_name + '({\n' + inside + '\n})'
        else:
            return cls_name + '({' + ', '.join(res) + '})'

    def __repr__(self):
        return _pformat(items(self), self.__class__.__name__)

    for cls_name in dir(meta):
        cls = getattr(meta, cls_name)
        if is_dataclass(cls):
            new_dataclass = make_dataclass(cls.__name__,fields=[('__dummy__', Any,  field(default=None))], bases=(cls,), repr=False)
            new_dataclass.__repr__ = __repr__
            new_dataclass.__str__ = __repr__
            setattr(meta, cls.__name__, new_dataclass)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants