Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TermSet Integration #862

Closed
wants to merge 43 commits into from
Closed

TermSet Integration #862

wants to merge 43 commits into from

Conversation

mavaylon1
Copy link
Contributor

@mavaylon1 mavaylon1 commented May 12, 2023

Motivation

Add TermSet Integration

TODO:

  • Add gallery doc on TermSet
  • Add data validation to Data
  • Add TermSet unit tests
  • Make sure everything works with ER and pynwb branch
  • Add tests for TermSet usage: data validation on Data
  • Add validation to DynamicTable
  • Add tests for TermSet usage: data validation on add_row
  • Add tests for TermSet usage: data validation on add_column
  • Add tests for TermSet usage: ExternalResources

How to test the behavior?

Show how to reproduce the new behavior (can be a bug fix or a new feature)

Checklist

  • Did you update CHANGELOG.md with your changes?
  • Have you checked our Contributing document?
  • Have you ensured the PR clearly describes the problem and the solution?
  • Is your contribution compliant with our coding style? This can be checked running ruff from the source directory.
  • Have you checked to ensure that there aren't other open Pull Requests for the same change?
  • Have you included the relevant issue number using "Fix #XXX" notation where XXX is the issue number? By including "Fix #XXX" you allow GitHub to close issue #XXX when the PR is merged.

@mavaylon1 mavaylon1 self-assigned this May 12, 2023
@mavaylon1
Copy link
Contributor Author

@rly @oruebel I finished adding the tests and the tutorials. I'll take another run through later today but feel free to give this a look over now. Also for some reason the path to the schema works locally but fails on git. It's a silly thing but let me know. I tried multiple path configurations.

@mavaylon1
Copy link
Contributor Author

mavaylon1 commented May 15, 2023

If you run

pytest

everything passes except backwards compatibility.

FileNotFoundError: [Errno 2] No such file or directory: 'tests/unit/back_compat_tests/1.0.5.h5'

even though it is there

@mavaylon1
Copy link
Contributor Author

So if you run "pytest" alone it will return an error where it can't find the file. However, if you run "pytest test_table.py" then it'll find the file.

@codecov
Copy link

codecov bot commented May 17, 2023

Codecov Report

Patch coverage: 25.66% and project coverage change: -1.30 ⚠️

Comparison is base (5fa6322) 88.21% compared to head (2af0e9f) 86.92%.

❗ Current head 2af0e9f differs from pull request most recent head ef43b23. Consider uploading reports for the commit ef43b23 to get more accurate results

Additional details and impacted files
@@            Coverage Diff             @@
##              dev     #862      +/-   ##
==========================================
- Coverage   88.21%   86.92%   -1.30%     
==========================================
  Files          44       45       +1     
  Lines        9020     9184     +164     
  Branches     2576     2616      +40     
==========================================
+ Hits         7957     7983      +26     
- Misses        752      882     +130     
- Partials      311      319       +8     
Impacted Files Coverage Δ
src/hdmf/term_set.py 17.02% <17.02%> (ø)
src/hdmf/container.py 87.17% <17.94%> (-3.69%) ⬇️
src/hdmf/common/table.py 82.66% <31.57%> (-2.69%) ⬇️
src/hdmf/common/resources.py 88.91% <32.20%> (-8.79%) ⬇️
src/hdmf/build/objectmapper.py 92.50% <33.33%> (-0.23%) ⬇️
src/hdmf/__init__.py 60.00% <100.00%> (+1.66%) ⬆️

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

@@ -2,7 +2,7 @@
h5py==2.10 # support for selection of datasets with list of indices added in 2.10
importlib-metadata==4.2.0; python_version < "3.8" # TODO: remove when minimum python version is 3.8
importlib-resources==5.12.0; python_version < "3.9" # TODO: remove when when minimum python version is 3.9
jsonschema==2.6.0
jsonschema>=2.6.0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The point of this file is to be used when testing whether the minimum requirements set in pyproject.toml are valid. Using >= instead of == defeats the point.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If 2.6.0 is too small, then the minimum version should be increased both here and in pyproject.toml

tests/unit/test_container.py Outdated Show resolved Hide resolved
tests/unit/test_term_set.py Outdated Show resolved Hide resolved
@rly
Copy link
Contributor

rly commented May 18, 2023

linkml_runtime is not a lightweight dependency... Installing it installs

Successfully installed attrs-23.1.0 certifi-2023.5.7 charset-normalizer-3.1.0 click-8.1.3 curies-0.5.5 deprecated-1.2.13 exceptiongroup-1.1.1 greenlet-2.0.1 hbreader-0.9.1 idna-3.4 importlib-metadata-6.6.0 iniconfig-2.0.0 isodate-0.6.1 json-flattener-0.1.9 jsonasobj2-1.0.4 jsonschema-4.17.3 linkml_runtime-1.5.3 packaging-23.1 pluggy-1.0.0 prefixcommons-0.1.12 prefixmaps-0.1.5 pydantic-1.10.7 pyparsing-3.0.9 pyrsistent-0.19.3 pytest-7.3.1 pytest-logging-2015.11.4 pytrie-0.4.0 pyyaml-6.0 rdflib-6.3.2 requests-2.30.0 six-1.16.0 sortedcontainers-2.4.0 tomli-2.0.1 typing-extensions-4.5.0 urllib3-2.0.2 wrapt-1.15.0 zipp-3.15.0

That's kind of a lot of packages, totaling at least 15 MB. Each package can introduce dependency resolution conflicts with users' environments, so in general, it is best for libraries to use as few packages as possible.

Out of the many functions of linkml_runtime, we use only:

SchemaView.schema.prefixes
SchemaView.all_enums()

Both of those seem like it can be handled using simple YAML parsing of the LinkML YAML.
@oruebel @mavaylon1 what do you think about replacing these SchemaView functions with simple calls to parse the LinkML YAML?
Pros: avoids installing a heavy dependency
Cons: if the LinkML format for prefixes and enums changes for some reason, then we would have to update the YAML parsing. But that seems unlikely.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants