New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is there a Python API client? #2061

Open
vsoch opened this Issue Sep 14, 2018 · 3 comments

Comments

Projects
None yet
2 participants
@vsoch

vsoch commented Sep 14, 2018

In that creators of schemas for scientific domains are commonly using python (and well, sometimes R, but that is more statistics and less bio/neuro) I wanted to ask if there is a Python client? For example, if I'm preparing a dataset as I loop I would want to do this:

from schemaorg import Schema
from schemaorg.extensions.health_lifesci import *
from glob import glob

# Create schema
schema = Schema()

anatomical = BrainStructure(bodyLocation="head", function="thinking")
schema.add(anatomica)
schema.save("schema.json")

The above would export to whatever format(s) are desired, and then the schema itself could be used however needed! The library could also have functions for interacting with a schema. For example, in datalad you have extractors for metadata that would typically parse over files and use a metadata definition to create a "meta metadata" object. This is actually where I'd want to put different kinds of validation, which also could be supplied

from schemaorg.validators import SchemaValidator
validator = SchemaValidator("schema.json")

anats = glob("*.nii.gz")

for anat_file in anats:
    validator.validate(anat)

With different levels for warnings, errors, coinciding with python logging, etc. Things like validators, nice graphical UIs, and transfer, and reports could be generated based on a (default) export of the current schemas, and then the user could specify a custom one of course.

I found a few libraries for extraction, or general parsing, but not what I'm thinking of, e.g.

The library I'm looking for would be more of a template generator and vaidator, and used not by developers but by dataset providers / generators. This would be a nice start / foundation for building a lot of the things I'm wanting to do, and making it easy for scientists, so I'd like to ask about it, and if there isn't one, I would like to give it a first shot :)

@danbri

This comment has been minimized.

Contributor

danbri commented Sep 21, 2018

The Python you can see in this repo is generally focussed on running and publishing the site, and are probably not a good fit (or stable APIs) for general usage. You might find that the general RDF APIs from something like rdflib address some of the needs here, in that you can create data graphs via API call and then the library will handle things like generating legal JSON-LD syntax.

@vsoch

This comment has been minimized.

vsoch commented Sep 21, 2018

ok cool! This is something I'll think more about / take a shot at coming up with a simple solution because I'll definitely need it down the line! For the site that I'm generating at openschemas.github.io I'm using map2model and I'm about to add validation functions to that. But for a client to interact with the schemas (and do some of the examples above) I think I'll make a similar, small related tool.

@vsoch

This comment has been minimized.

vsoch commented Nov 6, 2018

Heyo! If you are interested, I created a schemaorg python module (see verbose writeup here) and would love contributions and working with schema.org on this. It differs in what is provided in this repo because it intends to provide programmatic access to the specifications themselves, and help software developers to read / extract / dump out what they need to build software around the schemas.

Here's a good entrypoint to a basic example and the repository. https://openschemas.github.io/schemaorg/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment