Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Categorization of KGX transform warnings and errors #338

Closed
justaddcoffee opened this issue Sep 20, 2021 · 6 comments
Closed

Categorization of KGX transform warnings and errors #338

justaddcoffee opened this issue Sep 20, 2021 · 6 comments
Assignees
Labels
enhancement New feature or request

Comments

@justaddcoffee
Copy link

Is your feature request related to a problem? Please describe.

We are using KGX for kg-obo (Thanks! It's working great.) Basically we are using KGX to transform all OBO ontologies into KGX TSV format - see here.

One challenge is that when you transform >100 ontologies, you get all kinds of errors and warnings. So, which things transformed "correctly" (by some definition), and which did not?

It'd be useful to have KGX transform return errors and warnings that are categorized in some way. For example, when parsing foodon, we see 15 or so warnings to do with bnodes not be handled, which in this case we think is okay, and we can probably upload the transformed graph. In the case of other types of warnings, we might want to not upload the graph.

Describe the solution you'd like
A clear and concise description of what you want to happen.

It'd be awesome to have kgx.cli.transform return a dict with errors and warnings binned in some way according to what type of error they are. So we could do something like:

result = kgx.cli.transform(inputs=input_file,
                                            input_format=input_format,
                                            output=output_file,
                                            output_format=output_format,
                                            output_compression="tar.gz")
print("found these types of warnings" + " ".join([result['warnings'].keys()])
print("found these types of warnings" + " ".join([result['errors'].keys()])

# see if we only get warnings of the type that we want to ignore
if not result['errors'] and set(result['warning'].keys()) == ['bnode_warnings']:
    upload_graph_to_s3()
else:
    print("things didn't go as planned, not uploading graph")

Describe alternatives you've considered

Right now we are thinking of just parsing the logs and binning the errors and warnings manually, which is totally doable. We thought though that it might be worth considering having this functionality live in KGX.

Additional context

Per discussion with @cmungall and @caufieldjh

@justaddcoffee justaddcoffee added the enhancement New feature or request label Sep 20, 2021
@justaddcoffee justaddcoffee changed the title Categorization of KGX transform errors Categorization of KGX transform warnings and errors Sep 20, 2021
@RichardBruskiewich
Copy link
Collaborator

Relating a bit to #309 as well (I had the same complains previously)

@RichardBruskiewich
Copy link
Collaborator

Hi @justaddcoffee, I've just attempted a rehabilitation of the KGX error reporting for the 'validate' and 'graph-summary' (PR #364).

My gut feeling here is that perhaps I can somehow extend the new mechanism to your use case here. I'll attempt that.

@RichardBruskiewich RichardBruskiewich self-assigned this Nov 28, 2021
@justaddcoffee
Copy link
Author

Great @RichardBruskiewich thanks!

@cmungall
Copy link
Contributor

cmungall commented Dec 3, 2021

You may want to consider the linkml validation datamodel (based on SHACL validation datamodel)

https://github.com/linkml/linkml-model/blob/main/linkml_model/model/schema/validation.yaml

@RichardBruskiewich
Copy link
Collaborator

@justaddcoffee, PR #338 attempts to implement this use case.

@RichardBruskiewich
Copy link
Collaborator

Resolved by PR #365

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants