Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make everything installable via PyPI #41

Open
2 tasks
cmungall opened this issue Jun 13, 2022 · 11 comments
Open
2 tasks

Make everything installable via PyPI #41

cmungall opened this issue Jun 13, 2022 · 11 comments

Comments

@cmungall
Copy link
Collaborator

cmungall commented Jun 13, 2022

  • rewrite relation-graph in rust and use PyO3 bindings
  • use PyO3 to wrap rdftab

As an alternative to wrapping rdftab is to directly load the statements table in Python. This will be slower, but it should be very straightforward if we skip loading of the stanza field, which we don't use. It will also have the advantage that we don't need to do transformations to RDF/XML using riot or robot.

cmungall added a commit to cmungall/relation-graph-py that referenced this issue Jun 13, 2022
@cmungall
Copy link
Collaborator Author

cmungall commented Jul 6, 2022

See https://github.com/cmungall/relation-graph-py

Note it may not necessary to wrap rdftab using PyO3, we can use any rdf library (we don't use the stanza field from rdftab)

@cmungall
Copy link
Collaborator Author

Consider instead: https://github.com/balhoff/whelk-rs

@joeflack4
Copy link

joeflack4 commented Aug 10, 2022

@hrshdhgd @cmungall I was trying to get semsql to work today in order to troubleshoot some issues I'm having with trying to use SqlImplementation in OAK.

I had a lot of problems with version 0.1.7 of semsql, so I installed the latest version, 0.2.0, but now I'm getting this error: /bin/sh: relation-graph: command not found

For now, should I continue using semsql==0.1.* (resolves to 0.1.7)?

Error message

/bin/sh: relation-graph: command not found

Related

I think this is intentional on OAK's end because of the above error, but I just wanted to let you know that this came up as well:

ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
oaklib 0.1.34 requires semsql<0.2.0,>=0.1.6, but you have semsql 0.2.0 which is incompatible.

@cmungall
Copy link
Collaborator Author

Hi @joeflack4 - always use the latest version. If you are having issues with RG file an issue here: https://github.com/balhoff/relation-graph/issues.

Did you get your issue resolved?

@jamesaoverton
Copy link

Using PyO3 for RDFTab is certainly possible, but I wasn't planning to do it because I'll be using LDTab going forward. We've used PyO3 for valve.py and wiring.py and are working on using it LDTab (using horned-owl). We're happy to share our experience.

For this purpose, I think you're probably better off just porting RDFTab to Python.

@cmungall
Copy link
Collaborator Author

@jamesaoverton - that makes sense.

the speed of rdflib is the main issue. even though we get very fast access once we have built the sqlite db, there are still cases where latency in the build is an issue. but certainly having this as an option seems reasonable.

I'm figuring medium term python bindings to horned-owl will solve a lot of use cases...

@LucaCappelletti94
Copy link

Please do be advised that you will encounter the following complex issues:

  • Different available instruction sets (e.g. AVX256)
  • Different architectures (Mac M1, M2, Intel...)

Do take these things into account while designing your build and deploy process. It took quite a while for us to figure out how to do this for Ensmallen.

@joeflack4
Copy link

Just linking the Slack thread that Chris opened: https://obo-communitygroup.slack.com/archives/C03D93DEALA/p1661527315827469

@jamesaoverton
Copy link

jamesaoverton commented Aug 29, 2022

I agree with @LucaCappelletti94: Getting PyO3 to work has been the easy part, and cross-compiling binaries for packaging has been much tricker. With a lot of effort we have a workflow to compile for major architectures and push to PyPI using GitHub Actions. This has been tested but is not yet on production: https://github.com/ontodev/valve.py/blob/valve_rs_python_bindings/.github/workflows/build-and-publish-wheels.yml

Suggestions for improvements are welcome.

@cmungall
Copy link
Collaborator Author

cmungall commented Aug 30, 2022

I have an experimental replacement for rdftab.rs:

https://github.com/INCATools/rdf-sql-bulkloader

this doesn't do any rust binding itself, it relies on https://github.com/ozekik/lightrdf for that part. If this is fruitful, we may want to coordinate with the devs of this to make sure they have best practice for releasing wheels etc

I am still doing perf tests (INCATools/rdf-sql-bulkloader#1)

UPDATE the bulkloader now uses pyoxigraph which seems better supported

@cmungall
Copy link
Collaborator Author

I added a general discussion for rust depenencies in OAK here:

INCATools/ontology-access-kit#247

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants