-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Load kg-covid-19 into blazegraph instance #134
Comments
@kltm has set up a blazegraph endpoint: now we just need to give him a .jnl file - anyone want to help with this? |
From @balhoff
|
@kltm I had accidentally put a space before the format there. |
fixed |
@justaddcoffee you may need to increase available heap size:
|
made a PR for this #151 - likely will have to fiddle with Jenkinsfile a bit more |
Possible working deployment draft on |
Thanks very much @kltm - should have a blazegraph journal soon
@balhoff about whether this is an ontology - I'm not sure. It contains the contents of 3 ontologies, but I don't know if it's an ontology itself. What's this @balhoff thanks again for all your help. |
@justaddcoffee it kind of depends how they were merged. OWL ontologies can be stored in RDF, but there is a schema dictating how the RDF can be structured. For the purposes of the |
thanks @balhoff |
Okay, completed blazegraph journal is here Increasing memory cut the blazegraph-runner runtime in half, down to 17h I'll sync up with Seth to stand up the blazegraph instance |
This still seems quite long. Did you say 15 million triples? |
Since adding memory reduced the load time, it is highly likely that a lot of time is spent in garbage collection when loading via blazegraph-runner |
The TSV edge file has 15 million lines, it's actually got 561 million triples:
|
Oh! That's a different story. It may be fairly appropriate time frame then. You may be able to speed it up somewhat if you are on SSD. |
not sure why this would matter, but when I use the NT file produced at the merge step (as opposed to doing this after the fact with KGX), the runtime is reduced to 8 h. (@deepakunni3, thanks again for spotting this) Triples are down to about 262M now, so this reduction in runtime is possibly just because the graph got a lot smaller:
|
@kltm the blazegraph deploy stage failed, perhaps because of a permission problem with the repo operations.git
|
Yes, I fixed the NT exporter to not export unnecessary edges like |
@justaddcoffee You should have a repo invite now to help deal with the issue. |
Thanks @kltm ! |
Ummm...I made a few more changes and I think it worked? Do you have a way to test that? |
Assuming this is "good" for now, next steps might be to get this out of the cheat of using a GO repo into something you have more easy operational control of. That said, this works for now... |
No description provided.
The text was updated successfully, but these errors were encountered: