Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement Fuseki metadata store #106

Merged
merged 3 commits into from Jul 18, 2018
Merged

Implement Fuseki metadata store #106

merged 3 commits into from Jul 18, 2018

Conversation

@c-w
Copy link
Owner

@c-w c-w commented Jun 26, 2018

This pull request implements a metadata cache implementation based on Apache Jena Fuseki to supplement the existing SleepyCat and SQLite implementations.

Fuseki can be run as a separate service to Gutenberg, e.g. via Docker, which makes setup of the library much easier: no more need to install bsddb3! This means that going forward we can move bsddb3 into an optional dependency. Additionally, Fuseki can be run on a separate machine from Gutenberg so it enables use-cases where multiple users may want to share a single metadata cache.

@c-w c-w requested review from sethwoodworth, MasterOdin and hugovk Jun 26, 2018
Copy link
Collaborator

@hugovk hugovk left a comment

I've not tested the code out locally, so just a few minor comments.

README.rst Outdated
Apache Jena Fuseki
------------------

As an alternative to the BSD-DB backend, this package can also leverage `Apache Jena Fuseki <https://jena.apache.org/documentation/fuseki2/>`_
Copy link
Collaborator

@hugovk hugovk Jun 26, 2018

Use plain language: use "use" rather than "leverage".

Copy link
Owner Author

@c-w c-w Jun 26, 2018

Done as amend.

try:
self.graph.query('DELETE WHERE { ?s ?p ?o . }')
except ResultException:
# this is often just a false positive since jena fuseki does not
Copy link
Collaborator

@hugovk hugovk Jun 26, 2018

"jena fuseki" -> "Jena Fuseki"

Copy link
Owner Author

@c-w c-w Jun 26, 2018

Done as amend.

@@ -1,2 +1,3 @@
coverage
flake8
nose
Copy link
Collaborator

@hugovk hugovk Jun 26, 2018

Good to add this here.

This in really another issue, but we should consider switching away from nose. From November 2015:

Nose has been in maintenance mode for the past several years and will likely
cease without a new person/team to take over maintainership. New projects
should consider using Nose2 <https://github.com/nose-devs/nose2>, py.test <http://pytest.org/>, or just plain unittest/unittest2.

https://nose.readthedocs.io/en/latest/#note-to-users

Besides, we agreed that Nose was going to be in maintenance mode, Nose2 was the way forward, and that was part of the reason I took over maintainership at all. Personally, I wasn't ever agreeing to help make Nose live forever--it was more of a fix critical bugs the best I could with the time that I had available. There's some serious deficiencies in the Nose code base that can only be fixed with a lot of TLC, and no one on the current team really has the energy to commit to it.

That is not a knock on anyone... Nose has been around a long time, has lived through several changes in unit testing mentality, and across a number of versions of Python. It's legacy and with that comes the cruft of organic growth. It's just way more than I can deal with alone.

nose-devs/nose@0f40fa9#commitcomment-14224696

Copy link
Owner Author

@c-w c-w Jun 26, 2018

Done in 015107b.

Copy link
Collaborator

@MasterOdin MasterOdin Jun 26, 2018

Should probably consider moving completely away from nose/nose2 and to something like pytest. From the nose2 doc:

However, given the current climate, with much more interest accruing around pytest, nose2 is prioritizing bugfixes and maintenance ahead of new feature development.

It also has a "alpha" classifier attached to it.

Though that should probably be done in a different PR.

@c-w c-w force-pushed the fuseki-store branch 4 times, most recently from 4fd4f67 to c2d4a01 Jun 26, 2018
@c-w c-w force-pushed the fuseki-store branch from c2d4a01 to 9f80480 Jun 26, 2018
@c-w
Copy link
Owner Author

@c-w c-w commented Jun 26, 2018

@hugovk @MasterOdin @sethwoodworth Could someone take a look at the PR and let me know if you have any objections to the change? Thanks in advance!


ADD shiro.ini /jena-fuseki/shiro.ini

CMD ["/jena-fuseki/fuseki-server", "--loc=/fuseki", "--update", "/ds"]
Copy link
Collaborator

@MasterOdin MasterOdin Jun 29, 2018

as I don't know anything about fuseki, what would happen if someone used the default shiro.ini file that comes with the base docker image?

Copy link
Owner Author

@c-w c-w Jul 13, 2018

I was under the impression that the SPARQLUpdateStore didn't support authentication, but I must have stumbled across some pretty old docs. I've enabled pass-through for auth in aa7fcff.

return FusekiMetadataCache(cache_location, cache_url)
except InvalidCacheException:
logging.debug('Unable to create cache based on Apache Jena Fuseki. '
'Next trying BSD-DB implementation.')
Copy link
Collaborator

@MasterOdin MasterOdin Jun 29, 2018

Given that there's no obvious way to turn on debug level for logging, this seems kind of pointless to even have, and also, if a user is seriously trying to use fuseki and it's not working, said user would probably want a warning, though it might be most appropriate to actually throw the InvalidCacheException. They went to the trouble to set an environment variable after all.

Copy link
Owner Author

@c-w c-w Jul 13, 2018

Good idea. If the environment variable was set, we not throw the exception if the cache can't be instantiated.

@@ -201,6 +204,72 @@ def _check_can_be_instantiated(cls):
del db


class FusekiMetadataCache(MetadataCache):
_CACHE_URL_PREFIX = 'http://'
Copy link
Collaborator

@MasterOdin MasterOdin Jun 29, 2018

What happens if their url is behind https:// (as recommended in their docs for production servers)?

Copy link
Owner Author

@c-w c-w Jul 13, 2018

See above: I was misled by some old docs. Added https to the white-list.

@c-w c-w force-pushed the fuseki-store branch from 38b14b0 to aa7fcff Jul 13, 2018
@c-w
Copy link
Owner Author

@c-w c-w commented Jul 13, 2018

@MasterOdin Addressed your comments. Are you okay for this to be merged now or do you have any further questions/concerns?

@c-w c-w merged commit ed7a03a into master Jul 18, 2018
2 checks passed
@c-w c-w deleted the fuseki-store branch Jul 18, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

3 participants