RDFLib core vs dependencies #391

niklasl · 2014-05-08T19:19:08Z

A long time ago, the intent to split RDFLib into a core and separate packages for complex parsers, SPARQL etc. was cancelled, since there was a desire to have "the batteries included". Since then (and mainly because of the advanced parsing and query features), RDFLib has grown dependencies upon external packages, making it non-self-contained anyway.

One very strange effect of this is the dependency on SPARQLWrapper, which depends on RDFLib (which depends on SPARQLWrapper..). For one, this is an odd circular dependency, which e.g. causes an older RDFLib to be installed before running tests.

We should reconsider this situation. It is good and well to depend on packages. Therefore, we should seriously consider breaking out advanced parts (again), instead of this odd situation with a fat, complex core, and dependencies which sometimes depends on parts of those innards. We could e.g. make an rdflib-core, and reshape the 'rdflib' package into an umbrella package, which upon installation would pull in all these parts into the nice full suite. (Although some more complex parts should probably be considered optional anyway, such as non-memory backend stores.)

(In any case, the intended integration of rdflib-jsonld could just as easily be done by adding a dependency to RDFLib's setup.py and call it a day. Apart from the strange circularity if we keep the current situation, we wouldn't have to copy the separate repo code into RDFLib, tests and all, and then either mark the separate package "frozen", or have to maintain a copy manually.)

joernhees · 2014-05-09T00:35:21Z

I'm not exactly sure what you're arguing for. The circular dependencies you mention are a point against splitting into separate packages, but when i finish reading your post i think you're arguing for splitting?!?

dwvisser · 2014-05-09T02:22:39Z

My gut here is that it should be determined what subset of rdflib that SPARQLwrapper is dependent on, and see if that can be turned into a separated library. In any case, rdflib "core" should be independent of code focused on connecting to SPARQL endpoints.

Sent from my Windows Phone

From: Jörn Heesmailto:notifications@github.com
Sent: ‎5/‎8/‎2014 8:35 PM
To: RDFLib/rdflibmailto:rdflib@noreply.github.com
Subject: Re: [rdflib] RDFLib core vs dependencies (#391)

I'm not exactly sure what you're arguing for. The circular dependencies you mention are a point against splitting into separate packages, but when i finish reading your post i think you're arguing for splitting?!?

Reply to this email directly or view it on GitHub:
#391 (comment)

niklasl · 2014-05-09T18:03:30Z

@joernhees Yes, I am arguing for splitting, for the reasons @dwvisser exemplifies. The point is that the core should not depend on additional features, like a SPARQL endpoint mechanism (or advanced parsing code).

Developing, testing and releasing packages in isolation is much simpler (and quicker). There would still be a full RDFLib suite (i.e. the rdflib package) pulling in all "blessed", common and stable components (themselves only depending on an rdflib-core, or maybe one or two additional components). Given that RDFLib already pulls in external dependencies, nothing would change from an end user's point of view.

dbs · 2014-05-09T18:19:38Z

See also #359 - for packagers it would be convenient to have a core with (ideally optional) packages. (At least in Fedora I can take an existing monolithic package and make the transition of splitting it into multiple packages relatively easily).

Splitting also makes sense for the attempt to refactor Python 3 support using six; if we have a smaller core, then we can address the core and the split packages separately, testing the efforts along the way.

An aside about #359 : the SPARQLWrapper tests should probably all live in SPARQLWrapper anyway, possibly mocking the rdflib bits. As an example, right now I have no good way of automatically running the SPARQLWrapper tests in the Fedora package because RDFLib gets installed first as a dependency and there's no (good) way to re-run those tests once SPARQLWrapper is installed.

gromgull · 2017-01-29T19:48:25Z

I would make the following suggestion:

keep this repo for a central rdflib package which pulls in everything else. This could also contain the tools/extras (unless we retire some of them)
make a new repository + package rdflib-core with everything directly in the rdflib package, the memory store, the proxy-stores (auditable, regex, concurrent), and also the main serialisers/parsers (n3/ttl/nquads/trig/rdfxml)
move sparql to rdflib-sparql
move sleepycat to rdflib-sleepycat (it's only one file, but no bsddb built into py3, ? Could also be core.)
remove rdfa/mdata from the core, and let them only be in their own repo (where the actual development already happens today)
talk to the sparqlwrapper guys and see if we can have the sparqlstore in their repo (which in turn only depends on rdflib-core)

The tests go with each repo, some could stay here for testing the full package.

I don't know about the docs. Maybe also here, if you install all packages sphinx can find them for generating API docs.

joernhees · 2017-01-29T23:41:14Z

Let me put this first: any change wrt. this is a step forward.

I'm however a bit worried about splitting too much:

splits the (already small) developer community
more setup and maintenance overhead
harder to keep the full thing consistent

As an example: if we move sparql to rdflib-sparql, then something is changed in there, how do we make sure the whole test-suite is re-run? Sure, we can set up CI in each of those repos to test the full thing, but that definitely comes with some overhead.

So i'd reduce those splits down as much as possible. For example:

keep everything that only lives here in this repo in this repo
handle rdfa/mdata:
- either internalize them (delete their standalone repos)
- remove them from this repo and make them own packages (that we can depend on)

Moving sparqlstore to SPARQLWrapper is an option. If the main point for this however is to remove the circular dependency between rdflib and SPARQLWrapper: that's actually gone (via extras require). Installing rdflib will no longer install SPARQLWrapper, but to use sparqlstore, you need to install it. If you install SPARQLWrapper you get rdflib as a dependency.

wikier · 2017-01-30T13:51:21Z

My two cents from the SPARQLWrapper team: we could move sparqlstore into the wrapper, no problem. Maybe we can do this move in the 2.x branch I plan to start sooner than later.

gromgull · 2018-10-27T20:54:30Z

For 5.0 we've removed both rdfa and microdata from core, see #828

jpmccu · 2018-12-03T21:37:23Z

I'd like to object to the removal of rdfa from the core. It's one of the "official" RDF serializations endorsed by W3C, and the pyrdfa3 doesn't seem to support python 2.7. Yes, there are some of us out there who haven't moved up to Python 3 yet.

niklasl added the cleanup label May 8, 2014

niklasl mentioned this issue Oct 26, 2014

Possible circular dependecy RDFLib/sparqlwrapper#46

Closed

ocefpaf mentioned this issue Jan 14, 2015

Circular dependency ioos/conda-recipes#17

Closed

niklasl mentioned this issue Jul 15, 2015

jsonld-0.3 breaks rdflib tests RDFLib/rdflib-jsonld#30

Open

joernhees added this to the rdflib 5.0.0 milestone Jul 15, 2015

joernhees added meta Relates primarily to the project and not users of the project. discussion labels Jul 15, 2015

gromgull mentioned this issue Jan 26, 2016

code duplication issue between rdflib and pymicrodata #582

Closed

4 tasks

joernhees modified the milestones: rdflib 5.0.0, rdflib 6.0.0 Jan 28, 2016

gromgull mentioned this issue May 15, 2018

remove rdfa and microdata parsers from core RDFLib #828

Merged

joernhees mentioned this issue May 15, 2018

Remove SPARQLWrapper dependency #744

Merged

gromgull mentioned this issue Oct 28, 2018

weisfeiler lehman kernel and example #796

Closed

nicholascar mentioned this issue May 1, 2020

Pros and Cons for more modularisation #1031

Closed

ghost locked and limited conversation to collaborators Dec 25, 2021

ghost converted this issue into discussion #1533 Dec 25, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

This issue was moved to a discussion.

RDFLib core vs dependencies #391

RDFLib core vs dependencies #391

niklasl commented May 8, 2014

joernhees commented May 9, 2014

dwvisser commented May 9, 2014

niklasl commented May 9, 2014

dbs commented May 9, 2014

gromgull commented Jan 29, 2017

joernhees commented Jan 29, 2017

wikier commented Jan 30, 2017

gromgull commented Oct 27, 2018

jpmccu commented Dec 3, 2018

This issue was moved to a discussion.

This issue was moved to a discussion.

RDFLib core vs dependencies #391

RDFLib core vs dependencies #391

Comments

niklasl commented May 8, 2014

joernhees commented May 9, 2014

dwvisser commented May 9, 2014

niklasl commented May 9, 2014

dbs commented May 9, 2014

gromgull commented Jan 29, 2017

joernhees commented Jan 29, 2017

wikier commented Jan 30, 2017

gromgull commented Oct 27, 2018

jpmccu commented Dec 3, 2018

This issue was moved to a discussion.