Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RDFLib core vs dependencies #391

Closed
niklasl opened this issue May 8, 2014 · 9 comments
Closed

RDFLib core vs dependencies #391

niklasl opened this issue May 8, 2014 · 9 comments
Labels
cleanup discussion meta Relates primarily to the project and not users of the project.
Milestone

Comments

@niklasl
Copy link
Member

niklasl commented May 8, 2014

A long time ago, the intent to split RDFLib into a core and separate packages for complex parsers, SPARQL etc. was cancelled, since there was a desire to have "the batteries included". Since then (and mainly because of the advanced parsing and query features), RDFLib has grown dependencies upon external packages, making it non-self-contained anyway.

One very strange effect of this is the dependency on SPARQLWrapper, which depends on RDFLib (which depends on SPARQLWrapper..). For one, this is an odd circular dependency, which e.g. causes an older RDFLib to be installed before running tests.

We should reconsider this situation. It is good and well to depend on packages. Therefore, we should seriously consider breaking out advanced parts (again), instead of this odd situation with a fat, complex core, and dependencies which sometimes depends on parts of those innards. We could e.g. make an rdflib-core, and reshape the 'rdflib' package into an umbrella package, which upon installation would pull in all these parts into the nice full suite. (Although some more complex parts should probably be considered optional anyway, such as non-memory backend stores.)

(In any case, the intended integration of rdflib-jsonld could just as easily be done by adding a dependency to RDFLib's setup.py and call it a day. Apart from the strange circularity if we keep the current situation, we wouldn't have to copy the separate repo code into RDFLib, tests and all, and then either mark the separate package "frozen", or have to maintain a copy manually.)

@niklasl niklasl added the cleanup label May 8, 2014
@joernhees
Copy link
Member

I'm not exactly sure what you're arguing for. The circular dependencies you mention are a point against splitting into separate packages, but when i finish reading your post i think you're arguing for splitting?!?

@dwvisser
Copy link

dwvisser commented May 9, 2014

My gut here is that it should be determined what subset of rdflib that SPARQLwrapper is dependent on, and see if that can be turned into a separated library. In any case, rdflib "core" should be independent of code focused on connecting to SPARQL endpoints.

Sent from my Windows Phone


From: Jörn Heesmailto:notifications@github.com
Sent: ‎5/‎8/‎2014 8:35 PM
To: RDFLib/rdflibmailto:rdflib@noreply.github.com
Subject: Re: [rdflib] RDFLib core vs dependencies (#391)

I'm not exactly sure what you're arguing for. The circular dependencies you mention are a point against splitting into separate packages, but when i finish reading your post i think you're arguing for splitting?!?


Reply to this email directly or view it on GitHub:
#391 (comment)

@niklasl
Copy link
Member Author

niklasl commented May 9, 2014

@joernhees Yes, I am arguing for splitting, for the reasons @dwvisser exemplifies. The point is that the core should not depend on additional features, like a SPARQL endpoint mechanism (or advanced parsing code).

Developing, testing and releasing packages in isolation is much simpler (and quicker). There would still be a full RDFLib suite (i.e. the rdflib package) pulling in all "blessed", common and stable components (themselves only depending on an rdflib-core, or maybe one or two additional components). Given that RDFLib already pulls in external dependencies, nothing would change from an end user's point of view.

@dbs
Copy link
Contributor

dbs commented May 9, 2014

See also #359 - for packagers it would be convenient to have a core with (ideally optional) packages. (At least in Fedora I can take an existing monolithic package and make the transition of splitting it into multiple packages relatively easily).

Splitting also makes sense for the attempt to refactor Python 3 support using six; if we have a smaller core, then we can address the core and the split packages separately, testing the efforts along the way.

An aside about #359 : the SPARQLWrapper tests should probably all live in SPARQLWrapper anyway, possibly mocking the rdflib bits. As an example, right now I have no good way of automatically running the SPARQLWrapper tests in the Fedora package because RDFLib gets installed first as a dependency and there's no (good) way to re-run those tests once SPARQLWrapper is installed.

@gromgull
Copy link
Member

I would make the following suggestion:

  • keep this repo for a central rdflib package which pulls in everything else. This could also contain the tools/extras (unless we retire some of them)
  • make a new repository + package rdflib-core with everything directly in the rdflib package, the memory store, the proxy-stores (auditable, regex, concurrent), and also the main serialisers/parsers (n3/ttl/nquads/trig/rdfxml)
  • move sparql to rdflib-sparql
  • move sleepycat to rdflib-sleepycat (it's only one file, but no bsddb built into py3, ? Could also be core.)
  • remove rdfa/mdata from the core, and let them only be in their own repo (where the actual development already happens today)
  • talk to the sparqlwrapper guys and see if we can have the sparqlstore in their repo (which in turn only depends on rdflib-core)

The tests go with each repo, some could stay here for testing the full package.

I don't know about the docs. Maybe also here, if you install all packages sphinx can find them for generating API docs.

@joernhees
Copy link
Member

Let me put this first: any change wrt. this is a step forward.

I'm however a bit worried about splitting too much:

  • splits the (already small) developer community
  • more setup and maintenance overhead
  • harder to keep the full thing consistent

As an example: if we move sparql to rdflib-sparql, then something is changed in there, how do we make sure the whole test-suite is re-run? Sure, we can set up CI in each of those repos to test the full thing, but that definitely comes with some overhead.

So i'd reduce those splits down as much as possible. For example:

  • keep everything that only lives here in this repo in this repo
  • handle rdfa/mdata:
    • either internalize them (delete their standalone repos)
    • remove them from this repo and make them own packages (that we can depend on)

Moving sparqlstore to SPARQLWrapper is an option. If the main point for this however is to remove the circular dependency between rdflib and SPARQLWrapper: that's actually gone (via extras require). Installing rdflib will no longer install SPARQLWrapper, but to use sparqlstore, you need to install it. If you install SPARQLWrapper you get rdflib as a dependency.

@wikier
Copy link
Member

wikier commented Jan 30, 2017

My two cents from the SPARQLWrapper team: we could move sparqlstore into the wrapper, no problem. Maybe we can do this move in the 2.x branch I plan to start sooner than later.

@gromgull
Copy link
Member

For 5.0 we've removed both rdfa and microdata from core, see #828

@jpmccu
Copy link
Contributor

jpmccu commented Dec 3, 2018

I'd like to object to the removal of rdfa from the core. It's one of the "official" RDF serializations endorsed by W3C, and the pyrdfa3 doesn't seem to support python 2.7. Yes, there are some of us out there who haven't moved up to Python 3 yet.

@ghost ghost locked and limited conversation to collaborators Dec 25, 2021
@ghost ghost converted this issue into discussion #1533 Dec 25, 2021

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Labels
cleanup discussion meta Relates primarily to the project and not users of the project.
Projects
None yet
Development

No branches or pull requests

7 participants