Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP

Loading…

Documenting work to rebase Titanium to use Archimedes. #1

Closed
zmaril opened this Issue · 35 comments

3 participants

@zmaril
Owner

This isn't an issue so much as an ongoing log of the work to rebase Titanium to use Archimedes. Commit messages are only so helpful and there are definitely some undocumented philosophies present in Archimedes.

@zmaril
Owner

The first thing to note is that Archimedes uses a var to store the current state being accessed. The reasoning being that while it might be useful in some cases to have methods that can take arbitrary graph objects as input, most of the time that will be over engineering in the extreme. I think that (tev/add-vertex) is just as clear as (tev/add-vertex g). archimedes.core has a macro called with-graph that can allow a developer to use an arbitrary graph whenever they want that addresses the arbitrary use case.

@zmaril
Owner

Second, there will probably be more deletions than additions here. I'm using mikera/clojure-utils pull-all function that imports all of the functions and defs into a namespace from another namespace. That means that lots of code from archimedes will just be there in most of the namespaces. After the rebase is working, I'll go back and comment or use the import-fn and import-def commands to give a clearer picture of just what exactly is coming over from Archimedes.

@zmaril
Owner

Namespaces work differently than in Archimedes than they do in Titanium.

  • archimedes.vertex contains all of the functions for creating, updating, finding, and deleting vertices.
  • archimedes.edge contains all of the functions for creating, updating, finding, deleting, and asking questions about edges.
  • Both of the above libraries pull in archimedes.element and use all of the definitions in there as their own. It's a small distinction, but it becomes a pain to think about a vertex both as a vertex and as an element after a while. It's just a vertex and there should only be one namespace to deal with it as an object.
  • archimedes.core has all of the functions that deal with the graph object itself. That means transaction management (see later comment) and functions for setting the *graph* var all live in there. *archimedes.io has ways of printing out and reading various file formats.

And that's about it. Those namespaces cover much of the core parts of the blueprints library. I've found this separation very useful in my own work and intend to carry this pattern over into Titanium. It will mean changing many parts of the library though and restructuring much of the testing. In particular, there is no way of copying the graph var over. That means you will need to reference archimedes.core/*graph* whenever you want to use it directly (not titanium.core/*graph*).

@zmaril
Owner

Property keys and labels are keywords. Reasoning is not much more than aesthetics at this point. For property keys though, I've had a burning desire to be able to call (:prop v1) where v1 is a vertex object. That would be awesome.

@zmaril
Owner

(ted/connect! v1 :label v2) looks better to me than (ted/connect! v1 v2 :label) and so that is how Archimedes works.

@zmaril
Owner

TitanInMemoryGraph is getting removed in the next version of Titan it looks like and so most of the tests will have to be rewritten to use embedded cassandra or berkelydb.

@zmaril
Owner

clojurewerkz.titanium.gpipe will be deleted and replaced with Ogre. clojurewerkz.titanium.elements will stick around and probably stay it's own namespace. It deals with abstract properties of elements and so isn't something I think of as associated with working with a vertex or an edge.

@zmaril
Owner

Not touching clojurewerkz.titanium.query.

@zmaril
Owner

Removing gpipe means that about half of clojurewerkz.titanium.conversion will be removed as well.

@zmaril
Owner

I'll probably pull get-vertex, head-vertex, and tail-vertex into Archimedes as well. Those are Tinkerpop dependent. clojurewerkz.titanium.edges looks like it will be pretty empty.

@zmaril
Owner

Same with most of the functions in clojurewerkz.titanium.vertices. Those all depend on Tinkerpop and can be abstracted out into Archimedes. There really isn't anything special about Titan vertices or edges yet that isn't exposed in the Tinkerpop API.

@zmaril
Owner

Bring over https://github.com/zmaril/hermes/blob/master/src/hermes/type.clj would probably be smart as well.

@zmaril
Owner

Take a look at Hermes source actually for how empty the library will be after it depends on Archimedes:
https://github.com/zmaril/hermes/tree/master/src/hermes

There is a huge amount of stuff that is just Tinkerpop.

@zmaril
Owner

The most important thing coming in from Archimedes is transaction management.
https://github.com/zmaril/hermes/wiki/Transaction-Management

This lets us do awesome stuff like retry-transact!:
https://github.com/zmaril/hermes/blob/master/test/hermes/core_test.clj#L55

@michaelklishin

Oh, fantastic, thanks for starting this discussion.

So, point by point:

  • Having a var for implicit state is fine. Some ClojureWerkz projects do that, it works well in practice. For Titanium, binding should be more than sufficient for features such as explicit transactions.
  • I'm not familiar with those tools but we don't mind reasonable and well maintained 3rd party dependencies.
  • Namespaces organization sounds very similar to what Titanium has or at least what was the plan when I was designing namespaces. Sounds excellent.
  • Agree on (ted/connect! v1 :label v2). I always wanted to do (ted/connect! v1 -label-> v2), possibly via a separate graph population DSL.
  • For tests, if it's not hard, let try embedded Cassandra. Part of the appeal of Titan is that it can be backed by Cassandra and HBase. But if that takes more effort than it's worth, we'll just provide a few monger.testkit kind of functions to make working with BerkeleyDB and unique path generation trivial. ClojureWerkz Support depends on Guava that has very nice utilities for temporary directories.
  • Agreed on gpipe.
  • If some namespaces in Titanium largely copy Archimedes, feel free to pull functions from them and remove them altogether. We will simply update the docs.
  • I did not get to the types, if you feel Hermes' implementation is nice and doesn't have much room for improvement, just bring it in as is.
  • Having all Blueprints-related stuff in Archimedes makes total sense to me
  • Again, I did not get to more sophisticated transaction operations before Titan 0.3 was announced. Sounds good. We may tweak the API a bit as I (for some reason I can't easily articulate) liked Titanium's API names a bit better.

Thanks Zach!

@michaelklishin

If you need a GH milestone created for this, let me know. If not, we don't use them actively for ClojureWerkz.

@michaelklishin

By the way, @ifesdjeen may know how to best deal with using embedded Cassandra for tests. Sounds like a feature
Cassaforte may want to make really easy, too ;)

@ifesdjeen
Owner

Honestly, I've never used embedded cassandra for tests :/ I only know how to spawn local cluster with cluster manager...

@ifesdjeen
Owner

I'll catch up with @michaelklishin on jabber later today. I can definitely help out with embedded cassandra, just not sure if/why running a separate clean version is not enough

@zmaril
Owner

No worries! I know how to do embedded cassandra for Titan. Really easy. Already did it in Hermes.

@zmaril
Owner

"Agree on (ted/connect! v1 :label v2). I always wanted to do (ted/connect! v1 -label-> v2), possibly via a separate graph population DSL."

An interesting future project would be writing a way to easily specify schemas for the database. Whenever I do Titan projects, I always end with a titan.clj file that is largely just a schema. Once types are in there, let's see if there is any way to simplify the creation of a schema. types.clj is already pretty bare bones, mostly operating on maps of keywords, classes, and strings.

@zmaril
Owner

http://www.tinkerpop.com/docs/javadocs/blueprints/2.2.0/com/tinkerpop/blueprints/Query.html
http://thinkaurelius.github.com/titan/javadoc/current/com/thinkaurelius/titan/core/TitanQuery.html

query.clj will be emptied out at first as well. All of the current functions deal with Blueprints specific stuff. However, there is a ton of Titan specific methods that can and should be put in there at a future date.

@zmaril
Owner

Currently, Titan 0.3.0 has a ton of breaking changes. Read about them here!
https://groups.google.com/forum/?fromgroups=#!topic/aureliusgraphs/vlRg0ey735g

My main goal right now is to get Titanium working on Archimedes for 0.2.0, with decent test coverage backed by embedded cassandra. After that, it should be a hop, skip, and a jump to getting Titan 0.2.1 working. The pain of upgrading to 0.3.0 should be mitigated by embedded cassandra tests (no more in memory graphs it seems). I know for sure that some changes will need to be made to types.clj based on what Mattias has said online. It looks like indexing is going to be the biggest change.

@zmaril
Owner

From the previously linked email/announcement:
- Properties on vertices can have properties on them (mind boggling...) which is very useful for version, timestamping, etc

This could be huge. I need to look into this more, but this might make it really easy to make an immutable graph database. Another project for another day though.

@zmaril
Owner

Do these methods need to exist if we are using Ogre?

(defn edges-of
  "Returns edges that this vertex is part of with direction and with given labels"
  [^Vertex v direction labels]
  (.getEdges v (to-edge-direction direction) (into-array String labels)))

(defn all-edges-of
  "Returns edges that this vertex is part of, with given labels"
  [^Vertex v labels]
  (.getEdges v Direction/BOTH (into-array String labels)))

(defn outgoing-edges-of
  "Returns outgoing (outbound) edges that this vertex is part of, with given labels"
  [^Vertex v labels]
  (.getEdges v Direction/OUT (into-array String labels)))

(defn incoming-edges-of
  "Returns incoming (inbound) edges that this vertex is part of, with given labels"
  [^Vertex v labels]
  (.getEdges v Direction/IN (into-array String labels)))

(defn connected-vertices-of
  [^Vertex v direction labels]
  (.getVertices v (to-edge-direction direction) (into-array String labels)))

(defn connected-out-vertices
  [^Vertex v labels]
  (.getVertices v Direction/OUT (into-array String labels)))

(defn connected-in-vertices
  [^Vertex v labels]
  (.getVertices v Direction/IN (into-array String labels)))
@michaelklishin

@zmaril well, for the sake of completeness, why not. This API is part of Blueprints so I think we should cover it.

@michaelklishin

These were added primarily because Neocons has similar functions.

@zmaril
Owner

Makes sense. It might be a good idea to rewrite these to use query internally if blueprints doesn't already do that.

@zmaril
Owner

With that, Archimedes has now pulled in all of the Blueprints specific material from Titanium.
zmaril/archimedes@0.0.5...2b451a0

Everything has tests except the vertex methods I just pulled in. Starting work on rewriting the tests in Titanium to use the new API's.

@zmaril
Owner

All of the old tests (that made sense to port) now pass. Next up is switching to embedded cassandra and bringing over any tests from Hermes that cover things that have been missed. After that, the rebase should be finished and ready to be merged.

@zmaril
Owner

I've brought over all of the embedded tests from Hermes. They all pass. So the first pass quick and dirty merge is done. Next up is cleaning up and looking through the tests again.

@ifesdjeen
Owner

really good progress man!!

@zmaril
Owner

So, the rebasing is mostly done. Many new tests have been added and ported over. Embedded cassandra is used for some of the testing (all stuff brought over from Hermes). All the tests pass and the code says very clearly what it is importing over from Archimedes. I'd say Titanium is now far past where both Titanium and Hermes used to be. Warn-on-reflection is off because I haven't put type hints into Ogre or Archimedes for the most part yet.

The trick now will be trying to upgrade to 0.2.1 or 0.3.0.

@michaelklishin

Maybe we should merge this to master already?

@zmaril
Owner

I'll update the change log and merge this. Almost every API has changed due to rebasing, and so I'm not sure what level of detail should be provided in the change log. I'll try to give an accurate picture of what happened that isn't too long.

@zmaril zmaril closed this
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.