slender_t¶ ↑

SlenderT is a triples store It is simple for the simple jobs. I decided to replace a recommendation engine with a set of triples and some simple query tools, and this is the result. I didn’t want to break out the bigger tools for something that will only have hundreds of triples.

The ideas started with of “Programming the Semantic Web” by Toby Segaran et al. Toby’s code was in Python, and of course this is a Ruby version. I’ve used some Ruby idioms and created an IRB application (slender_t).

Usage¶ ↑

From the command line:

tmp : slender_t
Loading SlenderT version: 0.1.1
>> st = SlenderT.new
# => SlenderT: []
>> st.add :david, :writes, :code
# => [:david, :writes, :code]
>> st.add :david, :name, "David Richards"
# => [:david, :name, "David Richards"]
>> st.add :code, :name, 'Software'
# => [:code, :name, "Software"]
>> st.find nil, :name
# => [[:david, :name, "David Richards"], [:code, :name, "Software"]]

I’ve tried this with some larger data sets. The business data set that the O’Reilly provides can load about 36,500 triples in 22.3 seconds. I’d say that somewhere around that size of a data set is about where this library begins to need something a little more robust, such as Sesame or Redland. It’s up to you, of course. I haven’t made much effort at optimization, with the exception of the add_to_index method, which was exponential before, and is nearly linear now.

I’ve implemented the query concept from the book. The syntax is a little strange, but it works fine. Here is an example:

>> db.query(['?company', 'headquarters', 'New_York_New_York'], 
?> ['?company', 'industry', 'Investment Banking'],            
?> ['?contribution', 'contributor', '?company'],              
?> ['?contribution', 'recipient', 'Orrin Hatch'],             
?> ['?contribution', 'amount', '?dollars'])                   
=> [{"?contribution"=>"contrib285", "?dollars"=>30700.0, "?company"=>"BSC"}]

What this means is that BSC contributed $30,700 to Senator Orrin Hatch. If I dig around a little, I can find out some more information.

>> val = db.query(['?company', 'headquarters', 'New_York_New_York'], 
?> ['?company', 'industry', 'Investment Banking'],            
?> ['?contribution', 'contributor', '?company'],              
?> ['?contribution', 'recipient', '?recipient'],             
?> ['?contribution', 'amount', '?dollars'])
>> val.size
=> 110

That tells us that we know of 110 contributions from New York investment banks.

>> db.find('BSC', 'name', nil)
=> [["BSC", "name", "Bear Stearns"]]

That tells us that BSC is Bear Stearns.

>> val = db.query(['?contribution', 'contributor', 'BSC'],              
?> ['?contribution', 'recipient', '?recipient'],                        
?> ['?contribution', 'amount', '?dollars'])                             
=> [{"?contribution"=>"contrib285", "?dollars"=>30700.0, "?recipient"=>"Orrin Hatch"}, {"?contribution"=>"contrib284",
"?dollars"=>168335.0, "?recipient"=>"Hillary Rodham Clinton"}, {"?contribution"=>"contrib287", "?dollars"=>5600.0,
"?recipient"=>"Christopher Shays"}, {"?contribution"=>"contrib288", "?dollars"=>205100.0, "?recipient"=>"Christopher Dodd"},
{"?contribution"=>"contrib290", "?dollars"=>17300.0, "?recipient"=>"Frank Lautenberg"}, {"?contribution"=>"contrib286",
"?dollars"=>5000.0, "?recipient"=>"Barney Frank"}, {"?contribution"=>"contrib289", "?dollars"=>13000.0, "?recipient"=>"Michael
Dean Crapo"}, {"?contribution"=>"contrib294", "?dollars"=>4600.0, "?recipient"=>"Pete Sessions"},
{"?contribution"=>"contrib295", "?dollars"=>5000.0, "?recipient"=>"Paul E. Kanjorski"}, {"?contribution"=>"contrib292",
"?dollars"=>6600.0, "?recipient"=>"Nita Lowey"}, {"?contribution"=>"contrib293", "?dollars"=>5000.0, "?recipient"=>"Deborah
Pryce"}, {"?contribution"=>"contrib291", "?dollars"=>102260.0, "?recipient"=>"Joe Lieberman"}]
>> val.size
=> 12

That tells us that we know about 12 contributions that Bear Stearns made, including to Hillary Rodham Clinton for $168,335 and Joe Lieberman for $102,260.

You get the picture. I think I’ll create some classes soon to play with some inference, graph merging, and query patterns.

You should also know that you can load and save your triplet stores:

# st = SlenderT.load(some_csv_content_filename_or_url)
# st.save(some_filename)

Also, I was not trying to optimize anything. There is a really slow operation in the add method that I should probably fix, at least. This isn’t meant to be a fast data store, just a simple one. You may want to look at Sesame or Redland if you want to use a data set over 10,000 or 20,000 records. Like I said, I needed 200 or 300 triplets for a mini-recommendation engine, and I just couldn’t justify a full-on RDF system for that.

I’m also very interested in thinking about queries on graphs. I have several other projects that I think could benefit from some thoughts on a simple query syntax. I’m thinking of marginal, fathom, and overalls. These are all belief maintenance systems of some sort (joint distribution tables, Bayesian network, and nonmonotonic reasoning respectively) that got tabled until I thought through a better way to interface them.

Note on Patches/Pull Requests¶ ↑

Fork the project.
Make your feature addition or bug fix.
Add tests for it. This is important so I don’t break it in a future version unintentionally.
Commit, do not mess with rakefile, version, or history. (if you want to have your own version, that is fine but
```
bump version in a commit by itself I can ignore when I pull)
```
Send me a pull request. Bonus points for topic branches.

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
bin		bin
lib		lib
spec		spec
.document		.document
.gitignore		.gitignore
LICENSE		LICENSE
README.rdoc		README.rdoc
Rakefile		Rakefile
VERSION		VERSION
slender_t.gemspec		slender_t.gemspec

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bin

bin

lib

lib

spec

spec

.document

.document

.gitignore

.gitignore

LICENSE

LICENSE

README.rdoc

README.rdoc

Rakefile

Rakefile

VERSION

VERSION

slender_t.gemspec

slender_t.gemspec

Repository files navigation

slender_t¶ ↑

Usage¶ ↑

Note on Patches/Pull Requests¶ ↑

Copyright¶ ↑

About

Releases

Packages

Languages

License

davidrichards/slender_t

Folders and files

Latest commit

History

Repository files navigation

slender_t¶ ↑

Usage¶ ↑

Note on Patches/Pull Requests¶ ↑

Copyright¶ ↑

About

Resources

License

Stars

Watchers

Forks

Languages