Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP
Newer
Older
100644 79 lines (60 sloc) 3.356 kB
842ddf8 @jeanlange Update README.md
jeanlange authored
1 # JRuby Mahout
2 Jruby Mahout is a gem that unleashes the power of Apache Mahout in the world of JRuby. Mahout is a superior machine learning library written in Java. It deals with recommendations, clustering and classification machine learning problems at scale. Until now it was difficult to use it in Ruby projects. You'd have to implement Java interfaces in JRuby yourself, which is not quick especially if you just started exploring the world of machine learning.
38249cb @vasinov Modified readme file.
authored
3
842ddf8 @jeanlange Update README.md
jeanlange authored
4 The goal of this library is to make machine learning at scale in JRuby projects simple.
d9dd3f1 @vasinov Added basic readme file.
authored
5
6 ## Quick Overview
842ddf8 @jeanlange Update README.md
jeanlange authored
7 This is an early version of a JRuby gem that only supports Mahout recommendations. It also includes a simple Postgres manager that can be used to manage appropriate recommendations tables. Unfortunately it's impossible to use ActiveRecord (AR) with Mahout, because AR operates at a much higher level and creates a lot of overhead that is critical when dealing with millions of records in real time.
d9dd3f1 @vasinov Added basic readme file.
authored
8
e1ab240 @vasinov Modified the readme file.
authored
9 ## Get Mahout
842ddf8 @jeanlange Update README.md
jeanlange authored
10 First of all you need to download the Mahout library from one of the [mirrors](http://www.apache.org/dyn/closer.cgi/mahout/). Jruby Mahout only supports Mahout 0.7 at this point.
e1ab240 @vasinov Modified the readme file.
authored
11
38249cb @vasinov Modified readme file.
authored
12 ## Get Postgres JDBC Adapter
842ddf8 @jeanlange Update README.md
jeanlange authored
13 If you wish to work with a database for recommendations, you'll have to install the [JDBC driver for Postgres](http://jdbc.postgresql.org/download.html). Another option is to use file-based recommendations.
38249cb @vasinov Modified readme file.
authored
14
d9dd3f1 @vasinov Added basic readme file.
authored
15 ## Installation
842ddf8 @jeanlange Update README.md
jeanlange authored
16 ### 1. Set the environment variable MAHOUT_DIR to point at your Mahout installation.
e1ab240 @vasinov Modified the readme file.
authored
17 ### 2. Add the gem to your `Gemfile`
d9dd3f1 @vasinov Added basic readme file.
authored
18 ```ruby
19 platform :jruby do
20 gem "jruby_mahout"
21 end
22 ```
842ddf8 @jeanlange Update README.md
jeanlange authored
23 ### 3. Run `bundle install`.
d9dd3f1 @vasinov Added basic readme file.
authored
24
7e9c820 @vasinov Update README.md
authored
25 ## Brief Introduction
7d76c24 @vasinov Updated readme file and removed an empty imports file.
authored
26 I am planning to add more examples covering Jruby Mahout use cases to [this repo](https://github.com/vasinov/jruby_mahout-examples) soon.
27
842ddf8 @jeanlange Update README.md
jeanlange authored
28 First, define the `MAHOUT_DIR` environmental variable for your Mahout installation. For example:
fe59f47 @vasinov Updated readme
authored
29
30 ```
31 export MAHOUT_DIR=/bin/mahout
32 ```
33
7d76c24 @vasinov Updated readme file and removed an empty imports file.
authored
34 The easiest way to start working with Jruby Mahout recommendations is to initialize a recommender:
35 ```ruby
36 require 'jruby_mahout'
37 recommender = JrubyMahout::Recommender.new("PearsonCorrelationSimilarity", 5, "GenericUserBasedRecommender", false)
38 ```
39
842ddf8 @jeanlange Update README.md
jeanlange authored
40 Set up a data model:
7d76c24 @vasinov Updated readme file and removed an empty imports file.
authored
41 ```ruby
42 recommender.data_model = JrubyMahout::DataModel.new("file", { :file_path => "recommender_data.csv" }).data_model
43 ```
44
45 and get recommendations:
46 ```ruby
47 puts recommender.recommend(2, 10, nil) # 10 recommendations for user with id = 2
48 ```
49
50 You can evaluate your recommender to see how efficient it is:
51 ```ruby
52 puts recommender.evaluate(0.7, 0.3)
53 ```
54
55 The closer the score is to zero—the better.
56
7e9c820 @vasinov Update README.md
authored
57 ## Advanced
58 I am working on a series of articles on how to utilize JRuby Mahout in the real world projects. This is the first one in the series:
59 - [Machine Learning with Ruby, Part One](http://www.vasinov.com/blog/machine-learning-with-ruby-part-one)
7d76c24 @vasinov Updated readme file and removed an empty imports file.
authored
60
61 ## Development Plans
62 There are several things that should be supported by this gem, before it can be used in production. Some of them are:
63 - Hadoop integration
64 - Clustering support
65 - Classification support
66 - Better docs
67
68 If you feel like you can help—please do.
69
70 ## Testing
71 Jruby Mahout is thoroughly tested with Rspec.
72
d9dd3f1 @vasinov Added basic readme file.
authored
73 ## Contribute
38249cb @vasinov Modified readme file.
authored
74 - Fork the project.
75 - Write code for a feature or bug fix.
76 - Add Rspec tests for it.
77 - Commit, do not make changes to rakefile or version.
fe59f47 @vasinov Updated readme
authored
78 - Submit a pull request.
Something went wrong with that request. Please try again.