Machine Learning, what language should I use?

I'm trying to figure out the best programming language to use for Machine Learning. My three candidates are:

Octave (MATLAB). This is the language chosen by Andrew Ng for his excellent Machine Learning course at Stanford. Andrew has stated this was a carefully considered decision based on his experience that students learn more quickly in this high-level language.
Python. This seems to be the most popular choice for Machine Learning in industry.
JavaScript. I considered JavaScript because the language is so ubiquitous and convenient. I've been doing a lot of JavaScript lately (haven't we all?) and I knew I could show off my Machine Learning programs in a browser directly if I went down this path.

Linear Algebra

One of the first surprises you experience when you dig in to (the current generation of) Machine Learning techniques is that, under the hood, they're largely just applied Linear Algebra. Nothing fancy or difficult. Just good old Matrices and Vectors from high school mathematics.

I still remember with some fondness my Linear Algebra textbook from Maths II in my senior year (over 25 years ago?!). It was called Matrices and Vectors and it had a floppy, green cover with yellowish paper inside.

So my titular question has now become: Linear Algebra, what language should I use?

The Experiment

I implemented a typical Machine Learning problem in each language.

The Linear Algebra parts were done using numpy for Python and mathjs for JavaScript.

Let's see how it turned out. We'll compare three key parts of the solution in each language.

1. Processing Training Data

Assume the training data has been loaded into the variable data. This code separates the data into two column vectors and counts the number of training examples m.

Octave

X = data(:, 1);
y = data(:, 2);
m = length(y);

Python

X = data[:, 0:1]
y = data[:, 1:2]
m = len(y)

JavaScript

var X = math.subset(data, math.index(math.range(0, m), 0));
var y = math.subset(data, math.index(math.range(0, m), 1));
var m = math.size(y)[0];

You can see that Octave and Python look quite similar. The tricky part of the Python solution was to use slice indexing (0:1 instead of 0) to maintain rank-2 arrays. The JavaScript solution is very verbose in comparison.

2. Cost Function

Now let's examine a typical linear regression cost function in each language.

Octave

function J = computeCost(X, y, theta)
    h = X * theta;
    err = h - y;
    J = 1 / (2 * m) * err' * err;
end

Python

def computeCost(X, y, theta):
    h = np.dot(X, theta)
    err = h - y
    return 1.0 / (2.0 * m) * np.dot(err.T, err)

JavaScript

function computeCost(X, y, theta) {
    var h = math.multiply(X, theta);
    var err = math.subtract(h, y);
    return 1 / (2 * m) * math.multiply(math.transpose(err), err);
}

The Octave solution is wonderfully concise and elegant.

The Python solution comes close. We use numpy's array datatype as opposed to its matrix datatype (as recommended). The only downside of this is that we must resort to the function call dot() to perform matrix multiplication. This pollutes things somewhat and is a bit of a drag.

Once again the JavaScript solution is quite ugly. Every matrix operation requires a function call: multiply(), subtract(), transpose().

3. Gradient Descent

Octave

function theta = gradientDescent(X, y, theta, alpha, num_iters)
    for iter = 1:num_iters
	   h = X * theta;
	   err = h - y;
	   theta_change = alpha / m * (X' * err);
	   theta = theta - theta_change;
    end
end

Python

def gradientDescent(X, y, theta, alpha, num_iters):
    for i in range(0, num_iters):
        h = np.dot(X, theta)
        err = h - y
        theta_change = alpha / m * np.dot(X.T, err)
        theta = theta - theta_change
    return theta

JavaScript

function gradientDescent(X, y, theta, alpha, num_iters) {
    for (var i = 0; i < num_iters; i++) {
        var h = math.multiply(X, theta);
        var err = math.subtract(h, y);
        var theta_change = math.multiply(alpha / m, math.multiply(math.transpose(X), err));
        theta = math.subtract(theta, theta_change);
    }
    return theta;
}

Very similar results to the Cost Function. Octave is the most elegant. Python is ok apart from that annoying dot() function call. And JavaScript is a hot mess.

Conclusion

Octave has the simplest and cleanest syntax for performing Linear Algebra. It's a great choice for learning, studying, and prototyping Machine Learning problems.

Python is close behind Octave in succinctness. It has other things going for it however. It's a mainstream programming language with a huge userbase and massive library support. This makes it the go to choice for Machine Learning in industry.

JavaScript is a clunky choice for performing Linear Algebra / Machine Learning. This hasn't stopped motivated people from going ahead and doing it anyway so your mileage may vary.

In conclusion, if you are a researcher and/or interested in understanding and manipulating Machine Learning algorithms at a low level then consider working in Octave.

If you are in industry and are applying Machine Learning algorithms at scale then Python might be the right choice.

It's probably best to avoid JavaScript if you can.

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Machine Learning, what language should I use?

Linear Algebra

The Experiment

1. Processing Training Data

Octave

Python

JavaScript

2. Cost Function

Octave

Python

JavaScript

3. Gradient Descent

Octave

Python

JavaScript

Conclusion

About

Releases

Packages

Contributors 2

Languages

License

benhauser/ml-lab

Folders and files

Latest commit

History

Repository files navigation

Machine Learning, what language should I use?

Linear Algebra

The Experiment

1. Processing Training Data

Octave

Python

JavaScript

2. Cost Function

Octave

Python

JavaScript

3. Gradient Descent

Octave

Python

JavaScript

Conclusion

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages