Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
31 commits
Select commit Hold shift + click to select a range
ce60921
Create ISSUE_TEMPLATE.md
crockpotveggies Apr 4, 2018
e8e1be6
Update java-ai.md
chrisvnicholson Apr 4, 2018
1d31d07
Create automated-machine-learning.md
chrisvnicholson Apr 4, 2018
01744ae
Update automated-machine-learning.md
chrisvnicholson Apr 4, 2018
0d9a560
Update java-ai.md
chrisvnicholson Apr 4, 2018
e2af6c8
Update java-ai.md
chrisvnicholson Apr 4, 2018
c05e6b6
Update java-ai.md
chrisvnicholson Apr 4, 2018
80df3b6
Update java-ai.md
chrisvnicholson Apr 4, 2018
b1de36a
Update java-ai.md
chrisvnicholson Apr 4, 2018
40069d2
Update java-ai.md
chrisvnicholson Apr 4, 2018
f2fe45e
Update automated-machine-learning.md
chrisvnicholson Apr 4, 2018
1801adc
Update java-ai.md
chrisvnicholson Apr 4, 2018
29d2bf7
Update java-ai.md
chrisvnicholson Apr 4, 2018
a3827f9
Update automated-machine-learning.md
chrisvnicholson Apr 4, 2018
9f99913
Update automated-machine-learning.md
chrisvnicholson Apr 4, 2018
6c1cfb9
Update automated-machine-learning.md
chrisvnicholson Apr 4, 2018
73eff9a
Update automated-machine-learning.md
chrisvnicholson Apr 4, 2018
5b20e05
Update automated-machine-learning.md
chrisvnicholson Apr 4, 2018
7f9cad0
Update automated-machine-learning.md
chrisvnicholson Apr 4, 2018
1679842
Rename automated-machine-learning.md to automated-machine-learning-ai.md
chrisvnicholson Apr 4, 2018
670b061
Update index.html
chrisvnicholson Apr 4, 2018
49e68fc
Update sidebar.html
chrisvnicholson Apr 4, 2018
f6873b1
Update robotic-process-automation-rpa.md
chrisvnicholson Apr 4, 2018
2899bf0
Update opendata.md
chrisvnicholson Apr 4, 2018
b8846d4
Fixes #17 (#18)
mgubaidullin Apr 5, 2018
1c0bb38
Update spark.md
chrisvnicholson Apr 5, 2018
d46f19f
Update lstm.md
chrisvnicholson Apr 5, 2018
b24032e
Update convolutionalnetwork.md
chrisvnicholson Apr 5, 2018
11d594c
Merge branch 'releasenotes_100a' of https://github.com/deeplearning4j…
maxpumperla Apr 5, 2018
88befda
scalnet and nd4s
maxpumperla Apr 5, 2018
bbae3fb
SameDiff
maxpumperla Apr 5, 2018
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 9 additions & 0 deletions .github/ISSUE_TEMPLATE.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
## Due Date
*To be completed by:* YYYY-MM-DD


## Description
*Write a short description of what needs to be done.*

## Assignees
*Please ensure you have assigned at least one person to this issue. Include any authors and reviewers required.*
10 changes: 7 additions & 3 deletions _includes/sidebar.html
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,7 @@
<li><a href="/build_vgg_webapp">Build a Web Application for Image Classification</a></li>
<li><a href="/android">Deploy Deeplearning4j to Android</a></li>
<li><a href="/artificial-intelligence-ai.html">What is Artificial Intelligence (AI)?</a></li>
<li><a href="/strong-ai-general-agi.html">What is Strong AI?</a></li>
</ul>
</li>

Expand All @@ -52,11 +53,12 @@
<li>
<a href="#">Neural Networks</a>
<ul>
<li><a href="/lstm">Long Short-Term Memory Units</a></li>
<li><a href="/lstm">Long Short-Term Memory Units (LSTMs)</a></li>
<li><a href="/convolutionalnetwork">Convolutional Nets for Image Processing</a></li>
<li><a href="/recurrentnetwork">Recurrent Nets and LSTMs</a></li>
<li><a href="/word2vec">Word2Vec: Neural Word Embeddings</a></li>
<li><a href="/recurrentnetwork">Recurrent Neural Networks (RNNs)</a></li>
<li><a href="/word2vec">Word2Vec, Doc2vec, GloVe: Neural Word Embeddings</a></li>
<li><a href="/restrictedboltzmannmachine">Restricted Boltzmann Machines</a></li>
<li><a href="/generative-adversarial-network">Generative Adversarial Network (GAN)</a></li>
<li><a href="/multilayerperceptron">Multilayer Perceptron</a></li>
<li><a href="/deepautoencoder">Deep AutoEncoder</a></li>
<li><a href="/denoisingautoencoder">Denoising Autoencoders</a></li>
Expand Down Expand Up @@ -158,6 +160,8 @@
<li><a href="/decision-tree">Decision Trees</a></li>
<li><a href="/random-forest">Random Forests</a></li>
<li><a href="/scala">Scala, Spark and Deep Learning</a></li>
<li><a href="/java-ai">Java AI and Machine Learning Tools</a></li>
<li><a href="/automated-machine-learning-ai">Automated Machine Learning and AI</a></li>
<li><a href="/compare-dl4j-tensorflow-pytorch">DL4J, TensorFlow, Pytorch, Caffe</a></li>
<li><a href="/glossary">Glossary of Terms for Deep Learning and Neural Nets</a></li>
<li><a href="/deeplearningpapers">Free Online Courses, Tutorials and Papers</a></li>
Expand Down
8 changes: 4 additions & 4 deletions _layouts/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -60,14 +60,14 @@ <h3>Open-Source, Distributed, Deep Learning Library for the JVM</h3>

<header>
<h1><a href="https://deeplearning4j.org/overview">What is Eclipse Deeplearning4j?</a></h1>
<p><a href="https://projects.eclipse.org/projects/technology.deeplearning4j">Eclipse Deeplearning4j</a> is the first commercial-grade, open-source, distributed deep-learning library written for Java and Scala. Integrated with Hadoop and Spark, DL4J brings <a href="https://deeplearning4j.org/artificial-intelligence-ai.html" target="_blank">AI</a>AI to business environments for use on distributed GPUs and CPUs.</p>
<p><a href="https://projects.eclipse.org/projects/technology.deeplearning4j">Eclipse Deeplearning4j</a> is the first commercial-grade, open-source, distributed deep-learning library written for Java and Scala. Integrated with Hadoop and Spark, DL4J brings <a href="https://deeplearning4j.org/artificial-intelligence-ai.html" target="_blank">AI</a> to business environments for use on distributed GPUs and CPUs.</p>

<p><a href="https://www.skymind.io" target="_blank">Skymind</a> is its commercial support arm, bundling Deeplearning4j and other libraries such as Tensorflow and Keras in the Skymind Intelligence Layer (Community Edition), a deep learning environment that gives developers an easy, fast way to train and deploy AI models. <a href="https://skymind.ai/quickstart" target="_blank">SKIL CE is free and downloadable here</a>. SKIL acts as a bridge between Python data science environments and the JVM.</p>
</header>

<section>
<h2>Welcome to Eclipse Deeplearning4j</h2>
<p>Deeplearning4j aims to be cutting-edge plug and play, more convention than configuration, which allows for fast prototyping for non-researchers. DL4J is customizable at scale. Released under the Apache 2.0 license, all derivatives of DL4J belong to their authors. DL4J can <a href="https://deeplearning4j.org/keras-supported-features" target="_blank">import neural net models</a> from most major frameworks via Keras, including <a href="https://deeplearning4j.org/tensorflow" target="_blank">TensorFlow</a>, Caffe and Theano, bridging the gap between the Python ecosystem and the JVM with a cross-team toolkit for data scientists, data engineers and DevOps. <a href="https://deeplearning4j.org/keras-supported-features">Keras</a> is employed as Deeplearning4j's Python API. Skymind is the second-largest contributor to Keras after Google, and offers commercial support for Keras. Machine learning models are served in production with <a href="https://deeplearning4j.org/machine-learning-server.html">Skymind's machine learning server</a>. </p>
<p>Deeplearning4j aims to be cutting-edge plug and play, more convention than configuration, which allows for fast prototyping for data scientists, machine-learning practitioners and software engineers. DL4J is customizable at scale. Released under the Apache 2.0 license, all derivatives of DL4J belong to their authors. DL4J can <a href="https://deeplearning4j.org/keras-supported-features" target="_blank">import neural net models</a> from most major frameworks via Keras, including <a href="https://deeplearning4j.org/tensorflow" target="_blank">TensorFlow</a>, Caffe and Theano, bridging the gap between the Python ecosystem and the JVM with a cross-team toolkit for data scientists, data engineers and DevOps. <a href="https://deeplearning4j.org/keras-supported-features">Keras</a> is Deeplearning4j's Python API. Skymind is the second-largest contributor to Keras after Google, and offers commercial support for Keras. Machine learning models are served in production with <a href="https://deeplearning4j.org/machine-learning-server.html">Skymind's machine learning server</a>. </p>

<div class="row">
<div class="col-md-4 col-sm-6">
Expand All @@ -90,7 +90,7 @@ <h3>Open-Source</h3>
<div class="promo small-icon left">
<i class="fa fa-cubes"></i>
<h3>JVM/Python/C++</h3>
<p>Deeplearning4j is written in Java and is compatible with any JVM language, such as <a href="https://deeplearning4j.org/scala">Scala</a>, <a href="https://deeplearning4j.org/clojure">Clojure</a> or <a href="https://deeplearning4j.org/kotlin">Kotlin</a>. The underlying computations are written in C, C++ and Cuda. <a href="https://deeplearning4j.org/keras-supported-features">Keras</a> will serve as the Python API.</p>
<p>Deeplearning4j is written in Java and is compatible with any JVM language, such as <a href="https://deeplearning4j.org/scala">Scala</a>, <a href="https://deeplearning4j.org/clojure">Clojure</a> or <a href="https://deeplearning4j.org/kotlin">Kotlin</a>. The underlying computations are written in C, C++ and Cuda. <a href="https://deeplearning4j.org/keras-supported-features">Keras</a> will serve as the Python API.</p>
</div>
</div>

Expand Down Expand Up @@ -189,7 +189,7 @@ <h5>DL4J's Neural Networks</h5>
<a href="convolutionalnetwork.html">Deep Convolutional Networks (CNNs)</a>
<a href="multilayerperceptron.html">Multilayer Perceptron (MLP) for classification</a>
<a href="usingrnns.html">Recurrent Nets (RNNs)</a>
<a href="word2vec.html">Word2vec: Extracting Relations From Raw Text</a>
<a href="word2vec.html">Word2vec: Extracting Relations From Raw Text</a>
<a href="generative-adversarial-network.html">Generative Adversarial Networks (GANs)</a>
<a href="restrictedboltzmannmachine.html">Restricted Boltzmann Machines</a>
<a href="deepbeliefnetwork.html">Deep-Belief Networks (DBN)</a>
Expand Down
62 changes: 62 additions & 0 deletions automated-machine-learning-ai.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
---
title: Automated Machine Learning & AI
layout: default
---

# Automated Machine Learning and AI

One of the way AI vendors try to convince companies to buy their machine learning platforms and tools is by claiming that it's automated. That's a key selling point, because most companies are acutely aware that they haven't hired enough data scientists (even they have managed to hire any data scientists at all).

Data scientists are people who explore data, clean it, test algorithms that they think might make accurate predictions about that data, and then tune those algorithms until they work well, like an auto mechanic might tune a car. Here's a more complete list of [tasks in the machine learning workflow](./machine-learning-workflow.html).

If the data scientists are lucky, they are given tools to perform those tasks efficiently, and they may even be enabled to deploy those trained machine-learning models to a production environment, to make predictions about data outside the lab.

<p align="center">
<a href="https://docs.skymind.ai/docs/welcome" type="button" class="btn btn-lg btn-success" onClick="ga('send', 'event', ‘quickstart', 'click');">GET STARTED WITH MACHINE LEARNING</a>
</p>

Many machine learning vendors, ranging from Google to startups such as Datarobot and H2O.ai, claim that they can automate machine learning. That sounds great! Then you, the hiring manager, won't need to go chasing after data science talent whose skills you can't judge in a bidding war you can't win. You'll just automate all those skills away.

The problem is, the skills that data scientists possess are hard to automate, and people who seek to buy automated AI should be aware of what exactly can be automated, and what can't, with present technology. Data scientists perform many tasks. While automating some of those tasks may lighten their workload, unless you can automate all of their tasks, they are still necessary, and that scarce talent will remain a chokepoint that hinders the implementation of machine learning in many organizations.

## What Can We Automate in Machine Learning?

I mentioned that data scientists *tune* algorithms. When you tune a complex machine (and these algorithms are just mathematical and symbolic machines), you usually have several knobs to turn. It's kind of like cooking something with several ingredients. To produce the right taste, to tune your dish as it were, those ingredients should be added in proper proportion to one another, just like you might add twice as much [buttermilk as you do butter to a biscuit recipe](https://www.marthastewart.com/349650/biscuits). The idea is, the right proportions matter.

A data scientist is frequently operating without a "recipe", and must tune knobs in combination with each other other to explore which combination works. In this case, "working" means tuning an algorithm until it is able to learn efficiently from the data it is given to train upon.

### Hyperparameter Optimization

In data science, the knobs on an algorithm are called hyperparameters, and so the data scientists are performing "hyperparameter search" as they test different combinations of those hyperparameters, different ratios between their ingredients.

Hyperparameter search can be automated. [Eclipse Arbiter](https://github.com/deeplearning4j/arbiter) is a hyperparameter optimization library designed to automate hyperparameter tuning for deep neural net training. It is the equivalent of Google Tensorflow's Vizier, or the open-source Python library Spearmint. Arbiter is part of the Deeplearning4j framework. Some startups, like [SigOpt](https://sigopt.com/), are focused solely on hyperparameter optimization.

You can search for the best combination of hyperparameters with different kinds of search algorithm, like grid search, random search and Bayesian methods.

### Algorithm Selection

One thing that AI vendors will do is run the same data through several algorithms whose hyperparameters are set by default, to determine which algorithm can learn best on your data. At the end of the contest, they select the winner. Visualizing these algorithmic beauty contests is a dramatic way to show the work being done. However, it has its limits, notably in the range of algorithms that are chosen to run in any given race, and how well they are tuned.

### Limited Use Cases on the Happy Path

AI vendors can be smart about the algorithms they select only if they have some knowledge of the problem that is being solved, and the data that is being used to train the algorithm. In many real-world situations, lengthy data exploration and some domain-specific knowledge are necessary to select the right algorithms.

In the world of automated machine learning, we pretend that data exploration and domain knowledge don't matter. We can only do that for a few limited use cases. In software, this is called the [happy path](https://en.wikipedia.org/wiki/Happy_path), or the use case where everything goes as we expect it to. Automated machine learning has a narrow happy path; that is, it's easy to step off the path and get into trouble.

For example, it's easy to automate machine learning for a simple use case like scoring your leads to Salesforce to predict the likelihood that you will close a sale. That's because the schema of the data -- the things you know about your customers -- is constrained by Salesforce software and fairly standardized across sales teams. An automated machine learning solution focused on lead scoring can make strong assumptions about the type of data you will feed it.

But companies need machine learning for more than lead scoring. Their use cases differ, and so does their data. In those cases, it can be hard to offer a pre-baked solution. Data pipelines, also known as ETL, are often the stage of the AI workflow that require the most human attention. The real world is messy and data, which represents that world, is usually messy, too. Most datasets need to be explored, cleaned and otherwise pre-processed before that data can be fruitfully used to train a machine-learning algorithm. That cleaning and exploration often requires expert humans.

### Professional Services

Those companies have two choices: they can hire their own data scientists or rely on processional services from consulting firms. Every major public cloud vendor has introduced machine-learning solutions teams in an attempt to close the talent gap and make machine learning more available to potential users of their clouds. The major consultancies, from Accenture to Bain, have hired teams of data scientists to build solutions for their clients. Even automated machine-learning startups like Data Robot offer "Customer-facing Data Scientists".

So a lot of time, AI vendors that sell automated machine learning are really "automating" those tasks with humans; i.e. that is, they're allowing their clients to outsource the talent that their clients can't otherwise get access to. This is because the tasks and decisions involved in building AI solutions are many, varied and complex, and the technology does not yet exist to automate all of them. That's automation, it's services. We should call it what it is and recognize that buiding machine-learing often requires the refined judgment of experts, combined with automation for a few narrow tasks in a larger AI workflow.

### Transfer Learning and Pre-Trained Models

Machine learning models start out dumb and get smart by being exposed to data that they "train" on. Training involves making guesses about the data, measuring the error in their guesses, and correcting themselves until they make more accurate guesses. Machine learning algorithms train on data to produce an accurate "model" of the data. A trained, accurate model of the data is one that is capable of producing good predictions when it is fed new data that resembles what it trained on. For the purposes of this discussion, imagine a model as a black box that performs a mathematical operation on data to make a prediction about it. The data goes into the model, the prediction comes out; e.g. feed an image of one of your friends into the model, and it will predict the name of the friend in the image.

Sometimes, you can train a machine-learning model on one set of data, and then use it for another, slightly different set of data later. This only works when the two datasets resemble each other. For example, most photographs have certain characteristics in common. If you train a machine-learning model on, say, celebrity faces, it will learn what humans look like, and with just a little extra learning, you could teach it to transfer what it knows to photographs of your family and friends, whom it has never seen before. Using a pre-trained model could save you the cost of training your own over thousands of hours on distributed GPUs, an expensive proposition.

Pre-trained machine-learning models that gain some knowledge of the world are useful in computer vision, and widely available. Some well-known pre-trained computer vision models include AlexNet, LeNet, VGG16, YOLO and Inception. [Those pre-trained computer vision models are available here](https://github.com/deeplearning4j/deeplearning4j/tree/master/deeplearning4j-zoo/src/main/java/org/deeplearning4j/zoo/model). Google's [Cloud AutoML](https://cloud.google.com/automl/) relies on transfer learning, among other methods, to support its claim that it has "automated machine learning."
Loading