<a href="https://www.nvidia.com/dli/"> <img src="images/DLI Header.png" alt="Header" style="width: 400px;"/> </a>

### Load and organize data

Let's start by bringing our data, in this case, thousands of images, into our learning environment. We're going to use a tool called DIGITS, where we can visualize and manage our data. 

First, open DIGITS in a new tab using the following link.

### <a href="/digits/" target=\"_blank\">Open DIGITS</a>.

#### Loading our first dataset

When you start DIGITS, you will be taken to the home screen where you can create new datasets or new models. 

Select the **Datasets** tab on the left.

![](images/DIGITSdatasetHome.PNG)

Since we want our network to tell us which "class" each image belongs to, we ask DIGITS to prepare a "classification" image dataset by selecting "Classification" from the "Images" menu on the right.

At this point you may need to enter a username.  If requested, just enter any name in lower-case.

#### Loading and organizing our data

You'll see that you've got a lot of options around *how* to load a dataset. For this first runthrough, we're going to simplify and only fill out two fields. 

1. Copy and paste the following filepath into the field "Training Images":  <code>/dli/data/train_small</code>
2. Name the dataset so that you can find it. We've chosen: <code>Default Options Small Digits Dataset</code>

Don't see "Training Images?" Click "DIGITS" on the top left and select "Datasets" before selecting "Images" and "Classification."

Note that we've already downloaded the dataset to the computer where DIGITS is running. You'll have a chance to explore it shortly and will learn methods for accessing data as you work through our labs. 

![](images/FirstDataSetSettings.png)

By default, DIGITS puts 25% of your data into a seperate database for "Validation." This will set a randomly selected 25% of each class aside to make sure that your network is effective on *new* data vs. simply memorizing it's training dataset.

Then press "Create."

DIGITS is now creating your dataset from the folder. Inside the folder <code>train_small</code> there were 10 subfolders, one for each class (0, 1, 2, 3, ..., 9). All of the handwritten training images of '0's are in the '0' folder, '1's are in the '1' folder, etc.  

Explore what our data looks like by selecting "Explore the db".

#### Your data

While there is an endless amount of analysis that we could do on the data, make sure you at least note the following:

1. This data is *labeled.* Each image in the dataset is paired with a **label** that informs the computer what number the image represents, 0-9. We're basically providing a question with its answer, or, as our network will see it, a desired output with each input. These are the "examples" that our network will learn from.
2. Each image is simply a digit on a plain background. Image classification is the task of identifying the *predominant* object in an image. For a first attempt, we're using images that only contain *one* object. We'll build skills to deal with messier data in subsequent labs. 

This data comes from the [MNIST](http://yann.lecun.com/exdb/mnist/) dataset which was created by Yann LeCun. It's largely considered the "Hello World," or introduction, to deep learning.

### Learning from our data - Training a neural network

Next, we're going to use our data to *train* an artificial neural network. Like its biological inspiration, the human brain, artificial neural networks are learning machines. Also like the brain, these "networks" only become capable of solving problems with experience, in this case, interacting with data. Throughout this lab, we'll refer to "networks" as untrained artificial neural networks and "models" as what networks become once they are trained (through exposure to data).

![](images/networktomodel.PNG)

For image classification (and some other tasks), DIGITS comes pre-loaded with award-winning networks. As we take on different challenges in subsequent labs, we'll learn more about selecting networks and even building our own. However, to start, weighing the merits of different networks would be like arguing about the performance of different cars before driving for the first time. Building a network from scratch would be like building your own car. Let's drive first. We'll get there. 

Go to the tab where DIGITS is still open and return to the main screen by clicking "DIGITS" on the top left of the screen.

Creating a new model in DIGITS is a lot like creating a new dataset. From the home screen, the "Models" tab will be pre-selected. Click "Images" under "New Model" and select "Classification", as we're creating an image classification model to match our image classification dataset and image classification task.

![](images/newmodelselect.PNG)

Again, for this first round of training let's keep it simple. The following are the fewest settings you could possibly set to successfully train a network.

1. We need to choose the dataset we just created. Select our **Default Options Small Digits Dataset** dataset.
2. We need to tell the network how long we want it to train. An **epoch** is one trip through the entire training dataset. Set the number of **Training Epochs** to 5 to give our network enough time to learn something, but not take all day. This is a great setting to experiment with. 
3. We need to define which network will *learn* from our data. Since we stuck with default settings in creating our dataset, our database is full of 256x256 color images. Select the network **AlexNet**, if only because it expects 256x256 color images.
4. We need to name the model, as hopefully we'll do a lot of these. We chose **HandwrittenDigits1**

![](images/1stmodeltrain.png)

When you have set all of these options, press the Create button.  

You are now training your model! For this configuration, the model training should complete in less than 5 minutes. You can either watch it train, continue reading, or grab a cup of coffee. 

When done, the Job Status on the right will say "Done", and your training graph should look something like:

![](images/graphfromfirsttraining.png)

We'll dig into this graph as a tool for improvement, but the bottom line is that after 5 minutes of training, we have built a model that can map images of handwritten digits to the number they represent with an accuracy of about 87%! 

Let's test the ability of the model to identify **new** images.  

### Inference

Now that our neural network has learned something, *inference* is the process of making decisions based on what was learned. The power of our trained model is that it can now classify **unlabeled** images. 

![](images/trainingwithinferencevisualization.PNG)

We'll use DIGITS to test our trained model. At the bottom of the model window, you can test a single image or a list of images.  On the left, type in the path <code>/data/test_small/2/img_4415.png</code> in the Image Path text box. Select the **Classify One** button.  After a few seconds, a new window is displayed with the image and information about its attempt to classify the image. 

![](images/classifyoneunlabeled.png)



It worked! (Try again if it didn't). You took an untrained neural network, exposed it to thousands of *labeled* images, and it now has the ability to accurately predict the *class* of *unlabeled* images. Congratulations!

Note that that same workflow would work with almost any image classification task. You could train AlexNet to classify images of dogs from images of cats, images of you from images of me, etc. If you have extra time at the end of this lab, theres another dataset with 101 different classes of images where you can experiment.

While you have been successful with this introductory task, there is a lot more to learn.

Return to the course to continue.

<a href="https://www.nvidia.com/dli/"> <img src="images/DLI Header.png" alt="Header" style="width: 400px;"/> </a>