In [1]:
import Pkg; Pkg.add(Pkg.PackageSpec(url="https://github.com/JuliaComputing/JuliaAcademyData.jl"))
using JuliaAcademyData; activate("Foundations of machine learning")

[?25l[2K

[32m[1m   Updating[22m[39m git-repo `https://github.com/JuliaComputing/JuliaAcademyData.jl`


[?25h

[32m[1m   Updating[22m[39m registry at `~/.juliapro/JuliaPro_v1.4.1-1/registries/JuliaPro`
[32m[1m  Resolving[22m[39m package versions...
[32m[1m   Updating[22m[39m `~/.juliapro/JuliaPro_v1.4.1-1/environments/v1.4/Project.toml`
[90m [no changes][39m
[32m[1m   Updating[22m[39m `~/.juliapro/JuliaPro_v1.4.1-1/environments/v1.4/Manifest.toml`
[90m [no changes][39m
[32m[1m Activating[22m[39m environment at `~/.juliapro/JuliaPro_v1.4.1-1/packages/JuliaAcademyData/1to3l/courses/Foundations of machine learning/Project.toml`


# Motivation

Hello, and welcome! We're excited to be your gateway into machine learning. ML is a rapidly growing field that's buzzing with opportunity. Why? Because the tools and skills employed by ML specialists are extremely powerful and allow them to draw conclusions from large data sets quickly and with relative ease.

Take the Celeste project, for example. This is a project that took 178 **tera**bytes of data on the visible sky and used it to catalogue 188 millions stars and galaxies. "Cataloguing" these stars meant identifying characteristics like their locations, colors, sizes, and morphologies. This is an amazing feat, *especially* because this entire calculation took under 15 minutes.

<img src="https://raw.githubusercontent.com/JuliaComputing/JuliaAcademyData.jl/master/courses/Foundations%20of%20machine%20learning/data/Celeste.png" alt="Drawing" style="width: 1000px;"/>

How are Celeste's calculations so fast? To achieve performance on this scale, the Celeste team uses the Julia programming language to write their software and supercomputers from Lawrence Berkeley National Lab's NERSC as their hardware. In this course, we unfortunately won't be able to give you access to a top 10 supercomputer, but we will teach you how to use Julia!

We're confident that this course will put you on your way to understanding many of the important concepts and "buzz words" in ML. To get you started, we'll teach you how to teach a machine to tell the difference between images of apples and bananas, i.e to **classify** images as being one or the other type of fruit.

Like Project Celeste, we'll use the [Julia programming language](https://julialang.org/) to do this. In particular, we'll be working in [Jupyter notebooks](http://jupyter.org/) like this one! (Perhaps you already know that the "ju" in Jupyter comes from Julia.)

## What do the images we want to classify look like?

We can use the `Images.jl` package in Julia to load sample images from this dataset. Most of the data we will use live in the `data` folder in this repository.

In [2]:
using Images  # To execute hit <shift> + enter

┌ Info: Precompiling Images [916415d5-f1e6-5110-898d-aaa5f9f070e0]
└ @ Base loading.jl:1260
ERROR: LoadError: SpecialFunctions is not installed properly, run `Pkg.build("SpecialFunctions")`,restart Julia and try again
Stacktrace:
 [1] error(::String) at ./error.jl:33
 [2] top-level scope at /home/markus/.juliapro/JuliaPro_v1.4.1-1/packages/SpecialFunctions/ne2iw/src/SpecialFunctions.jl:6
 [3] include(::Module, ::String) at ./Base.jl:377
 [4] top-level scope at none:2
 [5] eval at ./boot.jl:331 [inlined]
 [6] eval(::Expr) at ./client.jl:449
 [7] top-level scope at ./none:3
in expression starting at /home/markus/.juliapro/JuliaPro_v1.4.1-1/packages/SpecialFunctions/ne2iw/src/SpecialFunctions.jl:4
ERROR: LoadError: Failed to precompile SpecialFunctions [276daf66-3868-5448-9aa4-cd146d93841b] to /home/markus/.juliapro/JuliaPro_v1.4.1-1/compiled/v1.4/SpecialFunctions/78gOt_ilo8l.ji.
Stacktrace:
 [1] error(::String) at ./error.jl:33
 [2] compilecache(::Base.PkgId, ::String) at ./loading.jl:12

ErrorException: Failed to precompile Images [916415d5-f1e6-5110-898d-aaa5f9f070e0] to /home/markus/.juliapro/JuliaPro_v1.4.1-1/compiled/v1.4/Images/H8Vxc_ilo8l.ji.

In [None]:
apple = load(datapath("data/10_100.jpg"))

In [None]:
banana = load(datapath("data/104_100.jpg"))

The dataset consists of many images of different fruits, viewed from different positions.
These images are [available on GitHub here](https://github.com/Horea94/Fruit-Images-Dataset).

## What is the goal?

The ultimate goal is to feed one of these images to the computer and for it to identify whether the image represents an apple or a banana!  To do so, we will **train** the computer to learn **for itself** how to
distinguish the two images.

The following notebooks will walk you step by step through the underlying math and machine learning concepts you need to know in order to accomplish this classification.

They alternate between two different types of notebooks: those labelled **ML** (Machine Learning), which are designed to give a high-level overview of the concepts we need for machine learning, but which gloss over some of the technical details; and those labelled **Tools**, which dive into the details of coding in Julia that will be key to actually implement the machine learning algorithms ourselves.

The notebooks contain many **Exercises**. By doing these exercises in Julia, you will learn the basics of machine learning!