## Numerical Computing and Data Analytics

We have covered the basic functionalities of the Julia language so far. Now we will take a quick look at Julia's capabilities in Numerical Computing and Data Analytics.

Julia started its life as a language geared towards scientific computing and, as such, mathematical operations and numerical computing are first class citizens of the language. Pretty much anything you can do with core MATLAB or numpy, you can do with core Julia with at least the same speed and stability... but Julia actually does way better than these frameworks in many cases.

Let's look at Linear Algebra as an example:

### Linear Algebra

Doing Linear Algebra in Julia is straightforward. Here is how you perform a dot product between two vectors:

In [None]:
a = [1,2,3,4,5,6,7,8,9,10]
b = [11,12,13,14,15,16,17,18,19,20]

a'*b # NOTICE HOW WE TRANSPOSE a BY WRITING a'

Or between two Matrices:

In [None]:
A = rand(10,10)
B = rand(3,10)

In [None]:
A*B'

You can also write dot products like this:

In [None]:
a'b

In [None]:
A'B'

In [None]:
AB' # WILL THIS WORK?

You can check the condition number of a matrix before using it in further computations:

In [None]:
using LinearAlgebra

cond(A)

Then you can invert a matrix:

In [None]:
inv(A)

Compute its determinant:

In [None]:
det(A)

Its eigenvalues:

In [None]:
eigen(A).values

And eigenvectors:

In [None]:
eigen(A).vectors

You can easily solve Linear Systems:

In [None]:
A \ b # FIND x WHERE Ax = b

And do lots more including factorizations, decompositions, rotations and other transformations.

## Data Frames

Over the last decade, R-like Data Frames have become the standard data structure for Data Analytics, Statistics and Machine Learning applications. Native in R and available in Python through the `pandas` library, Data Frames can be thought of as tables containing variables of interest on its columns. Each column can be given a name and variables can be extracted, referred to and otherwise manipulated by name in computations.

In Julia, Data Frames are available through the DataFrames package:

In [None]:
#Pkg.add("DataFrames")
using DataFrames

In [None]:
a_data_frame = DataFrame(Y=rand(5), X1 = 1:5, X2 = ["Category 1", "Category 2", "Category 2", "Category 3", "Category 3"])

You can easily compute summary Statistics for the entire Data Frame:

In [None]:
describe(a_data_frame)

Or you can apply functions to each column individually:

In [None]:
using Statistics

mean(a_data_frame.Y)

In [None]:
std(a_data_frame.Y)

In [None]:
sum(a_data_frame.Y)

You can load data from disk directly into Data Frames as well:

In [None]:
#Pkg.add("CSV")
using CSV

In [None]:
another_data_frame = CSV.read("./iris.csv",header=0)

In [None]:
rename!(another_data_frame, ["sepal_length","sepal_width","petal_length","petal_width","species"])

## Plotting

We've seen a simple example of a scatterplot in a previous notebook. Now let's look into Plotting in Julia more in depth.

The built-in plotting library in Julia is `Plots` and it is pretty powerful. We will focus on it here, but you can easily switch to other well-known libraries like `Plotly`, `PyPlot` and `Gadfly` (similar to R's `ggplot2`).

In [None]:
using Plots

scatter(another_data_frame.sepal_length,another_data_frame.sepal_width, group=another_data_frame.species)

# @df another_data_frame scatter(:sepal_length,:sepal_width,group=:species)

In [None]:
title!("Sepal Length vs Width per Iris Species")

xlabel!("Sepal Length")

ylabel!("Sepal Width")

You can keep adding elements to the plot... like plotting a function on top of the scatterplot:

In [None]:
f(x) = x/2

plot!(f)

You can even create amimated plots!

In [None]:
default(legend = false)
x = range(-5, 5, length = 40)
n = 100

@gif for i in range(0, stop = 2π, length = n) # NOTICE THE UNICODE pi (TYPE \pi AND HIT TAB)
    f(x) = sin(x+i)^2
    p = plot(f)
        
end