# An Introduction to Basic Julia Functionality 

# What is a vector?
    a vector is a one dimensional array.  It is like one row or column in a spreadsheet, and can hold any kind of data

How do we build one?

In [2]:
[1 2 3 4]

1×4 Array{Int64,2}:
 1  2  3  4

In [3]:
[1,2,3,4]

4-element Array{Int64,1}:
 1
 2
 3
 4

In [4]:
 ["dog", "cat", "bird", "mouse"]

4-element Array{String,1}:
 "dog"  
 "cat"  
 "bird" 
 "mouse"

To declare a vector in Julia, use the square braces []

A ROW vector has items separated by spaces

A COLUMN vector has items separated by commas


Lets say that you build a vector full of animals

animalsVector =  ["dog", "cat", "bird", "mouse"]

Notice: by giving this vector a title, we have saved it in the computer's memory

If I want to take out just the bird, I have to do INDEXING
Each entry of the vector has a number label of its position, aka an Index

In [5]:
animalsVector =  ["dog", "cat", "bird", "mouse"]
#lets find the cat!  
#you can index into a vector with the position inside square braces
animalsVector[2]

"cat"

A word of Warning! Indexing out of bounds will cause an error!

In [6]:
vectorFourLong = [1 2 3 4]
vectorFourLong[5]

BoundsError: BoundsError: attempt to access 1×4 Array{Int64,2} at index [5]

# What is a Matrix?
    a matrix is a two dimensional array.  It is like a bunch of row vectors stacked on top of each other

Matrices are built just like vectors.
Use a semicolon to indicate a new row!

In [7]:
["a" "b" "c" "d"; "e" "f" "g" "h"]

2×4 Array{String,2}:
 "a"  "b"  "c"  "d"
 "e"  "f"  "g"  "h"

Indexing into matrices works just like vectors, but you now need two coordinates.
Index with the format [row, column] 

In [16]:
myMatrix = ["a" "b" "c" "d"; "e" "f" "g" "h"]
myMatrix[2,3]

"g"

You can also slice a matrix into rows or columns by using the : operator during indexing

In [17]:
myMatrix = ["a" "b" "c" "d"; "e" "f" "g" "h"]
myMatrix[2,:]
#this operation will extract all of row 2

4-element Array{String,1}:
 "e"
 "f"
 "g"
 "h"

# Other ways to build an array/matrix

Range Notation:
Use a colon to create a vector or matrix with a range of values

In [11]:
[1:10; 21:30]

20-element Array{Int64,1}:
  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30

Build now, fill later with
Array{T}(undef, dims)
Where T is the type
and dims is the dimensions
Note: for concrete types, julia will automatically fill in the matrix with junk values

In [18]:
Array{Int64}(undef, 2, 3)

2×3 Array{Int64,2}:
 368363696  368363728  158578320
 156449200  157065840  163143984

In [6]:
#You can put the correct values in by indexing
junkArray = Array{Int64}(undef, 2, 3)
junkArray[1, 2] = 18;
junkArray[2, 3] = 45;
junkArray[2, 1] = 6;
junkArray

2×3 Array{Int64,2}:
 344327936         18  344327952
         6  140148768         45

There are some built-ins for quick matrix building:

**zeros(T, dims)
ones(T, dims)**

Both of these functions build a matrix full of either ones or zeros, with specified dimensions

# When is this useful?
# I WANT THE STUDENTS TO HAVE TO FILL THIS OUT!

Data in Julia is stored in Matrices, so it is helpful to know how to manipulate that data.

In [9]:
#first, we are going to download some data from the internet, in CSV(or comma separated values) format
P = download("https://raw.githubusercontent.com/kjbiener/introToJulia/master/juilaIntroData.csv","fruitConsumption.csv")
#we have to tell Julia to use the CSV Package before we can read the data
using CSV
data = CSV.read("fruitConsumption.csv")

Unnamed: 0_level_0,People,personA,personB,personC,personD
Unnamed: 0_level_1,String,Int64,Int64,Int64,Int64
1,Banana,2,4,6,2
2,Pear,4,6,7,5
3,Lemon,6,2,3,7
4,Pineapple,7,9,7,3
5,Orange,12,1,9,2
6,Strawberry,4,1,14,0
7,Apple,9,0,0,1
8,Mango,0,3,2,7


Notice that Julia automatically made the first row into a header, but left the first column in as data.  Having these string values in our dataset makes calculations weird.  Let's get rid of that first column by slicing the matrix.  The current format of this data is something called a DataFrame, which is inoperable.  To convert it to a matrix, simple use the Matrix() function

In [18]:
dataNumeric = Matrix(data[:, 2:end])
#or 
#dataNumeric = data[:, 2:5]

8×4 Array{Int64,2}:
  2  4   6  2
  4  6   7  5
  6  2   3  7
  7  9   7  3
 12  1   9  2
  4  1  14  0
  9  0   0  1
  0  3   2  7

Now that we have the raw numeric data, lets find out some things about it.  

In [20]:
#How much total fruit is consumed by each person?
#the sum() function will take all of the values in the matrix, and sum it up for you, 
#just specify the dimension you are summing over
totalFruit = sum(dataNumeric, dims=1)
#What happens if we call sum without specifying the dimensions?

145

We can also run other functions like mean(), maximum(), minimum(), var() and std()
Again, if we specify a dimension, it will evaluate only along that dimension.

In [32]:
# we have to use the statistics package
using Statistics
#example of standard deviation calculation
std(dataNumeric)

3.6009351384005437

# Iterators and Conditional Operations
# aka for and if statements

FOR loops:  A chunk of code that can be repeated a certain number of times

In [38]:
#here is an example
for i = 1:4
    println("I ran this loop ", i, " times")
end

I ran this loop 1 times
I ran this loop 2 times
I ran this loop 3 times
I ran this loop 4 times


These loops make it really easy to consolidate our code to work faster.  When we process data, sometimes it is easier to use a for loop to make sure that you are getting the exact values that you want.  Let's say that we want to figure out which person ate the most strawberries this week.  How can we use indexing, and a for loop, to figure out who it was?

In [44]:
#first, we need to get the strawberry row out
sRow = dataNumeric[6,:]
println(sRow)
#now lets initialize our maximum strawberry number to be really low, and the index to be 0
maxStrawberry = 0
indexMost = 0

# now for every value in the strawberry row, we need to examine that value
for i= 1:4
    #is it the biggest?
    if sRow[i] > maxStrawberry
        #we found a bigger value!  Let's update the maximum that we have, and remember the location it was in
        maxStrawberry = sRow[i]
        indexMost = i
    end
    #if it wasn't the biggest, we just move on to the next value in the list
end

#Now that we have found the maximum, lets print it out, and also figure out which person it was
#the names of the test subjects were
testPeople = ["personA", "personB", "personC", "personD"]
biggestStrawberryFan = testPeople[indexMost]

#lets print out our information!
println(biggestStrawberryFan, " ate ", maxStrawberry, " strawberries, which was the most in the study.")
    

[4, 1, 14, 0]
personC ate 14 strawberries, which was the most in the study.
