# Fictional Army - Filtering and Sorting

### Introduction:

This exercise was inspired by this [page](http://chrisalbon.com/python/)

Special thanks to: https://github.com/chrisalbon for sharing the dataset and materials.

### Step 1. Import the necessary libraries

In [None]:
using DotEnv
using Pkg

DotEnv.load!()
path = ENV["ENV_PATH"]
Pkg.activate(path)

using CSV
using DataFrames
using Downloads
using Statistics # mean, std

### Step 2. This is the data given as a dictionary

In [2]:
# Create an example dataframe about a fictional army
raw_data = Dict("regiment" => String["Nighthawks", "Nighthawks", "Nighthawks", "Nighthawks", "Dragoons", "Dragoons", "Dragoons", "Dragoons", "Scouts", "Scouts", "Scouts", "Scouts"],
            "company" => String["1st", "1st", "2nd", "2nd", "1st", "1st", "2nd", "2nd","1st", "1st", "2nd", "2nd"],
            "deaths"=> Int[523, 52, 25, 616, 43, 234, 523, 62, 62, 73, 37, 35],
            "battles"=> Int[5, 42, 2, 2, 4, 7, 8, 3, 4, 7, 8, 9],
            "size"=> Int[1045, 957, 1099, 1400, 1592, 1006, 987, 849, 973, 1005, 1099, 1523],
            "veterans"=> Int[1, 5, 62, 26, 73, 37, 949, 48, 48, 435, 63, 345],
            "readiness"=> Int[1, 2, 3, 3, 2, 1, 2, 3, 2, 1, 2, 3],
            "armored"=> Int[1, 0, 1, 1, 0, 1, 0, 1, 0, 0, 1, 1],
            "deserters"=> Int[4, 24, 31, 2, 3, 4, 24, 31, 2, 3, 2, 3],
            "origin"=> String["Arizona", "California", "Texas", "Florida", "Maine", "Iowa", "Alaska", "Washington", "Oregon", "Wyoming", "Louisana", "Georgia"])

Dict{String, Vector} with 10 entries:
  "deaths"    => [523, 52, 25, 616, 43, 234, 523, 62, 62, 73, 37, 35]
  "battles"   => [5, 42, 2, 2, 4, 7, 8, 3, 4, 7, 8, 9]
  "veterans"  => [1, 5, 62, 26, 73, 37, 949, 48, 48, 435, 63, 345]
  "company"   => ["1st", "1st", "2nd", "2nd", "1st", "1st", "2nd", "2nd", "1st"…
  "armored"   => [1, 0, 1, 1, 0, 1, 0, 1, 0, 0, 1, 1]
  "regiment"  => ["Nighthawks", "Nighthawks", "Nighthawks", "Nighthawks", "Drag…
  "size"      => [1045, 957, 1099, 1400, 1592, 1006, 987, 849, 973, 1005, 1099,…
  "readiness" => [1, 2, 3, 3, 2, 1, 2, 3, 2, 1, 2, 3]
  "deserters" => [4, 24, 31, 2, 3, 4, 24, 31, 2, 3, 2, 3]
  "origin"    => ["Arizona", "California", "Texas", "Florida", "Maine", "Iowa",…

### Step 3. Create a dataframe and assign it to a variable called army.

In [3]:
army = DataFrame(raw_data)

Row,armored,battles,company,deaths,deserters,origin,readiness,regiment,size,veterans
Unnamed: 0_level_1,Int64,Int64,String,Int64,Int64,String,Int64,String,Int64,Int64
1,1,5,1st,523,4,Arizona,1,Nighthawks,1045,1
2,0,42,1st,52,24,California,2,Nighthawks,957,5
3,1,2,2nd,25,31,Texas,3,Nighthawks,1099,62
4,1,2,2nd,616,2,Florida,3,Nighthawks,1400,26
5,0,4,1st,43,3,Maine,2,Dragoons,1592,73
6,1,7,1st,234,4,Iowa,1,Dragoons,1006,37
7,0,8,2nd,523,24,Alaska,2,Dragoons,987,949
8,1,3,2nd,62,31,Washington,3,Dragoons,849,48
9,0,4,1st,62,2,Oregon,2,Scouts,973,48
10,0,7,1st,73,3,Wyoming,1,Scouts,1005,435


### Step 4. Print only the column veterans

In [4]:
army[!, :veterans]

12-element Vector{Int64}:
   1
   5
  62
  26
  73
  37
 949
  48
  48
 435
  63
 345

### Step 5. Print the columns 'veterans' and 'deaths'

In [5]:
army[!, [:veterans, :deaths]]

Row,veterans,deaths
Unnamed: 0_level_1,Int64,Int64
1,1,523
2,5,52
3,62,25
4,26,616
5,73,43
6,37,234
7,949,523
8,48,62
9,48,62
10,435,73


### Step 6. Print the name of all the columns.

In [6]:
names(army)

10-element Vector{String}:
 "armored"
 "battles"
 "company"
 "deaths"
 "deserters"
 "origin"
 "readiness"
 "regiment"
 "size"
 "veterans"

### Step 7. Select the 'deaths', 'size' and 'deserters' columns from Maine and Alaska

In [7]:
army[in.(army[!, "origin"], Ref(["Maine", "Alaska"])), [:origin,:deaths, :size, :deserters]]

Row,origin,deaths,size,deserters
Unnamed: 0_level_1,String,Int64,Int64,Int64
1,Maine,43,1592,3
2,Alaska,523,987,24


### Step 8. Select the rows 3 to 7 and the columns 3 to 6

In [6]:
army[3:7, 3:6]

Row,company,deaths,deserters,origin
Unnamed: 0_level_1,String,Int64,Int64,String
1,2nd,25,31,Texas
2,2nd,616,2,Florida
3,1st,43,3,Maine
4,1st,234,4,Iowa
5,2nd,523,24,Alaska


### Step 9. Select every row after the fourth row and all columns

In [7]:
army[5:end, :]

Row,armored,battles,company,deaths,deserters,origin,readiness,regiment,size,veterans
Unnamed: 0_level_1,Int64,Int64,String,Int64,Int64,String,Int64,String,Int64,Int64
1,0,4,1st,43,3,Maine,2,Dragoons,1592,73
2,1,7,1st,234,4,Iowa,1,Dragoons,1006,37
3,0,8,2nd,523,24,Alaska,2,Dragoons,987,949
4,1,3,2nd,62,31,Washington,3,Dragoons,849,48
5,0,4,1st,62,2,Oregon,2,Scouts,973,48
6,0,7,1st,73,3,Wyoming,1,Scouts,1005,435
7,1,8,2nd,37,2,Louisana,2,Scouts,1099,63
8,1,9,2nd,35,3,Georgia,3,Scouts,1523,345


### Step 10. Select every row up to the 4th row and all columns

In [10]:
army[1:4, :]

Row,armored,battles,company,deaths,deserters,origin,readiness,regiment,size,veterans
Unnamed: 0_level_1,Int64,Int64,String,Int64,Int64,String,Int64,String,Int64,Int64
1,1,5,1st,523,4,Arizona,1,Nighthawks,1045,1
2,0,42,1st,52,24,California,2,Nighthawks,957,5
3,1,2,2nd,25,31,Texas,3,Nighthawks,1099,62
4,1,2,2nd,616,2,Florida,3,Nighthawks,1400,26


### Step 11. Select the 3rd column up to the 7th column

In [11]:
army[:, 3:7]

Row,company,deaths,deserters,origin,readiness
Unnamed: 0_level_1,String,Int64,Int64,String,Int64
1,1st,523,4,Arizona,1
2,1st,52,24,California,2
3,2nd,25,31,Texas,3
4,2nd,616,2,Florida,3
5,1st,43,3,Maine,2
6,1st,234,4,Iowa,1
7,2nd,523,24,Alaska,2
8,2nd,62,31,Washington,3
9,1st,62,2,Oregon,2
10,1st,73,3,Wyoming,1


### Step 12. Select rows where deaths is greater than 50

In [12]:
army[army[!, :deaths] .>= 50, :]

Row,armored,battles,company,deaths,deserters,origin,readiness,regiment,size,veterans
Unnamed: 0_level_1,Int64,Int64,String,Int64,Int64,String,Int64,String,Int64,Int64
1,1,5,1st,523,4,Arizona,1,Nighthawks,1045,1
2,0,42,1st,52,24,California,2,Nighthawks,957,5
3,1,2,2nd,616,2,Florida,3,Nighthawks,1400,26
4,1,7,1st,234,4,Iowa,1,Dragoons,1006,37
5,0,8,2nd,523,24,Alaska,2,Dragoons,987,949
6,1,3,2nd,62,31,Washington,3,Dragoons,849,48
7,0,4,1st,62,2,Oregon,2,Scouts,973,48
8,0,7,1st,73,3,Wyoming,1,Scouts,1005,435


### Step 13. Select rows where deaths is greater than 500 or less than 50

In [13]:
army[(army[!, :deaths] .<= 50) .| (army[!, :deaths] .>= 500), :]

Row,armored,battles,company,deaths,deserters,origin,readiness,regiment,size,veterans
Unnamed: 0_level_1,Int64,Int64,String,Int64,Int64,String,Int64,String,Int64,Int64
1,1,5,1st,523,4,Arizona,1,Nighthawks,1045,1
2,1,2,2nd,25,31,Texas,3,Nighthawks,1099,62
3,1,2,2nd,616,2,Florida,3,Nighthawks,1400,26
4,0,4,1st,43,3,Maine,2,Dragoons,1592,73
5,0,8,2nd,523,24,Alaska,2,Dragoons,987,949
6,1,8,2nd,37,2,Louisana,2,Scouts,1099,63
7,1,9,2nd,35,3,Georgia,3,Scouts,1523,345


### Step 14. Select all the regiments not named "Dragoons"

In [9]:
army[army[!, :regiment] .!= "Dragoons", :]

Row,armored,battles,company,deaths,deserters,origin,readiness,regiment,size,veterans
Unnamed: 0_level_1,Int64,Int64,String,Int64,Int64,String,Int64,String,Int64,Int64
1,1,5,1st,523,4,Arizona,1,Nighthawks,1045,1
2,0,42,1st,52,24,California,2,Nighthawks,957,5
3,1,2,2nd,25,31,Texas,3,Nighthawks,1099,62
4,1,2,2nd,616,2,Florida,3,Nighthawks,1400,26
5,0,4,1st,62,2,Oregon,2,Scouts,973,48
6,0,7,1st,73,3,Wyoming,1,Scouts,1005,435
7,1,8,2nd,37,2,Louisana,2,Scouts,1099,63
8,1,9,2nd,35,3,Georgia,3,Scouts,1523,345


### Step 15. Select the rows called Texas and Arizona

In [15]:
army[in.(army[!, :origin], Ref(["Arizona", "Texas"])), :]

Row,armored,battles,company,deaths,deserters,origin,readiness,regiment,size,veterans
Unnamed: 0_level_1,Int64,Int64,String,Int64,Int64,String,Int64,String,Int64,Int64
1,1,5,1st,523,4,Arizona,1,Nighthawks,1045,1
2,1,2,2nd,25,31,Texas,3,Nighthawks,1099,62


### Step 16. Select the third cell in the row named Arizona

In [16]:
army[army[!, :origin] .== "Arizona", 4]

1-element Vector{Int64}:
 523

### Step 17. Select the third cell down in the column named deaths

In [17]:
army[3, :deaths]

25