# CVSs, Loops, Datetime

## DelimimtedFiles Package

**Why is the package not a core functionality of Julia?**

1) Startup efficieny: This is not an essential Feature needed at startup. Also, the size of this functionality is quite large.

2) Funtional Efficieny: By having a modular design, it means that functionalites must follow Julia's design philosophy of consistency over convenience. 

In [1]:
using DelimitedFiles
data = DelimitedFiles.readdlm("ebola_virus_epidemic_major_outbreaks.csv", ',')
data

54×9 Matrix{Any}:
 "25 Nov 2015"     "28,637"  …     "4,808"     "14,122"     "3,955"
 "18 Nov 2015"     "28,634"        "4,808"     "14,122"     "3,955"
 "11 Nov 2015"     "28,635"        "4,808"     "14,122"     "3,955"
 "4 Nov 2015"      "28,607"        "4,808"     "14,089"     "3,955"
 "25 Oct 2015"     "28,539"        "4,808"     "14,061"     "3,955"
 "18 Oct 2015"     "28,476"  …     "4,808"     "14,001"     "3,955"
 "11 Oct 2015"     "28,454"        "4,808"     "13,982"     "3,955"
 "27 Sep 2015"     "28,388"        "4,808"     "13,911"     "3,955"
 "20 Sep 2015"     "28,295"        "4,808"     "13,823"     "3,955"
 "13 Sep 2015"     "28,220"        "4,808"     "13,756"     "3,953"
 ⋮                           ⋱                           
 "14 Jul 2014"  982          …  106         397          197
 "2 Jul 2014"   779              75         252          101
 "17 Jun 2014"  528              24          97           49
 "27 May 2014"  309              11          16            5


## Converting Dates

In [2]:
using Dates
Dates.DateTime(data[1,1], "d u y")

2015-11-25T00:00:00

In [3]:
col1 = data[:, 1]

for i = 1:length(col1)
    col1[i] = Dates.DateTime(col1[i], "d u y")
end

In [4]:
col1

54-element Vector{Any}:
 2015-11-25T00:00:00
 2015-11-18T00:00:00
 2015-11-11T00:00:00
 2015-11-04T00:00:00
 2015-10-25T00:00:00
 2015-10-18T00:00:00
 2015-10-11T00:00:00
 2015-09-27T00:00:00
 2015-09-20T00:00:00
 2015-09-13T00:00:00
 ⋮
 2014-07-14T00:00:00
 2014-07-02T00:00:00
 2014-06-17T00:00:00
 2014-05-27T00:00:00
 2014-05-12T00:00:00
 2014-05-01T00:00:00
 2014-04-14T00:00:00
 2014-03-31T00:00:00
 2014-03-22T00:00:00

In [5]:
Dates.datetime2rata(col1[1]) # Format to assign 


735927

In [6]:
dayssincemar22(x) = Dates.datetime2rata(x) - Dates.datetime2rata(col1[54])   
epidays = Array{Int64}(undef, 54)

for i = 1:length(col1)
    epidays[i] = dayssincemar22(col1[i])
end

epidays

54-element Vector{Int64}:
 613
 606
 599
 592
 582
 575
 568
 554
 547
 540
   ⋮
 114
 102
  87
  66
  51
  40
  23
   9
   0

In [7]:
data[:, 1] = epidays
data

54×9 Matrix{Any}:
 613     "28,637"     "11,314"  …     "4,808"     "14,122"     "3,955"
 606     "28,634"     "11,314"        "4,808"     "14,122"     "3,955"
 599     "28,635"     "11,314"        "4,808"     "14,122"     "3,955"
 592     "28,607"     "11,314"        "4,808"     "14,089"     "3,955"
 582     "28,539"     "11,298"        "4,808"     "14,061"     "3,955"
 575     "28,476"     "11,298"  …     "4,808"     "14,001"     "3,955"
 568     "28,454"     "11,297"        "4,808"     "13,982"     "3,955"
 554     "28,388"     "11,296"        "4,808"     "13,911"     "3,955"
 547     "28,295"     "11,295"        "4,808"     "13,823"     "3,955"
 540     "28,220"     "11,291"        "4,808"     "13,756"     "3,953"
   ⋮                            ⋱                           
 114  982          613          …  106         397          197
 102  779          481              75         252          101
  87  528          337              24          97           49
  66  309          

## Coverting Values

In [9]:
function convert2int(value)

    # Handle 'dashes'
    if value == "–"
        return 0
    
    # Handle Numbers
    elseif isa(value, Number) 
        return value
    
    # Handle String Cases
    elseif isa(value, String) || isa(value, SubString{String}) == true
        value = replace(value, r"[\",≥]" => "" )
        return parse(Int64, value)
    end


end

# Testing
convert2int_tests = ["10,000","\"270,000\"", '–', 1099, data[1,2]]

# Iterate through inputs, call convert2int, and print input and output
for test in convert2int_tests
    output = convert2int(test)
    println("Input: $test; Output: $output")
end

Input: 10,000; Output: 10000
Input: "270,000"; Output: 270000
Input: –; Output: nothing
Input: 1099; Output: 1099
Input: 28,637; Output: 28637


In [23]:
new = zeros(Int64, size(data))

for i in 1:size(data, 1)
    for j in 1:size(data, 2)
        result = convert2int(data[i, j])
        new[i, j] = result
    end
end

new

54×9 Matrix{Int64}:
 613  28637  11314  3804  2536  10675  4808  14122  3955
 606  28634  11314  3804  2536  10672  4808  14122  3955
 599  28635  11314  3805  2536  10672  4808  14122  3955
 592  28607  11314  3810  2536  10672  4808  14089  3955
 582  28539  11298  3806  2535  10672  4808  14061  3955
 575  28476  11298  3803  2535  10672  4808  14001  3955
 568  28454  11297  3800  2534  10672  4808  13982  3955
 554  28388  11296  3805  2533  10672  4808  13911  3955
 547  28295  11295  3800  2532  10672  4808  13823  3955
 540  28220  11291  3792  2530  10672  4808  13756  3953
   ⋮                                ⋮               
 114    982    613   411   310    174   106    397   197
 102    779    481   412   305    115    75    252   101
  87    528    337   398   264     33    24     97    49
  66    309    202   281   186     12    11     16     5
  51    260    182   248   171     12    11      0     0
  40    239    160   226   149     13    11      0     0
  23    176    

In [24]:
DelimitedFiles.writedlm("ebola_virus_epidemic_major_outbreaks_converted.csv", data, ',')