### Exemplo de uso do Multiple Dispatch, IO e DataFrames

In [1]:
;ls

bank-full.csv
banktest.jl
fileIO.ipynb
iris.csv
myexp.jl
normtype.jl
Preliminares.ipynb
README.md
tests.jl


In [24]:
include("myexp.jl")

myexp

In [25]:
?myexp

search: [0m[1mm[22m[0m[1my[22m[0m[1me[22m[0m[1mx[22m[0m[1mp[22m



```
myexp(x, n::Integer)
```

Retorna `x*x*...*x`, n vezes. Retorna o mesmo que `x*myexp(x, n-1)`, se n > 0.


In [4]:
include("banktest.jl")

movelast!

In [5]:
data, data_names = readdlm("bank-full.csv", ';', header=true)

(Any[58 "management" … "unknown" "no"; 44 "technician" … "unknown" "no"; … ; 57 "blue-collar" … "unknown" "no"; 37 "entrepreneur" … "other" "no"], AbstractString["age" "job" … "poutcome" "y"])

In [6]:
println(size(data))
print.(data_names[:]," ")
data[1:5,:]

(45211, 17)
age job marital education default balance housing loan contact day month duration campaign pdays previous poutcome y 

5×17 Array{Any,2}:
 58  "management"    "married"  …  261  1  -1  0  "unknown"  "no"
 44  "technician"    "single"      151  1  -1  0  "unknown"  "no"
 33  "entrepreneur"  "married"      76  1  -1  0  "unknown"  "no"
 47  "blue-collar"   "married"      92  1  -1  0  "unknown"  "no"
 33  "unknown"       "single"      198  1  -1  0  "unknown"  "no"

In [7]:
typeof.(data[1,:])

17-element Array{DataType,1}:
 Int64            
 SubString{String}
 SubString{String}
 SubString{String}
 SubString{String}
 Int64            
 SubString{String}
 SubString{String}
 SubString{String}
 Int64            
 SubString{String}
 Int64            
 Int64            
 Int64            
 Int64            
 SubString{String}
 SubString{String}

In [8]:
?preprocess

search: [0m[1mp[22m[0m[1mr[22m[0m[1me[22m[0m[1mp[22m[0m[1mr[22m[0m[1mo[22m[0m[1mc[22m[0m[1me[22m[0m[1ms[22m[0m[1ms[22m



```
preprocess(data, col[, levels]) -> Matrix{Float64}
```

Preprocesses col-th column of data. If the list of levels is provided preprocess is more efficient. When the levels of the selected column are nominal, preprocess returns a matrix of dummy indicator variables, one for each of the categories contained in the given column of matrix data, with except for the last level. This exception is a consequence of the last level being uniquely defined from the others. Including the last level would make the output matrix a singular one. When the levels of the selected column are numerical, preprocess returns a vector of those values converted to Float64.


In [9]:
D = buildmultivar(data, [1, 3])

45211×3 Array{Float64,2}:
 58.0  1.0  0.0
 44.0  0.0  1.0
 33.0  1.0  0.0
 47.0  1.0  0.0
 33.0  0.0  1.0
 35.0  1.0  0.0
 28.0  0.0  1.0
 42.0  0.0  0.0
 58.0  1.0  0.0
 43.0  0.0  1.0
 41.0  0.0  0.0
 29.0  0.0  1.0
 53.0  1.0  0.0
  ⋮            
 34.0  0.0  1.0
 38.0  1.0  0.0
 53.0  1.0  0.0
 34.0  0.0  1.0
 23.0  0.0  1.0
 73.0  1.0  0.0
 25.0  0.0  1.0
 51.0  1.0  0.0
 71.0  0.0  0.0
 72.0  1.0  0.0
 57.0  1.0  0.0
 37.0  1.0  0.0

In [10]:
using DataFrames

df = DataFrame(data, Symbol.(vec(data_names)))

Unnamed: 0,age,job,marital,education,default,balance,housing,loan,contact,day,month,duration,campaign,pdays,previous,poutcome,y
1,58,management,married,tertiary,no,2143,yes,no,unknown,5,may,261,1,-1,0,unknown,no
2,44,technician,single,secondary,no,29,yes,no,unknown,5,may,151,1,-1,0,unknown,no
3,33,entrepreneur,married,secondary,no,2,yes,yes,unknown,5,may,76,1,-1,0,unknown,no
4,47,blue-collar,married,unknown,no,1506,yes,no,unknown,5,may,92,1,-1,0,unknown,no
5,33,unknown,single,unknown,no,1,no,no,unknown,5,may,198,1,-1,0,unknown,no
6,35,management,married,tertiary,no,231,yes,no,unknown,5,may,139,1,-1,0,unknown,no
7,28,management,single,tertiary,no,447,yes,yes,unknown,5,may,217,1,-1,0,unknown,no
8,42,entrepreneur,divorced,tertiary,yes,2,yes,no,unknown,5,may,380,1,-1,0,unknown,no
9,58,retired,married,primary,no,121,yes,no,unknown,5,may,50,1,-1,0,unknown,no
10,43,technician,single,secondary,no,593,yes,no,unknown,5,may,55,1,-1,0,unknown,no


In [11]:
describe(df)

Unnamed: 0,variable,mean,min,median,max,nunique,nmissing,eltype
1,age,40.9362,18,,95,77,0,Any
2,job,,admin.,,unknown,12,0,Any
3,marital,,divorced,,single,3,0,Any
4,education,,primary,,unknown,4,0,Any
5,default,,no,,yes,2,0,Any
6,balance,1362.27,-8019,,102127,7168,0,Any
7,housing,,no,,yes,2,0,Any
8,loan,,no,,yes,2,0,Any
9,contact,,cellular,,unknown,3,0,Any
10,day,15.8064,1,,31,31,0,Any


In [12]:
tail(df)

Unnamed: 0,age,job,marital,education,default,balance,housing,loan,contact,day,month,duration,campaign,pdays,previous,poutcome,y
1,25,technician,single,secondary,no,505,no,yes,cellular,17,nov,386,2,-1,0,unknown,yes
2,51,technician,married,tertiary,no,825,no,no,cellular,17,nov,977,3,-1,0,unknown,yes
3,71,retired,divorced,primary,no,1729,no,no,cellular,17,nov,456,2,-1,0,unknown,yes
4,72,retired,married,secondary,no,5715,no,no,cellular,17,nov,1127,5,184,3,success,yes
5,57,blue-collar,married,secondary,no,668,no,no,telephone,17,nov,508,4,-1,0,unknown,no
6,37,entrepreneur,married,secondary,no,2971,no,no,cellular,17,nov,361,2,188,11,other,no
