In [70]:
# Installation cell
%%shell
if ! command -v julia 3>&1 > /dev/null
then
    wget 'https://julialang-s3.julialang.org/bin/linux/x64/1.3/julia-1.3.1-linux-x86_64.tar.gz' \
        -O /tmp/julia.tar.gz
    tar -x -f /tmp/julia.tar.gz -C /usr/local --strip-components 1
    rm /tmp/julia.tar.gz
fi
julia -e 'using Pkg; pkg"add IJulia; precompile;"'
echo 'Done'

Unrecognized magic `%%shell`.

Julia does not use the IPython `%magic` syntax.   To interact with the IJulia kernel, use `IJulia.somefunction(...)`, for example.  Julia macros, string macros, and functions can be used to accomplish most of the other functionalities of IPython magics.


After you run the first cell (the the cell directly above this text), go to Colab's Edit menu and select Notebook settings from the drop down. Select *Julia 1.3* as the runtime and *GPU* as the hadware accelerator.

<br/>You should see somthing like this:

> ![alt text](https://drive.google.com/uc?id=1AeglaLmWI-zRXPCErofIZ4BH9zvPCwNy)
<br/>Click on SAVE
<br/>**We are ready to get going**





In [72]:
#Julia 1.3 Environment
using Pkg
pkg"add BenchmarkTools; precompile;"
pkg"add CuArrays; precompile;"

[32m[1m Resolving[22m[39m package versions...
[32m[1m  Updating[22m[39m `~/.julia/environments/v1.3/Project.toml`
[90m [no changes][39m
[32m[1m  Updating[22m[39m `~/.julia/environments/v1.3/Manifest.toml`
[90m [no changes][39m
[32m[1mPrecompiling[22m[39m project...
[32m[1m Resolving[22m[39m package versions...
[32m[1m  Updating[22m[39m `~/.julia/environments/v1.3/Project.toml`
[90m [no changes][39m
[32m[1m  Updating[22m[39m `~/.julia/environments/v1.3/Manifest.toml`
[90m [no changes][39m
[32m[1mPrecompiling[22m[39m project...


The main reason we are interested in running Julia on Colab is the GPU functionality. We have already installed libraries in the previous cell, so let's benchmark Colab's GPU performance:

In [73]:
using BenchmarkTools

mcpu = rand(2^10, 2^10)
@benchmark mcpu*mcpu

BenchmarkTools.Trial: 
  memory estimate:  8.00 MiB
  allocs estimate:  2
  --------------
  minimum time:     52.451 ms (0.00% GC)
  median time:      53.807 ms (0.00% GC)
  mean time:        55.979 ms (1.39% GC)
  maximum time:     83.227 ms (9.75% GC)
  --------------
  samples:          90
  evals/sample:     1

In [74]:
using CuArrays

mgpu = cu(mcpu)
@benchmark CuArrays.@sync mgpu*mgpu

BenchmarkTools.Trial: 
  memory estimate:  336 bytes
  allocs estimate:  7
  --------------
  minimum time:     395.547 μs (0.00% GC)
  median time:      448.346 μs (0.00% GC)
  mean time:        467.162 μs (0.13% GC)
  maximum time:     9.255 ms (32.84% GC)
  --------------
  samples:          10000
  evals/sample:     1

The CuArrray operation should take around 1 ms, and should be much faster. If so, the GPU is working.

In [75]:
versioninfo()

Julia Version 1.3.1
Commit 2d5741174c (2019-12-30 21:36 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
  CPU: Intel(R) Xeon(R) CPU @ 2.20GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-6.0.1 (ORCJIT, broadwell)


# `*Starting JULIA with  a bang!!*`

#***Mini Projects cum Periodic Assessments***

##**Question 1:**

In [76]:
using Pkg
Pkg.add("CSV")

[32m[1m Resolving[22m[39m package versions...
[32m[1m  Updating[22m[39m `~/.julia/environments/v1.3/Project.toml`
[90m [no changes][39m
[32m[1m  Updating[22m[39m `~/.julia/environments/v1.3/Manifest.toml`
[90m [no changes][39m


In [77]:
using CSV
Pkg.add("DataFrames")

[32m[1m Resolving[22m[39m package versions...
[32m[1m  Updating[22m[39m `~/.julia/environments/v1.3/Project.toml`
[90m [no changes][39m
[32m[1m  Updating[22m[39m `~/.julia/environments/v1.3/Manifest.toml`
[90m [no changes][39m


In [78]:
using DataFrames

##**a. Load Titanic Dataset into Julia DataFrame using CSV package**

In [112]:
Titanic_df=CSV.read("datasets_11657_16098_train.csv")

│   caller = read(::String) at CSV.jl:40
└ @ CSV /root/.julia/packages/CSV/MKemC/src/CSV.jl:40


Unnamed: 0_level_0,PassengerId,Survived,Pclass,Name
Unnamed: 0_level_1,Int64,Int64,Int64,String
1,1,0,3,"Braund, Mr. Owen Harris"
2,2,1,1,"Cumings, Mrs. John Bradley (Florence Briggs Thayer)"
3,3,1,3,"Heikkinen, Miss. Laina"
4,4,1,1,"Futrelle, Mrs. Jacques Heath (Lily May Peel)"
5,5,0,3,"Allen, Mr. William Henry"
6,6,0,3,"Moran, Mr. James"
7,7,0,1,"McCarthy, Mr. Timothy J"
8,8,0,3,"Palsson, Master. Gosta Leonard"
9,9,1,3,"Johnson, Mrs. Oscar W (Elisabeth Vilhelmina Berg)"
10,10,1,2,"Nasser, Mrs. Nicholas (Adele Achem)"


##**b.Explore and demonstrate , Size, Type of Each Col, describe it, check missing and null values**

In [113]:
println(Titanic_df)

891×12 DataFrame
│ Row │ PassengerId │ Survived │ Pclass │ Name                                                                               │ Sex    │ Age      │ SibSp │ Parch │ Ticket             │ Fare    │ Cabin           │ Embarked │
│     │ [90mInt64[39m       │ [90mInt64[39m    │ [90mInt64[39m  │ [90mString[39m                                                                             │ [90mString[39m │ [90mFloat64?[39m │ [90mInt64[39m │ [90mInt64[39m │ [90mString[39m             │ [90mFloat64[39m │ [90mString?[39m         │ [90mString?[39m  │
├─────┼─────────────┼──────────┼────────┼────────────────────────────────────────────────────────────────────────────────────┼────────┼──────────┼───────┼───────┼────────────────────┼─────────┼─────────────────┼──────────┤
│ 1   │ 1           │ 0        │ 3      │ Braund, Mr. Owen Harris                                                            │ male   │ 22.0     │ 1     │ 0     │ A/5 21171          │ 7.25    │ 

In [114]:
size(Titanic_df)

(891, 12)

In [115]:
names(Titanic_df)

12-element Array{String,1}:
 "PassengerId"
 "Survived"   
 "Pclass"     
 "Name"       
 "Sex"        
 "Age"        
 "SibSp"      
 "Parch"      
 "Ticket"     
 "Fare"       
 "Cabin"      
 "Embarked"   

In [116]:
eltype.(eachcol(Titanic_df))

12-element Array{Type,1}:
 Int64                  
 Int64                  
 Int64                  
 String                 
 String                 
 Union{Missing, Float64}
 Int64                  
 Int64                  
 String                 
 Float64                
 Union{Missing, String} 
 Union{Missing, String} 

In [117]:
println(describe(Titanic_df))

12×8 DataFrame
│ Row │ variable    │ mean     │ min                 │ median  │ max                         │ nunique │ nmissing │ eltype                  │
│     │ [90mSymbol[39m      │ [90mUnion…[39m   │ [90mAny[39m                 │ [90mUnion…[39m  │ [90mAny[39m                         │ [90mUnion…[39m  │ [90mUnion…[39m   │ [90mType[39m                    │
├─────┼─────────────┼──────────┼─────────────────────┼─────────┼─────────────────────────────┼─────────┼──────────┼─────────────────────────┤
│ 1   │ PassengerId │ 446.0    │ 1                   │ 446.0   │ 891                         │         │          │ Int64                   │
│ 2   │ Survived    │ 0.383838 │ 0                   │ 0.0     │ 1                           │         │          │ Int64                   │
│ 3   │ Pclass      │ 2.30864  │ 1                   │ 3.0     │ 3                           │         │          │ Int64                   │
│ 4   │ Name        │          │ Abbing, Mr. Anthony 

In [118]:
println(first(Titanic_df,10))

10×12 DataFrame
│ Row │ PassengerId │ Survived │ Pclass │ Name                                                │ Sex    │ Age      │ SibSp │ Parch │ Ticket           │ Fare    │ Cabin   │ Embarked │
│     │ [90mInt64[39m       │ [90mInt64[39m    │ [90mInt64[39m  │ [90mString[39m                                              │ [90mString[39m │ [90mFloat64?[39m │ [90mInt64[39m │ [90mInt64[39m │ [90mString[39m           │ [90mFloat64[39m │ [90mString?[39m │ [90mString?[39m  │
├─────┼─────────────┼──────────┼────────┼─────────────────────────────────────────────────────┼────────┼──────────┼───────┼───────┼──────────────────┼─────────┼─────────┼──────────┤
│ 1   │ 1           │ 0        │ 3      │ Braund, Mr. Owen Harris                             │ male   │ 22.0     │ 1     │ 0     │ A/5 21171        │ 7.25    │ [90mmissing[39m │ S        │
│ 2   │ 2           │ 1        │ 1      │ Cumings, Mrs. John Bradley (Florence Briggs Thayer) │ female │ 38.0     │ 1     │ 0 

In [119]:
println(last(Titanic_df,10))

10×12 DataFrame
│ Row │ PassengerId │ Survived │ Pclass │ Name                                     │ Sex    │ Age      │ SibSp │ Parch │ Ticket           │ Fare    │ Cabin   │ Embarked │
│     │ [90mInt64[39m       │ [90mInt64[39m    │ [90mInt64[39m  │ [90mString[39m                                   │ [90mString[39m │ [90mFloat64?[39m │ [90mInt64[39m │ [90mInt64[39m │ [90mString[39m           │ [90mFloat64[39m │ [90mString?[39m │ [90mString?[39m  │
├─────┼─────────────┼──────────┼────────┼──────────────────────────────────────────┼────────┼──────────┼───────┼───────┼──────────────────┼─────────┼─────────┼──────────┤
│ 1   │ 882         │ 0        │ 3      │ Markun, Mr. Johann                       │ male   │ 33.0     │ 0     │ 0     │ 349257           │ 7.8958  │ [90mmissing[39m │ S        │
│ 2   │ 883         │ 0        │ 3      │ Dahlberg, Miss. Gerda Ulrika             │ female │ 22.0     │ 0     │ 0     │ 7552             │ 10.5167 │ [90mmissing[39m │ 

In [121]:
println(ismissing.(Titanic_df))

891×12 DataFrame
│ Row │ PassengerId │ Survived │ Pclass │ Name │ Sex  │ Age  │ SibSp │ Parch │ Ticket │ Fare │ Cabin │ Embarked │
│     │ [90mBool[39m        │ [90mBool[39m     │ [90mBool[39m   │ [90mBool[39m │ [90mBool[39m │ [90mBool[39m │ [90mBool[39m  │ [90mBool[39m  │ [90mBool[39m   │ [90mBool[39m │ [90mBool[39m  │ [90mBool[39m     │
├─────┼─────────────┼──────────┼────────┼──────┼──────┼──────┼───────┼───────┼────────┼──────┼───────┼──────────┤
│ 1   │ 0           │ 0        │ 0      │ 0    │ 0    │ 0    │ 0     │ 0     │ 0      │ 0    │ 1     │ 0        │
│ 2   │ 0           │ 0        │ 0      │ 0    │ 0    │ 0    │ 0     │ 0     │ 0      │ 0    │ 0     │ 0        │
│ 3   │ 0           │ 0        │ 0      │ 0    │ 0    │ 0    │ 0     │ 0     │ 0      │ 0    │ 1     │ 0        │
│ 4   │ 0           │ 0        │ 0      │ 0    │ 0    │ 0    │ 0     │ 0     │ 0      │ 0    │ 0     │ 0        │
│ 5   │ 0           │ 0        │ 0      │ 0    │ 0    │ 0    │ 0 

In [None]:
isempty("Titanic_df")

false

##**c. Create a separate DataFrame for categorical features(other than numeric features)**

In [125]:
println(skipmissing(Titanic_df))

Base.SkipMissing{DataFrame}(891×12 DataFrame
│ Row │ PassengerId │ Survived │ Pclass │ Name                                                                               │ Sex    │ Age      │ SibSp │ Parch │ Ticket             │ Fare    │ Cabin           │ Embarked │
│     │ [90mInt64[39m       │ [90mInt64[39m    │ [90mInt64[39m  │ [90mString[39m                                                                             │ [90mString[39m │ [90mFloat64?[39m │ [90mInt64[39m │ [90mInt64[39m │ [90mString[39m             │ [90mFloat64[39m │ [90mString?[39m         │ [90mString?[39m  │
├─────┼─────────────┼──────────┼────────┼────────────────────────────────────────────────────────────────────────────────────┼────────┼──────────┼───────┼───────┼────────────────────┼─────────┼─────────────────┼──────────┤
│ 1   │ 1           │ 0        │ 3      │ Braund, Mr. Owen Harris                                                            │ male   │ 22.0     │ 1     │ 0     │ A/5

In [144]:
CategoricalArrays.categorical(Titanic_df)

Unnamed: 0_level_0,PassengerId,Survived,Pclass,Name
Unnamed: 0_level_1,Int64,Int64,Int64,Cat…
1,1,0,3,"Braund, Mr. Owen Harris"
2,2,1,1,"Cumings, Mrs. John Bradley (Florence Briggs Thayer)"
3,3,1,3,"Heikkinen, Miss. Laina"
4,4,1,1,"Futrelle, Mrs. Jacques Heath (Lily May Peel)"
5,5,0,3,"Allen, Mr. William Henry"
6,6,0,3,"Moran, Mr. James"
7,7,0,1,"McCarthy, Mr. Timothy J"
8,8,0,3,"Palsson, Master. Gosta Leonard"
9,9,1,3,"Johnson, Mrs. Oscar W (Elisabeth Vilhelmina Berg)"
10,10,1,2,"Nasser, Mrs. Nicholas (Adele Achem)"


##**d.Sort DataFrame according to Names**

In [140]:
println(sort(Titanic_df, :Name))

891×12 DataFrame
│ Row │ PassengerId │ Survived │ Pclass │ Name                                                                               │ Sex    │ Age      │ SibSp │ Parch │ Ticket             │ Fare    │ Cabin           │ Embarked │
│     │ [90mInt64[39m       │ [90mInt64[39m    │ [90mInt64[39m  │ [90mString[39m                                                                             │ [90mString[39m │ [90mFloat64?[39m │ [90mInt64[39m │ [90mInt64[39m │ [90mString[39m             │ [90mFloat64[39m │ [90mString?[39m         │ [90mString?[39m  │
├─────┼─────────────┼──────────┼────────┼────────────────────────────────────────────────────────────────────────────────────┼────────┼──────────┼───────┼───────┼────────────────────┼─────────┼─────────────────┼──────────┤
│ 1   │ 846         │ 0        │ 3      │ Abbing, Mr. Anthony                                                                │ male   │ 42.0     │ 0     │ 0     │ C.A. 5547          │ 7.55    │ 

##**e.Analysis of survived columns according to category(survived, died)**

In [None]:
describing the survived data
using DataFrames
Survi=DataFrame()
Survi=by(del,:Survived,nrow,sort=true,skipmissing=true)
println(Survi)

In [None]:
count=0
counti=0
for i in Survi.Survived
    if(i==0)
        count=count+1
     
    elseif(i==1)
        counti=counti+1
    else
        println("01")
    end
end
   Survi.freq=[count/nrow(del),counti/nrow(del)]
println(freq)

In [None]:
#rename!(Survi,:nrow=>:count)
println(Survi)

##**f.Analysis of Pclass columns according to category(first, second,third)class**

In [None]:
class=DataFrame()
class=by(del,:Pclass,nrow,sort=true,skipmissing=true)

rename!(class,:nrow=>:count)
println(class)

In [None]:
count=0
counti=0
countr=0
for i in del.Pclass
    if(i==1)
        count=count+1
    elseif(i==2)
        counti=counti+1
    else
         countr=countr+1
         
    end
end
   class.freq=[count/nrow(del),counti/nrow(del),countr/nrow(del)]
   
println(class)

##**g.Analysis of sex column according to category(male, female)- result should be same as above**

In [None]:
sex=DataFrame()
sex=by(Titanic_df, :Sex, nrow)
println(sex)

│   caller = top-level scope at In[30]:2
└ @ Core In[30]:2


2×2 DataFrame
│ Row │ Sex    │ nrow  │
│     │ [90mString[39m │ [90mInt64[39m │
├─────┼────────┼───────┤
│ 1   │ male   │ 577   │
│ 2   │ female │ 314   │


#***2. Create the following given DataFrame, and perform below given operations.***

In [79]:
Air_df=DataFrame(From_To= ["LoNDon_paris", "MAdrid_miLAN", "londON_StockhOlm","Budapest_PaRis","Brussels_londOn"],
FlightNumber=[10045,NaN,10065,NaN,10085],
RecentDelays=[[23,47],"[]",[24,43,87],[13],[67,32]],
Airline=["KLM(!)","<Air France> (12)","(British Airways. )",
"12. Air France","Swiss Air"]);

In [80]:
println(Air_df)

5×4 DataFrame
│ Row │ From_To          │ FlightNumber │ RecentDelays │ Airline             │
│     │ [90mString[39m           │ [90mFloat64[39m      │ [90mAny[39m          │ [90mString[39m              │
├─────┼──────────────────┼──────────────┼──────────────┼─────────────────────┤
│ 1   │ LoNDon_paris     │ 10045.0      │ [23, 47]     │ KLM(!)              │
│ 2   │ MAdrid_miLAN     │ NaN          │ []           │ <Air France> (12)   │
│ 3   │ londON_StockhOlm │ 10065.0      │ [24, 43, 87] │ (British Airways. ) │
│ 4   │ Budapest_PaRis   │ NaN          │ [13]         │ 12. Air France      │
│ 5   │ Brussels_londOn  │ 10085.0      │ [67, 32]     │ Swiss Air           │


###a. Some values in the the FlightNumber column are missing. These numbers are meant to increase by 10 with each row so 10055 and 10075 need to be put in place. Fill in these missing numbers and make the column an integer column (instead of a float column).

In [81]:
Air_df[2,:FlightNumber] = 10055.0    
Air_df[4,:FlightNumber] = 10075.0  

10075.0

In [82]:
println(Air_df)

5×4 DataFrame
│ Row │ From_To          │ FlightNumber │ RecentDelays │ Airline             │
│     │ [90mString[39m           │ [90mFloat64[39m      │ [90mAny[39m          │ [90mString[39m              │
├─────┼──────────────────┼──────────────┼──────────────┼─────────────────────┤
│ 1   │ LoNDon_paris     │ 10045.0      │ [23, 47]     │ KLM(!)              │
│ 2   │ MAdrid_miLAN     │ 10055.0      │ []           │ <Air France> (12)   │
│ 3   │ londON_StockhOlm │ 10065.0      │ [24, 43, 87] │ (British Airways. ) │
│ 4   │ Budapest_PaRis   │ 10075.0      │ [13]         │ 12. Air France      │
│ 5   │ Brussels_londOn  │ 10085.0      │ [67, 32]     │ Swiss Air           │


###b. The From_To column would be better as two separate columns! Spliteach string on the underscore delimiter _ to give a new temporary DataFrame with the correct values. Assign the correct column names to this temporary DataFrame.

In [83]:
data = split.(Air_df.From_To, '_')

5-element Array{Array{SubString{String},1},1}:
 ["LoNDon", "paris"]    
 ["MAdrid", "miLAN"]    
 ["londON", "StockhOlm"]
 ["Budapest", "PaRis"]  
 ["Brussels", "londOn"] 

In [84]:
foreach(enumerate([:From, :To])) do (i, n)
           Air_df[!, n] = getindex.(data, i)
       end

In [85]:
println(Air_df)

5×6 DataFrame
│ Row │ From_To          │ FlightNumber │ RecentDelays │ Airline             │ From     │ To        │
│     │ [90mString[39m           │ [90mFloat64[39m      │ [90mAny[39m          │ [90mString[39m              │ [90mSubStri…[39m │ [90mSubStrin…[39m │
├─────┼──────────────────┼──────────────┼──────────────┼─────────────────────┼──────────┼───────────┤
│ 1   │ LoNDon_paris     │ 10045.0      │ [23, 47]     │ KLM(!)              │ LoNDon   │ paris     │
│ 2   │ MAdrid_miLAN     │ 10055.0      │ []           │ <Air France> (12)   │ MAdrid   │ miLAN     │
│ 3   │ londON_StockhOlm │ 10065.0      │ [24, 43, 87] │ (British Airways. ) │ londON   │ StockhOlm │
│ 4   │ Budapest_PaRis   │ 10075.0      │ [13]         │ 12. Air France      │ Budapest │ PaRis     │
│ 5   │ Brussels_londOn  │ 10085.0      │ [67, 32]     │ Swiss Air           │ Brussels │ londOn    │


###c. Notice how the capitalisation of the city names is all mixed up in this temporary DataFrame. Standardise the strings so that only the first letter is uppercase (e.g. "londON" should become "London".)

In [86]:
From=[]
To=[]
for i in Air_df[!,:From]
   push!(From,lowercase(i))
end
println(From)
Air_df.From=From
for i in Air_df[!,:To]
   push!(To,lowercase(i))
end
println(To)
Air_df.To=To

Any["london", "madrid", "london", "budapest", "brussels"]
Any["paris", "milan", "stockholm", "paris", "london"]


5-element Array{Any,1}:
 "paris"    
 "milan"    
 "stockholm"
 "paris"    
 "london"   

In [123]:
From=[]
To=[]
for i in Air_df[!,:From]
   push!(From,uppercasefirst(i))
end
println(From)
Air_df.From=From
for i in Air_df[!,:To]
   push!(To,uppercasefirst(i))
end
println(To)
Air_df.To=To

Any["London", "Madrid", "London", "Budapest", "Brussels"]
Any["Paris", "Milan", "Stockholm", "Paris", "London"]


5-element Array{Any,1}:
 "Paris"    
 "Milan"    
 "Stockholm"
 "Paris"    
 "London"   

In [124]:
println(Air_df)

5×5 DataFrame
│ Row │ FlightNumber │ RecentDelays │ Airline             │ From     │ To        │
│     │ [90mFloat64[39m      │ [90mAny[39m          │ [90mString[39m              │ [90mAny[39m      │ [90mAny[39m       │
├─────┼──────────────┼──────────────┼─────────────────────┼──────────┼───────────┤
│ 1   │ 10045.0      │ [23, 47]     │ KLM(!)              │ London   │ Paris     │
│ 2   │ 10055.0      │ []           │ <Air France> (12)   │ Madrid   │ Milan     │
│ 3   │ 10065.0      │ [24, 43, 87] │ (British Airways. ) │ London   │ Stockholm │
│ 4   │ 10075.0      │ [13]         │ 12. Air France      │ Budapest │ Paris     │
│ 5   │ 10085.0      │ [67, 32]     │ Swiss Air           │ Brussels │ London    │


## d. Delete the From_To column from df and attach the temporary .DataFrame from the previous questions.


In [88]:
deletecols!(Air_df, 1)

│   caller = top-level scope at In[88]:1
└ @ Core In[88]:1


Unnamed: 0_level_0,FlightNumber,RecentDelays,Airline,From,To
Unnamed: 0_level_1,Float64,Any,String,Any,Any
1,10045.0,"[23, 47]",KLM(!),london,paris
2,10055.0,[],<Air France> (12),madrid,milan
3,10065.0,"[24, 43, 87]",(British Airways. ),london,stockholm
4,10075.0,[13],12. Air France,budapest,paris
5,10085.0,"[67, 32]",Swiss Air,brussels,london


In [89]:
println(Air_df)

5×5 DataFrame
│ Row │ FlightNumber │ RecentDelays │ Airline             │ From     │ To        │
│     │ [90mFloat64[39m      │ [90mAny[39m          │ [90mString[39m              │ [90mAny[39m      │ [90mAny[39m       │
├─────┼──────────────┼──────────────┼─────────────────────┼──────────┼───────────┤
│ 1   │ 10045.0      │ [23, 47]     │ KLM(!)              │ london   │ paris     │
│ 2   │ 10055.0      │ []           │ <Air France> (12)   │ madrid   │ milan     │
│ 3   │ 10065.0      │ [24, 43, 87] │ (British Airways. ) │ london   │ stockholm │
│ 4   │ 10075.0      │ [13]         │ 12. Air France      │ budapest │ paris     │
│ 5   │ 10085.0      │ [67, 32]     │ Swiss Air           │ brussels │ london    │


## e. In the RecentDelays column, the values have been entered into the DataFrame as a list. We would like each first value in its own column, each second value in its own column, and so on. If there isn't an Nth value, the value should be NaN.

In [126]:
for (a,b) in split.(Air_df.RecentDelays)

LoadError: ignored

In [127]:
for i in 2:3
       Air_df[:, "RD$i"] = getindex.(Air_df.RecentDelays, i)
end

BoundsError: ignored

In [128]:
Pkg.add("Unitful")
using Unitful

[32m[1m Resolving[22m[39m package versions...
[32m[1m Installed[22m[39m ConstructionBase ─ v1.0.0
[32m[1m Installed[22m[39m Unitful ────────── v1.4.0
[32m[1m  Updating[22m[39m `~/.julia/environments/v1.3/Project.toml`
 [90m [1986cc42][39m[92m + Unitful v1.4.0[39m
[32m[1m  Updating[22m[39m `~/.julia/environments/v1.3/Manifest.toml`
 [90m [187b0558][39m[92m + ConstructionBase v1.0.0[39m
 [90m [1986cc42][39m[92m + Unitful v1.4.0[39m


┌ Info: Precompiling Unitful [1986cc42-f94f-5a68-af5c-568840ba703d]
└ @ Base loading.jl:1273


In [129]:
RD_test = for i in Air_df.RecentDelays 

LoadError: ignored

In [130]:
eltype.(eachcol(Air_df.RecentDelays))

1-element Array{DataType,1}:
 Any

In [131]:
Air_df[!,:RecentDelays] = convert.(Array{Int64},Air_df[!,:RecentDelays])

MethodError: ignored

In [132]:
 RecentDelays = [ ] 
  for i in Air_df[!,:RecentDelays]
   Air_df[!,:RecentDelays] = convert.(Array{Int64},Air_df[!,:RecentDelays])
end
print(Air_df.RecentDelays)

MethodError: ignored

In [133]:

for row in eachrow(describe(Air_df))
    if row[:eltype] === Any
         Air_df[:, "RD$row"] = getindex.(Air_df.RecentDelays, row)
     end
end        

ArgumentError: ignored

In [134]:
Air_df[!,:RecentDelays] = map(x -> ismissing(x) ? NA : convert(Array, x), Air_df[!,:RecentDelays])

MethodError: ignored

In [135]:
len = length(Air_df[!,:RecentDelays])
function chunkIt(seq, num)
    avg = len(seq) / float(num)
    out = []
    last = 0.0

    while last < len(seq)
        out.append(seq[int(last):int(last + avg)])
        last += avg
      end
    return out
    end

chunkIt (generic function with 1 method)

In [139]:
for row in eachrow(describe(Air_df))
    if row[:eltype] === Any
         Air_df[:, "RD$row"] = chunkIt(Air_df[row,:RecentDelays], 1) 
     end
end 

MethodError: ignored

In [137]:
for row in eachrow(describe(Air_df))
    if row[:eltype] === Any
         Air_df[!, "RD$row"] = getindex(Air_df.RecentDelays, row)
     end
end

ArgumentError: ignored

In [138]:
println(Air_df)

5×5 DataFrame
│ Row │ FlightNumber │ RecentDelays │ Airline             │ From     │ To        │
│     │ [90mFloat64[39m      │ [90mAny[39m          │ [90mString[39m              │ [90mAny[39m      │ [90mAny[39m       │
├─────┼──────────────┼──────────────┼─────────────────────┼──────────┼───────────┤
│ 1   │ 10045.0      │ [23, 47]     │ KLM(!)              │ London   │ Paris     │
│ 2   │ 10055.0      │ []           │ <Air France> (12)   │ Madrid   │ Milan     │
│ 3   │ 10065.0      │ [24, 43, 87] │ (British Airways. ) │ London   │ Stockholm │
│ 4   │ 10075.0      │ [13]         │ 12. Air France      │ Budapest │ Paris     │
│ 5   │ 10085.0      │ [67, 32]     │ Swiss Air           │ Brussels │ London    │
