# How to initialize empty columns in a dataframe

Sometimes, in dealing with dataframes, we need to initialize empty columns to be filled later. 

In [43]:
using DataFrames

In [52]:
name = ["Julia", "Mike", "Tom", "John"]
x = [2, 3, 4, 7]
y = [9, 3, 6, 5]
df = DataFrame(:name => name, :var1 => x, :var2 => y)

Unnamed: 0_level_0,name,var1,var2
Unnamed: 0_level_1,String,Int64,Int64
1,Julia,2,9
2,Mike,3,3
3,Tom,4,6
4,John,7,5


Now, we want an empty column called `var3`. How can we do that?

It's easy to initialize an empty vector: `v = Vector[]`. However, doing this will throw an error in a `df` because we need to specify the length of this vector:

In [53]:
df[:, :var3] = Vector[]

LoadError: ArgumentError: New columns must have the same length as old columns

This is how we solve it:

In [54]:
df[:, :var3] = Vector{String}(undef, size(df)[1])

4-element Vector{String}:
 #undef
 #undef
 #undef
 #undef

In [55]:
df

Unnamed: 0_level_0,name,var1,var2,var3
Unnamed: 0_level_1,String,Int64,Int64,String
1,Julia,2,9,#undef
2,Mike,3,3,#undef
3,Tom,4,6,#undef
4,John,7,5,#undef


## How to add multiple columns with a `Tuple`

Sometimes, we want to add multiple columns. We can accomplish this with tuples very easily. 

We can first create a tuple of tuples:

In [56]:
vAndT = ((:var4, String), (:var5, Union{Missing, Int64}), (:var6, Union{Missing, String}))
# v stands for variable, t stands for type

((:var4, String), (:var5, Union{Missing, Int64}), (:var6, Union{Missing, String}))

In [57]:
typeof(vAndT)

Tuple{Tuple{Symbol, DataType}, Tuple{Symbol, Union}, Tuple{Symbol, Union}}

In [58]:
for (v, t) in vAndT
    df[:, v] = Vector{t}(undef, size(df)[1])
end

In [59]:
df

Unnamed: 0_level_0,name,var1,var2,var3,var4,var5,var6
Unnamed: 0_level_1,String,Int64,Int64,String,String,Int64?,String?
1,Julia,2,9,#undef,#undef,missing,#undef
2,Mike,3,3,#undef,#undef,missing,#undef
3,Tom,4,6,#undef,#undef,missing,#undef
4,John,7,5,#undef,#undef,missing,#undef
