# A deep dive into DataFrames.jl indexing
# Part 2: implementation of indexing in DataFrames.jl
### Bogumił Kamiński

In this part we will not cover all scenarios of implementation of indexing in DataFrames.jl, but rather I will focus on scenarios that are non-obvious (at least for me).

This tutorial is tested under Julia 1.8.2.

In general to provide support for indexing and broadcasting for your type you should follow instructions contained in the [Julia manual](https://docs.julialang.org/en/v1/).

Actually, effectively, this tutorial is mostly about how you can dig into what Julia is doing under the hood when processing your code.

Also I hope it will show package developers how hard it is to define your own types that fully support indexing/broadcasting.

Finally, this notebook is more advanced and I refer to the source code a lot. I expect that it will be hard to follow it without watching the video recording of the tutorial during JuliaCon2020.

In [1]:
using DataFrames

#### Example 1: Consequences of the fact that `DataFrame` can be resized

In [2]:
df = DataFrame()

In [3]:
size(df)

(0, 0)

we get that number of rows is `0` but actually for `setindex!` and `setproperty!` it is treated as *undefined*

In [4]:
df.x = [1, 2, 3]

3-element Vector{Int64}:
 1
 2
 3

In [5]:
df

Row,x
Unnamed: 0_level_1,Int64
1,1
2,2
3,3


In [6]:
df.y = [1, 2]

LoadError: ArgumentError: New columns must have the same length as old columns

In [7]:
@less df.y = [1, 2]

Base.setproperty!(df::DataFrame, col_ind::Symbol, v::AbstractVector) =
    (df[!, col_ind] = v)
Base.setproperty!(df::DataFrame, col_ind::AbstractString, v::AbstractVector) =
    (df[!, col_ind] = v)
Base.setproperty!(::DataFrame, col_ind::Symbol, v::Any) =
    throw(ArgumentError("It is only allowed to pass a vector as a column of a DataFrame. " *
                        "Instead use `df[!, col_ind] .= v` if you want to use broadcasting."))
Base.setproperty!(::DataFrame, col_ind::AbstractString, v::Any) =
    throw(ArgumentError("It is only allowed to pass a vector as a column of a DataFrame. " *
                        "Instead use `df[!, col_ind] .= v` if you want to use broadcasting."))

# df[SingleRowIndex, SingleColumnIndex] = Single Item
function Base.setindex!(df::DataFrame, v::Any, row_ind::Integer, col_ind::ColumnIndex)
    insert_single_entry!(df, v, row_ind, col_ind)
    return df
end

# df[SingleRowIndex, MultiColumnIndex] = value
# the method for value o

function keepat!(df::DataFrame, inds::Integer)
    inds isa Bool && throw(ArgumentError("Invalid index of type Bool"))
    return deleteat!(df, Not(Int[inds]))
end

keepat!(df::DataFrame, inds::AbstractVector{Bool}) = deleteat!(df, .!inds)
keepat!(df::DataFrame, inds::Not) = deleteat!(df, Not(inds))

"""
    empty!(df::DataFrame)

Remove all rows from `df`, making each of its columns empty.

$METADATA_FIXED

# Examples
```jldoctest
julia> df = DataFrame(a=1:3, b=4:6)
3×2 DataFrame
 Row │ a      b
     │ Int64  Int64
─────┼──────────────
   1 │     1      4
   2 │     2      5
   3 │     3      6

julia> empty!(df)
0×2 DataFrame

julia> df.a, df.b
(Int64[], Int64[])
```

"""
function Base.empty!(df::DataFrame)
    foreach(empty!, eachcol(df))
    _drop_all_nonnote_metadata!(df)
    return df
end

"""
    resize!(df::DataFrame, n::Integer)

Resize `df` to have `n` rows by calling `resize!` on all columns of `df`.

$METADATA_FIXED

# Examples

     │ Int64  Int64
─────┼──────────────
   1 │     1      3
   2 │     1      3
   3 │     2      4
   4 │     2      4
   5 │     1      3
   6 │     1      3
   7 │     2      4
   8 │     2      4
   9 │     1      3
  10 │     1      3
  11 │     2      4
  12 │     2      4
```
"""
function repeat!(df::DataFrame; inner::Integer=1, outer::Integer=1)
    inner < 0 && throw(ArgumentError("inner keyword argument must be non-negative"))
    outer < 0 && throw(ArgumentError("outer keyword argument must be non-negative"))
    cols = _columns(df)
    for (i, col) in enumerate(cols)
        col_new = repeat(col, inner=Int(inner), outer=Int(outer))
        firstindex(col_new) != 1 && _onebased_check_error(i, col_new)
        cols[i] = col_new
    end
    _drop_all_nonnote_metadata!(df)
    return df
end

"""
    repeat!(df::DataFrame, count::Integer)

Update a data frame `df` in-place by repeating its rows the number of times
specified by `count`. Columns o

In [8]:
@less df[!, :y] = [1, 2]

function Base.setindex!(df::DataFrame, v::AbstractVector, ::typeof(!), col_ind::ColumnIndex)
    insert_single_column!(df, v, col_ind)
    return df
end

# df.col = AbstractVector
# separate methods are needed due to dispatch ambiguity
Base.setproperty!(df::DataFrame, col_ind::Symbol, v::AbstractVector) =
    (df[!, col_ind] = v)
Base.setproperty!(df::DataFrame, col_ind::AbstractString, v::AbstractVector) =
    (df[!, col_ind] = v)
Base.setproperty!(::DataFrame, col_ind::Symbol, v::Any) =
    throw(ArgumentError("It is only allowed to pass a vector as a column of a DataFrame. " *
                        "Instead use `df[!, col_ind] .= v` if you want to use broadcasting."))
Base.setproperty!(::DataFrame, col_ind::AbstractString, v::Any) =
    throw(ArgumentError("It is only allowed to pass a vector as a column of a DataFrame. " *
                        "Instead use `df[!, col_ind] .= v` if you want to use broadcasting."))

# df[SingleRowIndex, SingleColumnIndex] = Sin

    end

    if !issorted(inds, lt=<=)
        throw(ArgumentError("Indices passed to keepat! must be unique and sorted"))
    end

    return deleteat!(df, Not(inds))
end

function keepat!(df::DataFrame, inds::Integer)
    inds isa Bool && throw(ArgumentError("Invalid index of type Bool"))
    return deleteat!(df, Not(Int[inds]))
end

keepat!(df::DataFrame, inds::AbstractVector{Bool}) = deleteat!(df, .!inds)
keepat!(df::DataFrame, inds::Not) = deleteat!(df, Not(inds))

"""
    empty!(df::DataFrame)

Remove all rows from `df`, making each of its columns empty.

$METADATA_FIXED

# Examples
```jldoctest
julia> df = DataFrame(a=1:3, b=4:6)
3×2 DataFrame
 Row │ a      b
     │ Int64  Int64
─────┼──────────────
   1 │     1      4
   2 │     2      5
   3 │     3      6

julia> empty!(df)
0×2 DataFrame

julia> df.a, df.b
(Int64[], Int64[])
```

"""
function Base.empty!(df::DataFrame)
    foreach(empty!, eachcol(df))
    _drop_all_nonnote_metadata

julia> df
12×2 DataFrame
 Row │ a      b
     │ Int64  Int64
─────┼──────────────
   1 │     1      3
   2 │     1      3
   3 │     2      4
   4 │     2      4
   5 │     1      3
   6 │     1      3
   7 │     2      4
   8 │     2      4
   9 │     1      3
  10 │     1      3
  11 │     2      4
  12 │     2      4
```
"""
function repeat!(df::DataFrame; inner::Integer=1, outer::Integer=1)
    inner < 0 && throw(ArgumentError("inner keyword argument must be non-negative"))
    outer < 0 && throw(ArgumentError("outer keyword argument must be non-negative"))
    cols = _columns(df)
    for (i, col) in enumerate(cols)
        col_new = repeat(col, inner=Int(inner), outer=Int(outer))
        firstindex(col_new) != 1 && _onebased_check_error(i, col_new)
        cols[i] = col_new
    end
    _drop_all_nonnote_metadata!(df)
    return df
end

"""
    repeat!(df::DataFrame, count::Integer)

Update a data frame `df` in-place by repeating its rows the numb

In [9]:
@less DataFrames.insert_single_column!(df, [1, 2], :y)

function insert_single_column!(df::DataFrame, v::AbstractVector, col_ind::ColumnIndex)
    if ncol(df) != 0 && nrow(df) != length(v)
        throw(ArgumentError("New columns must have the same length as old columns"))
    end
    dv = isa(v, AbstractRange) ? collect(v) : v
    firstindex(dv) != 1 && _onebased_check_error()

    if haskey(index(df), col_ind)
        j = index(df)[col_ind]
        _columns(df)[j] = dv
    else
        if col_ind isa SymbolOrString
            push!(index(df), Symbol(col_ind))
            push!(_columns(df), dv)
        else
            throw(ArgumentError("Cannot assign to non-existent column: $col_ind"))
        end
    end
    _drop_all_nonnote_metadata!(df)
    return dv
end

function insert_single_entry!(df::DataFrame, v::Any, row_ind::Integer, col_ind::ColumnIndex)
    if haskey(index(df), col_ind)
        _columns(df)[index(df)[col_ind]][row_ind] = v
        _drop_all_nonnote_metadata!(df)
        return v
    else
     

a vector of sorted and unique integers, a boolean vector, an integer,
or `Not` wrapping any valid selector.

$METADATA_FIXED

# Examples
```jldoctest
julia> df = DataFrame(a=1:3, b=4:6)
3×2 DataFrame
 Row │ a      b
     │ Int64  Int64
─────┼──────────────
   1 │     1      4
   2 │     2      5
   3 │     3      6

julia> keepat!(df, [1, 3])
2×2 DataFrame
 Row │ a      b
     │ Int64  Int64
─────┼──────────────
   1 │     1      4
   2 │     3      6
```
"""
keepat!(df::DataFrame, inds)

function keepat!(df::DataFrame, ::Colon)
    _drop_all_nonnote_metadata!(df)
    return df
end

function keepat!(df::DataFrame, inds::AbstractVector)
    isempty(inds) && return empty!(df)

    # this is required because of https://github.com/JuliaData/InvertedIndices.jl/issues/31
    if !((eltype(inds) <: Integer) || all(x -> x isa Integer, inds))
        throw(ArgumentError("unsupported index $inds"))
    end

    if Bool <: eltype(inds) && any(x -> x isa Bool

function disallowmissing!(df::DataFrame, cols::AbstractVector{Bool}; error::Bool=true)
    length(cols) == size(df, 2) || throw(BoundsError(df, (!, cols)))
    for (col, cond) in enumerate(cols)
        cond && disallowmissing!(df, col, error=error)
    end
    _drop_all_nonnote_metadata!(df)
    return df
end

disallowmissing!(df::DataFrame, cols::MultiColumnIndex; error::Bool=true) =
    disallowmissing!(df, index(df)[cols], error=error)

disallowmissing!(df::DataFrame, cols::Colon=:; error::Bool=true) =
    disallowmissing!(df, axes(df, 2), error=error)

"""
    repeat!(df::DataFrame; inner::Integer=1, outer::Integer=1)

Update a data frame `df` in-place by repeating its rows. `inner` specifies how many
times each row is repeated, and `outer` specifies how many times the full set
of rows is repeated. Columns of `df` are freshly allocated.

$METADATA_FIXED

# Example
```jldoctest
julia> df = DataFrame(a=1:2, b=3:4)
2×2 DataFrame
 Row │ a      b
     │ Int

Note that for `broadcast!` it is treated as `0` rows to be consistent with the value returned by `size`:

In [10]:
df = DataFrame()
df[!, :x] .= 1

Int64[]

In [11]:
df

However, pseudo-broadcasting provided by DataFrames.jl in `DataFrame`, `insertcols!` and `combine` broadcasts scalars into 1-row, as usually this is what the user expects.

In [12]:
df = DataFrame(:a => 1)

Row,a
Unnamed: 0_level_1,Int64
1,1


In [13]:
insertcols!(DataFrame(), :a => 1)

Row,a
Unnamed: 0_level_1,Int64
1,1


In [14]:
combine(DataFrame(), nrow)

Row,nrow
Unnamed: 0_level_1,Int64
1,0


but not in `select` and `transform` as in this case we keep the number of rows in the source:

In [15]:
select(DataFrame(), nrow)

In [16]:
transform(DataFrame(), nrow)

#### Example 2: broadcasting assignment of getproperty

In [17]:
df = DataFrame(x=1:2)

Row,x
Unnamed: 0_level_1,Int64
1,1
2,2


Since Julia 1.7 this works (`:y` is a new column):

In [18]:
df.y .= 2

2-element Vector{Int64}:
 2
 2

and this works:

In [19]:
df[!, :z] .= 1

2-element Vector{Int64}:
 1
 1

In [20]:
df

Row,x,y,z
Unnamed: 0_level_1,Int64,Int64,Int64
1,1,2,1
2,2,2,1


Here is the way to check what is going on:

In [21]:
@code_warntype (df -> df.zz .= 1)(df)

MethodInstance for (::var"#1#2")(::DataFrame)
  from (::var"#1#2")(df) in Main at In[21]:1
Arguments
  #self#[36m::Core.Const(var"#1#2"())[39m
  df[36m::DataFrame[39m
Body[36m::Vector{Int64}[39m
[90m1 ─[39m %1 = Base.dotgetproperty(df, :zz)[36m::Core.PartialStruct(DataFrames.LazyNewColDataFrame{Symbol, DataFrame}, Any[DataFrame, Core.Const(:zz)])[39m
[90m│  [39m %2 = Base.broadcasted(Base.identity, 1)[36m::Core.Const(Base.Broadcast.Broadcasted(identity, (1,)))[39m
[90m│  [39m %3 = Base.materialize!(%1, %2)[36m::Vector{Int64}[39m
[90m└──[39m      return %3



vs

In [22]:
@code_warntype (df -> df[:, :zz] .= 1)(df)

MethodInstance for (::var"#3#4")(::DataFrame)
  from (::var"#3#4")(df) in Main at In[22]:1
Arguments
  #self#[36m::Core.Const(var"#3#4"())[39m
  df[36m::DataFrame[39m
Body[91m[1m::Any[22m[39m
[90m1 ─[39m %1 = Base.dotview(df, Main.:(:), :zz)[91m[1m::Any[22m[39m
[90m│  [39m %2 = Base.broadcasted(Base.identity, 1)[36m::Core.Const(Base.Broadcast.Broadcasted(identity, (1,)))[39m
[90m│  [39m %3 = Base.materialize!(%1, %2)[91m[1m::Any[22m[39m
[90m└──[39m      return %3



We see that in `df.z .= 1` Julia does the following steps:
1. tries to take a property `:z` from `df`
2. does broadcasting into the result of `df.z`

And since `:z` does not exist in `df` the `dotgetproperty` correctly identifies that a new column needs to be allocated.

<div class="alert alert-block alert-info">
<b>Tip:</b>

This behavior is available starting from Julia 1.7. In earlier versions of Julia this is an error.
</div>

If the column exists it gets replaced:

In [23]:
df.x .= "a"

2-element Vector{String}:
 "a"
 "a"

In [24]:
df

Row,x,y,z
Unnamed: 0_level_1,String,Int64,Int64
1,a,2,1
2,a,2,1


Now what happens in `df[:, :z] .= 1` is that try to broadcast into a result of `Base.dotview(df, :, :z)` instead.

Let us check what it returns:

In [25]:
Base.dotview(df, :, :z)

2-element view(::Vector{Int64}, :) with eltype Int64:
 1
 1

In [26]:
Base.dotview(df, :, :x)

2-element view(::Vector{String}, :) with eltype String:
 "a"
 "a"

In [27]:
Base.dotview(df, !, :z)

DataFrames.LazyNewColDataFrame{Symbol, DataFrame}([1m2×3 DataFrame[0m
[1m Row [0m│[1m x      [0m[1m y     [0m[1m z     [0m
     │[90m String [0m[90m Int64 [0m[90m Int64 [0m
─────┼──────────────────────
   1 │ a           2      1
   2 │ a           2      1, :z)

In [28]:
Base.dotview(df, !, :x)

DataFrames.LazyNewColDataFrame{Symbol, DataFrame}([1m2×3 DataFrame[0m
[1m Row [0m│[1m x      [0m[1m y     [0m[1m z     [0m
     │[90m String [0m[90m Int64 [0m[90m Int64 [0m
─────┼──────────────────────
   1 │ a           2      1
   2 │ a           2      1, :x)

In [29]:
@less Base.dotview(df, !, :x)

function Base.dotview(df::AbstractDataFrame, ::typeof(!), cols)
    if !(cols isa ColumnIndex)
        return ColReplaceDataFrame(df, convert(Vector{Int}, index(df)[cols]))
    end
    if cols isa SymbolOrString
        if columnindex(df, cols) == 0 && !is_column_insertion_allowed(df)
            throw(ArgumentError("creating new columns in a SubDataFrame that subsets " *
                                "columns of its parent data frame is disallowed"))
        end
    elseif !(1 <= cols <= ncol(df))
        throw(ArgumentError("creating new columns using an integer index is disallowed"))
    end
    return LazyNewColDataFrame(df, cols isa AbstractString ? Symbol(cols) : cols)
end

if isdefined(Base, :dotgetproperty) # Introduced in Julia 1.7
    function Base.dotgetproperty(df::AbstractDataFrame, col::SymbolOrString)
        if columnindex(df, col) == 0 && !is_column_insertion_allowed(df)
            throw(ArgumentError("creating new columns in a SubDataFrame that su

Note that `dotview` is defined only when a special treatement is needed:

In [30]:
methods(Base.dotview, DataFrames)

as "normally" the default implementation is just enough:

In [31]:
Base.dotview(df, 1:1, 1:1)

Row,x
Unnamed: 0_level_1,String
1,a


In [32]:
typeof(Base.dotview(df, 1:1, 1:1))

SubDataFrame{DataFrame, DataFrames.SubIndex{DataFrames.Index, UnitRange{Int64}, UnitRange{Int64}}, UnitRange{Int64}}

So we can see that:
1. if we use `df[:, :x]` (an existing column) - we get just a view into it; a particular consequence is that we cannot cheange the `eltype` of the column (just like with `df.x .= 1`)
2. if we use `df[!, ...]` (any column) or `df[:, :z]` (non existing column) we get a `LazyNewColDataFrame` object.

Importantly note that in indexing context `x[y] .= z` the meaning of `x[y]` can be controlled by the package developer.

Let us try to understand what `LazyNewColDataFrame` does.

For this we need to dig into how broadcasting assignment works.

In [33]:
df = DataFrame(x = [1, 2])

Row,x
Unnamed: 0_level_1,Int64
1,1
2,2


We want to manually recreate the process of execution of `df[:, :z] .= 1`

In [34]:
dest = Base.dotview(df, :, :z)

DataFrames.LazyNewColDataFrame{Symbol, DataFrame}([1m2×1 DataFrame[0m
[1m Row [0m│[1m x     [0m
     │[90m Int64 [0m
─────┼───────
   1 │     1
   2 │     2, :z)

In [35]:
bc = Base.broadcasted(identity, 1)

Base.Broadcast.Broadcasted(identity, (1,))

In [36]:
@less Base.materialize!(dest, bc)

@inline function materialize!(dest, bc::Broadcasted{Style}) where {Style}
    return materialize!(combine_styles(dest, bc), dest, bc)
end
@inline function materialize!(::BroadcastStyle, dest, bc::Broadcasted{Style}) where {Style}
    return copyto!(dest, instantiate(Broadcasted{Style}(bc.f, bc.args, axes(dest))))
end

## general `copy` methods
@inline copy(bc::Broadcasted{<:AbstractArrayStyle{0}}) = bc[CartesianIndex()]
copy(bc::Broadcasted{<:Union{Nothing,Unknown}}) =
    throw(ArgumentError("broadcasting requires an assigned BroadcastStyle"))

const NonleafHandlingStyles = Union{DefaultArrayStyle,ArrayConflict}

@inline function copy(bc::Broadcasted{Style}) where {Style}
    ElType = combine_eltypes(bc.f, bc.args)
    if Base.isconcretetype(ElType)
        # We can trust it and defer to the simpler `copyto!`
        return copyto!(similar(bc, ElType), bc)
    end
    # When ElType is not concrete, use narrowing. Use the first output
    # value to determine the s

## scalar-range broadcast operations ##
# DefaultArrayStyle and \ are not available at the time of range.jl
broadcasted(::DefaultArrayStyle{1}, ::typeof(+), r::AbstractRange) = r

broadcasted(::DefaultArrayStyle{1}, ::typeof(-), r::AbstractRange) = range(-first(r), step=negate(step(r)), length=length(r))
broadcasted(::DefaultArrayStyle{1}, ::typeof(-), r::OrdinalRange) = range(-first(r), -last(r), step=negate(step(r)))
broadcasted(::DefaultArrayStyle{1}, ::typeof(-), r::StepRangeLen) = StepRangeLen(-r.ref, negate(r.step), length(r), r.offset)
broadcasted(::DefaultArrayStyle{1}, ::typeof(-), r::LinRange) = LinRange(-r.start, -r.stop, length(r))

# For #18336 we need to prevent promotion of the step type:
broadcasted(::DefaultArrayStyle{1}, ::typeof(+), r::AbstractRange, x::Number) = range(first(r) + x, step=step(r), length=length(r))
broadcasted(::DefaultArrayStyle{1}, ::typeof(+), x::Number, r::AbstractRange) = range(x + first(r), step=step(r), length=length(r))
broadcasted

        Expr(:call, tmp, dotargs...)
    else
        head = String(x.head)::String
        if last(head) == '=' && first(head) != '.' || head == "&&" || head == "||"
            Expr(Symbol('.', head), dotargs...)
        else
            Expr(x.head, dotargs...)
        end
    end
end
"""
    @. expr

Convert every function call or operator in `expr` into a "dot call"
(e.g. convert `f(x)` to `f.(x)`), and convert every assignment in `expr`
to a "dot assignment" (e.g. convert `+=` to `.+=`).

If you want to *avoid* adding dots for selected function calls in
`expr`, splice those function calls in with `\$`.  For example,
`@. sqrt(abs(\$sort(x)))` is equivalent to `sqrt.(abs.(sort(x)))`
(no dot for `sort`).

(`@.` is equivalent to a call to `@__dot__`.)

# Examples
```jldoctest
julia> x = 1.0:3.0; y = similar(x);

julia> @. y = x + 3 * sin(x)
3-element Vector{Float64}:
 3.5244129544236893
 4.727892280477045
 3.4233600241796016
```
"""
macro __dot__(x)

So we see that first Base checks what should be style of the output

In [37]:
Base.Broadcast.combine_styles(dest, bc)

Base.Broadcast.DefaultArrayStyle{1}()

but e.g.

In [38]:
Base.Broadcast.combine_styles(df, bc)

DataFrames.DataFrameStyle()

as we insist that if a data frame takes part in broadcasting the result should be a data frame (more on this later).

In [39]:
@less Base.materialize!(Base.Broadcast.combine_styles(dest, bc), dest, bc)

@inline function materialize!(::BroadcastStyle, dest, bc::Broadcasted{Style}) where {Style}
    return copyto!(dest, instantiate(Broadcasted{Style}(bc.f, bc.args, axes(dest))))
end

## general `copy` methods
@inline copy(bc::Broadcasted{<:AbstractArrayStyle{0}}) = bc[CartesianIndex()]
copy(bc::Broadcasted{<:Union{Nothing,Unknown}}) =
    throw(ArgumentError("broadcasting requires an assigned BroadcastStyle"))

const NonleafHandlingStyles = Union{DefaultArrayStyle,ArrayConflict}

@inline function copy(bc::Broadcasted{Style}) where {Style}
    ElType = combine_eltypes(bc.f, bc.args)
    if Base.isconcretetype(ElType)
        # We can trust it and defer to the simpler `copyto!`
        return copyto!(similar(bc, ElType), bc)
    end
    # When ElType is not concrete, use narrowing. Use the first output
    # value to determine the starting output eltype; copyto_nonleaf!
    # will widen `dest` as needed to accommodate later values.
    bc′ = preprocess(nothing, bc)
  

broadcasted(::DefaultArrayStyle{1}, ::typeof(+), r::AbstractRange) = r

broadcasted(::DefaultArrayStyle{1}, ::typeof(-), r::AbstractRange) = range(-first(r), step=negate(step(r)), length=length(r))
broadcasted(::DefaultArrayStyle{1}, ::typeof(-), r::OrdinalRange) = range(-first(r), -last(r), step=negate(step(r)))
broadcasted(::DefaultArrayStyle{1}, ::typeof(-), r::StepRangeLen) = StepRangeLen(-r.ref, negate(r.step), length(r), r.offset)
broadcasted(::DefaultArrayStyle{1}, ::typeof(-), r::LinRange) = LinRange(-r.start, -r.stop, length(r))

# For #18336 we need to prevent promotion of the step type:
broadcasted(::DefaultArrayStyle{1}, ::typeof(+), r::AbstractRange, x::Number) = range(first(r) + x, step=step(r), length=length(r))
broadcasted(::DefaultArrayStyle{1}, ::typeof(+), x::Number, r::AbstractRange) = range(x + first(r), step=step(r), length=length(r))
broadcasted(::DefaultArrayStyle{1}, ::typeof(+), r::OrdinalRange, x::Integer) = range(first(r) + x, last(r) + x, step=ste

    elseif x.head === :(<:) || x.head === :(>:)
        tmp = x.head === :(<:) ? :.<: : :.>:
        Expr(:call, tmp, dotargs...)
    else
        head = String(x.head)::String
        if last(head) == '=' && first(head) != '.' || head == "&&" || head == "||"
            Expr(Symbol('.', head), dotargs...)
        else
            Expr(x.head, dotargs...)
        end
    end
end
"""
    @. expr

Convert every function call or operator in `expr` into a "dot call"
(e.g. convert `f(x)` to `f.(x)`), and convert every assignment in `expr`
to a "dot assignment" (e.g. convert `+=` to `.+=`).

If you want to *avoid* adding dots for selected function calls in
`expr`, splice those function calls in with `\$`.  For example,
`@. sqrt(abs(\$sort(x)))` is equivalent to `sqrt.(abs.(sort(x)))`
(no dot for `sort`).

(`@.` is equivalent to a call to `@__dot__`.)

# Examples
```jldoctest
julia> x = 1.0:3.0; y = similar(x);

julia> @. y = x + 3 * sin(x)
3-element Vector{Floa

In [40]:
typeof(bc)

Base.Broadcast.Broadcasted{Base.Broadcast.DefaultArrayStyle{0}, Nothing, typeof(identity), Tuple{Int64}}

In [41]:
@less axes(dest)

Base.axes(x::LazyNewColDataFrame) = (Base.OneTo(nrow(x.df)),)
Base.ndims(::Type{<:LazyNewColDataFrame}) = 1

struct ColReplaceDataFrame{T<:AbstractDataFrame}
    df::T
    cols::Vector{Int}
end

Base.axes(x::ColReplaceDataFrame) = (axes(x.df, 1), Base.OneTo(length(x.cols)))
Base.ndims(::Type{<:ColReplaceDataFrame}) = 2

# In the functions below we need to call _drop_all_nonnote_metadata!
# upfront as the rest of the operations is handled by Base Julia

function Base.maybeview(df::AbstractDataFrame, idx::CartesianIndex{2})
    _drop_all_nonnote_metadata!(parent(df))
    return df[idx]
end

function Base.maybeview(df::AbstractDataFrame, row::Integer, col::ColumnIndex)
    _drop_all_nonnote_metadata!(parent(df))
    return df[row, col]
end

function Base.maybeview(df::AbstractDataFrame, rows, cols)
    _drop_all_nonnote_metadata!(parent(df))
    return view(df, rows, cols)
end

function Base.dotview(df::AbstractDataFrame, ::Colon, cols::ColumnIndex)
    if ha

    end
    return dfr
end


In [42]:
inst = Base.Broadcast.instantiate(Base.Broadcast.Broadcasted{Base.Broadcast.DefaultArrayStyle{0}}((bc.f, bc.args), axes(dest)))

Base.Broadcast.Broadcasted((identity, (1,)), (Base.OneTo(2),))

In [43]:
@less copyto!(dest, inst)

function Base.copyto!(lazydf::LazyNewColDataFrame, bc::Base.Broadcast.Broadcasted{T}) where T
    df = lazydf.df
    if !haskey(index(df), lazydf.col) && df isa SubDataFrame && lazydf.col isa SymbolOrString
        @assert is_column_insertion_allowed(df)
    end
    if bc isa Base.Broadcast.Broadcasted{<:Base.Broadcast.AbstractArrayStyle{0}}
        bc_tmp = Base.Broadcast.Broadcasted{T}(bc.f, bc.args, ())
        v = Base.Broadcast.materialize(bc_tmp)
        col = similar(Vector{typeof(v)}, nrow(df))
        copyto!(col, bc)
    else
        col = Base.Broadcast.materialize(bc)
    end

    return df[!, lazydf.col] = col
end

function _copyto_helper!(dfcol::AbstractVector, bc::Base.Broadcast.Broadcasted, col::Int)
    if axes(dfcol, 1) != axes(bc)[1]
        # this should never happen unless data frame is corrupted (has unequal column lengths)
        throw(DimensionMismatch("Dimension mismatch in broadcasting. The updated" *
                                " dat

Why a special path for 0-dimensional objects is required?

In [44]:
Base.Broadcast.materialize(inst)

LoadError: MethodError: objects of type Tuple{typeof(identity), Tuple{Int64}} are not callable

#### Example 3: avoiding dispatch ambiguity

In [45]:
df = DataFrame([1 2 3 4], :auto)

Row,x1,x2,x3,x4
Unnamed: 0_level_1,Int64,Int64,Int64,Int64
1,1,2,3,4


In [46]:
df[1, Not(1)] = [11, 12, 13]

3-element Vector{Int64}:
 11
 12
 13

In [47]:
df

Row,x1,x2,x3,x4
Unnamed: 0_level_1,Int64,Int64,Int64,Int64
1,1,11,12,13


In [48]:
@less df[1, Not(1)] = [11, 12, 13] # note @eval in the source code

    @eval function Base.setindex!(df::DataFrame,
                                  v::Union{Tuple, AbstractArray},
                                  row_ind::Integer,
                                  col_inds::$T)
        idxs = index(df)[col_inds]
        if length(v) != length(idxs)
            throw(DimensionMismatch("$(length(idxs)) columns were selected but the assigned " *
                                    "collection contains $(length(v)) elements"))
        end
        for (i, x) in zip(idxs, v)
            df[!, i][row_ind] = x
        end
        _drop_all_nonnote_metadata!(df)
        return df
    end
end

# df[MultiRowIndex, SingleColumnIndex] = AbstractVector
for T in (:AbstractVector, :Not, :Colon)
    @eval function Base.setindex!(df::DataFrame,
                                  v::AbstractVector,
                                  row_inds::$T,
                                  col_ind::ColumnIndex)
        if row_inds isa Colon && !haskey(inde

     │ Int64  Int64
─────┼──────────────
   1 │     1      4
   2 │     2      5
   3 │     3      6

julia> resize!(df, 2)
2×2 DataFrame
 Row │ a      b
     │ Int64  Int64
─────┼──────────────
   1 │     1      4
   2 │     2      5
```
"""
function Base.resize!(df::DataFrame, n::Integer)
    if ncol(df) == 0 && n != 0
        throw(ArgumentError("data frame has no columns and requested number " *
                            "of rows is not zero"))
    end
    foreach(col -> resize!(col, n), eachcol(df))
    _drop_all_nonnote_metadata!(df)
    return df
end

"""
    pop!(df::DataFrame)

Remove the last row from `df` and return a `NamedTuple` created from this row.

!!! note

    Using this method for very wide data frames may lead to expensive compilation.

$METADATA_FIXED

# Examples
```jldoctest
julia> df = DataFrame(a=1:3, b=4:6)
3×2 DataFrame
 Row │ a      b
     │ Int64  Int64
─────┼──────────────
   1 │     1      4
   2 │     2      

julia> df = DataFrame(a=1:2, b=3:4)
2×2 DataFrame
 Row │ a      b
     │ Int64  Int64
─────┼──────────────
   1 │     1      3
   2 │     2      4

julia> repeat(df, 2)
4×2 DataFrame
 Row │ a      b
     │ Int64  Int64
─────┼──────────────
   1 │     1      3
   2 │     2      4
   3 │     1      3
   4 │     2      4
```
"""
function repeat!(df::DataFrame, count::Integer)
    count < 0 && throw(ArgumentError("count must be non-negative"))
    cols = _columns(df)
    for (i, col) in enumerate(cols)
        col_new = repeat(col, count)
        firstindex(col_new) != 1 && _onebased_check_error(i, col_new)
        cols[i] = col_new
    end
    _drop_all_nonnote_metadata!(df)
    return df
end

# This is not exactly copy! as in general we allow axes to be different
# Also no table metadata needs to be copied as we use _replace_columns!
# only in situations when table metadata for `df` must be left as-is
function _replace_columns!(df::DataFrame, newdf::Data

Why is this needed?
Because we are flexible in both row indexing and column indexing options.

Here is a simple worked example:

In [49]:
f(x::Union{Float64, Int64}, y::Int64) = 1
f(x::Int64, y) = 2

f (generic function with 2 methods)

In [50]:
f(1, 1)

LoadError: MethodError: f(::Int64, ::Int64) is ambiguous. Candidates:
  f(x::Union{Float64, Int64}, y::Int64) in Main at In[49]:1
  f(x::Int64, y) in Main at In[49]:2
Possible fix, define
  f(::Int64, ::Int64)

In [51]:
for T in (Float64, Int)
    @eval g(x::$T, y::Int64) = 1
end
g(x::Int64, y) = 2

g (generic function with 3 methods)

In [52]:
g(1, 1)

1

In more complex scenarios it gets very complicated to ensure that you cover every possible ambiguity (you have to think of a cartesian index of options), so it is simpler to unwrap `Union`.

Also have a look at this one to see how to define non-standard indices:

In [53]:
df

Row,x1,x2,x3,x4
Unnamed: 0_level_1,Int64,Int64,Int64,Int64
1,1,11,12,13


In [54]:
@less df[:, :] = rand(Int, 1, 4) # note how `!` or `Not` are referenced to

    @eval function Base.setindex!(df::DataFrame,
                                  mx::AbstractMatrix,
                                  row_inds::$T1,
                                  col_inds::$T2)
        idxs = index(df)[col_inds]
        if size(mx, 2) != length(idxs)
            throw(DimensionMismatch("number of selected columns ($(length(idxs))) " *
                                    "and number of columns in " *
                                    "matrix ($(size(mx, 2))) do not match"))
        end
        for (j, col) in enumerate(idxs)
            # this will drop metadata appropriately
            df[row_inds, col] = (row_inds === !) ? mx[:, j] : view(mx, :, j)
        end
        return df
    end
end

"""
    copy(df::DataFrame; copycols::Bool=true)

Copy data frame `df`.
If `copycols=true` (the default), return a new  `DataFrame` holding
copies of column vectors in `df`.
If `copycols=false`, return a new `DataFrame` sharing column vectors with 

 Row │ a      b
     │ Int64  Int64
─────┼──────────────
   1 │     1      4
   2 │     3      6
```
"""
function Base.popat!(df::DataFrame, i::Integer)
    i isa Bool && throw(ArgumentError("Invalid index of type Bool"))
    nt = NamedTuple(df[i, :])
    deleteat!(df, i)
    return nt
end

##############################################################################
##
## hcat!
##
##############################################################################

# hcat! for 2 arguments, only a vector or a data frame is allowed
function hcat!(df1::DataFrame, df2::AbstractDataFrame;
               makeunique::Bool=false, copycols::Bool=true)
    u = add_names(index(df1), index(df2), makeunique=makeunique)

    _drop_all_nonnote_metadata!(df1)
    _keep_matching_table_note_metadata!(df1, df2)
    for i in 1:length(u)
        df1[!, u[i]] = copycols ? df2[:, i] : df2[!, i]
        _copy_col_note_metadata!(df1, u[i], df2, i)
    end

    return df1
end

# T

See also: [`crossjoin`](@ref) can be used to get the cartesian product
of rows from passed data frames.

# Examples

```jldoctest
julia> allcombinations(DataFrame, a=1:2, b='a':'c')
6×2 DataFrame
 Row │ a      b
     │ Int64  Char
─────┼─────────────
   1 │     1  a
   2 │     2  a
   3 │     1  b
   4 │     2  b
   5 │     1  c
   6 │     2  c

julia> allcombinations(DataFrame, "a" => 1:2, "b" => 'a':'c', "c" => "const")
6×3 DataFrame
 Row │ a      b     c
     │ Int64  Char  String
─────┼─────────────────────
   1 │     1  a     const
   2 │     2  a     const
   3 │     1  b     const
   4 │     2  b     const
   5 │     1  c     const
   6 │     2  c     const
```
"""
function allcombinations(::Type{DataFrame}, pairs::Pair{Symbol, <:Any}...)
    colnames = first.(pairs)
    if !allunique(colnames)
        throw(ArgumentError("All column names passed to allcombinations must be unique"))
    end
    colvalues = map(pairs) do p
        v = last(p)

#### Example 4: defining broadcasting

Your type should support `CartesianIndex` indexing because it later can get used in broadcasting mechanics (which was not obvious for me initially)

In [55]:
@less df[CartesianIndex(1, 1)] = 1

Base.setindex!(df::AbstractDataFrame, val, idx::CartesianIndex{2}) =
    (df[idx[1], idx[2]] = val)

Base.broadcastable(df::AbstractDataFrame) = df

struct DataFrameStyle <: Base.Broadcast.BroadcastStyle end

Base.Broadcast.BroadcastStyle(::Type{<:AbstractDataFrame}) =
    DataFrameStyle()

Base.Broadcast.BroadcastStyle(::DataFrameStyle, ::Base.Broadcast.BroadcastStyle) =
    DataFrameStyle()
Base.Broadcast.BroadcastStyle(::Base.Broadcast.BroadcastStyle, ::DataFrameStyle) =
    DataFrameStyle()
Base.Broadcast.BroadcastStyle(::DataFrameStyle, ::DataFrameStyle) = DataFrameStyle()
# The method below is added to avoid dispatch ambiguity
Base.Broadcast.BroadcastStyle(::DataFrameStyle, ::Base.Broadcast.Unknown) =
    DataFrameStyle()

function copyto_widen!(res::AbstractVector{T}, bc::Base.Broadcast.Broadcasted,
                       pos, col) where T
    for i in pos:length(axes(bc)[1])
        val = bc[CartesianIndex(i, col)]
        S = typeof(val)
        if S <:

                parentidx = parentcols(index(src), col2)
                parent(src)[!, parentidx] = Base.unaliascopy(parent(src)[!, parentidx])
            else
                if !wascopied
                    src = copy(src, copycols=false)
                end
                src[!, col2] = Base.unaliascopy(scol)
            end
            return src, true
        end
    end
    return src, wascopied
end

function Base.Broadcast.broadcast_unalias(dest::AbstractDataFrame, src::AbstractDataFrame)
    if size(dest, 2) != size(src, 2)
        throw(DimensionMismatch("Dimension mismatch in broadcasting."))
    end
    wascopied = false
    for col2 in axes(dest, 2)
        scol = src[!, col2]
        src, wascopied = _broadcast_unalias_helper(dest, scol, src, col2, wascopied)
    end
    return src
end

function Base.copyto!(df::AbstractDataFrame, bc::Base.Broadcast.Broadcasted)
    bcf = Base.Broadcast.flatten(bc)
    colnames = unique!(Any[_names(x) for x 

Also below you can see how we force broadcasting to make sure the result is a `DataFrame` using `BroadcastStyle`.

Now in order for broadcasting to overcome the problem that `DataFrame` column access is not type stable we have to process it column by column.

In [56]:
f(df) = df .+ 1

f (generic function with 3 methods)

In [57]:
@code_warntype f(df)

MethodInstance for f(::DataFrame)
  from f(df) in Main at In[56]:1
Arguments
  #self#[36m::Core.Const(f)[39m
  df[36m::DataFrame[39m
Body[36m::DataFrame[39m
[90m1 ─[39m %1 = Base.broadcasted(Main.:+, df, 1)[36m::Core.PartialStruct(Base.Broadcast.Broadcasted{DataFrames.DataFrameStyle, Nothing, typeof(+), Tuple{DataFrame, Int64}}, Any[Core.Const(+), Core.PartialStruct(Tuple{DataFrame, Int64}, Any[DataFrame, Core.Const(1)]), Core.Const(nothing)])[39m
[90m│  [39m %2 = Base.materialize(%1)[36m::DataFrame[39m
[90m└──[39m      return %2



In [58]:
@less Base.materialize(Base.broadcasted(+, df, 1))

@inline materialize(bc::Broadcasted) = copy(instantiate(bc))
materialize(x) = x

@inline function materialize!(dest, x)
    return materialize!(dest, instantiate(Broadcasted(identity, (x,), axes(dest))))
end

@inline function materialize!(dest, bc::Broadcasted{Style}) where {Style}
    return materialize!(combine_styles(dest, bc), dest, bc)
end
@inline function materialize!(::BroadcastStyle, dest, bc::Broadcasted{Style}) where {Style}
    return copyto!(dest, instantiate(Broadcasted{Style}(bc.f, bc.args, axes(dest))))
end

## general `copy` methods
@inline copy(bc::Broadcasted{<:AbstractArrayStyle{0}}) = bc[CartesianIndex()]
copy(bc::Broadcasted{<:Union{Nothing,Unknown}}) =
    throw(ArgumentError("broadcasting requires an assigned BroadcastStyle"))

const NonleafHandlingStyles = Union{DefaultArrayStyle,ArrayConflict}

@inline function copy(bc::Broadcasted{Style}) where {Style}
    ElType = combine_eltypes(bc.f, bc.args)
    if Base.isconcretetype(ElType)
      

end

## Tuple methods

@inline function copy(bc::Broadcasted{Style{Tuple}})
    dim = axes(bc)
    length(dim) == 1 || throw(DimensionMismatch("tuple only supports one dimension"))
    N = length(dim[1])
    return ntuple(k -> @inbounds(_broadcast_getindex(bc, k)), Val(N))
end

## scalar-range broadcast operations ##
# DefaultArrayStyle and \ are not available at the time of range.jl
broadcasted(::DefaultArrayStyle{1}, ::typeof(+), r::AbstractRange) = r

broadcasted(::DefaultArrayStyle{1}, ::typeof(-), r::AbstractRange) = range(-first(r), step=negate(step(r)), length=length(r))
broadcasted(::DefaultArrayStyle{1}, ::typeof(-), r::OrdinalRange) = range(-first(r), -last(r), step=negate(step(r)))
broadcasted(::DefaultArrayStyle{1}, ::typeof(-), r::StepRangeLen) = StepRangeLen(-r.ref, negate(r.step), length(r), r.offset)
broadcasted(::DefaultArrayStyle{1}, ::typeof(-), r::LinRange) = LinRange(-r.start, -r.stop, length(r))

# For #18336 we need to prevent promotion of the

        Expr(:., dotargs[1], Expr(:tuple, dotargs[2:end]...))
    elseif x.head === :comparison
        Expr(:comparison, (iseven(i) && dottable(arg) && arg isa Symbol && isoperator(arg) ?
                               Symbol('.', arg) : arg for (i, arg) in pairs(dotargs))...)
    elseif x.head === :$
        x.args[1]
    elseif x.head === :let # don't add dots to `let x=...` assignments
        Expr(:let, undot(dotargs[1]), dotargs[2])
    elseif x.head === :for # don't add dots to for x=... assignments
        Expr(:for, undot(dotargs[1]), dotargs[2])
    elseif (x.head === :(=) || x.head === :function || x.head === :macro) &&
           Meta.isexpr(x.args[1], :call) # function or macro definition
        Expr(x.head, x.args[1], dotargs[2])
    elseif x.head === :(<:) || x.head === :(>:)
        tmp = x.head === :(<:) ? :.<: : :.>:
        Expr(:call, tmp, dotargs...)
    else
        head = String(x.head)::String
        if last(head) == '=' && first(head) != '.'

So we see that essentially we need to define `copy`

In [59]:
less(copy, (Base.Broadcast.Broadcasted{DataFrames.DataFrameStyle},)) # note getcolbc! and copyto_widen!

function Base.copy(bc::Base.Broadcast.Broadcasted{DataFrameStyle})
    ndim = length(axes(bc))
    if ndim != 2
        throw(DimensionMismatch("cannot broadcast a data frame into $ndim dimensions"))
    end
    bcf = Base.Broadcast.flatten(bc)
    colnames = unique!(Any[_names(df) for df in bcf.args if df isa AbstractDataFrame])
    if length(colnames) != 1
        wrongnames = setdiff(union(colnames...), intersect(colnames...))
        if isempty(wrongnames)
            throw(ArgumentError("Column names in broadcasted data frames " *
                                "must have the same order"))
        else
            msg = join(wrongnames, ", ", " and ")
            throw(ArgumentError("Column names in broadcasted data frames must match. " *
                                "Non matching column names are $msg"))
        end
    end
    nrows = length(axes(bcf)[1])
    df = DataFrame()
    for i in axes(bcf)[2]
        if nrows == 0
            col = Any[]
     

                      bc::Base.Broadcast.Broadcasted{<:Base.Broadcast.AbstractArrayStyle{0}})
    # special case of fast approach when bc is providing an untransformed scalar
    if bc.f === identity && bc.args isa Tuple{Any} && Base.Broadcast.isflat(bc)
        for col in axes(df, 2)
            fill!(df[!, col], bc.args[1][])
        end
        return df
    else
        return copyto!(df, convert(Base.Broadcast.Broadcasted{Nothing}, bc))
    end
end

create_bc_tmp(bcf′_col::Base.Broadcast.Broadcasted{T}) where {T} =
    Base.Broadcast.Broadcasted{T}(bcf′_col.f, bcf′_col.args, ())

function Base.copyto!(crdf::ColReplaceDataFrame, bc::Base.Broadcast.Broadcasted)
    bcf = Base.Broadcast.flatten(bc)
    colnames = unique!(Any[_names(x) for x in bcf.args if x isa AbstractDataFrame])
    if length(colnames) > 1 ||
        (length(colnames) == 1 && view(_names(crdf.df), crdf.cols) != colnames[1])
        push!(colnames, view(_names(crdf.df), crdf.cols))
        wrong

#### Example 5: unaliasing in broadcasting assignment

What is aliasing?

Assume we have:

In [60]:
x = [1, 2, 3]

3-element Vector{Int64}:
 1
 2
 3

In [61]:
y = @view x[3:-1:1]

3-element view(::Vector{Int64}, 3:-1:1) with eltype Int64:
 3
 2
 1

now we call:

In [62]:
x .= y

3-element Vector{Int64}:
 3
 2
 1

In [63]:
x

3-element Vector{Int64}:
 3
 2
 1

and all is OK.

But assume we have a naive broadcasting implemented:

In [64]:
x = [1, 2, 3]
y = @view x[3:-1:1]

3-element view(::Vector{Int64}, 3:-1:1) with eltype Int64:
 3
 2
 1

In [65]:
naive_broadcast!(x, y) = foreach(i -> x[i] = y[i], eachindex(x, y))

naive_broadcast! (generic function with 1 method)

In [66]:
naive_broadcast!(x, y)

In [67]:
x

3-element Vector{Int64}:
 3
 2
 3

This is ensured to be avoided by broadcasting mechanism in Base in `Base.Broadcast.preprocess` function (which should be called before performing assignment of source to target). This function intenally calls `Base.Broadcast.broadcast_unalias` that should be implemented for your custom type.

In [68]:
methods(Base.Broadcast.broadcast_unalias)

In [69]:
less(Base.Broadcast.broadcast_unalias, (AbstractDataFrame, Any)) # this is a first method of several

function Base.Broadcast.broadcast_unalias(dest::AbstractDataFrame, src)
    for col in eachcol(dest)
        src = Base.Broadcast.unalias(col, src)
    end
    return src
end

# The method below is added to avoid dispatch ambiguity
Base.Broadcast.broadcast_unalias(::Nothing, src::AbstractDataFrame) = src

function Base.Broadcast.broadcast_unalias(dest, src::AbstractDataFrame)
    wascopied = false
    for (i, col) in enumerate(eachcol(src))
        if Base.mightalias(dest, col)
            if src isa SubDataFrame
                if !wascopied
                    src = SubDataFrame(copy(parent(src), copycols=false),
                                       index(src), rows(src))
                end
                parentidx = parentcols(index(src), i)
                parent(src)[!, parentidx] = Base.unaliascopy(parent(src)[!, parentidx])
            else
                if !wascopied
                    src = copy(src, copycols=false)
                end
         

Note that this process is expensive unfortunately, but we want to stay safe:

In [70]:
df = DataFrame(x=[1,2,3])

Row,x
Unnamed: 0_level_1,Int64
1,1
2,2
3,3


In [71]:
y = view(df, 3:-1:1, 1)

3-element view(::Vector{Int64}, 3:-1:1) with eltype Int64:
 3
 2
 1

In [72]:
df .= y
df

Row,x
Unnamed: 0_level_1,Int64
1,3
2,2
3,1


In [73]:
y

3-element view(::Vector{Int64}, 3:-1:1) with eltype Int64:
 1
 2
 3

In [74]:
df .= y
df

Row,x
Unnamed: 0_level_1,Int64
1,1
2,2
3,3


When is unaliasing triggered by DataFrames.jl?

Well - we already know that ultimately `copyto!` is called in broadcasting assignment:

In [75]:
methods(copyto!, DataFrames)

Let us have a look how they are implemented:

In [76]:
less(Base.copyto!, (AbstractDataFrame, Base.Broadcast.Broadcasted))

function Base.copyto!(df::AbstractDataFrame, bc::Base.Broadcast.Broadcasted)
    bcf = Base.Broadcast.flatten(bc)
    colnames = unique!(Any[_names(x) for x in bcf.args if x isa AbstractDataFrame])
    if length(colnames) > 1 || (length(colnames) == 1 && _names(df) != colnames[1])
        push!(colnames, _names(df))
        wrongnames = setdiff(union(colnames...), intersect(colnames...))
        if isempty(wrongnames)
            throw(ArgumentError("Column names in broadcasted data frames " *
                                "must have the same order"))
        else
            msg = join(wrongnames, ", ", " and ")
            throw(ArgumentError("Column names in broadcasted data frames must match. " *
                                "Non matching column names are $msg"))
        end
    end

    bcf′ = Base.Broadcast.preprocess(df, bcf)
    for i in axes(df, 2)
        _copyto_helper!(df[!, i], getcolbc(bcf′, i), i)
    end
    _drop_all_nonnote_metadata!(parent(df

#### That is all for today!

I hope this part of the tutorial gave you some insight how indexing and broadcasting is implemented in DataFrames.jl and what things you should take into account when designing your own types that are expected to support indexing/broadcasting.