# Mill v2.0 features

In [1]:
using Mill, Flux, FileIO, JLD2, SparseArrays, BenchmarkTools, Setfield

## Bag count

- `AggregationFunction` changed to `AggregationOperator` for clarity and are not meant to be used by the user.
- Reduced number of exported `Segmented*` methods
- `Segmented*` calls now return `Aggregation` type even for aggregations using only one operator.
- All `Aggregation{T}` types now append `log(length(bag) + one(T))` unless a global flag is not set
- slightly more strict type checking
- `Aggregation` is now flattened upon construction
- smart `vcat` implemented

In [2]:
a = SegmentedMeanMax(3)

Aggregation{Float32}:
 SegmentedMean(ψ = Float32[0.0, 0.0, 0.0])
 SegmentedMax(ψ = Float32[0.0, 0.0, 0.0])

In [3]:
SegmentedMean(3) |> typeof

Aggregation{Float32,Tuple{SegmentedMean{Float32,Array{Float32,1}}}}

In [4]:
SegmentedMean(zeros(3)) |> typeof

SegmentedMean{Float64,Array{Float64,1}}

In [5]:
x = reshape(1:9, 3, 3) |> f32

3×3 Array{Float32,2}:
 1.0  4.0  7.0
 2.0  5.0  8.0
 3.0  6.0  9.0

In [6]:
a(x, Mill.bags([1:2, 3:3]))

7×2 Array{Float32,2}:
 2.5      7.0
 3.5      8.0
 4.5      9.0
 4.0      7.0
 5.0      8.0
 6.0      9.0
 1.09861  0.693147

In [7]:
a(x[:, 1:2], Mill.bags([1:2, 0:-1]))

7×2 Array{Float32,2}:
 2.5      0.0
 3.5      0.0
 4.5      0.0
 4.0      0.0
 5.0      0.0
 6.0      0.0
 1.09861  0.0

In [8]:
Mill.bagcount()

true

In [9]:
Mill.bagcount!(false)
Mill.bagcount()

false

In [10]:
a(x, Mill.bags([1:2, 3:3]))

6×2 Array{Float32,2}:
 2.5  7.0
 3.5  8.0
 4.5  9.0
 4.0  7.0
 5.0  8.0
 6.0  9.0

In [11]:
a = Aggregation(SegmentedPNormLSE(3), Aggregation(SegmentedMean(3)), SegmentedMax(3))

Aggregation{Float32}:
 SegmentedPNorm(ψ = Float32[-0.980588, -0.676807, 0.563166], ρ = Float32[-0.20514, 0.198478, -0.788083], c = Float32[0.0, 0.0, 0.0])
 SegmentedLSE(ψ = Float32[1.36628, -0.756589, -1.99925], ρ = Float32[0.0, 0.0, 0.0])
 SegmentedMean(ψ = Float32[0.0, 0.0, 0.0])
 SegmentedMax(ψ = Float32[0.0, 0.0, 0.0])

In [12]:
vcat(SegmentedMean(2), SegmentedMeanMax(2))

Aggregation{Float32}:
 SegmentedMean(ψ = Float32[0.0, 0.0])
 SegmentedMean(ψ = Float32[0.0, 0.0])
 SegmentedMax(ψ = Float32[0.0, 0.0])

## Pre (row) imputing

In [13]:
A = PreImputingMatrix(rand(3,3))
A::AbstractMatrix{Float64}

3×3 PreImputingMatrix{Float64,Array{Float64,1},Array{Float64,2}}:
W:
 0.90315   0.0144307  0.62638
 0.718898  0.31501    0.415531
 0.360414  0.877861   0.457933

ψ:
 0.0  0.0  0.0

In [14]:
hcat(A, A)

3×6 PreImputingMatrix{Float64,Array{Float64,1},Array{Float64,2}}:
W:
 0.90315   0.0144307  0.62638   0.90315   0.0144307  0.62638
 0.718898  0.31501    0.415531  0.718898  0.31501    0.415531
 0.360414  0.877861   0.457933  0.360414  0.877861   0.457933

ψ:
 0.0  0.0  0.0  0.0  0.0  0.0

In [15]:
X = rand(3, 2)

3×2 Array{Float64,2}:
 0.156779  0.322361
 0.707233  0.395962
 0.847294  0.898436

In [16]:
A * X

3×2 Array{Float64,2}:
 0.682529  0.859617
 0.687571  0.729805
 1.06536   0.875207

In [17]:
Y = [1.0 missing; missing 2.0; 3.0 4.0]

3×2 Array{Union{Missing, Float64},2}:
 1.0        missing
  missing  2.0
 3.0       4.0

In [18]:
A * Y

3×2 Array{Float64,2}:
 2.78229  2.53438
 1.96549  2.29214
 1.73421  3.58745

In [19]:
Z = [missing, missing, missing]

3-element Array{Missing,1}:
 missing
 missing
 missing

In [20]:
A * Z

3-element Array{Float64,1}:
 0.0
 0.0
 0.0

In [21]:
gradient((x, y) -> x * y |> sum, A, X)

((W = [0.4791405279186667 1.103194820364753 1.745729040761589; 0.4791405279186667 1.103194820364753 1.745729040761589; 0.4791405279186667 1.103194820364753 1.745729040761589], ψ = nothing), [1.9824625821174064 1.9824625821174064; 1.207302140269231 1.207302140269231; 1.4998441742284894 1.4998441742284894])

In [22]:
gradient((x, y) -> x * y |> sum, A, Y)

((W = [1.0 2.0 7.0; 1.0 2.0 7.0; 1.0 2.0 7.0], ψ = [1.9824625821174064; 1.207302140269231; 0.0]), [1.9824625821174064 0.0; 0.0 1.207302140269231; 1.4998441742284894 1.4998441742284894])

In [23]:
gradient((x, y) -> x * y |> sum, A, Z)

((W = [0.0 0.0 0.0; 0.0 0.0 0.0; 0.0 0.0 0.0], ψ = [1.9824625821174064, 1.207302140269231, 1.4998441742284894]), nothing)

## Maybe hot

In [24]:
oh1 = Flux.onehot(1, 1:3)

3-element Flux.OneHotVector:
 1
 0
 0

In [25]:
mh1 = maybehot(1, 1:3)
mh1::AbstractVector{Bool}

3-element MaybeHotVector{Int64,Int64,Bool}:
 1
 0
 0

In [26]:
Flux.onehot(mh1)

3-element Flux.OneHotVector:
 1
 0
 0

In [27]:
mh2 = Mill.maybehot(missing, 1:3)
mh2::AbstractVector{Missing}

3-element MaybeHotVector{Missing,Int64,Missing}:
 missing
 missing
 missing

In [28]:
ohb1 = Flux.onehotbatch([1, 3], 1:3)

3×2 Flux.OneHotMatrix{Array{Flux.OneHotVector,1}}:
 1  0
 0  0
 0  1

In [29]:
mhb1 = Mill.maybehotbatch([1, 3], 1:3)
mhb1::AbstractMatrix{Bool}

3×2 MaybeHotMatrix{Int64,Array{Int64,1},Int64,Bool}:
 1  0
 0  0
 0  1

In [30]:
Flux.onehotbatch(mhb1)

3×2 Flux.OneHotMatrix{Array{Flux.OneHotVector,1}}:
 1  0
 0  0
 0  1

In [31]:
mhb2 = Mill.maybehotbatch([1, missing, 3], 1:3)
mhb2::AbstractMatrix{Union{Bool, Missing}}

3×3 MaybeHotMatrix{Union{Missing, Int64},Array{Union{Missing, Int64},1},Int64,Union{Missing, Bool}}:
  true  missing  false
 false  missing  false
 false  missing   true

In [32]:
x = rand(3,3)

3×3 Array{Float64,2}:
 0.68779   0.0751477  0.0180377
 0.847652  0.567769   0.336186
 0.736119  0.252398   0.587399

In [33]:
x * oh1

3-element Array{Float64,1}:
 0.6877900825635983
 0.8476519468174486
 0.736119416739053

In [34]:
x * mh1

3-element Array{Float64,1}:
 0.6877900825635983
 0.8476519468174486
 0.736119416739053

In [35]:
x * mh2

3-element Array{Missing,1}:
 missing
 missing
 missing

In [36]:
x * ohb1

3×2 Array{Float64,2}:
 0.68779   0.0180377
 0.847652  0.336186
 0.736119  0.587399

In [37]:
x * mhb1

3×2 Array{Float64,2}:
 0.68779   0.0180377
 0.847652  0.336186
 0.736119  0.587399

In [38]:
x * mhb2

3×3 Array{Union{Missing, Float64},2}:
 0.68779   missing  0.0180377
 0.847652  missing  0.336186
 0.736119  missing  0.587399

In [39]:
gradient((x, y) -> x * y |> sum, x, mh1)

([1.0 0.0 0.0; 1.0 0.0 0.0; 1.0 0.0 0.0], nothing)

In [40]:
gradient((x, y) -> x * y |> sum, x, mh2)

LoadError: Output should be scalar; gradients are not defined for output missing

In [41]:
gradient((x, y) -> x * y |> sum, x, mhb1)

([1.0 0.0 1.0; 1.0 0.0 1.0; 1.0 0.0 1.0], nothing)

In [42]:
gradient((x, y) -> x * y |> sum, x, mhb2)

LoadError: Output should be scalar; gradients are not defined for output missing

## NGramMatrix with Missing

In [43]:
NGramIterator([3,2,1] |> collect, 4, 10) |> collect

6-element Array{Int64,1}:
 2223
 2232
 2321
 3213
 2133
 1333

In [44]:
Mill.string_start_code!(0)

0

In [45]:
Mill.string_start_code!(0)

0

In [46]:
NGramIterator([3,2,1] |> collect, 4, 10) |> collect

6-element Array{Int64,1}:
    3
   32
  321
 3213
 2133
 1333

In [47]:
Y1 = NGramMatrix(["hello", "world"])

2053×2 NGramMatrix{String,Array{String,1},Int64}:
 "hello"
 "world"

In [48]:
Y1S = SparseMatrixCSC(Y1)

2053×2 SparseMatrixCSC{Int64,UInt64} with 14 stored entries:
  [37  , 1]  =  1
  [105 , 1]  =  1
  [215 , 1]  =  1
  [875 , 1]  =  1
  [1113, 1]  =  1
  [1332, 1]  =  1
  [1489, 1]  =  1
  [112 , 2]  =  1
  [120 , 2]  =  1
  [1196, 2]  =  1
  [1268, 2]  =  1
  [1279, 2]  =  1
  [1297, 2]  =  1
  [1834, 2]  =  1

In [49]:
A1 = rand(10, 2053);
A1 * Y1

10×2 Array{Float64,2}:
 3.2715   3.765
 3.22379  3.8776
 3.76015  4.09328
 4.72192  4.28454
 3.15434  4.24327
 2.16216  3.91107
 3.01392  3.63304
 3.12473  5.00734
 3.77835  2.14699
 3.54556  3.38773

In [50]:
gradient((x, y) -> x * y |> sum, A1, Y1)

([0.0 0.0 … 0.0 0.0; 0.0 0.0 … 0.0 0.0; … ; 0.0 0.0 … 0.0 0.0; 0.0 0.0 … 0.0 0.0], nothing)

In [51]:
Y2 = NGramMatrix([missing, missing])
Y2::AbstractMatrix{Missing}

2053×2 NGramMatrix{Missing,Array{Missing,1},Missing}:
 missing
 missing

In [52]:
Y3 = NGramMatrix([[1,2,3], [4,5,6]])
Y3::AbstractMatrix{Int}

2053×2 NGramMatrix{Array{Int64,1},Array{Array{Int64,1},1},Int64}:
 [1, 2, 3]
 [4, 5, 6]

In [53]:
Y4 = NGramMatrix([missing, "a"])
Y4::AbstractMatrix{Union{Missing,Int}}

2053×2 NGramMatrix{Union{Missing, String},Array{Union{Missing, String},1},Union{Missing, Int64}}:
 missing
 "a"

In [54]:
Mill.Sequence

Union{AbstractString, Base.CodeUnits, AbstractArray{var"#s49",1} where var"#s49"<:Integer}

In [55]:
A2 = PostImputingMatrix(A1)

10×2053 PostImputingMatrix{Float64,Array{Float64,1},Array{Float64,2}}:
W:
 0.275559  0.847974  0.680983  0.135222    …  0.43406   0.422469   0.790355
 0.651621  0.313179  0.893847  0.738713       0.401743  0.682378   0.981197
 0.480651  0.237514  0.151058  0.938627       0.975828  0.0581015  0.196094
 0.124064  0.738416  0.774745  0.00208435     0.944144  0.991936   0.812399
 0.175429  0.410782  0.539103  0.816444       0.967007  0.983326   0.499664
 0.377245  0.721285  0.712769  0.831653    …  0.278597  0.0104127  0.198687
 0.908712  0.886308  0.695659  0.545085       0.930218  0.672024   0.00756091
 0.421417  0.347583  0.283536  0.688452       0.692411  0.511136   0.981275
 0.984789  0.651495  0.817573  0.214749       0.780617  0.904818   0.166428
 0.266691  0.412021  0.596556  0.187815       0.503121  0.97108    0.0416748

ψ:
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0

In [56]:
gradient((x, y) -> x * y |> sum, A2, Y1)

((W = [0.0 0.0 … 0.0 0.0; 0.0 0.0 … 0.0 0.0; … ; 0.0 0.0 … 0.0 0.0; 0.0 0.0 … 0.0 0.0], ψ = nothing), nothing)

In [57]:
gradient((x, y) -> x * y |> sum, A2, Y2)

((W = nothing, ψ = [2.0, 2.0, 2.0, 2.0, 2.0, 2.0, 2.0, 2.0, 2.0, 2.0]), nothing)

In [58]:
gradient((x, y) -> x * y |> sum, A2, Y3)

((W = [0.0 1.0 … 0.0 0.0; 0.0 1.0 … 0.0 0.0; … ; 0.0 1.0 … 0.0 0.0; 0.0 1.0 … 0.0 0.0], ψ = nothing), nothing)

In [59]:
gradient((x, y) -> x * y |> sum, A2, Y4)

((W = [0.0 0.0 … 0.0 0.0; 0.0 0.0 … 0.0 0.0; … ; 0.0 0.0 … 0.0 0.0; 0.0 0.0 … 0.0 0.0], ψ = [1.0; 1.0; … ; 1.0; 1.0]), nothing)

## Post (column) imputing

In [60]:
A = PostImputingMatrix(rand(3,3))
A::AbstractMatrix{Float64}

3×3 PostImputingMatrix{Float64,Array{Float64,1},Array{Float64,2}}:
W:
 0.75634   0.361657  0.956439
 0.717659  0.473828  0.322957
 0.319839  0.492202  0.227207

ψ:
 0.0
 0.0
 0.0

In [61]:
X = rand(3)

3-element Array{Float64,1}:
 0.5382592671567383
 0.9285620635705398
 0.13597963338194807

In [62]:
A * X

3-element Array{Float64,1}:
 0.8729847703587639
 0.8701806808777965
 0.6600918978775794

In [63]:
Y = maybehotbatch([1, missing, 3], 1:3)

3×3 MaybeHotMatrix{Union{Missing, Int64},Array{Union{Missing, Int64},1},Int64,Union{Missing, Bool}}:
  true  missing  false
 false  missing  false
 false  missing   true

In [64]:
A * Y

3×3 Array{Float64,2}:
 0.75634   0.0  0.956439
 0.717659  0.0  0.322957
 0.319839  0.0  0.227207

In [65]:
Z = maybehot(1, 1:3)

3-element MaybeHotVector{Int64,Int64,Bool}:
 1
 0
 0

In [66]:
A * Z

3-element Array{Float64,1}:
 0.7563401344429341
 0.7176588882785486
 0.3198386050280777

In [67]:
gradient((x, y) -> x * y |> sum, A, X)

((W = [0.5382592671567383 0.9285620635705398 0.13597963338194807; 0.5382592671567383 0.9285620635705398 0.13597963338194807; 0.5382592671567383 0.9285620635705398 0.13597963338194807], ψ = nothing), [1.7938376277495605, 1.327687573778762, 1.5066028920894707])

In [68]:
gradient((x, y) -> x * y |> sum, A, Y)

((W = [1.0 0.0 1.0; 1.0 0.0 1.0; 1.0 0.0 1.0], ψ = [1.0; 1.0; 1.0]), nothing)

In [69]:
gradient((x, y) -> x * y |> sum, A, Z)

((W = [1.0 0.0 0.0; 1.0 0.0 0.0; 1.0 0.0 0.0], ψ = nothing), nothing)

## Reflect in model and integration

- better IO for all types and trees
- single_key_identity
- single_scalar_identity

In [70]:
m = preimputing_dense(5, 5)

PreImputingDense(5, 5)

In [71]:
typeof(m)

Dense{typeof(identity),PreImputingMatrix{Float32,Array{Float32,1},Array{Float32,2}},Array{Float32,1}}

In [72]:
m.W

5×5 PreImputingMatrix{Float32,Array{Float32,1},Array{Float32,2}}:
W:
 -0.0205994  -0.272904    0.303568   -0.753332   0.00729682
  0.689609   -0.1689     -0.697259    0.450062   0.215844
 -0.598527    0.482865    0.0522939   0.37539    0.287853
  0.19854    -0.581905    0.397906   -0.198554   0.521882
  0.698969    0.0742277   0.0909097   0.226935  -0.326836

ψ:
 0.0  0.0  0.0  0.0  0.0

In [73]:
m.b

5-element Array{Float32,1}:
 0.0
 0.0
 0.0
 0.0
 0.0

In [74]:
m.σ

identity (generic function with 1 method)

In [75]:
m = postimputing_dense(5, 5)

PostImputingDense(5, 5)

In [76]:
typeof(m)

Dense{typeof(identity),PostImputingMatrix{Float32,Array{Float32,1},Array{Float32,2}},Array{Float32,1}}

In [77]:
m.W

5×5 PostImputingMatrix{Float32,Array{Float32,1},Array{Float32,2}}:
W:
 -0.00401047   0.149707   0.0387411  -0.743909    0.152047
 -0.330687     0.583643  -0.364024    0.711583   -0.578534
 -0.283126    -0.278242   0.0743308  -0.423035   -0.318288
  0.327039    -0.202112  -0.636051   -0.566097   -0.144081
 -0.228226     0.102411   0.11788    -0.0553263   0.63839

ψ:
 0.0
 0.0
 0.0
 0.0
 0.0

In [78]:
m.b

5-element Array{Float32,1}:
 0.0
 0.0
 0.0
 0.0
 0.0

In [79]:
m.σ

identity (generic function with 1 method)

In [80]:
x1 = reshape([i%3 == 0 ? missing : i for i in 1:10], 1, 10) |> collect
aa = BagNode(ArrayNode(x1), bags([1:2, 3:7, 0:-1, 8:10]))
a = ProductNode((; aa))

ba = ArrayNode(NGramMatrix(["a", missing, missing, "b"]))
bb = ArrayNode(NGramMatrix([[1,2], [3,4], [5], [6, 7, 8]]))
b = ProductNode((; ba, bb))

ca = ArrayNode(maybehotbatch([1,missing,9,missing], 1:10))
cb = ArrayNode(maybehotbatch([1,2,3,4], 1:10))
c = ProductNode((; ca, cb))

ds = ProductNode((; a, b, c))
printtree(ds)

[34mProductNode with 4 obs[39m
[34m  ├── a: [39m[31mProductNode with 4 obs[39m
[34m  │      [39m[31m  └── aa: [39m[32mBagNode with 4 obs[39m
[34m  │      [39m[31m          [39m[32m  └── [39m[39mArrayNode(1×10 Array, Union{Missing, Int64}) with 10 obs
[34m  ├── b: [39m[31mProductNode with 4 obs[39m
[34m  │      [39m[31m  ├── ba: [39m[39mArrayNode(2053×4 NGramMatrix, Union{Missing, Int64}) with 4 obs
[34m  │      [39m[31m  └── bb: [39m[39mArrayNode(2053×4 NGramMatrix, Int64) with 4 obs
[34m  └── c: [39m[31mProductNode with 4 obs[39m
[34m         [39m[31m  ├── ca: [39m[39mArrayNode(10×4 MaybeHotMatrix, Union{Missing, Bool}) with 4 obs
[34m         [39m[31m  └── cb: [39m[39mArrayNode(10×4 MaybeHotMatrix, Bool) with 4 obs

In [81]:
m = reflectinmodel(ds)
printtree(m; trav=true)

[34mProductModel … ↦ ArrayModel(Dense(21, 10)) [""][39m
[34m  ├── a: [39m[31mProductModel … ↦ ArrayModel(identity) ["E"][39m
[34m  │      [39m[31m  └── aa: [39m[32mBagModel … ↦ ⟨SegmentedMean(1)⟩ ↦ ArrayModel(identity) ["M"][39m
[34m  │      [39m[31m          [39m[32m  └── [39m[39mArrayModel(PreImputingDense(1, 1)) ["Q"]
[34m  ├── b: [39m[31mProductModel … ↦ ArrayModel(Dense(20, 10)) ["U"][39m
[34m  │      [39m[31m  ├── ba: [39m[39mArrayModel(PostImputingDense(2053, 10)) ["Y"]
[34m  │      [39m[31m  └── bb: [39m[39mArrayModel(Dense(2053, 10)) ["c"]
[34m  └── c: [39m[31mProductModel … ↦ ArrayModel(Dense(20, 10)) ["k"][39m
[34m         [39m[31m  ├── ca: [39m[39mArrayModel(PostImputingDense(10, 10)) ["o"]
[34m         [39m[31m  └── cb: [39m[39mArrayModel(Dense(10, 10)) ["s"]

In [82]:
m["E"].m

[39mArrayModel(identity)

In [83]:
m["Q"].m.W

1×1 PreImputingMatrix{Float32,Array{Float32,1},Array{Float32,2}}:
W:
 1.0

ψ:
 0.0

In [84]:
m = reflectinmodel(ds; single_key_identity=false, single_scalar_identity=false)
printtree(m)

[34mProductModel … ↦ ArrayModel(Dense(30, 10))[39m
[34m  ├── a: [39m[31mProductModel … ↦ ArrayModel(Dense(10, 10))[39m
[34m  │      [39m[31m  └── aa: [39m[32mBagModel … ↦ ⟨SegmentedMean(10)⟩ ↦ ArrayModel(Dense(10, 10))[39m
[34m  │      [39m[31m          [39m[32m  └── [39m[39mArrayModel(PreImputingDense(1, 10))
[34m  ├── b: [39m[31mProductModel … ↦ ArrayModel(Dense(20, 10))[39m
[34m  │      [39m[31m  ├── ba: [39m[39mArrayModel(PostImputingDense(2053, 10))
[34m  │      [39m[31m  └── bb: [39m[39mArrayModel(Dense(2053, 10))
[34m  └── c: [39m[31mProductModel … ↦ ArrayModel(Dense(20, 10))[39m
[34m         [39m[31m  ├── ca: [39m[39mArrayModel(PostImputingDense(10, 10))
[34m         [39m[31m  └── cb: [39m[39mArrayModel(Dense(10, 10))

In [85]:
m(ds)

10×4 ArrayNode{Array{Float32,2},Nothing}:
 -0.3317468    -0.5125708      0.16964775   -1.4313884
  0.263836      0.98139167    -0.6077279     2.1506412
  0.72772545    1.5863311      0.7082397     2.9281526
  0.112154655  -0.0032429546   0.39403498    0.18122023
 -0.25328735   -0.89813316    -0.052873097  -2.1488771
  0.9207657     1.0648692     -0.35964847    1.8690777
  0.1215893     0.11113492    -0.5825505     0.12865126
 -0.25538045   -1.1753664     -0.15263507   -2.0088832
  0.742897      0.7013071      0.040981848   1.5228921
 -0.30997258   -0.39253962    -0.46471187   -0.63144016

In [None]:
g = gradient(m -> sum(m(ds).data), m)

## Lens utilities
- ModelLens
- findnonempty
- findin
- replacein

In [None]:
printtree(ds; trav=true)

In [None]:
printtree(m; trav=true)

In [None]:
lens = findnonempty(ds)

In [None]:
[ModelLens(m, l) for l in lens]

In [None]:
n = ArrayNode(rand(1, 10))
ds2 = replacein(ds, ds["Q"], n)
printtree(ds2)

In [None]:
findin(ds, n)

In [None]:
findin(ds2, n)

## Error checks

In [None]:
vcat(PreImputingMatrix(rand(2,2)),
     PreImputingMatrix(rand(2,2))
)

In [None]:
hcat(PostImputingMatrix(rand(2,2)),
     PostImputingMatrix(rand(2,2))
)

In [None]:
PreImputingMatrix(rand(2,2)) * rand(3,3)

In [None]:
PostImputingMatrix(rand(2,2)) * maybehot(1, 1:4)

In [None]:
maybehot(1, 1:4)[5]

In [None]:
NGramMatrix(["a", "b"])[:, 3]

## Other changes

- renamed default params everywhere to `ψ` for consistency
- `terseprint` is gone and will be available from a standalone package
- `!` versions of functions for global flags
- `ChainRulesCore.rrule` instead of `Zygote.@adjoint` where possible
- `Nothing{T}` and `Maybe{T}` union types
- `ImputingMatrix`, `Sequence`
- `IdentityModel` changed to `ArrayMode{::typeof(identity)}`
- `3x` more tests than before
- more efficient aggregation operators
- at least `julia-1.5` required from now on
- `nobs` from `LearnBase` gone and replaced by `StatsBase` version
- `Macrotools` as a dependency used from `Flux`
- `NGramIterator` now works with starting and ending characters
- `MillString` prototyped
- reworked and simplified gradient checking tests