# Chapter 6: Comprehension and Production

## Preparations

Load necessary packages.

In [None]:
using JudiLing, DataFrames, Plots

Here, we will also want to show some progress meters, so we additionally install and load:

In [None]:
using Pkg
Pkg.add("ProgressMeter")
using ProgressMeter

Load the Dutch dataset we will be working with.

In [None]:
# Adjust the filepath to the location of your dutch.csv file.
dutch = JudiLing.load_dataset("../dat/dutch.csv");
dutch = dutch[:,[:Ortho, :Word, :Number, :WordCat, :Lexeme, :Syllables, :Frequency]];

## Comprehension and production

First, we regenerate the cue_obj using trigrams and we load the word embeddings:

In [None]:
cue_obj = JudiLing.make_cue_matrix(dutch,
                                   grams=3,
                                   target_col="Ortho");
S, words = JudiLing.load_S_matrix("../dat/dutch_w2v.csv", header = false, sep = ",");

### Endstate of learning

Then, we compute the mapping matrix from `cue_obj.C` to the `S` matrix using the `JudiLing.make_transform_matrix` function. The order of the parameters determines from which to which matrix the mapping is calculated.

In [None]:
F = JudiLing.make_transform_matrix(cue_obj.C, S);
JudiLing.display_matrix(dutch, :Ortho, cue_obj, F, :F)

Now we compute the predicted semantic vectors:

In [None]:
Shat = cue_obj.C * F
JudiLing.display_matrix(dutch, :Ortho, cue_obj, Shat, :S)

...and compare it (visually) to the correct semantic vectors:

In [None]:
JudiLing.display_matrix(dutch, :Ortho, cue_obj, S, :S)

We can compute the production matrix in the same manner (by reversing the order of the arguments):

In [None]:
G = JudiLing.make_transform_matrix(S, cue_obj.C);
JudiLing.display_matrix(dutch, :Ortho, cue_obj, G, :G)

and predict the cue vectors:

In [None]:
Chat = S * G
JudiLing.display_matrix(dutch, :Ortho, cue_obj, Chat, :C)

and compare the target vectors:

In [None]:
JudiLing.display_matrix(dutch, :Ortho, cue_obj, cue_obj.C, :C)

We can compute the accuracy of our mappings more formally using the `eval_SC` function (details in the next notebooks).

Comprehension accuracy:

In [None]:
comp_acc_el = JudiLing.eval_SC(Shat, S)

(We will return to the cause of this warning message in the next notebooks, for now, it can be safely ignored)

Production accuracy:

In [None]:
JudiLing.eval_SC(Chat, cue_obj.C)

### Incremental learning

Instead of using the frequency-agnostic endstate-of-learning, we can also use incremental learning to calculate the mapping matrices. 

First, we assume that each word is learned exactly once by setting `n_epochs=1` and we use a learning rate of `eta=0.1`. This is good for demonstration purposes, but likely too big in most cases:

In [None]:
F = JudiLing.wh_learn(cue_obj.C, S, n_epochs=1, eta=0.1);

Compute the predicted semantic matrix:

In [None]:
Shat = cue_obj.C * F

Again, evaluate using `eval_SC`:

In [None]:
JudiLing.eval_SC(Shat, S)

Next, we pass through the data another 10 times by setting `n_epochs=10`. By setting `weights=F` we ensure that our previous `F` matrix is trained (instead of creating a new one).

In [None]:
F = JudiLing.wh_learn(cue_obj.C, S, n_epochs=10, weights=F, eta=0.1, verbose=true);
Shat = cue_obj.C * F
acc0 = JudiLing.eval_SC(Shat, S)

The previous piece of code passed through the data 10 times but did not retain any information on accuracy during the training. The following code does the same, but computes accuracy after each pass through the data:

In [None]:
# create a list where we will save all accuracies
accuracies = [0.]

# initialise F with zeros
F = zeros(Float64, size(cue_obj.C, 2), size(S, 2))

# now go through the data 10 times
# this takes a few minutes
@showprogress for i=1:10

    # learn all word forms in the dataframe once
    F = JudiLing.wh_learn(cue_obj.C, S, n_epochs=1, weights=F, eta=0.1)
    
    # compute the new Shat matrix
    Shat = cue_obj.C * F
    
    # evaluate and save
    acc = JudiLing.eval_SC(Shat, S)
    append!(accuracies, [acc])
end

# plot the learning trajectory
plot(collect(0:10), accuracies, seriestype = :scatter, 
     label="Incremental learning", xlab="Epoch", ylab="Accuracy")
     
# plot the endstate of learning accuracy for comparison
plot!([11], [comp_acc_el], seriestype=:scatter, 
      label="Endstate of learning", legend=:bottomright)

The resulting figure can be saved in the following way:

In [None]:
savefig("../fig/wh_learning.pdf")

Next, we want to learn our mapping matrix F incrementally based on real frequencies. However, using the provided CELEX frequencies would result in a very large number of training events:

In [None]:
sum(dutch.Frequency)

We can soften this problem by dividing frequencies by a constant, 100 in this case:

In [None]:
dutch[!, "Frequency_scaled100"] = dutch.Frequency./100;
dutch[!, "Frequency_scaled100"] = Int.(ceil.(dutch.Frequency_scaled100));

Now we need to generate the order in which words are supposed to be learned based on their (scaled) frequency. This is done with `make_learn_seq` function which takes as arguments the list of frequencies and the a random seed (to control the randomness in the sequence generation):

In [None]:
learn_seq = JudiLing.make_learn_seq(dutch.Frequency_scaled100;
                                           random_seed = 314);
learn_seq[1:5]

When we hand the `learn_seq` to the `wh_learn` function and specify `n_epochs`, the wordforms in our dutch dataset will be learned in the order specified by `learn_seq`:

In [None]:
F = JudiLing.wh_learn(cue_obj.C, S, n_epochs=1, 
                             learn_seq=learn_seq, verbose=true, 
                             eta=0.1);

Subsequently, we can again compute the predicted semantic matrix and evaluate the accuracy:

In [None]:
Shat = cue_obj.C * F
JudiLing.eval_SC(Shat, S)

### Frequency-informed endstate-of-learning

Finally, we can compute the frequency-informed endstate of learning by simply adding frequencies as an additional parameter:

In [None]:
F = JudiLing.make_transform_matrix(cue_obj.C, S, dutch.Frequency);

Then we can compute the predicted semantic matrix and its accuracy as usual:

In [None]:
Shat = cue_obj.C * F;
JudiLing.eval_SC(Shat, S)

## Exercises

### Exercise 1

Preparation: Load the latin dataset, and create cue object and S matrix:

In [None]:
latin = JudiLing.load_dataset("../dat/latin.csv")

cue_obj_la = JudiLing.make_cue_matrix(latin, grams=3, target_col=:Word)
S_la = JudiLing.make_S_matrix(
    latin,
    ["Lexeme"],
    ["Person", "Number", "Tense", "Voice", "Mood"],
    ncol=300);

### Exercise 2

Compute F and G matrices and inspect with display_matrix:

In [None]:
F_la = JudiLing.make_transform_matrix(cue_obj_la.C, S_la)
JudiLing.display_matrix(latin, :Word, cue_obj_la, F_la, :F)

In [None]:
G_la = JudiLing.make_transform_matrix(S_la, cue_obj_la.C)
JudiLing.display_matrix(latin, :Word, cue_obj_la, G_la, :G)

### Exercise 3

Predict Shat and Chat and compare both visually and with eval_SC

In [None]:
Shat_la = cue_obj_la.C * F_la
JudiLing.display_matrix(latin, :Word, cue_obj_la, Shat_la, :Shat)

In [None]:
JudiLing.display_matrix(latin, :Word, cue_obj_la, S_la, :S)

In [None]:
JudiLing.eval_SC(Shat_la, S_la)

In [None]:
Chat_la = S_la * G_la
JudiLing.display_matrix(latin, :Word, cue_obj_la, Chat_la, :Chat)

In [None]:
JudiLing.display_matrix(latin, :Word, cue_obj_la, cue_obj_la.C, :C)

In [None]:
JudiLing.eval_SC(Chat_la, cue_obj_la.C)

### Exercise 4

Modify the code above to cycle through the latin dataset 10 times and record the accuracy after each iteration.

In [None]:
# create a list where we will save all accuracies
accuracies = [0.]

# initialise F with zeros
F_la_incr = zeros(Float64, size(cue_obj_la.C, 2), size(S_la, 2))

# now go through the data 10 times
# this takes a few minutes
@showprogress for i=1:10

    # learn all word forms in the dataframe once
    F_la_incr = JudiLing.wh_learn(cue_obj_la.C, S_la, n_epochs=1, weights=F_la_incr, eta=0.1)
    
    # compute the new Shat matrix
    Shat_la = cue_obj_la.C * F_la_incr
    
    # evaluate and save
    acc = JudiLing.eval_SC(Shat_la, S_la)
    append!(accuracies, [acc])
end

# plot the learning trajectory
plot(collect(0:10), accuracies, seriestype = :scatter, 
     label="Incremental learning", xlab="Epoch", ylab="Accuracy")
     
# plot the endstate of learning accuracy for comparison
plot!([11], [0.9911], seriestype=:scatter, 
      label="Endstate of learning", legend=:bottomright)

### Exercise 5

Train incrementally according to simulated frequencies.

In [None]:
learn_seq_la = JudiLing.make_learn_seq(latin.sim_freq;
                                           random_seed = 314);
learn_seq_la[1:5]

In [None]:
length(learn_seq_la) 

In [None]:
F_incr = JudiLing.wh_learn(cue_obj_la.C, S_la, n_epochs=1, 
                             learn_seq=learn_seq_la, verbose=true, 
                             eta=0.1);

In [None]:
Shat_incr = cue_obj_la.C * F_incr
JudiLing.eval_SC(Shat_incr, S_la)

### Exercise 6

Compute F with FIL:

In [None]:
F_fil = JudiLing.make_transform_matrix(cue_obj_la.C, S_la, latin.sim_freq)

Shat_fil = cue_obj_la.C * F_fil
JudiLing.eval_SC(Shat_fil, S_la)