## Julia on Colaboratory ##

[Colaboratory](https://colab.research.google.com) does not provide native support for the [Julia programming language](https://julialang.org). However, since Colaboratory gives you root access to the machine that runs your notebook (the *“runtime”* in Colaboratory terminology), we can install Julia support by uploading a specially crafted Julia notebook  – *this* notebook. We then install Julia and [IJulia](https://github.com/JuliaLang/IJulia.jl) ([Jupyter](https://jupyter.org)/Colaboratory notebook support) and reload the notebook so that Colaboratory detects and initiates what we installed.

In brief:

1. **Run the cell below**
2. **Reload the page**
3. **Edit the notebook name and start hacking Julia code below**

**If your runtime resets**, either manually or if left idle for some time, **repeat steps 1 and 2**.

### Acknowledgements ###

This hack by Pontus Stenetorp is an adaptation of [James Bradbury’s original Colaboratory Julia hack](https://discourse.julialang.org/t/julia-on-google-colab-free-gpu-accelerated-shareable-notebooks/15319/27), that broke some time in September 2019 as Colaboratory increased their level of notebook runtime isolation. There also appears to be CUDA compilation support installed by default for each notebook runtime type in October 2019, which shaves off a good 15 minutes or so from the original hack’s installation time.

In [0]:
# Installation cell
%%shell
if ! command -v julia 2>&1 > /dev/null
then
    wget 'https://julialang-s3.julialang.org/bin/linux/x64/1.2/julia-1.2.0-linux-x86_64.tar.gz' \
        -O /tmp/julia.tar.gz
    tar -x -f /tmp/julia.tar.gz -C /usr/local --strip-components 1
    rm /tmp/julia.tar.gz
fi
julia -e 'using Pkg; pkg"add Plots; add PyPlot; add IJulia; add Knet; precompile"'
julia -e 'using Pkg; pkg"build Knet;"'

--2020-01-05 13:02:01--  https://julialang-s3.julialang.org/bin/linux/x64/1.2/julia-1.2.0-linux-x86_64.tar.gz
Resolving julialang-s3.julialang.org (julialang-s3.julialang.org)... 151.101.2.49, 151.101.66.49, 151.101.130.49, ...
Connecting to julialang-s3.julialang.org (julialang-s3.julialang.org)|151.101.2.49|:443... connected.
HTTP request sent, awaiting response... 302 gce internal redirect trigger
Location: https://storage.googleapis.com/julialang2/bin/linux/x64/1.2/julia-1.2.0-linux-x86_64.tar.gz [following]
--2020-01-05 13:02:01--  https://storage.googleapis.com/julialang2/bin/linux/x64/1.2/julia-1.2.0-linux-x86_64.tar.gz
Resolving storage.googleapis.com (storage.googleapis.com)... 108.177.97.128, 2404:6800:4008:c00::80
Connecting to storage.googleapis.com (storage.googleapis.com)|108.177.97.128|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 91990555 (88M) [application/x-tar]
Saving to: ‘/tmp/julia.tar.gz’


2020-01-05 13:02:02 (257 MB/s) - ‘/tmp/julia.t



In [0]:
using Knet
# Test if Knet is using gpu
Knet.gpu()

┌ Info: Recompiling stale cache file /root/.julia/compiled/v1.2/Knet/f4vSz.ji for Knet [1902f260-5fb4-5aff-8c31-6271790ab950]
└ @ Base loading.jl:1240


0

In [0]:
a = KnetArray(randn(4,4))
sigm.(a)

4×4 KnetArray{Float64,2}:
 0.54592   0.554279  0.343429   0.838957
 0.806902  0.768127  0.285473   0.488519
 0.568271  0.857464  0.0606523  0.424611
 0.53042   0.396642  0.694845   0.125767

In [1]:
#imports
import Pkg
using Pkg; for p in ("Knet","IterTools","WordTokenizers","Test","Random","Statistics","Dates","LinearAlgebra","CuArrays"); haskey(Pkg.installed(),p) || Pkg.add(p); end
using Statistics, IterTools, WordTokenizers, Test, Knet, Random, Dates, Base.Iterators, LinearAlgebra
# Update and list all packages
Pkg.update()
pkgs = Pkg.installed()

for package in keys(pkgs)
    if pkgs[package] == nothing
        pkgs[package] = VersionNumber("0.0.1")
    end
    println("Package name: ", package, " Version: ", pkgs[package])
end
using CuArrays: CuArrays, usage_limit
CuArrays.usage_limit[] = 8_000_000_000
BATCH_SIZE = 64

Knet.atype() = KnetArray{Float32} 
is_lstm_strategy_on = true # if true rnn type becomes lstm, otherwise we preferred to use relu
gpu() # GPU test must result as 0
# Vocabulary Structure
struct Vocab
    w2i::Dict{String,Int}
    i2w::Vector{String}
    unk::Int
    eos::Int
    tokenizer
end

function Vocab(file::String; tokenizer=split, vocabsize=Inf, mincount=1, unk="<unk>", eos="<s>")
    vocab_freq = Dict{String,Int64}(unk => 1, eos => 1)
    w2i = Dict{String, Int64}(unk => 2, eos => 1)
    i2w = Vector{String}()

    push!(i2w, eos)
    push!(i2w, unk)

    open(file) do f
        for line in eachline(f)
            sentence = strip(lowercase(line))
            sentence = tokenizer(line, [' '], keepempty = false)

            for word in sentence
                word == unk && continue
                word == eos && continue # They are default ones to be added later
                vocab_freq[word] = get!(vocab_freq, word, 0) + 1
            end
        end
        close(f)
    end


    # End of vanilla implementation of the vocaulary
    # From here we must add the mincount and vocabsize properties
    # We must change the first two property of the vocab wrt those paramaters
    vocab_freq = sort!(
        collect(vocab_freq),
        by = tuple -> last(tuple),
        rev = true,
    )

    if length(vocab_freq)>vocabsize - 2 # eos and unk ones
        vocab_freq = vocab_freq[1:vocabsize-2] # trim to fit the size
    end

    #vocab_freq = reverse(vocab_freq)

    while true
        length(vocab_freq)==0 && break
        word,freq = vocab_freq[end]
        freq>=mincount && break # since it is already ordered
        vocab_freq = vocab_freq[1:(end - 1)]
    end
    #pushfirst!(vocab_freq,unk=>1,eos=>1) # freq does not matter, just adding the
    for i in 1:length(vocab_freq)
        word, freq = vocab_freq[i]
        ind = (get!(w2i, word, 1+length(w2i)))
        (length(i2w) < ind) && push!(i2w, word)
    end

    return Vocab(w2i, i2w, 2, 1, tokenizer)
end
# Special reader for the task
struct TextReader
    file::String
    vocab::Vocab
end

word2ind(dict,x) = get(dict, x, 2)

#Implementing the iterate function
function Base.iterate(r::TextReader, s=nothing)
    if s == nothing
        state = open(r.file)
        Base.iterate(r,state)
    else
        if eof(s) == true
            close(s)
            return nothing
        else
            line = readline(s)
            line = strip(lowercase(line))
            sent = r.vocab.tokenizer(line, [' '], keepempty = false)
            sent_ind = Int[]
            for word in sent
                ind = word2ind(r.vocab.w2i,word)
                push!(sent_ind,ind)
            end
            push!(sent_ind,r.vocab.eos)
            return (sent_ind, s)
        end
    end
end


Base.IteratorSize(::Type{TextReader}) = Base.SizeUnknown()
Base.IteratorEltype(::Type{TextReader}) = Base.HasEltype()
Base.eltype(::Type{TextReader}) = Vector{Int}
# File 
const datadir = "nn4nlp-code/data/ptb"
isdir(datadir) || run(`git clone https://github.com/neubig/nn4nlp-code.git`)

if !isdefined(Main, :vocab)
    vocab = Vocab("$datadir/train.txt", mincount=1)

    train = TextReader("$datadir/train.txt", vocab)
    test = TextReader("$datadir/valid.txt", vocab)

end
#Embed
struct Embed; w; end

function Embed(vocabsize::Int, embedsize::Int)
    Embed(param(embedsize,vocabsize))
end

function (l::Embed)(x)
    l.w[:,x]
end

#Linear
struct Linear; w; b; end

function Linear(inputsize::Int, outputsize::Int)
    Linear(param(outputsize,inputsize), param0(outputsize))
end

function (l::Linear)(x)
    l.w * mat(x,dims=1) .+ l.b
end
# Mask!
function mask!(a,pad)
    matr = a
    for j in 1:size(matr)[1]
        i=0
        while i<(length(matr[j,:])-1)
            matr[j,length(matr[j,:])-i-1]!=pad && break

            if matr[j,length(matr[j,:])-i]== pad
                matr[j,length(matr[j,:])-i]= 0
            end
            i+=1
        end
    end
    matr
end
# Minibatching
struct LMData
    src::TextReader
    batchsize::Int
    maxlength::Int
    bucketwidth::Int
    buckets
end

function LMData(src::TextReader; batchsize = 64, maxlength = typemax(Int), bucketwidth = 10)
    numbuckets = min(128, maxlength ÷ bucketwidth)
    buckets = [ [] for i in 1:numbuckets ]
    LMData(src, batchsize, maxlength, bucketwidth, buckets)
end

Base.IteratorSize(::Type{LMData}) = Base.SizeUnknown()
Base.IteratorEltype(::Type{LMData}) = Base.HasEltype()
Base.eltype(::Type{LMData}) = Matrix{Int}

function Base.iterate(d::LMData, state=nothing)
    if state == nothing
        for b in d.buckets; empty!(b); end
    end
    bucket,ibucket = nothing,nothing
    while true
        iter = (state === nothing ? iterate(d.src) : iterate(d.src, state))
        if iter === nothing
            ibucket = findfirst(x -> !isempty(x), d.buckets)
            bucket = (ibucket === nothing ? nothing : d.buckets[ibucket])
            break
        else
            sent, state = iter
            if length(sent) > d.maxlength || length(sent) == 0; continue; end
            ibucket = min(1 + (length(sent)-1) ÷ d.bucketwidth, length(d.buckets))
            bucket = d.buckets[ibucket]
            push!(bucket, sent)
            if length(bucket) === d.batchsize; break; end
        end
    end
    if bucket === nothing; return nothing; end
    batchsize = length(bucket)
    maxlen = maximum(length.(bucket))
    batch = fill(d.src.vocab.eos, batchsize, maxlen + 1)
    for i in 1:batchsize
        batch[i, 1:length(bucket[i])] = bucket[i]
    end
    empty!(bucket)
    return batch, state
end
struct RNN_model
    embed::Embed        # language embedding
    rnn::RNN            # RNN (can be bidirectional)
    projection::Linear  # converts output to vocab scores
    dropout::Real       # dropout probability to prevent overfitting
    vocab::Vocab        # language vocabulary  
end

function RNN_model(hidden::Int,      # hidden size for both the encoder and decoder RNN
                embsz::Int,          # embedding size
                vocab::Vocab;     # vocabulary for source language
                layers=1,            # number of layers
                bidirectional=false, # whether encoder RNN is bidirectional
                dropout=0)           # dropout probability

    embed = Embed(length(vocab.i2w),embsz)

    rnn = RNN(embsz,hidden;rnnType=is_lstm_strategy_on ? :lstm : :relu, numLayers=layers,bidirectional=bidirectional ,dropout= dropout)
    
    layerMultiplier = bidirectional ? 2 : 1
    
    projection = Linear(layerMultiplier*hidden,length(vocab.i2w))

    RNN_model(embed,rnn,projection,dropout,vocab)

end
function calc_scores(rm::RNN_model, data; average=true)
    B, Tx = size(data)
    
    project = rm.projection
    emb = rm.embed(data)
    
#     rm.rnn.h = 0
#     rm.rnn.c = 0

    y = rm.rnn(emb)

    return project(reshape(y,:,B*Tx))
    

end
function loss_f(model, batch; average = true)  
    verify = deepcopy(batch[:,2:end])
    mask!(verify,vocab.eos)
        
    scores = calc_scores(model,batch[:,1:end-1]) # trim one end
   
    return nll(scores,verify;average=average)

end
function maploss(lossfn, model, data; average = true)
    total_words = 0
    total_loss = 0
    for part in collect(data)
        curr_loss, curr_word = lossfn(model,part, average = false)
        total_loss += curr_loss
        total_words += curr_word
    end

    average && return total_loss/total_words
    return total_loss, total_words
end
model = RNN_model(512, 512, vocab; bidirectional=true, dropout=0.2)
train_batches = collect(LMData(train))
test_batches = collect(LMData(test))
train_batches50 = train_batches[1:50] # Small sample for quick loss calculation
epoch = adam(loss_f, ((model, batch) for batch in train_batches))
bestmodel, bestloss = deepcopy(model), maploss(loss_f, model, test_batches)

[32m[1m  Updating[22m[39m registry at `~/.julia/registries/General`
[32m[1m  Updating[22m[39m git-repo `https://github.com/JuliaRegistries/General.git`
[?25l[2K[?25h[32m[1m Resolving[22m[39m package versions...
[32m[1m Installed[22m[39m IterTools ─ v1.3.0
[32m[1m  Updating[22m[39m `~/.julia/environments/v1.2/Project.toml`
 [90m [c8e1da08][39m[92m + IterTools v1.3.0[39m
[32m[1m  Updating[22m[39m `~/.julia/environments/v1.2/Manifest.toml`
 [90m [c8e1da08][39m[92m + IterTools v1.3.0[39m
[32m[1m Resolving[22m[39m package versions...
[32m[1m Installed[22m[39m StrTables ────── v1.0.1
[32m[1m Installed[22m[39m WordTokenizers ─ v0.5.3
[32m[1m Installed[22m[39m HTML_Entities ── v1.0.0
[32m[1m  Updating[22m[39m `~/.julia/environments/v1.2/Project.toml`
 [90m [796a5d58][39m[92m + WordTokenizers v0.5.3[39m
[32m[1m  Updating[22m[39m `~/.julia/environments/v1.2/Manifest.toml`
 [90m [7693890a][39m[92m + HTML_Entities v1.0.0[39m
 [90m

┌ Info: Precompiling IterTools [c8e1da08-722c-5040-9ed9-7db0dc04731e]
└ @ Base loading.jl:1242
┌ Info: Precompiling WordTokenizers [796a5d58-b03d-544a-977e-18100b691f6e]
└ @ Base loading.jl:1242
┌ Info: Recompiling stale cache file /root/.julia/compiled/v1.2/Knet/f4vSz.ji for Knet [1902f260-5fb4-5aff-8c31-6271790ab950]
└ @ Base loading.jl:1240


[32m[1m  Updating[22m[39m registry at `~/.julia/registries/General`
[32m[1m  Updating[22m[39m git-repo `https://github.com/JuliaRegistries/General.git`
[?25l[2K[?25h[32m[1m Resolving[22m[39m package versions...
[32m[1m Installed[22m[39m CUDAdrv ──── v5.0.1
[32m[1m Installed[22m[39m CUDAnative ─ v2.7.0
[32m[1m Installed[22m[39m CuArrays ─── v1.6.0
[32m[1m Installed[22m[39m Knet ─────── v1.2.7
[32m[1m  Updating[22m[39m `~/.julia/environments/v1.2/Project.toml`
 [90m [3a865a2d][39m[93m ↑ CuArrays v1.5.0 ⇒ v1.6.0[39m
 [90m [1902f260][39m[95m ↓ Knet v1.3.2 ⇒ v1.2.7[39m
[32m[1m  Updating[22m[39m `~/.julia/environments/v1.2/Manifest.toml`
 [90m [c5f51814][39m[93m ↑ CUDAdrv v4.0.4 ⇒ v5.0.1[39m
 [90m [be33ccc6][39m[93m ↑ CUDAnative v2.6.0 ⇒ v2.7.0[39m
 [90m [3a865a2d][39m[93m ↑ CuArrays v1.5.0 ⇒ v1.6.0[39m
 [90m [1902f260][39m[95m ↓ Knet v1.3.2 ⇒ v1.2.7[39m
 [90m [ae029012][39m[93m ↑ Requires v0.5.2 ⇒ v1.0.0[39m
[32m[1m  Build

Cloning into 'nn4nlp-code'...


(RNN_model(Embed(P(KnetArray{Float32,2}(512,10000))), LSTM(input=512,hidden=512,bidirectional,dropout=0.2), Linear(P(KnetArray{Float32,2}(10000,1024)), P(KnetArray{Float32,1}(10000))), 0.2, Vocab(Dict("adviser" => 1750,"enjoy" => 4607,"advertisements" => 7826,"fight" => 1441,"nicholas" => 3783,"everywhere" => 6278,"surveyed" => 3556,"helping" => 2081,"whose" => 621,"manufacture" => 5052…), ["<s>", "<unk>", "the", "N", "of", "to", "a", "in", "and", "'s"  …  "cluett", "hydro-quebec", "memotec", "photography", "ipo", "ssangyong", "fromstein", "ferc", "gitano", "daewoo"], 2, 1, split)), 9.214204f0)

In [2]:
#progress(ncycle(epoch, 100), seconds=5) do x
#for ep =1:100 
j =0
start = time()
dev_time = 0
println("Start Time=",start)
all_tagged = 0
for i in ncycle(epoch,100) 
    j += 1
    global bestmodel, bestloss
    ## Report gradient norm for the first batch
    f = @diff loss_f(model,train_batches[1])
    gnorm = sqrt(sum(norm(grad(f,x))^2 for x in params(model)))
    ## Report training and validation loss
    trnloss,this_words = maploss(loss_f,model, train_batches50,average = false)
    trnloss= trnloss/this_words
    all_tagged+=this_words
    if(j%10 == 0)
      println("train-nll= ",trnloss)
      end
    if (j%100==0)
      dev_start = time()
      devloss,words  = maploss(loss_f,model, test_batches,average = false )
      ## Save model that does best on validation data
      if devloss < bestloss
          bestmodel, bestloss = deepcopy(model), devloss
      end
      dev_time += time() - dev_start

      train_time = time() - start - dev_time

      println("nll=",devloss/words,"    ppl=", exp(devloss/words),"    words=",words, "    time=",train_time,"    word_per_sec=",all_tagged/train_time)
    end
    
end
Knet.save("lm-lstm.jld2", "model", bestmodel)

Start Time=1.578230789136887e9
train-nll= 7.245623
train-nll= 6.812294
train-nll= 6.5757356
train-nll= 6.4693737
train-nll= 6.3861575
train-nll= 6.3389435
train-nll= 6.3085117
train-nll= 6.219439
train-nll= 6.0866265
train-nll= 5.8965154
nll=5.954507    ppl=385.48676    words=70390    time=168.54808592796326    word_per_sec=39238.653845209636
train-nll= 5.748297
train-nll= 5.638169
train-nll= 5.5402837
train-nll= 5.481519
train-nll= 5.3810406
train-nll= 5.302901
train-nll= 5.214938
train-nll= 5.1277294
train-nll= 5.0314837
train-nll= 4.9339175
nll=4.8852177    ppl=132.31926    words=70390    time=326.67675709724426    word_per_sec=40490.177867360675
train-nll= 4.8225665
train-nll= 4.72732
train-nll= 4.6216307
train-nll= 4.5201325
train-nll= 4.410288
train-nll= 4.320653
train-nll= 4.2322564
train-nll= 4.1465745
train-nll= 4.0589604
train-nll= 3.9847374
nll=3.8957253    ppl=49.19172    words=70390    time=483.9358160495758    word_per_sec=40998.82534415979
train-nll= 3.908325
train-nll= 

InterruptException: ignored

In [1]:
@doc accuracy

No documentation found.

Binding `accuracy` does not exist.
