# Hong Kongese Language Identifier
This notebook contains modifications to make it run with the Hong Kongese language identification dataset. The only difference is that we do not load the English vectors because they will be useless on Hong Kongese.
This notebook uses a dataset with 8 each of Hong Kongese and Standard Chinese articles.

# 5 - Multi-class Sentiment Analysis

In all of the previous notebooks we have performed sentiment analysis on a dataset with only two classes, positive or negative. When we have only two classes our output can be a single scalar, bound between 0 and 1, that indicates what class an example belongs to. When we have more than 2 examples, our output must be a $C$ dimensional vector, where $C$ is the number of classes.

In this notebook, we'll be performing classification on a dataset with 6 classes. Note that this dataset isn't actually a sentiment analysis dataset, it's a dataset of questions and the task is to classify what category the question belongs to. However, everything covered in this notebook applies to any dataset with examples that contain an input sequence belonging to one of $C$ classes.

Below, we setup the fields, and load the dataset. 

The first difference is that we do not need to set the `dtype` in the `LABEL` field. When doing a mutli-class problem, PyTorch expects the labels to be numericalized `LongTensor`s. 

The second different is that we use `TREC` instead of `IMDB` to load the `TREC` dataset. The `fine_grained` argument allows us to use the fine-grained labels (of which there are 50 classes) or not (in which case they'll be 6 classes). You can change this how you please.

In [1]:
import torch
from torchtext import data
from torchtext import datasets
import random

SEED = 1234

torch.manual_seed(SEED)
torch.cuda.manual_seed(SEED)
torch.backends.cudnn.deterministic = True

DATASET="8"

Custom tokenizer to simply split at character level.

In [2]:
def tokenizer(text): # create a tokenizer function
    return list(map(str, text.replace(" ", "")))

Load dataset from data/language directory.

In [3]:
TEXT = data.Field(tokenize=tokenizer)
LABEL = data.LabelField()

fields = {'language': ('label', LABEL), 'text': ('text', TEXT)}
train_data, valid_data, test_data = data.TabularDataset.splits(
                                        path = 'data/language/' + DATASET,
                                        train = 'train.json',
                                        validation = 'valid.json',
                                        test = 'test.json',
                                        format = 'json',
                                        fields = fields
)

Let's look at one of the examples in the training set.

In [4]:
vars(train_data[-1])

{'label': 'zh',
 'text': ['北',
  '京',
  '清',
  '華',
  '大',
  '學',
  '日',
  '前',
  '公',
  '布',
  '為',
  '文',
  '科',
  '教',
  '師',
  '設',
  '立',
  '的',
  '最',
  '高',
  '學',
  '術',
  '榮',
  '譽',
  '名',
  '單',
  '，',
  '1',
  '8',
  '人',
  '名',
  '單',
  '中',
  '不',
  '少',
  '都',
  '曾',
  '經',
  '歌',
  '頌',
  '習',
  '近',
  '平',
  '及',
  '其',
  '政',
  '策',
  '，',
  '反',
  '之',
  '自',
  '由',
  '派',
  '學',
  '者',
  '秦',
  '暉',
  '等',
  '則',
  '未',
  '在',
  '名',
  '單',
  '內',
  '。',
  '有',
  '學',
  '者',
  '受',
  '訪',
  '時',
  '批',
  '評',
  '，',
  '清',
  '華',
  '這',
  '種',
  '「',
  '政',
  '治',
  '第',
  '一',
  '」',
  '的',
  '取',
  '向',
  '，',
  '正',
  '反',
  '映',
  '中',
  '國',
  '高',
  '校',
  '學',
  '術',
  '、',
  '人',
  '文',
  '精',
  '神',
  '的',
  '淪',
  '落',
  '。',
  '清',
  '華',
  '大',
  '學',
  '文',
  '科',
  '建',
  '設',
  '處',
  '日',
  '前',
  '於',
  '網',
  '站',
  '公',
  '布',
  '首',
  '批',
  '1',
  '8',
  '人',
  '「',
  '文',
  '科',
  '資',
  '深',
  '教',
  '授',
  '」',
  '名',
  '單',
  '，',
  '稱',
  '

Next, we'll build the vocabulary. As this dataset is small (only ~3800 training examples) it also has a very small vocabulary (~7500 unique tokens), this means we do not need to set a `max_size` on the vocabulary as before.

In [5]:
TEXT.build_vocab(train_data)
LABEL.build_vocab(train_data)

Next, we can check the labels.

The 6 labels (for the non-fine-grained case) correspond to the 6 types of questions in the dataset:
- `HUM` for questions about humans
- `ENTY` for questions about entities
- `DESC` for questions asking you for a description 
- `NUM` for questions where the answer is numerical
- `LOC` for questions where the answer is a location
- `ABBR` for questions asking about abbreviations

In [6]:
print(LABEL.vocab.stoi)

defaultdict(<function _default_unk_index at 0x1169b7268>, {'hky': 0, 'zh': 1, 'en': 2})


As always, we set up the iterators.

In [7]:
BATCH_SIZE = 64

device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')

train_iterator, valid_iterator, test_iterator = data.BucketIterator.splits(
    (train_data, valid_data, test_data), 
    batch_size=BATCH_SIZE, 
    device=device,
    sort_key=lambda x: len(x.text), # the BucketIterator needs to be told what function it should use to group the data.
    sort_within_batch=False)

We'll be using the CNN model from the previous notebook, however any of the models covered in these tutorials will work on this dataset. The only difference is now the `output_dim` will be $C$ instead of $1$.

In [8]:
import torch.nn as nn
import torch.nn.functional as F

class CNN(nn.Module):
    def __init__(self, vocab_size, embedding_dim, n_filters, filter_sizes, output_dim, dropout):
        super().__init__()
        
        self.embedding = nn.Embedding(vocab_size, embedding_dim)
        self.convs = nn.ModuleList([
                                    nn.Conv2d(in_channels = 1, out_channels = n_filters, 
                                              kernel_size = (fs, embedding_dim)) 
                                    for fs in filter_sizes
                                    ])
        self.fc = nn.Linear(len(filter_sizes) * n_filters, output_dim)
        self.dropout = nn.Dropout(dropout)
        
    def forward(self, text):
        
        #text = [sent len, batch size]
        
        text = text.permute(1, 0)
                
        #text = [batch size, sent len]
        
        embedded = self.embedding(text)
                
        #embedded = [batch size, sent len, emb dim]
        
        embedded = embedded.unsqueeze(1)
        
        #embedded = [batch size, 1, sent len, emb dim]
        
        conved = [F.relu(conv(embedded)).squeeze(3) for conv in self.convs]
            
        #conv_n = [batch size, n_filters, sent len - filter_sizes[n]]
        
        pooled = [F.max_pool1d(conv, conv.shape[2]).squeeze(2) for conv in conved]
        
        #pooled_n = [batch size, n_filters]
        
        cat = self.dropout(torch.cat(pooled, dim=1))

        #cat = [batch size, n_filters * len(filter_sizes)]
            
        return self.fc(cat)

We define our model, making sure to set `OUTPUT_DIM` to $C$. We can get $C$ easily by using the size of the `LABEL` vocab, much like we used the length of the `TEXT` vocab to get the size of the vocabulary of the input.

In [9]:
INPUT_DIM = len(TEXT.vocab)
EMBEDDING_DIM = 100
N_FILTERS = 100
FILTER_SIZES = [2,3,4]
OUTPUT_DIM = len(LABEL.vocab)
DROPOUT = 0.5

model = CNN(INPUT_DIM, EMBEDDING_DIM, N_FILTERS, FILTER_SIZES, OUTPUT_DIM, DROPOUT)

Another different to the previous notebooks is our loss function (aka criterion). Before we used `BCEWithLogitsLoss`, however now we use `CrossEntropyLoss`. Without going into too much detail, `CrossEntropyLoss` performs a *softmax* function over our model outputs and the loss is given by the *cross entropy* between that and the label.

Generally:
- `CrossEntropyLoss` is used when our examples exclusively belong to one of $C$ classes
- `BCEWithLogitsLoss` is used when our examples exclusively belong to only 2 classes (0 and 1) and is also used in the case where our examples belong to between 0 and $C$ classes (aka multilabel classification).

In [10]:
import torch.optim as optim

optimizer = optim.Adam(model.parameters())

criterion = nn.CrossEntropyLoss()

model = model.to(device)
criterion = criterion.to(device)

Before, we had a function that calculated accuracy in the binary label case, where we said if the value was over 0.5 then we would assume it is positive. In the case where we have more than 2 classes, our model outputs a $C$ dimensional vector, where the value of each element is the beleief that the example belongs to that class. 

For example, in our labels we have: 'HUM' = 0, 'ENTY' = 1, 'DESC' = 2, 'NUM' = 3, 'LOC' = 4 and 'ABBR' = 5. If the output of our model was something like: **[5.1, 0.3, 0.1, 2.1, 0.2, 0.6]** this means that the model strongly believes the example belongs to class 0, a question about a human, and slightly believes the example belongs to class 3, a numerical question.

We calculate the accuracy by performing an `argmax` to get the index of the maximum value in the prediction for each element in the batch, and then counting how many times this equals the actual label. We then average this across the batch.

In [11]:
def categorical_accuracy(preds, y):
    """
    Returns accuracy per batch, i.e. if you get 8/10 right, this returns 0.8, NOT 8
    """
    max_preds = preds.argmax(dim=1, keepdim=True) # get the index of the max probability
    correct = max_preds.squeeze(1).eq(y)
    return correct.sum()/torch.FloatTensor([y.shape[0]])

The training loop is similar to before, without the need to `squeeze` the model predictions as `CrossEntropyLoss` expects the input to be **[batch size, n classes]** and the label to be **[batch size]**.

The label needs to be a `LongTensor`, which it is by default as we did not set the `dtype` to a `FloatTensor` as before.

In [12]:
def train(model, iterator, optimizer, criterion):
    
    epoch_loss = 0
    epoch_acc = 0
    
    model.train()
    
    for batch in iterator:
        
        optimizer.zero_grad()
        
        predictions = model(batch.text)
        
        loss = criterion(predictions, batch.label)
        
        acc = categorical_accuracy(predictions, batch.label)
        
        loss.backward()
        
        optimizer.step()
        
        epoch_loss += loss.item()
        epoch_acc += acc.item()
        
    return epoch_loss / len(iterator), epoch_acc / len(iterator)

The evaluation loop is, again, similar to before.

In [13]:
def evaluate(model, iterator, criterion):
    
    epoch_loss = 0
    epoch_acc = 0
    
    model.eval()
    
    with torch.no_grad():
    
        for batch in iterator:

            predictions = model(batch.text)
            
            loss = criterion(predictions, batch.label)
            
            acc = categorical_accuracy(predictions, batch.label)

            epoch_loss += loss.item()
            epoch_acc += acc.item()
        
    return epoch_loss / len(iterator), epoch_acc / len(iterator)

Next, we train our model.

In [14]:
%%time

N_EPOCHS = 30

lowest_valid_loss = 100
for epoch in range(N_EPOCHS):

    train_loss, train_acc = train(model, train_iterator, optimizer, criterion)
    valid_loss, valid_acc = evaluate(model, valid_iterator, criterion)
    
    # save the model with the lowest validation loss for use later
    saved = False
    if valid_loss < lowest_valid_loss:
        lowest_valid_loss = valid_loss
        with open("./models/language-identifier-" + DATASET + "-best.pt", 'wb') as fb:
            saved = True
            torch.save(model, fb)
    
    print(f'| Epoch: {epoch+1:02} | Train Loss: {train_loss:.3f} | Train Acc: {train_acc*100:.2f}% | Val. Loss: {valid_loss:.3f} | Val. Acc: {valid_acc*100:.2f}% | Saved: {saved}')
    
with open("./models/language-identifier-" + DATASET + "-final.pt", 'wb') as ff:
    torch.save(model, ff)    

  "type " + obj.__name__ + ". It won't be checked "


| Epoch: 01 | Train Loss: 1.845 | Train Acc: 29.41% | Val. Loss: 1.049 | Val. Acc: 47.99% | Saved: True
| Epoch: 02 | Train Loss: 0.927 | Train Acc: 47.06% | Val. Loss: 0.891 | Val. Acc: 56.59% | Saved: True
| Epoch: 03 | Train Loss: 1.100 | Train Acc: 35.29% | Val. Loss: 0.892 | Val. Acc: 42.01% | Saved: False
| Epoch: 04 | Train Loss: 0.642 | Train Acc: 76.47% | Val. Loss: 0.938 | Val. Acc: 41.22% | Saved: False
| Epoch: 05 | Train Loss: 0.497 | Train Acc: 82.35% | Val. Loss: 0.958 | Val. Acc: 41.22% | Saved: False
| Epoch: 06 | Train Loss: 0.445 | Train Acc: 88.24% | Val. Loss: 0.942 | Val. Acc: 41.61% | Saved: False
| Epoch: 07 | Train Loss: 0.351 | Train Acc: 82.35% | Val. Loss: 0.900 | Val. Acc: 43.18% | Saved: False
| Epoch: 08 | Train Loss: 0.555 | Train Acc: 76.47% | Val. Loss: 0.849 | Val. Acc: 58.23% | Saved: True
| Epoch: 09 | Train Loss: 0.350 | Train Acc: 82.35% | Val. Loss: 0.814 | Val. Acc: 73.10% | Saved: True
| Epoch: 10 | Train Loss: 0.296 | Train Acc: 82.35% | Val. 

For the non-fine-grained case, we should get an accuracy of around 90%. For the fine-grained case, we should get around 70%.

In [15]:
test_loss, test_acc = evaluate(model, test_iterator, criterion)

print(f'| Test Loss: {test_loss:.3f} | Test Acc: {test_acc*100:.2f}% |')

| Test Loss: 0.544 | Test Acc: 84.25% |


Deep learning tend to overfit the training data if it ran for too many epochs. We'll compare with the best model we've found.

In [16]:
with open("./models/language-identifier-" + DATASET + "-best.pt", 'rb') as fbl:
    best_model = torch.load(fbl)

In [17]:
best_model_test_loss, best_model_test_acc = evaluate(best_model, test_iterator, criterion)

print(f'| Test Loss: {best_model_test_loss:.3f} | Test Acc: {best_model_test_acc*100:.2f}% |')

| Test Loss: 0.544 | Test Acc: 84.25% |


Choose the model with the best accuracy.

In [18]:
if test_loss > best_model_test_loss:
    print("Will use best_model.")
    selected_model = best_model
else:
    print("Will use final model.")
    selected_model = model

Will use final model.


Similar to how we made a function to predict sentiment for any given sentences, we can now make a function that will predict the class of question given.

The only difference here is that instead of using a sigmoid function to squash the input between 0 and 1, we use the `argmax` to get the highest predicted class index. We then use this index with the label vocab to get the human readable label.

In [19]:
def predict_sentiment(sentence, trained_model, min_len=4):
    tokenized = tokenizer(sentence)
    if len(tokenized) < min_len:
        tokenized += ['<pad>'] * (min_len - len(tokenized))
    indexed = [TEXT.vocab.stoi[t] for t in tokenized]
    tensor = torch.LongTensor(indexed).to(device)
    tensor = tensor.unsqueeze(1)
    preds = trained_model(tensor)
    print(preds)
    max_preds = preds.argmax(dim=1)
    return max_preds.item()

Now, let's try it out on a few different questions...

In [20]:
pred_class = predict_sentiment("特朗普上周四（7日）曾表示，在3月1日達成貿易協議的最後期限前，他不會與中國國家主席習近平會晤。", selected_model)
print(f'Predicted class is: {pred_class} = {LABEL.vocab.itos[pred_class]}')

tensor([[ 0.2146,  0.4720, -2.7170]], grad_fn=<AddmmBackward>)
Predicted class is: 1 = zh


In [21]:
pred_class = predict_sentiment("喺未有互聯網之前，你老母叫你做人唔好太高眼角，正正常常嘅男人嫁出去就算。", selected_model)
print(f'Predicted class is: {pred_class} = {LABEL.vocab.itos[pred_class]}')

tensor([[ 0.9028,  0.6221, -2.2925]], grad_fn=<AddmmBackward>)
Predicted class is: 0 = hky


In [22]:
pred_class = predict_sentiment("I need to get some food.", selected_model)
print(f'Predicted class is: {pred_class} = {LABEL.vocab.itos[pred_class]}')

tensor([[-0.4475, -1.0717, -0.0933]], grad_fn=<AddmmBackward>)
Predicted class is: 2 = en


In [23]:
def range_predictions(prelist, trained_model, min_len=4):
    min_len = 4
    predict_hky = {}
    predict_zh = {}
    predict_en = {}
    for token in prelist:
        tokenized = tokenizer(token)
        tokenized_len = len(tokenized)
        if tokenized_len < min_len:
            tokenized += ['<pad>'] * (min_len - tokenized_len)
        indexed = [TEXT.vocab.stoi[t] for t in tokenized]
        tensor = torch.LongTensor(indexed).to(device)
        tensor = tensor.unsqueeze(1)
        preds = trained_model(tensor)
        max_preds = preds.argmax(dim=1)
        if LABEL.vocab.itos[max_preds.item()] == 'hky':
            predict_hky[token] = preds.data[0][max_preds.item()].item()
        elif LABEL.vocab.itos[max_preds.item()] == 'zh':
            predict_zh[token] = preds.data[0][max_preds.item()].item()
        else:
            predict_en[token] = preds.data[0][max_preds.item()].item()
    return predict_hky, predict_zh, predict_en

In [24]:
predict_hky, predict_zh, predict_en = range_predictions(TEXT.vocab.itos, selected_model, min_len=4)

In [25]:
sorted_by_value = sorted(predict_hky.items(), key=lambda kv: kv[1])
sorted_by_value.reverse()
for i in range(5):
    if i < len(sorted_by_value):
        print(sorted_by_value[i])
    else:
        break

('檢', 0.5987368226051331)
('罷', 0.5766047835350037)
('異', 0.5702905058860779)
('抵', 0.5484750866889954)
('犬', 0.5457524061203003)


In [26]:
sorted_by_value = sorted(predict_zh.items(), key=lambda kv: kv[1])
sorted_by_value.reverse()
for i in range(5):
    if i < len(sorted_by_value):
        print(sorted_by_value[i])
    else:
        break

('生', 0.5956845283508301)
('抱', 0.5700251460075378)
('乏', 0.5359424352645874)
('材', 0.5287002325057983)
('園', 0.522555410861969)


In [27]:
sorted_by_value = sorted(predict_en.items(), key=lambda kv: kv[1])
sorted_by_value.reverse()
for i in range(5):
    if i < len(sorted_by_value):
        print(sorted_by_value[i])
    else:
        break

Check what kind of articles are incorrect.

In [28]:
with torch.no_grad():
    for batch in test_iterator:
        predictions = selected_model(batch.text)
        max_preds = predictions.argmax(dim=1)
        wrong_preds = (max_preds.eq(batch.label) == 0).nonzero()
        for wrong_pred in wrong_preds:
            wrong_idx = wrong_pred.item()
            incorrect_prediction = max_preds[wrong_idx].item()
            correct_label = batch.label[wrong_idx].item()
            print("Predicted \"" + LABEL.vocab.itos[incorrect_prediction] + "\" but should be \"" + LABEL.vocab.itos[correct_label] + "\".")
            text_i = batch.text[:,wrong_idx].tolist()
            text_striped_idx = [x for x in text_i if x != TEXT.vocab.stoi['<pad>']]
            full_text = list(map(lambda x: TEXT.vocab.itos[x], text_striped_idx))
            print("Article: ")
            print("".join(full_text))
            print()

Predicted "zh" but should be "hky".
Article: 
曾俊華原來冇話過佢一直認同梁振英？

Predicted "zh" but should be "hky".
Article: 
今次有人動員臺港兩地搞羅<unk><unk>等人，有啲乜嘢深層次<unk>謀？

Predicted "zh" but should be "hky".
Article: 
「中央最信任的香港人」若果是梁愛詩，佢話「梁營眾叛親離」係代表乜嘢？

Predicted "zh" but should be "hky".
Article: 
相關文章：林<unk>光講咗又唔認<unk>前年港視事件早有前科<unk><unk>中流反<unk>二副未得善終<unk>林<unk>光講船員叛變故事為警告港人？<unk>皇<unk>清

Predicted "hky" but should be "zh".
Article: 
港隊在土<unk><unk>舉行的亞室運賽事再傳喜訊，女飛魚<unk>健樂繼昨日於女子100米<unk><unk>封后之後，今日（9月24日）再在50米<unk><unk>決賽，以26秒74時間奪銅。至於女子200米自由<unk>決賽，年<unk>15歲的何南<unk>就以1分58秒49奪得一面銅牌。（港協<unk><unk>委會圖片）

Predicted "hky" but should be "zh".
Article: 
天文台預測，現時影響中國東南<unk><unk>的東北季候風會，會在本<unk>中期逐<unk>被一<unk>偏東氣流取代。然而，預料一<unk>強<unk>東北季候風會在周末抵達華南，該區天氣將會顯著轉<unk>。根據天文台九天天氣預報，下星期日（11月19日）的預測市區最低氣溫得17度，保<unk>衣物是時候出動。

Predicted "hky" but should be "zh".
Article: 
 <unk><unk>天氣警告仍然生效，天文台進一步調低明日（2月1日）市區最低溫預測到7度，達嚴<unk>水平。根據天文台的天氣術語，8至12度為之<unk><unk>，若7度以下稱為嚴<unk>。天文台表示，冬季季候風及其補充，會維持華南的<unk><unk>天氣至下周後期。受一<unk><unk>燥大

Predicted "hky" but should be "zh".
Article: 
小<unk>波無講笑，你食緊條香<unk>將會<unk>少見少，因為新型的黃<unk>病正在全球各地<unk>延，重創市值360<unk>美元的香<unk>產業。市面上見到的香<unk>，九成是香<unk><unk>(Cavendish)。在1965年前，<unk>們被視為次品的香<unk>種類。然而，原先在二十世紀全球出口最多的大米七香<unk>(GrosMichel)，就是因為黃<unk>病<unk>行而無法再大量產出，最終商業性<unk>絕。當時的香<unk>產業後來選擇了香<unk><unk>東山再起。不過，五十年過去，人類總是犯上同樣的錯誤，黃<unk>病<unk>土重來，侵襲香<unk><unk>。黃<unk>病是由<unk><unk><unk><unk><unk>古巴專化型(Fusariumoxysporumf.sp.cubense)這種真<unk>造成。<unk>們會<unk><unk>於<unk>土之中感染香<unk>樹的根部，並分<unk>毒素，令樹無法吸水而死。最麻煩的是，<unk><unk><unk><unk><unk>對抗真<unk><unk>有抵抗力，至今無藥可根治——連<unk><unk>重<unk>也不能，因為該<unk>可於<unk>土存活數十年，再種香<unk>也是死路一條，唯有<unk><unk>田地後改種其他植物才可保住一點經濟效益。話說回來，說是五十年黃<unk>病才<unk>土重來也不盡正確，因為這次的疫病早在九十年代初，於馬來西亞開始爆發。有<unk>的塵土透過風、雨水以及交通工具，傳<unk>到其他地方，由東南亞越<unk>傳到澳洲，並在2013年傳到非洲。上周舉行的國際香<unk>會議甚至因擔心會將真<unk>傳到拉<unk>美洲，臨時將會議由哥斯達<unk>加改到<unk>亞密。至於為何香<unk>業界如斯恐懼此病，除了因為無藥可醫之外，更重要是業界只單一<unk>作，再加上現代香<unk>是無性繁殖的植物——就像水<unk>，農民只需要有一個球<unk>就可種出香<unk>。換句話說，全球的香<unk><unk>基因幾近完全一樣，無法透過遺傳變異發展出對抗黃<unk>病的基因，一<unk>香<unk><unk>樹出事，就等

Predicted "hky" but should be "en".
Article: 
（中譯<unk>本在下）ThecurrentRioOlympicshassadlybeenbrandedasoneoftheugliestinrecentmemoriesasfarastheOlympicspiritisconcerned,orthelackofit.We’vesofarwitnessedmorethanourfairshareoftheunsightlysideofthisglobalsportingevent.BeforetheGamesevenstartedtherewerealreadyrisinganti-RussiansentimentsduetodopingscandalsandthenChineseswimmerSunYangtoppedit,notbybeingcalleda“drugcheat”,buttheensuingretaliatorywarofwordshelaunchedagainsthisaccuserthereafter,whichwasutterlyshameful.Fortunately,theGames’mostcelebratedmedallistMichaelPhelpssavedtheGames’PRimagefromspirallingoutofcontrolwithhisgraciousacknowledgementofhisdefeatinthe100-metrebutterflytoyoungfirst-timeOlympianJosephSchooling,whobecameafanofPhelps’aftermeetinghimeightyearsago.WithsuchadramaticturnofeventsunfoldingattheRioGames,Icouldn’tpossiblystandonthesidelinesandleavetheGamessosoon.Therefore,IdecidedtodedicateanothercolumntotheRioGames.Afterall,ittakesplaceonlyonceeveryfouryears.Whynot?Firstoff,“