## 1. How to run this notebook:
1. If not installed, install the required packages in **[2]**
2. Define the parameters in **[3]**. You will fine instructions there.
3. Just run the cells in **[4]**
4. Run the last cell in **[5]** and start annotating
5. Your annotations are stored in *./annotation.csv* 

## 2. Install

In case you get an error for the second command, try installing Node.js. For MacOS you can download it from their website: https://nodejs.org/en 

In [10]:
# !pip install ipywidgets jupyter nbextension enable --py widgetsnbextension
# !jupyter labextension install @jupyter-widgets/jupyterlab-manager

In [1]:
# imports   #thesis-godel-metrics-and-data
import ipywidgets as w
import numpy as np
import pandas as pd
import csv
from collections import defaultdict

## 3. Define Parameters

# `BE CAREFUL: Define index and load parameters`

**load**  
- If you run the notebook for the first time leave as *False*
- If you want to continue annotating define as *True*

**index**
- If you run the notebook for the first time leave as *"explanation"*
- If you want to contine annotating assign the index (as integer) where you left off


In [2]:
load = False     
index = 'explanation'

#you shouldn't have to modify these paths
jsonl_path = './data/annotations.jsonl'
csv_path = './data/annotations.csv'

## 4. Run Cells

In [3]:
### RUN CELL 1
#setting up data
if load == False:
    data = pd.read_json(jsonl_path, lines=True)

    def htmlivize(val):
        return val.replace('|','<br><br>')
    data.dialogue = data.dialogue.transform(htmlivize)

    line = f" <br> -------------------------------------------------------------------------------------------------- <br>"
    response_start = "<font color='blue'> Response: </font> "
    all_text_col = []
    for text,response in zip(data['dialogue'],data['response']): 
        all_text = text + line + response_start + response
        all_text_col.append(all_text)
    data['text'] = all_text_col

    df = pd.DataFrame(list(data.text), columns = ['texts'])
    df['soundness'] = 0
    df['conciseness'] = 0
    df['completeness'] = 0
    df['relevance'] = 0
    df['clarity'] = 0
    df['brevity'] = 0
    df['coherence'] = 0
    df['dialogue_act'] = 'Y'
    df['emotion'] = 'Y'
    df['communicative_goal'] = 'Y'
    df['comments'] = 'Any comments?'

else:
    df = pd.read_csv(csv_path, delimiter=';')

In [4]:
### RUN CELL 2
explanation = f""" 
<p><span style='font-family: "Times New Roman", Times, serif; font-size: 14px; color: rgb(0, 0, 0);'>Explanation:</span></p>
<ul>
    <li style='font-family: "Times New Roman", Times, serif; font-size: 14px; color: rgb(0, 0, 0);'>
        <div style="background-color: #ffffff;font-size: 14px;">This is the annotation tool. We will go through each response and its dialogue context.</div>
    </li>
  <br>
    <li style='font-family: "Times New Roman", Times, serif; font-size: 14px; color: rgb(0, 0, 0);'>
        <div style="background-color: #ffffff;font-size: 14px;">The first 7 criteria have a text box. Type the score (<em>1-5)&nbsp;</em> or click the arrows. For the last 3 criteria click the button you want to select (<em>Y|N|P)&nbsp;</em>. After you assigned all scores click <strong>Save and Next</strong>. </div>
    </li>
  <br>
    <li style='font-family: "Times New Roman", Times, serif; font-size: 14px; color: rgb(0, 0, 0);'>
        <div style="background-color: #ffffff;font-size: 14px;">Click&nbsp;<strong>Previous&nbsp;</strong>to go to the previous response, if you want to change something.<br>&nbsp;<span style='font-family: "Lucida Console", Monaco, monospace; font-size: 20px; color: rgb(226, 80, 65);'>!</span> Remember: You will have to adjust the scores of your current slide again, when you go back to it.</div>
    </li>
  <br>
    <li style='font-family: "Times New Roman", Times, serif; font-size: 14px; color: rgb(0, 0, 0);'>
        <div style="background-color: #ffffff;font-size: 14px;">If you want to take a break, click&nbsp;<strong>Pause and Store</strong>.&nbsp;<br>&nbsp;<span style='font-size: 20px; font-family: "Lucida Console", Monaco, monospace; color: rgb(226, 80, 65);'>!</span> Remember: Assign your current index to the variable&nbsp;<span style="font-family: 'Lucida Console', Monaco, monospace;">index</span>&nbsp;at the&nbsp;<span style="font-family: 'Lucida Console', Monaco, monospace;">Set Parameters!</span>&nbsp;Section of this notebook. Also, turn&nbsp;<span style="font-family: 'Lucida Console', Monaco, monospace;">load</span>&nbsp;to <em>True</em>.</div>
    </li>
  <br>
    <li style='font-family: "Times New Roman", Times, serif; font-size: 14px; color: rgb(0, 0, 0);'>
        <div style="background-color: #ffffff;font-size: 14px;">When you are done annotating, click <strong>Finish</strong>. You annotations are stored in&nbsp;<em>{csv_path}</em>.<br><br><br><br><br><br><br><br></div>
    </li>
</ul>
"""

In [5]:
### RUN CELL 3
soundness = w.IntText(value=0,
                      min=1,
                      max=5,
                      step=1,
                      description='Soundness',
                      disabled=False,
                      continuous_update=False,
                      readout=True,
                      readout_format='d')

conciseness = w.IntText(value=0,
    min=1,
    max=5,
    step=1,
    description='Conciseness',
    disabled=False,
    continuous_update=False,
    readout=True,
    readout_format='d')

completeness = w.IntText(value=0,
    min=1,
    max=5,
    step=1,
    description='Completeness',
    disabled=False,
    continuous_update=False,
    readout=True,
    readout_format='d')

relevance = w.IntText(value=0,
    min=1,
    max=5,
    step=1,
    description='Relevance',
    disabled=False,
    continuous_update=False,
    readout=True,
    readout_format='d')

clarity = w.IntText(value=0,
    min=1,
    max=5,
    step=1,
    description='Clarity',
    disabled=False,
    continuous_update=False,
    readout=True,
    readout_format='d')

brevity = w.IntText(value=0,
    min=1,
    max=5,
    step=1,
    description='Brevity',
    disabled=False,
    continuous_update=False,
    readout=True,
    readout_format='d')

coherence = w.IntText(value=0,
    min=1,
    max=5,
    step=1,
    description='Coherence',
    disabled=False,
    continuous_update=False,
    readout=True,
    readout_format='d')

# dialogue_act = w.Text(value='Fill in',
#     description='Dialogue-act',
#     disabled=False,
#     continuous_update=False,
#     readout=True,
#     readout_format='d')

# emotion = w.Text(value='Fill in',
#     description='Emotion',
#     disabled=False,
#     continuous_update=False,
#     readout=True,
#     readout_format='d')

# communicative_goal = w.Text(value='Fill in',
#     description='Communicative goal',
#     disabled=False,
#     continuous_update=False,
#     readout=True,
#     readout_format='d')

dialogue_act = w.ToggleButtons(options=['Y', 'N', 'P'], 
                               description= 'Dialogue-act',
                               disabled=False, 
                               button_style='')

emotion = w.ToggleButtons(options=['Y', 'N', 'P'], 
                               description= 'Emotion',
                               disabled=False, 
                               button_style='')

communicative_goal = w.ToggleButtons(options=['Y', 'N', 'P'], 
                               description= 'Communicative goal',
                               disabled=False, 
                               button_style='')

comments = w.Textarea(value='Any comments?')

In [6]:
# RUN CELL 4
# Prepare button functionality
def next(b):
    global df, index, out, progression
    
    if index == 'explanation':
        index = 0
        progress_text.value=f'Current index: {index}'
        
        soundness.value = df.loc[index,'soundness']
        conciseness.value = df.loc[index,'conciseness']
        completeness.value = df.loc[index,'completeness']
        relevance.value = df.loc[index,'relevance']
        clarity.value = df.loc[index,'clarity']
        brevity.value = df.loc[index,'brevity']
        coherence.value = df.loc[index,'coherence']
        dialogue_act.value = df.loc[index,'dialogue_act']
        emotion.value = df.loc[index,'emotion']
        communicative_goal.value = df.loc[index,'communicative_goal']
        comments.value = df.loc[index,'comments']

        text = w.HTMLMath(value=df.loc[index, 'texts'])
        
        with out:
            out.clear_output()
            display(text)

    elif index == len(df):
        progress_text.value=f'Current index: {index}'
        text = w.HTMLMath(value='STOP <br> You are done!!!')
        with out:
            out.clear_output()
            display(text)

    else:
        progression.value = index+1
        
        # add current slider values to df
        df.loc[index,'soundness'] = soundness.value
        df.loc[index,'conciseness'] = conciseness.value
        df.loc[index,'completeness'] = completeness.value
        df.loc[index,'relevance'] = relevance.value
        df.loc[index,'clarity'] = clarity.value
        df.loc[index,'brevity'] = brevity.value
        df.loc[index,'coherence'] = coherence.value
        df.loc[index,'dialogue_act'] = dialogue_act.value
        df.loc[index,'emotion'] = emotion.value
        df.loc[index,'communicative_goal'] = communicative_goal.value
        df.loc[index, 'comments'] = comments.value

        index+=1
        if index == len(df):
            text = w.HTMLMath(value='STOP <br> You are done!!!')
            progress_text.value=f'Current index: {index}'
            with out:
                out.clear_output()
                display(text)
        
        else:

            soundness.value = df.loc[index,'soundness']
            conciseness.value = df.loc[index,'conciseness']
            completeness.value = df.loc[index,'completeness']
            relevance.value = df.loc[index,'relevance']
            clarity.value = df.loc[index,'clarity']
            brevity.value = df.loc[index,'brevity']
            coherence.value = df.loc[index,'coherence']
            dialogue_act.value = df.loc[index,'dialogue_act']
            emotion.value = df.loc[index,'emotion']
            communicative_goal.value = df.loc[index,'communicative_goal']
            comments.value = df.loc[index,'comments']

            progress_text.value=f'Current index: {index}'
            text = w.HTMLMath(value=df.loc[index, 'texts'])
            # was trying to update the texts show, but does not work
            with out:
                out.clear_output()
                display(text)

# def small_next(b):
    # update position
    # global df, index, out, progression

    # if index == 'explanation':
    #     index = 0
    #     progression.value = index+1

    #     coherence.value = df.loc[index,'coherence_results']
    #     consistency.value = df.loc[index,'consistency_results']
    #     relevance.value = df.loc[index,'relevance_results']
    #     fluency.value = df.loc[index,'fluency_results']

    # else:
    #     index += 1
    
    # if index == len(df):
    #     index=0

    # progress_text.value=f'Current index: {index}'
    
    # coherence.value = df.loc[index,'coherence_results']
    # consistency.value = df.loc[index,'consistency_results']
    # relevance.value = df.loc[index,'relevance_results']
    # fluency.value = df.loc[index,'fluency_results']


    # # refresh display
    # text = w.HTMLMath(value=df.loc[index, 'texts'])
    # with out:
    #     out.clear_output()
    #     display(text)

def back(b):
    # update position
    global df, index, out, progression

    if index == 'explanation':
        progression.value = 0
        index = 'explanation'
        progress_text.value=f'Current index: {index}'
        text = w.HTMLMath(value=explanation)

    else:
        index -= 1
        if index == -1:
            progression.value = 0
            index = 'explanation'
            progress_text.value=f'Current index: {index}'
            text = w.HTMLMath(value=explanation)

        

        else:
            progression.value = index+1
            progress_text.value=f'Current index: {index}'

            soundness.value = df.loc[index,'soundness']
            conciseness.value = df.loc[index,'conciseness']
            completeness.value = df.loc[index,'completeness']
            relevance.value = df.loc[index,'relevance']
            clarity.value = df.loc[index,'clarity']
            brevity.value = df.loc[index,'brevity']
            coherence.value = df.loc[index,'coherence']
            dialogue_act.value = df.loc[index,'dialogue_act']
            emotion.value = df.loc[index,'emotion']
            communicative_goal.value = df.loc[index,'communicative_goal']
            comments.value = df.loc[index,'comments']

            text = w.HTMLMath(value=df.loc[index, 'texts'])
        
        with out:
            out.clear_output()
            display(text)

def pause_annotation(b):
    global index, df
    # store temp data
    df.to_csv(csv_path, sep=';')

    text = w.HTMLMath(value=f'You paused annotation! Data is stored on your local drive. <br>\
                            Next time, add the index below on top of the notebook to continue where you left off.\
                            <br><br> Stopped at index: {index}')
    with out:
        out.clear_output()
        display(text)


def finish(b):
    global df
    # store data
    df.to_csv(csv_path, sep=';')


    # notify user
    text = w.HTMLMath(value=f'You have finished annotating! The data is stored on the drive')
    with out:
        out.clear_output()
        display(text)


In [7]:
### RUN CELL 5
# Compose widget

# set initial Dialogue
if index == 'explanation':
    text = w.HTMLMath(value=explanation)
    
else:
    text = w.HTMLMath(value=df.loc[index, 'texts'])

out = w.Output()
with out:
    out.clear_output()
    display(text)

# save work button
forward = w.Button(description='Save and Next', icon='forward')
forward.on_click(next)
small_forward = w.Button(description='Next', icon='arrow-right')
#small_forward.on_click(small_next)

previous =  w.Button(description='Previous',icon='arrow-left')
previous.on_click(back)

pause = w.Button(description='Pause and store data',icon='download')
pause.on_click(pause_annotation)
fin = w.Button(description='Finish',icon='check')
fin.on_click(finish)

progression_bar = w.IntProgress(
    value=0,
    min=0,
    max=len(df),
    description='Progress',
    bar_style='', # 'success', 'info', 'warning', 'danger' or ''
    style={'bar_color': 'cyan'},
    orientation='horizontal'
)

progress_text = w.HTMLMath(value=f'Current index: {index}')


# sliders & buttons
sliders = w.VBox([soundness, conciseness, completeness, relevance, clarity, brevity, coherence,dialogue_act, emotion, communicative_goal])
buttons = w.HBox([previous,forward, pause, fin])
progression = w.HBox([progression_bar, progress_text])

## 5. Annotate

In [8]:
### RUN LAST CELL
# run widget
display(out)
display(sliders)
display(comments)
display(progression)
display(buttons)

Output(outputs=({'output_type': 'display_data', 'data': {'text/plain': 'HTMLMath(value=\' \\n<p><span style=\\…

VBox(children=(IntText(value=0, description='Soundness'), IntText(value=0, description='Conciseness'), IntText…

Textarea(value='Any comments?')

HBox(children=(IntProgress(value=0, description='Progress', max=300, style=ProgressStyle(bar_color='cyan')), H…

HBox(children=(Button(description='Previous', icon='arrow-left', style=ButtonStyle()), Button(description='Sav…

You can check out your current dataframe here, but the data is already stored in your working directory.

In [None]:
df.head()