*best viewed in [nbviewer](https://nbviewer.jupyter.org/github/CambridgeSemiticsLab/BH_time_collocations/blob/master/data/annotations/annotating_semantics.ipynb)*

# Annotating Time Adverbials with Semantic Classes
## The Semantics of Time and Events
### Cody Kingham
<a href="../../docs/sponsors.md"><img height=200px width=200px align="left" src="../../docs/images/CambridgeU_BW.png"></a>

In [103]:
! echo "last updated:"; date

last updated:
Thu  7 May 2020 12:06:27 BST


## Time adverbials locate events on a timeline

An event, in line with Croft 2012, refers to an entire aspectual-temporal
expression, with all of the various entities that make up the expression,
including the verb lexeme, morphemes, verb arguments, adverbials, and more. 
Note that the term 'event' is used generically rather than to refer to a 
specific kind of aspectual category, following Croft 2012.

**A time adverbial serves to situate the whole or part of an event
along a metaphorical one-dimensional timeline** (Haspelmath 1997: 23-42). 
This timeline is a metaphorical extension of the spatial dimension (idem).

```
"The baby was born before her great
grandfather died." (Haspelmath 1997: 28)

         RefT: her great-grandfather died
            |
───────────────────────────>
    |
   LSit: the baby was born
```

Here the located situation `LSit` (located situation) refers to the event, 
while `RefT` (reference time) refers to the information anchored 
to the time adverbial. The positioning  between the event and the reference
time is supplied by the preposition "before". 

In the above example, the durational quality of the situation "the baby was
born" is bounded (i.e. not durative). This is due to the interaction of the
various constructions in the sentence, including the verb tense and the semantic
structure associated with the verb lexeme. 

Specifically, there are two dimensions expressed by an event such as the one in the
example. There is both a qualitative (phasal) dimension, and a temporal dimension.
Following Croft 2012, Croft 2012 (*Verbs: Aspect and Causal Structure*), these two 
dimensions can be captured as spatial metaphors by using a graph:

```
"The door is open."

   │ 
   │
   │     ______
   │    .
q  │    .
   │.....
   │
   └────────────────
           t
```

The x-axis, or *time dimension* (t), is the one-dimensional timeline referred to
by Haspelmath as a metaphorical extension of space. Its domain (input values) are
continuous, thus the time dimension can be segmented into arbitrary segments or 
spans of various size (see time units like "day", "year", "moment"). 

The y-axis, or *quality dimension* (q), models the phases unique to an event
with points along the axis. Unlike the time dimension, the input values of the
quality dimension must be a whole number (i.e. not fractional) which indicates
how many phases an event consists of. Thus, y=1 is the first phase of an event, 
y=2 is the second, and so on, with most event types being summarized with y=1 to 3. 

Where the horizontal dotted line represents the situation *before* the door was open, and 
the vertical line represents the immediate change in state. On the q-dimension, there are
therefore 2 coordinates (i.e. 'open' y=1, versus 'not open' y=2). The solid horizontal line represents
the state of the door being open. One could also add an additional point on the time dimension
to represent the position of the speaker, which would align with the open state.

### On the "observable" versus "unobservable"

The dotted line on the plot above, the phase before the door was opened, models
information not explicitly found in the sentence. In other cases, as will be seen, 
a given constructional network implies a phase that follows the event and extends 
onwards, likewise without any explicit corresponding element in the sentence. 

These are phases that we, as humans, know intuitively from world knowledge about
how these kinds of events unfold and result. But without any means of validating 
our intuition, we are left with a methodological problem. From the perspective of
linguistics as a science, we stand on one side of a great dividing wall between
the unseen and the seen, where the unseen is the concepts, beliefs, practices, 
and customs that lie hidden somewhere in the brain.

<img src="../../docs/figures/schemas/empirical_linguistics.svg" height="600px" width="600px">

But we are not left helpless. As seen in the schema above, cognitive links in the brain
give rise to statistical links amongst constructional patterns. The tricky part is that
patterns and concepts are, of course, not the same thing. And often intricate combinations
of patterns are woven for the purpose of pairing an idea. For instance, in English the formation
of the "perfective" meaning with present tense "have" + past participle shows how several
kinds of patterns (orthographic, lexical, verbal, and syntactic) are together and 
simultaneously linked to one idea. They are, in other words, non-compositional—an emergent
phenomenon.

*to be continued...*

### Merging Croft and Haspelmath's models

**We can combine Croft's spatial models of event structure with Haspelmath's
model of time adverbials quite easily by adding the time adverbial information
below the time dimension.**

```
"The door was open for an hour."

q
│ 
│    
│     ______
│    .       
│    .       
│.....       
│
└──────────────── t
|    |—————|    |  hours      
        |
       RefT: "for an hour"
```

This is an example of Haspelmath's "atelic extent" (1997: 120f). The reference
time of the time adverbial is now added below the time axis, and can be seen
to highlight the span of time during which the door was open.

(What about the final state of the door? This information most likely
needs to be supplied from the context. The shift to past tense here seems to imply
that the door is now closed. But that does not seem required by the semantic context.
It is possible, however, the past tense and the ending of such a state are statistically
associated—in which case we could say that the construction has a default interpretation
in which the state ends.)

## Aspect is constructional not verbal

The aspect of an event derives from a whole constructional network, 
rather than a single given construction such as the verb.

## Building Annotations

In this notebook, we aim to develop a method of annotating time adverbials and their
respective events in the Hebrew Bible.
These annotations will form the basis for a multivariate statistical analysis, which 
will seek to uncover the strongest predictors of time adverbial usage in the text.

The time adverbial dataset of this project consists of >5000 individual instances of
adverbial phrases throughout the Hebrew Bible. It would be time prohibitive to manually
tag every single one of them. Thus, it is also a goal of this notebook 

In Haspelmath's survey of time adverbials throughout world languages, he identifies 
several common semantic categories. Modern Hebrew is amongst the languages surveyed. 
Here are the semantic classes for Modern Hebrew with their most common leading prepositions:

    anterior - לפני
    posterior - אחרי
    simultaneous location - ב
    anterior durative - עד
    posterior durative - מן
    atelic extent - ø + quantified NP
    telic extent -  ב + quantified NP
    distance future - עוד
    distance past - לפני in sense of "ago"
    distance posterior - זה + quantified NP
    

<hr>

# Python

Now we import the modules and data needed for the analysis.

In [73]:
# standard & data science packages
import collections
import re
import pandas as pd
pd.set_option('max_rows', 100)
pd.set_option('max_colwidth',100)
import numpy as np
import matplotlib.pyplot as plt
from matplotlib import rcParams
rcParams['font.serif'] = ['SBL Biblit']
import seaborn as sns
from bidi.algorithm import get_display # bi-directional text support for plotting
from paths import main_table, figs
from IPython.display import HTML, display

# custom packages (see /tools)
from cx_analysis.load import cxs
from tf_tools.load import load_tf
from stats.significance import contingency_table, apply_fishers

# launch Text-Fabric with custom data
TF, API, A = load_tf(silent='deep')
A.displaySetup(condenseType='phrase')
F, E, T, L = A.api.F, A.api.E, A.api.T, A.api.L # corpus analysis methods

# load and set up project dataset
times_full = pd.read_csv(main_table, sep='\t')
times_full.set_index(['node'], inplace=True)
times = times_full[~times_full.classi.str.contains('component')] # select singles

To increase the rate,see https://annotation.github.io/text-fabric/Api/Repo/


In [3]:
times.head()

Unnamed: 0_level_0,ref,book,ph_type,text,token,clause,classi,time,time_etcbc,time_pos,...,qual_str,demonstrative,demon_str,demon_dist,ordinal,ord_str,cl_kind,verb,tense,verb_lex
node,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
1446800,Gen 1:1,Genesis,prep_ph,בְּרֵאשִׁ֖ית,ב.ראשׁית,בְּרֵאשִׁ֖ית בָּרָ֣א אֱלֹהִ֑ים אֵ֥ת הַשָּׁמַ֖יִם וְאֵ֥ת הָאָֽרֶץ׃,single.prep.bare.øanchor,ראשׁית,R>CJT/,subs,...,,False,,,False,,VC,True,qtl,ברא
1446801,Gen 2:2,Genesis,prep_ph,בַּיֹּ֣ום הַשְּׁבִיעִ֔י,ב.ה.יום.ה.שׁביעי,וַיְכַ֤ל אֱלֹהִים֙ בַּיֹּ֣ום הַשְּׁבִיעִ֔י מְלַאכְתֹּ֖ו,single.prep.definite.def_apposition.ordinal,יום,JWM/,subs,...,,False,,,True,שׁביעי,VC,True,wyqtl,כלה
1446802,Gen 2:2,Genesis,prep_ph,בַּיֹּ֣ום הַשְּׁבִיעִ֔י,ב.ה.יום.ה.שׁביעי,וַיִּשְׁבֹּת֙ בַּיֹּ֣ום הַשְּׁבִיעִ֔י מִכָּל־מְלַאכְתֹּ֖ו,single.prep.definite.def_apposition.ordinal,יום,JWM/,subs,...,,False,,,True,שׁביעי,VC,True,wyqtl,שׁבת
1446803,Gen 2:5,Genesis,prep,טֶ֚רֶם,טרם,וְכֹ֣ל׀ שִׂ֣יחַ הַשָּׂדֶ֗ה טֶ֚רֶם יִֽהְיֶ֣ה בָאָ֔רֶץ,single.bare.øanchor,טרם,VRM/,subs,...,,False,,,False,,VC,True,yqtl,היה
1446804,Gen 2:5,Genesis,prep,טֶ֣רֶם,טרם,וְכָל־עֵ֥שֶׂב הַשָּׂדֶ֖ה טֶ֣רֶם יִצְמָ֑ח,single.bare.øanchor,טרם,VRM/,subs,...,,False,,,False,,VC,True,yqtl,צמח


# Generic Overview

First, let's get re-acquainted with the general makeup of the dataset.

In [4]:
time_surfaces = pd.DataFrame(times['token'].value_counts())
time_surfaces.head(50)

Unnamed: 0,token
עוד,344
עתה,340
ב.ה.יום.ה.הוא,201
ה.יום,191
אז,117
ל.עולם,99
ב.ה.בקר,78
כל.ה.יום,76
אחר,67
עד.ה.יום.ה.זה,65


## Generating Automatic Annotations for Biblical Hebrew

We generate automatic annotations to lessen the workload of annotating and to solve 
repetitive tasks at once. These annotations are all tentative, and subject to human
correction and adjustment.

In order to formulate a standard, I want to practice with a few key cases that we've
already seen in the dataset above. Here's a diverse group of common adverbials selected
from the above counts.

    ב.ה.יום.ה.הוא
    ה.יום.ה.זה
    עד.ה.יום.ה.זה
    שׁבע.יום
    
The semantic labels are taken from Croft 2012. The two-dimensional plots from Croft's
method are merged with the one dimensional timeline of Haspelmath. Where relevant, the
contribution from each construction is indicated by writing it near the graphed lines.

**ב.ה.יום.ה.הוא**

In [5]:
in_that_day = times[times.token == 'ב.ה.יום.ה.הוא']

in_that_day.head()

Unnamed: 0_level_0,ref,book,ph_type,text,token,clause,classi,time,time_etcbc,time_pos,...,qual_str,demonstrative,demon_str,demon_dist,ordinal,ord_str,cl_kind,verb,tense,verb_lex
node,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
1446905,Gen 15:18,Genesis,prep_ph,בַּיֹּ֣ום הַה֗וּא,ב.ה.יום.ה.הוא,בַּיֹּ֣ום הַה֗וּא כָּרַ֧ת יְהוָ֛ה אֶת־אַבְרָ֖ם בְּרִ֣ית,single.prep.definite.def_apposition.demonstrative,יום,JWM/,subs,...,,True,הוא,far,False,,VC,True,qtl,כרת
1446967,Gen 26:32,Genesis,prep_ph,בַּיֹּ֣ום הַה֗וּא,ב.ה.יום.ה.הוא,וַיְהִ֣י׀ בַּיֹּ֣ום הַה֗וּא,single.prep.definite.def_apposition.demonstrative,יום,JWM/,subs,...,,True,הוא,far,False,,VC,True,wyqtl,היה
1447004,Gen 30:35,Genesis,prep_ph,בַּיֹּום֩ הַה֨וּא,ב.ה.יום.ה.הוא,וַיָּ֣סַר בַּיֹּום֩ הַה֨וּא אֶת־הַתְּיָשִׁ֜ים הָֽעֲקֻדִּ֣ים וְהַטְּלֻאִ֗ים וְאֵ֤ת כָּל־הָֽעִזִּי...,single.prep.definite.def_apposition.demonstrative,יום,JWM/,subs,...,,True,הוא,far,False,,VC,True,wyqtl,סור
1447033,Gen 33:16,Genesis,prep_ph,בַּיֹּ֨ום הַה֥וּא,ב.ה.יום.ה.הוא,וַיָּשָׁב֩ בַּיֹּ֨ום הַה֥וּא עֵשָׂ֛ו לְדַרְכֹּ֖ו שֵׂעִֽירָה׃,single.prep.definite.def_apposition.demonstrative,יום,JWM/,subs,...,,True,הוא,far,False,,VC,True,wyqtl,שׁוב
1447108,Gen 48:20,Genesis,prep_ph,בַּיֹּ֣ום הַהוּא֮,ב.ה.יום.ה.הוא,וַיְבָ֨רֲכֵ֜ם בַּיֹּ֣ום הַהוּא֮,single.prep.definite.def_apposition.demonstrative,יום,JWM/,subs,...,,True,הוא,far,False,,VC,True,wyqtl,ברך


In [6]:
in_that_day.iloc[0:1][['ref', 'text', 'clause']]

Unnamed: 0_level_0,ref,text,clause
node,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
1446905,Gen 15:18,בַּיֹּ֣ום הַה֗וּא,בַּיֹּ֣ום הַה֗וּא כָּרַ֧ת יְהוָ֛ה אֶת־אַבְרָ֖ם בְּרִ֣ית


```
בַּיֹּ֣ום הַה֗וּא כָּרַ֧ת יְהוָ֛ה אֶת־אַבְרָ֖ם בְּרִ֣ית

Achievement, irreversible directed 

q
| 
|
|
|         ------>
|        |
|        |  כרת ברית
|.........
|_________________t
/–––––/–––––/–––––/ days
         |
     ביום ההוא
```

This diagram contains an analysis for the situation expressed
in Gen 15:18.

The main constructional network modified by the adverbial is the 
construction כרת "he cut" + direct object

**ה.יום.ה.זה**



In [7]:
this_day = times[times.token == 'ה.יום.ה.זה']

this_day.head()

Unnamed: 0_level_0,ref,book,ph_type,text,token,clause,classi,time,time_etcbc,time_pos,...,qual_str,demonstrative,demon_str,demon_dist,ordinal,ord_str,cl_kind,verb,tense,verb_lex
node,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
1447763,Deut 2:25,Deuteronomy,attrib_ph,הַיֹּ֣ום הַזֶּ֗ה,ה.יום.ה.זה,הַיֹּ֣ום הַזֶּ֗ה אָחֵל֙,single.definite.def_apposition.demonstrative,יום,JWM/,subs,...,,True,זה,near,False,,VC,True,yqtl,חלל
1447792,Deut 5:24,Deuteronomy,attrib_ph,הַיֹּ֤ום הַזֶּה֙,ה.יום.ה.זה,הַיֹּ֤ום הַזֶּה֙ רָאִ֔ינוּ,single.definite.def_apposition.demonstrative,יום,JWM/,subs,...,,True,זה,near,False,,VC,True,qtl,ראה
1447902,Deut 26:16,Deuteronomy,attrib_ph,הַיֹּ֣ום הַזֶּ֗ה,ה.יום.ה.זה,הַיֹּ֣ום הַזֶּ֗ה יְהוָ֨ה אֱלֹהֶ֜יךָ מְצַוְּךָ֧,single.definite.def_apposition.demonstrative,יום,JWM/,subs,...,,True,זה,near,False,,VC,True,ptcp,צוה
1447908,Deut 27:9,Deuteronomy,attrib_ph,הַיֹּ֤ום הַזֶּה֙,ה.יום.ה.זה,הַיֹּ֤ום הַזֶּה֙ נִהְיֵ֣יתָֽ לְעָ֔ם לַיהוָ֖ה אֱלֹהֶֽיךָ׃,single.definite.def_apposition.demonstrative,יום,JWM/,subs,...,,True,זה,near,False,,VC,True,qtl,היה
1447986,Josh 3:7,Joshua,attrib_ph,הַיֹּ֣ום הַזֶּ֗ה,ה.יום.ה.זה,הַיֹּ֣ום הַזֶּ֗ה אָחֵל֙,single.definite.def_apposition.demonstrative,יום,JWM/,subs,...,,True,זה,near,False,,VC,True,yqtl,חלל


In [8]:
this_day.iloc[0:1][['ref', 'text', 'clause']]

Unnamed: 0_level_0,ref,text,clause
node,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
1447763,Deut 2:25,הַיֹּ֣ום הַזֶּ֗ה,הַיֹּ֣ום הַזֶּ֗ה אָחֵל֙


היום is an interesting case because it is zero-marked. Mose zero-marked time adverbials
seem to be durative. Is היום durative or is it punctual?

In [17]:
A.plain(L.u(1447763,'verse')[0], condenseType='verse')


```
הַיֹּ֣ום הַזֶּ֗ה אָחֵל֙ תֵּ֤ת פַּחְדְּךָ֙ וְיִרְאָ֣תְךָ֔ עַל־פְּנֵי֙ הָֽעַמִּ֔ים

Accomplishment, non-incremental

q
|             ... 
|            .  
|        תן  .    
|     /\/\/\/ 
|    | 
|    | אחל
|.....
|_________________t
/–––––/–––––/–––––/ days
      ...... 
        |
     היום הזה
                     
```

In [11]:
this_day.iloc[1:2][['ref', 'text', 'clause']]

Unnamed: 0_level_0,ref,text,clause
node,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
1447792,Deut 5:24,הַיֹּ֤ום הַזֶּה֙,הַיֹּ֤ום הַזֶּה֙ רָאִ֔ינוּ


In [12]:
A.plain(L.u(1447792,'verse')[0], condenseType='verse')


```
הַיֹּ֤ום הַזֶּה֙ רָאִ֔ינוּ כִּֽי־יְדַבֵּ֧ר אֱלֹהִ֛ים אֶת־הָֽאָדָ֖ם וָחָֽי

Activity, undirected

q
|     
|               
|       ראינו   
|     /\/\/\/ 
|    . 
|    . 
|.....
|_________________t
/–––––/–––––/–––––/ days
      ...... 
        |
     היום הזה
                     
```

## Pilot Study: Genesis

I'm going to run a pilot study on time adverbials in the book of Genesis.
This will allow me to practice the annotations, as well as to gather useful
data on patterns.

I hope to come out of the study with further ideas on how to optimize a set of
procedures that can accurately predict the semantic labels of a given construction.
The procedures could also include statistical association data. That would allow me to 
automatically tag all of the single-phrasal adverbials in the dataset, and then manually
correct bad cases. For the whole dataset, I could then measure the success of the automatically 
tagged data, and thus get a sense for how effective the outlined procedures are.

In [24]:
genesis_times = times.loc[times.book == 'Genesis', :]

print(genesis_times.shape[0], 'times in Genesis selected')

# look at what times are contained
pd.DataFrame(genesis_times.token.value_counts()).head(50)

273 times in Genesis selected


Unnamed: 0,token
עתה,31
עוד,27
ה.יום,12
ב.ה.בקר,11
אחר,9
אחר.כן,7
ב.ה.יום.ה.הוא,5
אז,5
אחר.ה.דבר.ה.אלה,5
טרם,5


Prepare data for export.

In [36]:
genesis_data = genesis_times.loc[:, ['ref', 'time', 'token', 'clause']]

# add columns for manual annotations
genesis_data['TA_type'] = ''
genesis_data['Aspect_main'] = ''
genesis_data['Aspect_second'] = ''

genesis_data

Unnamed: 0_level_0,ref,time,token,clause,TA_type,Aspect_main,Aspect_second
node,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
1446800,Gen 1:1,ראשׁית,ב.ראשׁית,בְּרֵאשִׁ֖ית בָּרָ֣א אֱלֹהִ֑ים אֵ֥ת הַשָּׁמַ֖יִם וְאֵ֥ת הָאָֽרֶץ׃,,,
1446801,Gen 2:2,יום,ב.ה.יום.ה.שׁביעי,וַיְכַ֤ל אֱלֹהִים֙ בַּיֹּ֣ום הַשְּׁבִיעִ֔י מְלַאכְתֹּ֖ו,,,
1446802,Gen 2:2,יום,ב.ה.יום.ה.שׁביעי,וַיִּשְׁבֹּת֙ בַּיֹּ֣ום הַשְּׁבִיעִ֔י מִכָּל־מְלַאכְתֹּ֖ו,,,
1446803,Gen 2:5,טרם,טרם,וְכֹ֣ל׀ שִׂ֣יחַ הַשָּׂדֶ֗ה טֶ֚רֶם יִֽהְיֶ֣ה בָאָ֔רֶץ,,,
1446804,Gen 2:5,טרם,טרם,וְכָל־עֵ֥שֶׂב הַשָּׂדֶ֖ה טֶ֣רֶם יִצְמָ֑ח,,,
...,...,...,...,...,...,...,...
1447116,Gen 50:16,מות,ל.פנה.מות,אָבִ֣יךָ צִוָּ֔ה לִפְנֵ֥י מֹותֹ֖ו,,,
1447117,Gen 50:17,עתה,עתה,וְעַתָּה֙,,,
1447118,Gen 50:20,יום,כ.ה.יום.ה.זה,לְמַ֗עַן עֲשֹׂ֛ה כַּיֹּ֥ום הַזֶּ֖ה,,,
1447119,Gen 50:21,עתה,עתה,וְעַתָּה֙,,,


In [95]:
# gather additional contexts to add to dataset

contexts = {}

for tp in genesis_data.index:
    verse = L.u(tp, 'verse')[0]
    sentence = L.u(tp, 'sentence')[0]
    html_link = A.webLink(tp, _asString=True)
    href = re.search('href="([^"]*)"', html_link).group(1)
    contexts[tp] = {
        'sentence': T.text(sentence),
        'verse': T.text(verse),
        'link': href,
    }
    
new_contexts = pd.DataFrame.from_dict(contexts, orient='index')
new_contexts.index.name = 'node'
new_contexts.head()

Unnamed: 0_level_0,sentence,verse,link
node,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
1446800,בְּרֵאשִׁ֖ית בָּרָ֣א אֱלֹהִ֑ים אֵ֥ת הַשָּׁמַ֖יִם וְאֵ֥ת הָאָֽרֶץ׃,בְּרֵאשִׁ֖ית בָּרָ֣א אֱלֹהִ֑ים אֵ֥ת הַשָּׁמַ֖יִם וְאֵ֥ת הָאָֽרֶץ׃,https://shebanq.ancient-data.org/hebrew/text?book=Genesis&amp;chapter=1&amp;verse=1&amp;version=...
1446801,וַיְכַ֤ל אֱלֹהִים֙ בַּיֹּ֣ום הַשְּׁבִיעִ֔י מְלַאכְתֹּ֖ו אֲשֶׁ֣ר עָשָׂ֑ה,וַיְכַ֤ל אֱלֹהִים֙ בַּיֹּ֣ום הַשְּׁבִיעִ֔י מְלַאכְתֹּ֖ו אֲשֶׁ֣ר עָשָׂ֑ה וַיִּשְׁבֹּת֙ בַּיֹּ֣ום ...,https://shebanq.ancient-data.org/hebrew/text?book=Genesis&amp;chapter=2&amp;verse=2&amp;version=...
1446802,וַיִּשְׁבֹּת֙ בַּיֹּ֣ום הַשְּׁבִיעִ֔י מִכָּל־מְלַאכְתֹּ֖ו אֲשֶׁ֥ר עָשָֽׂה׃,וַיְכַ֤ל אֱלֹהִים֙ בַּיֹּ֣ום הַשְּׁבִיעִ֔י מְלַאכְתֹּ֖ו אֲשֶׁ֣ר עָשָׂ֑ה וַיִּשְׁבֹּת֙ בַּיֹּ֣ום ...,https://shebanq.ancient-data.org/hebrew/text?book=Genesis&amp;chapter=2&amp;verse=2&amp;version=...
1446803,וְכֹ֣ל׀ שִׂ֣יחַ הַשָּׂדֶ֗ה טֶ֚רֶם יִֽהְיֶ֣ה בָאָ֔רֶץ,וְכֹ֣ל׀ שִׂ֣יחַ הַשָּׂדֶ֗ה טֶ֚רֶם יִֽהְיֶ֣ה בָאָ֔רֶץ וְכָל־עֵ֥שֶׂב הַשָּׂדֶ֖ה טֶ֣רֶם יִצְמָ֑ח כּ...,https://shebanq.ancient-data.org/hebrew/text?book=Genesis&amp;chapter=2&amp;verse=5&amp;version=...
1446804,וְכָל־עֵ֥שֶׂב הַשָּׂדֶ֖ה טֶ֣רֶם יִצְמָ֑ח,וְכֹ֣ל׀ שִׂ֣יחַ הַשָּׂדֶ֗ה טֶ֚רֶם יִֽהְיֶ֣ה בָאָ֔רֶץ וְכָל־עֵ֥שֶׂב הַשָּׂדֶ֖ה טֶ֣רֶם יִצְמָ֑ח כּ...,https://shebanq.ancient-data.org/hebrew/text?book=Genesis&amp;chapter=2&amp;verse=5&amp;version=...


**Merge context data and export .csv**

In [100]:
genesis_pilot_data = pd.concat([genesis_data, new_contexts], axis=1)

genesis_pilot_data.head(2)

Unnamed: 0_level_0,ref,time,token,clause,TA_type,Aspect_main,Aspect_second,sentence,verse,link
node,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1
1446800,Gen 1:1,ראשׁית,ב.ראשׁית,בְּרֵאשִׁ֖ית בָּרָ֣א אֱלֹהִ֑ים אֵ֥ת הַשָּׁמַ֖יִם וְאֵ֥ת הָאָֽרֶץ׃,,,,בְּרֵאשִׁ֖ית בָּרָ֣א אֱלֹהִ֑ים אֵ֥ת הַשָּׁמַ֖יִם וְאֵ֥ת הָאָֽרֶץ׃,בְּרֵאשִׁ֖ית בָּרָ֣א אֱלֹהִ֑ים אֵ֥ת הַשָּׁמַ֖יִם וְאֵ֥ת הָאָֽרֶץ׃,https://shebanq.ancient-data.org/hebrew/text?book=Genesis&amp;chapter=1&amp;verse=1&amp;version=...
1446801,Gen 2:2,יום,ב.ה.יום.ה.שׁביעי,וַיְכַ֤ל אֱלֹהִים֙ בַּיֹּ֣ום הַשְּׁבִיעִ֔י מְלַאכְתֹּ֖ו,,,,וַיְכַ֤ל אֱלֹהִים֙ בַּיֹּ֣ום הַשְּׁבִיעִ֔י מְלַאכְתֹּ֖ו אֲשֶׁ֣ר עָשָׂ֑ה,וַיְכַ֤ל אֱלֹהִים֙ בַּיֹּ֣ום הַשְּׁבִיעִ֔י מְלַאכְתֹּ֖ו אֲשֶׁ֣ר עָשָׂ֑ה וַיִּשְׁבֹּת֙ בַּיֹּ֣ום ...,https://shebanq.ancient-data.org/hebrew/text?book=Genesis&amp;chapter=2&amp;verse=2&amp;version=...


In [102]:
genesis_pilot_data.to_csv('Genesis_pilot/genesis_times.csv', encoding='UTF-16')

Data imported into Google Drive for maual annotation at the following link:
https://docs.google.com/spreadsheets/d/12K623fm6iAWoTcqSwrK0SlahdOLDEC3XdBPMyO5X98M/edit?usp=sharing