# Assignment 1 – variables, types, booleans, conditionals

**Questions? Drop em in the Slack under #questions**

## Introduction: Operationalizing Linguistic Research

Often in the humanities we deal with subjective concepts, things which are not directly measurable. Linguistics, though, sits somewhere on the border of subjective and objective. In quantitative linguistics, we try to use objective data to clarify issues once thought to be only subjective. In doing so, we have to learn how to "operationalize" our research questions. 

To operationalize means to condense a complex question into a simpler one that can be addressed with empirical data (Stefanowitsch 2010). It's possible that we lose some nuance in the process. But we also enable progress to be made within a specific niche. An accumulation of this progress eventually allows us to re-evaluate our theoretical starting point. 

Consider, for instance, the following two research questions:

> (1) In world languages, does the verb serve as the syntactic head of the sentence?

> (2) In world languages, how predictive is a verb for other arguments in a sentence?

The first question is a largely subjective one, which linguists continue to disagree on (e.g. Croft, *Radical Construction Grammar*, 2001). There is no way to test that hypothesis with only empirical data. 

The second question, on the other hand, can be answered with some Python, linguistic annotations, and a knowledge of statistics. And even though the answer to that question is simpler, it still has broader implications for more subjective questions.


<img src="../images/research_operationalization.png" height=500px width=500px>

<center><i>From question, to Python, to data, to theories</i></center>

**Further reading**\
[Anatol Stefanowitsch. 2010. "Empirical Cognitive Semantics"](https://pdfs.semanticscholar.org/5237/c136ec1f1c09c42b7fb9ffc7b36a5f217489.pdf#page=364)

<hr>

## Exercise Brief: From Strings to Integers

In the following exercises, we will practise miniature operationalizations, wherein we convert a small linguistic question into Python code that produces counts from which we can form theories. In Pythonic terms, we move from strings to integers.

Bsides strings and integers, the following exercises will also test your knowledge of booleans and conditionals.

## Warm-up

Combine these two strings using variables.

In [None]:
'μῆνιν ἄειδε θεὰ Πηληϊάδεω ' 
'Ἀχιλῆος οὐλομένην'

Write code that shows the number of times the letter "e" appears in the text below.

In [34]:
virgil = 'Arma virumque canō, Trōiae quī prīmus ab ōrīs'

Change the code below so that it prints the statement after the colon (:) on a separate line. 

In [None]:
print('Fair is foul, and foul is fair: Hover through the fog and filthy air.')

## Exercise 1. Basic Statistics from Strings

Below you see an excerpt of the first lines of Beowulf. Examine the various features of the text. For instance, we can see punctuation, and newlines. We can also see Old English characters.

Write code that calculates the following statistics from the text:

1. Store the number of sentences and display the answer within a statement (e.g. "The number of sentences is blank"). *hint: think about punctuation*
2. Store the number of lines and display the answer within a statement.
3. Display within a statement how many more lines there are than sentences.
4. What is the [ratio](https://www.mathsisfun.com/numbers/ratio.html) of lines to sentences? Display this in a statement as a decimal.

[Beowulf source](https://www.poetryfoundation.org/poems/43521/beowulf-old-english-version)

In [None]:
'''\
Hwæt. We Gardena in geardagum, 
þeodcyninga, þrym gefrunon, 
hu ða æþelingas ellen fremedon. 
Oft Scyld Scefing sceaþena þreatum, 
monegum mægþum, meodosetla ofteah, 
egsode eorlas. Syððan ærest wearð 
feasceaft funden, he þæs frofre gebad, 
weox under wolcnum, weorðmyndum þah, 
oðþæt him æghwylc þara ymbsittendra 
ofer hronrade hyran scolde, 
gomban gyldan. þæt wæs god cyning. 
ðæm eafera wæs æfter cenned, 
geong in geardum, þone god sende 
folce to frofre; fyrenðearfe ongeat 
þe hie ær drugon aldorlease 
lange hwile. Him þæs liffrea, 
wuldres wealdend, woroldare forgeaf; 
Beowulf wæs breme blæd wide sprang, 
Scyldes eafera Scedelandum in. 
Swa sceal geong guma gode gewyrcean, 
fromum feohgiftum on fæder bearme, 
þæt hine on ylde eft gewunigen 
wilgesiþas, þonne wig cume, 
leode gelæsten; lofdædum sceal 
in mægþa gehwære man geþeon. 
Him ða Scyld gewat to gescæphwile 
felahror feran on frean wære. 
Hi hyne þa ætbæron to brimes faroðe, 
swæse gesiþas, swa he selfa bæd, 
þenden wordum weold wine Scyldinga; 
leof landfruma lange ahte. 
þær æt hyðe stod hringedstefna, 
isig ond utfus, æþelinges fær. 
Aledon þa leofne þeoden, 
beaga bryttan, on bearm scipes, 
mærne be mæste. þær wæs madma fela 
of feorwegum, frætwa, gelæded; 
ne hyrde ic cymlicor ceol gegyrwan 
hildewæpnum ond heaðowædum, 
billum ond byrnum; him on bearme læg 
madma mænigo, þa him mid scoldon 
on flodes æht feor gewitan. 
Nalæs hi hine læssan lacum teodan, 
þeodgestreonum, þon þa dydon 
þe hine æt frumsceafte forð onsendon 
ænne ofer yðe umborwesende. 
þa gyt hie him asetton segen geldenne 
heah ofer heafod, leton holm beran, 
geafon on garsecg; him wæs geomor sefa, 
murnende mod. Men ne cunnon 
secgan to soðe, selerædende, 
hæleð under heofenum, hwa þæm hlæste onfeng.\
'''