What if we want to understand the context of a word? Can we get the words that surround a word in a text and do something with them?

```python
for word in text:
    get the values of the X words before the target_word
    get the values of the X words after the target_word
    sum the values (?)
    return sum for each instance of word 
```

I am going to re-use some of the code from Gina's `word.py` script.

When I am working in Jupyter notebook, I tend to migrate my import toward the top of the notebook so that should I export to a standalone script, the imports are already where they should be:

In [1]:
import re
from nltk.corpus import stopwords

In [2]:
# =-=-=-=-=-=-=-=-=-=-=
# INPUTS
# =-=-=-=-=-=-=-=-=-=-= 
filename = "texts/mdg.txt"
target = "game"
window = 5
stop_words = set(stopwords.words('english')) 

# For added stopwords:
# newStopWords = ['stopWord1','stopWord2'] # This could be a file
# stopwords.extend(newStopWords)

In [3]:
# Load the file 
with open(filename, 'r') as myfile:
    text = myfile.read()
# convert it to a list of words 
normalized_words = re.sub("[^a-zA-Z']"," ", text).lower().split()
# drop unwanted words
words = [word for word in normalized_words if not word in stop_words]

# Test how things are going:
print(words[0:40])

['right', 'somewhere', 'large', 'island', 'said', 'whitney', 'rather', 'mystery', 'island', 'rainsford', 'asked', 'old', 'charts', 'call', "'ship", 'trap', 'island', "'", 'whitney', 'replied', 'suggestive', 'name', 'sailors', 'curious', 'dread', 'place', 'know', 'superstition', "can't", 'see', 'remarked', 'rainsford', 'trying', 'peer', 'dank', 'tropical', 'night', 'palpable', 'pressed', 'thick']


The `enumerate` function is really useful: it takes a list and creates a list of pairs with a number associated with the position for each item in the list:

In [11]:
string = "This is a sentence."
words = re.sub("[^a-zA-Z']"," ", string).lower().split()
numbered = enumerate(words)
for i, j in numbered:
    print(i, j)

0 this
1 is
2 a
3 sentence


Note that you cannot simply `print(numbered)` as this will only return `<enumerate object at 0x12105fe`. You will encounter this particular phenomenon in a number of instances in Python. It simply means you have created a kind of object that cannot be printed in its current form: you will need to iterate over it. And whenever you hear the word *iterate* you should almost always think `for`, and that is what we have done above. Since we know that `enumerate` produces pairs of objects, we feed the for loop a pair of iterators -- here I've used `i` and `j` but you could use anything that makes sense and helps make your code more readable and usable. 

In [4]:
targetIndices = [i for i, x in enumerate(words) if x == target]

# Let's see those locations
print(targetIndices)

# Now let's make sure that the word associated with those locations is ours:
print([words[i] for i in targetIndices])

# To get the single line above, I first wrote this:
# for i in targetIndices:
#     print (words[i])

[110, 1360, 1377, 1387, 1392, 1454, 1550, 2165, 2196, 2212, 2664, 3931, 3993]
['game', 'game', 'game', 'game', 'game', 'game', 'game', 'game', 'game', 'game', 'game', 'game', 'game']


In [7]:
before = []
after = []
for index, word in enumerate(words):
    for i in targetIndices:
        if index >= (i - window) and index < i:
            before.append(word)
        if index > i and index <= (i + window):
            after.append(word)

In [8]:
print(before)

['rot', 'whitney', 'said', 'rainsford', 'big', 'ord', 'cape', 'buffalo', 'dangerous', 'big', 'sir', 'cape', 'buffalo', 'dangerous', 'big', 'said', 'slow', 'tone', 'hunt', 'dangerous', 'game', 'rainsford', 'expressed', 'surprise', 'big', 'said', 'general', 'shall', 'glad', 'society', 'always', 'hunt', 'hunted', 'every', 'kind', 'rainsford', 'effort', 'held', 'tongue', 'check', 'eludes', 'three', 'whole', 'days', 'wins', 'give', 'option', 'course', 'need', 'play', 'glass', 'rainsford', 'sat', 'staring', 'find', 'quarry', 'escaped', 'course', 'american', 'played', 'sucked', 'breath', 'smiled', 'congratulate', 'said']


I think what I want below is a function that will take an index for a word and then return the words before and the words after: 

In [9]:
def windoWord (text):
    before = []
    after = []
    for index, word in enumerate(words):
        for i in targetIndices:
            if index >= (i - window) and index < i:
                before.append(word)
            if index > i and index <= (i + window):
                after.append(word)
    return before

In [None]:
for i in range(len(targetIndices)):
    if i <= len(before)/window:
        print(" ".join(before[i*window:(i+1)*window]) + "\n")

for i in range(len(targetIndices)):
    if i <= len(after)/window:
        print(" ".join(after[i*window:(i+1)*window]) + "\n")