# In-Class Exercise: Exploring Gutenberg Using Python
 This exercise includes
 both collaborative and independent components. You will be working primarily in your own Jupyter notebook, but will be collaborating on investigating a question of your own choosing.


 First, you will need to install some dependencies:

 
 - Install BSD-DB according to the instructions here:
 https://github.com/c-w/Gutenberg

 - Next, we'll install a library for downloading texts from gutenberg via pip. After selecting the appropriate shell for Anaconda, type the following into the terminal:
 
 ```bash
 pip install gutenberg
 ```
 
 - Finally, install TextBlob and necessary corpora:
 ```na;j
 pip install -U textblob
 python -m textblob.download_corpora
 ```

In [2]:
# Let's begin by downloading and using the version of Moby Dick published on Project Gutenberg.
from gutenberg.acquire import load_etext
from gutenberg.cleanup import strip_headers
from textblob import TextBlob

text = strip_headers(load_etext(16328)).strip()
blob = TextBlob(text)
# print(text)  # prints 'MOBY DICK; OR THE WHALE\n\nBy Herman Melville ...'
# This will save the text to a local .txt file in this directory.
source = open('beowulf.txt','w',encoding="utf-16",newline='\n')
source.write(text)
source.close()

In [7]:
type(text)


str

In [3]:
blob.noun_phrases   # WordList(['titular threat', 'blob',
                    #            'ultimate movie monster',
                    #            'amoeba-like mass', ...])

e daughter', 'hæreth', 'b.', 'p.', 'b. xii', 'grdtvg.', 'b.', "suggests 'fácne", 'jul', 'accepting', '_who longest', '= _seldom_', '= _page_', 'scholars regard', "verb meaning '_bend_", 'great scholar', "'_shall kill_", 'oft', 'seldan wære', 'lýtle hwíle bongár búgeð', 'þéah séo brýd duge = _often', 'short time', 'xxx', 'beowulf narrates his adventures to higelac', 'heathobards', 'daneman', 'hard', 'heathobards', 'own dear', 'belovèd companions', 'collar beholdeth', 'ancient ash-warrior', "earlmen 's destruction", 'clearly', 'sadly', 'thane-champion', "'s spirit", 'war-grief', 'word-answer speaketh', 'ingeld', "'art thou", 'thou seest', 'thy father', "'neath visor", 'danemen', 'scyldings', "e'en", "murderer 's progeny", 'exulting', 'ornaments enters', 'boasts', 'thou shouldst', 'till waxeth', "woman 's thane", 'blood-gory sleepeth', 'fated', 'land knoweth', 'ingeld', 'wife-love waxeth', 'heathobards', 'danemen', 'thee [', '] {', 'preliminary statements', 'grendel', 'grendel', 'ornament

In [9]:
for sentence in blob.sentences:
    print(sentence.sentiment.polarity)


0.1
0.0
0.0
0.06818181818181818
0.0
0.02500000000000001
0.0
0.8
0.0
0.0
0.0
0.0
0.0
0.0
0.16
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.35000000000000003
0.0
0.0
0.0
0.0
0.0
-0.2
-0.1
0.0
-0.5
0.0
0.0
0.1
0.05357142857142857
0.0
0.35
0.0
0.2375
0.0
0.0
0.375
0.19999999999999998
0.13333333333333333
0.09375
0.6
0.8
-0.049999999999999996
0.0
0.2
0.16666666666666666
0.85
0.30833333333333335
0.135
0.5
0.15
0.04087301587301586
0.0
0.0
0.0
0.0625
0.125
0.0
0.0
0.0
0.0
0.0
0.0
0.375
0.0
-0.0625
0.0
0.0
0.2333333333333333
0.0
0.8
0.4
0.0
-0.02777777777777779
0.0
-0.04666666666666667
0.0
-0.1
0.0
0.0
-0.1
0.2
0.050000000000000044
0.0
0.0
-0.6
0.0
0.0
0.05
0.30000000000000004
0.16
0.0
0.0
0.1875
0.0
-0.051851851851851864
0.3
0.0
-0.2
0.10000000000000003
0.0
0.0
0.6
0.3
0.6
0.0
0.0
-0.375
0.0
0.8
0.39166666666666666
0.38333333333333336
0.0
0.3125
0.14404761904761906
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
-0.125
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.3

In [10]:
from operator import itemgetter  

d = blob.word_counts
for key, value in sorted(d.items(), key = itemgetter(1), reverse = True):
    print(key, value)


 1
passes 1
narrow 1
paths 1
unfrequented 1
abrupt 1
nicker-haunts 1
surroundings 1
unawares 1
woods 1
hoar-stones 1
holt-wood 1
irksome 1
unlittle 1
horn 1
sea-dragons 1
serpent 1
mere-dragons 1
noonday 1
wild-beasts 1
wormkind 1
hot-mooded 1
clamor 1
war-trumpet 1
winding 1
bowstring 1
sea-struggle 1
war-missile 1
beast 1
poor 1
sword-pointed 1
boar-spears 1
pulled 1
loath-fashioned 1
donned 1
ample 1
hand-woven 1
explore 1
mix 1
treasure-emblazoned 1
weapon-smith 1
wondrously 1
swine-bodies 1
thenceforward 1
helpers 1
blotted 1
hardened 1
battle-field 1
destined 1
sword-hero 1
endanger 1
recall 1
lending 1
stead 1
protect 1
perceive 1
gem-giver 1
over-measure 1
heavy-sword 1
grim-death 1
fray 1
wave-current 1
day's-length 1
elapsed 1
52 1
domains 1
exploring 1
sooner 1
unscathèd 1
out-guarded 1
limb-mail 1
locked 1
loath-grabbing 1
bottomward 1
grabs 1
mere-beast 1
flood-beasts 1
fierce-biting 1
tusks 1
pursued 1
harmed 1
roofed-hall 1
prevented 1
brightness 1
a-gleaming 1
fire-ligh

In [8]:
max = 0
index = 0
# Find the longest sentence in the work
for key, sentence in enumerate(blob.sentences):
    if(len(sentence.words) > max):
        max = len(sentence.words)
        index = key

print(max)
print(blob.sentences[index])


142
Soothly this hindered Heming's kinsman;
       55 Other ale-drinking earlmen asserted
          That fearful folk-sorrows fewer she wrought them,
          Treacherous doings, since first she was given
          Adorned with gold to the war-hero youthful,
          For her origin honored, when Offa's great palace
       60 O'er the fallow flood by her father's instructions
          She sought on her journey, where she afterwards fully,
          Famed for her virtue, her fate on the king's-seat
[67]      Enjoyed in her lifetime, love did she hold with
          The ruler of heroes, the best, it is told me,
       65 Of all of the earthmen that oceans encompass,
          Of earl-kindreds endless; hence Offa was famous
          Far and widely, by gifts and by battles,
          Spear-valiant hero; the home of his fathers
          He governed with wisdom, whence Eomær did issue
       70 For help unto heroes, Heming's kinsman,
          Grandson of Garmund, great in encounters.


In [13]:
# Find the longest word in the work
max = 0
for key, word in enumerate(blob.words):
    if(len(word) > max):
        max = len(word)
        index = key
print(max)
print(blob.words[index])


21
swords-for-the-battle


# Parts of Speech

Another method Montfort described is to use the tags to count certain parts of speech. Below is an example that uses a single sentence, but the same could be applied to a full manuscript.

In [15]:

pride = TextBlob('''It is a truth universally acknowledged, 
that a single man in possession of a good fortune, must be in 
want of a wife.''')


In [24]:
def adjs(blob):
    count = 0
    for (word, tag) in blob.tags:
        if tag == 'JJ':
            count = count + 1
    return count


In [25]:
adjs(blob)

2685

# Creating Figures
There are many ways to create figures. Below is one example of a table. You can save the figure to a file. 

You will need to install orca, however, using conda in order to create a static image:
```
conda install -c plotly plotly-orca
```

In [21]:
import plotly.graph_objects as go

fig = go.Figure(data=[go.Table(header=dict(values=['A Scores', 'B Scores']),
                 cells=dict(values=[[100, 90, 80, 90], [95, 85, 75, 95]]))
                     ])
fig.show()
fig.write_image("fig1.png")

ModuleNotFoundError: No module named 'plotly'

In [22]:
import plotly.graph_objects as go

fig = go.Figure(data=[go.Table(header=dict(values=['A Scores', 'B Scores']),
                 cells=dict(values=[[100, 90, 80, 90], [95, 85, 75, 95]]))
                     ])
fig.show()
fig.write_image("fig1.png")

ModuleNotFoundError: No module named 'plotly'

We will work with other types of figures, graphs, and tables in Lab 2.

To turn in the assignment, follow the instructions in class_notes.ipynb