# In-Class Exercise: Exploring Gutenberg Using Python
 This exercise includes
 both collaborative and independent components. You will be working primarily in your own Jupyter notebook, but will be collaborating on investigating a question of your own choosing.


 First, you will need to install some dependencies:

 
 - Install BSD-DB according to the instructions here:
 https://github.com/c-w/Gutenberg

 - Next, we'll install a library for downloading texts from gutenberg via pip. After selecting the appropriate shell for Anaconda, type the following into the terminal:
 
 ```bash
 pip install gutenberg
 ```
 
 - Finally, install TextBlob and necessary corpora:
 ```na;j
 pip install -U textblob
 python -m textblob.download_corpora
 ```

In [1]:
# Let's begin by downloading and using the version of Moby Dick published on Project Gutenberg.
from gutenberg.acquire import load_etext
from gutenberg.cleanup import strip_headers
from textblob import TextBlob

text = strip_headers(load_etext(2701)).strip()
blob = TextBlob(text)
# print(text)  # prints 'MOBY DICK; OR THE WHALE\n\nBy Herman Melville ...'
# This will save the text to a local .txt file in this directory.
source = open('in-class/week-6/mobydick.txt','w',encoding="utf-16",newline='\n')
source.write(text)
source.close()

In [2]:
type(text)


In [4]:
blob.noun_phrases   # WordList(['titular threat', 'blob',
                    #            'ultimate movie monster',
                    #            'amoeba-like mass', ...])

In [0]:
for sentence in blob.sentences:
    print(sentence.sentiment.polarity)


In [2]:
from operator import itemgetter  

d = blob.word_counts
for key, value in sorted(d.items(), key = itemgetter(1), reverse = False):
    print(key, value)


 10
dream 10
softly 10
serious 10
descending 10
glimpse 10
mostly 10
women 10
shortly 10
japan 10
yesterday 10
vital 10
methinks 10
venerable 10
removed 10
gaining 10
uplifted 10
closed 10
lift 10
smallest 10
depths 10
—i 10
berth 10
until 10
hideous 10
shroud 10
ask 10
gently 10
george 10
mortals 10
woman 10
hints 10
worship 10
priest 10
canoe 10
boldly 10
ignorant 10
block 10
bull 10
backs 10
nantucketers 10
cod 10
impressions 10
vague 10
remaining 10
nearest 10
concern 10
japanese 10
alike 10
century 10
term 10
main-mast 10
quaker 10
experience 10
pious 10
sliding 10
stump 10
accursed 10
stricken 10
murder 10
entitled 10
domestic 10
thrusting 10
finger 10
nights 10
relieved 10
coils 10
shoulder 10
approaching 10
imperial 10
endure 10
valiant 10
divided 10
king-post 10
square 10
headsman 10
quest 10
retained 10
tambourine 10
soil 10
throughout 10
manxman 10
mood 10
woods 10
narwhale 10
grounds 10
host 10
derived 10
detached 10
lip 10
spirits 10
skies 10
blinds 10
murderous 10
mount 1

In [5]:
max = 0
index = 0
# Find the longest sentence in the work
for key, sentence in enumerate(blob.sentences):
    if(len(sentence.words) > max):
        max = len(sentence.words)
        index = key
print(blob.sentences[index])

Though in many natural objects, whiteness refiningly enhances beauty,
as if imparting some special virtue of its own, as in marbles,
japonicas, and pearls; and though various nations have in some way
recognised a certain royal preeminence in this hue; even the barbaric,
grand old kings of Pegu placing the title “Lord of the White Elephants”
above all their other magniloquent ascriptions of dominion; and the
modern kings of Siam unfurling the same snow-white quadruped in the
royal standard; and the Hanoverian flag bearing the one figure of a
snow-white charger; and the great Austrian Empire, Cæsarian, heir to
overlording Rome, having for the imperial colour the same imperial hue;
and though this pre-eminence in it applies to the human race itself,
giving the white man ideal mastership over every dusky tribe; and
though, besides, all this, whiteness has been even made significant of
gladness, for among the Romans a white stone marked a joyful day; and
though in other mortal sympathies an

In [6]:
# Find the longest word in the work
max = 0
for key, word in enumerate(blob.words):
    if(len(word) > max):
        max = len(word)
        index = key
print(max)
print(blob.words[index])


28
swayings—coyings—flutterings


# Parts of Speech

Another method Montfort described is to use the tags to count certain parts of speech. Below is an example that uses a single sentence, but the same could be applied to a full manuscript.

In [10]:

pride = TextBlob('''It is a truth universally acknowledged, 
that a single man in possession of a good fortune, must be in 
want of a wife.''')


In [11]:
def adjs(pride):
    count = 0
    for (word, tag) in pride.tags:
        if tag == 'JJ':
            count = count + 1
    return count


In [12]:
adjs(pride)

# Creating Figures
There are many ways to create figures. Below is one example of a table. You can save the figure to a file. 

You will need to install orca, however, using conda in order to create a static image:
```
conda install -c plotly plotly-orca
```

In [5]:
import plotly.graph_objects as go

fig = go.Figure(data=[go.Table(header=dict(values=['A Scores', 'B Scores']),
                 cells=dict(values=[[100, 90, 80, 90], [95, 85, 75, 95]]))
                     ])
fig.show()
fig.write_image("in-class/week-6/fig1.png")

ValueError: 
The orca executable is required to export figures as static images,
but it could not be found on the system path.

Searched for executable 'orca' on the following path:
    C:\Users\jo284142\AppData\Local\Continuum\anaconda3
    C:\Users\jo284142\AppData\Local\Continuum\anaconda3\Library\mingw-w64\bin
    C:\Users\jo284142\AppData\Local\Continuum\anaconda3\Library\usr\bin
    C:\Users\jo284142\AppData\Local\Continuum\anaconda3\Library\bin
    C:\Users\jo284142\AppData\Local\Continuum\anaconda3\Scripts
    C:\Users\jo284142\AppData\Local\Continuum\anaconda3\bin
    C:\Users\jo284142\AppData\Local\Continuum\anaconda3\condabin
    C:\Program Files (x86)\Common Files\Intel\Shared Libraries\redist\intel64\compiler
    C:\cygwin64\bin
    C:\Users\jo284142\bin
    C:\Program Files\emacs\bin
    C:\Program Files (x86)\Common Files\Oracle\Java\javapath
    C:\Windows\system32
    C:\Windows
    C:\Windows\System32\Wbem
    C:\Windows\System32\WindowsPowerShell\v1.0
    C:\Program Files (x86)\GtkSharp\2.12\bin
    C:\Program Files\nodejs
    C:\Users\jo284142\MagicLeap\mlsdk\v0.18.0\tools\mldb
    C:\Program Files (x86)\Dell\Dell Display Manager
    C:\Program Files\Pandoc
    C:\Program Files\Common Files\Autodesk Shared
    C:\Program Files\Microsoft SQL Server\120\Tools\Binn
    C:\Program Files\dotnet
    C:\Program Files (x86)\NVIDIA Corporation\PhysX\Common
    C:\Program Files\NVIDIA Corporation\NVIDIA NvDLISR
    C:\Program Files\7-Zip
    C:\Program Files (x86)\Windows Kits\10\Windows Performance Toolkit
    C:\Ruby26-x64\bin
    C:\Ruby25-x64\bin
    C:\Users\jo284142\AppData\Local\Microsoft\WindowsApps
    C:\Users\jo284142\AppData\Local\Programs\Microsoft VS Code\bin
    C:\Users\jo284142\AppData\Local\Programs\MiKTeX 2.9\miktex\bin\x64
    C:\ProgramData\jo284142\GitHubDesktop\bin
    C:\Users\jo284142\AppData\Roaming\npm
    C:\ProgramData\jo284142\atom\bin
    C:\Users\jo284142\.dotnet\tools
    C:\Users\jo284142\bin\php
    C:\Users\jo284142\composer
    C:\Users\jo284142\AppData\Roaming\Composer\vendor\bin
    C:\Users\jo284142\AppData\Local\Programs\Microsoft VS Code Insiders\bin
    C:\Users\jo284142\AppData\Local\Programs\Git\cmd

If you haven't installed orca yet, you can do so using conda as follows:

    $ conda install -c plotly plotly-orca

Alternatively, see other installation methods in the orca project README at
https://github.com/plotly/orca.

After installation is complete, no further configuration should be needed.

If you have installed orca, then for some reason plotly.py was unable to
locate it. In this case, set the `plotly.io.orca.config.executable`
property to the full path of your orca executable. For example:

    >>> plotly.io.orca.config.executable = '/path/to/orca'

After updating this executable property, try the export operation again.
If it is successful then you may want to save this configuration so that it
will be applied automatically in future sessions. You can do this as follows:

    >>> plotly.io.orca.config.save()

If you're still having trouble, feel free to ask for help on the forums at
https://community.plot.ly/c/api/python


In [None]:
import plotly.graph_objects as go

fig = go.Figure(data=[go.Table(header=dict(values=['A Scores', 'B Scores']),
                 cells=dict(values=[[100, 90, 80, 90], [95, 85, 75, 95]]))
                     ])
fig.show()
fig.write_image("in-class/week-6/fig1.png")

We will work with other types of figures, graphs, and tables in Lab 2.

To turn in the assignment, follow the instructions in class_notes.ipynb