### Preparing and Analysing a Sample Text: Elizabeth S. Haldane's (1911) English translation of Descartes' *Meditationes*

**1. Preparation**

In [6]:
import pandas as pd
import re

In [7]:
import spacy

In [8]:
Meditationes = open('Meditationes.txt', encoding="utf8")

In [9]:
sample = Meditationes.read()

In [10]:
from spacy.lang.en import English

raw_text = sample
nlp = English()
nlp.add_pipe(nlp.create_pipe('sentencizer'))
doc = nlp(raw_text)
sentences = [sent.string.strip() for sent in doc.sents]

In [11]:
sentences

['Meditation I.\n\n\n\n Of the things which may be brought within the sphere of the\n\n                          doubtful.',
 'It is now some years since I detected how many were the\nfalse beliefs that I had from my earliest youth admitted as\ntrue, and how doubtful was everything I had since constructed\non this basis; and from that time I was convinced that I must\nonce for all seriously undertake to rid myself of all the\nopinions which I had formerly accepted, and commence to build\nanew from the foundation, if I wanted to establish any firm\nand permanent structure in the sciences.',
 'But as this\nenterprise appeared to be a very great one, I waited until I\nhad attained an age so mature that I could not hope that at\nany later date I should be better fitted to execute my design.',
 'This reason caused me to delay so long that I should feel that\nI was doing wrong were I to occupy in deliberation the time\nthat yet remains to me for action.',
 'To-day, then, since very\nopportun

In [12]:
sentences = [item.replace('\n', " ") for item in sentences]
print (sentences)

['Meditation I.     Of the things which may be brought within the sphere of the                            doubtful.', 'It is now some years since I detected how many were the false beliefs that I had from my earliest youth admitted as true, and how doubtful was everything I had since constructed on this basis; and from that time I was convinced that I must once for all seriously undertake to rid myself of all the opinions which I had formerly accepted, and commence to build anew from the foundation, if I wanted to establish any firm and permanent structure in the sciences.', 'But as this enterprise appeared to be a very great one, I waited until I had attained an age so mature that I could not hope that at any later date I should be better fitted to execute my design.', 'This reason caused me to delay so long that I should feel that I was doing wrong were I to occupy in deliberation the time that yet remains to me for action.', 'To-day, then, since very opportunely for the plan I have

In [13]:
df = pd.DataFrame(sentences) 
df 

Unnamed: 0,0
0,Meditation I. Of the things which may be b...
1,It is now some years since I detected how many...
2,But as this enterprise appeared to be a very g...
3,This reason caused me to delay so long that I ...
4,"To-day, then, since very opportunely for the p..."
...,...
506,But when I perceive things as to which I know ...
507,And I ought in no wise to doubt the truth of s...
508,"For because God is in no wise a deceiver, it f..."
509,But because the exigencies of action often obl...


**2. Analysis**

In [14]:
from spacy.matcher import Matcher

In [15]:
nlp = spacy.load("en_core_web_sm")

doc_sentence = nlp(sample)

In [16]:
matcher = Matcher(nlp.vocab)
pattern = [{"LEMMA":"soul"}]
matcher.add("Soul_PATTERN", None, pattern)
matches = matcher(doc_sentence)

In [17]:
print("Total matches found:", len(matches))

for match_id, start, end in matches:
    span = doc[start: end]
    print("Match found:", doc[start:end].text, span.start_char, span.end_char)

Total matches found: 8
Match found: soul 17278 17282
Match found: soul 17325 17329
Match found: soul 18931 18935
Match found: soul 19792 19796
Match found: soul 119467 119471
Match found: soul 125865 125869
Match found: soul 133974 133978
Match found: soul 135363 135367


In [18]:
df = pd.DataFrame(matches) 
df 

Unnamed: 0,0,1,2
0,14564036968997165607,3841,3842
1,14564036968997165607,3853,3854
2,14564036968997165607,4248,4249
3,14564036968997165607,4458,4459
4,14564036968997165607,26762,26763
5,14564036968997165607,28176,28177
6,14564036968997165607,29975,29976
7,14564036968997165607,30297,30298


In [19]:
sents = [sent for sent in doc.sents]

In [29]:
for sent in sents:
    if matches[0][1] < sent.end:
        print(sent)
        break

 In addition to
this I considered that I was nourished, that I walked, that I
felt, and that I thought, and I referred all these actions to
the soul:  but I did not stop to consider what the soul was,
or if I did stop, I imagined that it was something extremely
rare and subtle like a wind, a flame, or an ether, which was
spread throughout my grosser parts.


In [30]:
for sent in sents:
    if matches[2][1] < sent.end:
        print(sent)
        break

 Let us pass to the
attributes of soul and see if there is any one which is in me?


In [31]:
for sent in sents:
    if matches[3][1] < sent.end:
        print(sent)
        break

 I do
not now admit anything which is not necessarily true:  to
speak accurately I am not more than a thing which thinks, that
is to say a mind or a soul, or an understanding, or a reason,
which are terms whose significance was formerly unknown to me.


In [32]:
for sent in sents:
    if matches[4][1] < sent.end:
        print(sent)
        break

 And
although possibly (or rather certainly, as I shall say in a
moment) I possess a body with which I am very intimately
conjoined, yet because, on the one side, I have a clear and
distinct idea of myself inasmuch as I am only a thinking and
unextended thing, and as, on the other, I possess a distinct
idea of body, inasmuch as it is only an extended and
unthinking thing, it is certain that this I [that is to say,
my soul by which I am what I am], is entirely and absolutely
distinct from my body, and can exist without it.


In [33]:
for sent in sents:
    if matches[5][1] < sent.end:
        print(sent)
        break

 And also from
the fact that amongst these different sense-perceptions some
are very agreeable to me and others disagreeable, it is quite
certain that my body (or rather myself in my entirety,
inasmuch as I am formed of body and soul) may receive
different impressions agreeable and disagreeable from the
other bodies which surround it.


In [34]:
for sent in sents:
    if matches[6][1] < sent.end:
        print(sent)
        break



     But certainly although in regard to the dropsical body it
is only so to speak to apply an extrinsic term when we say
that its nature is corrupted, inasmuch as apart from the need
to drink, the throat is parched; yet in regard to the
composite whole, that is to say, to the mind or soul united to
this body, it is not a purely verbal predicate, but a real
error of nature, for it to have thirst when drinking would be
hurtful to it.


In [35]:
for sent in sents:
    if matches[7][1] < sent.end:
        print(sent)
        break

 But it is quite otherwise with
corporeal or extended objects, for there is not one of these
imaginable by me which my mind cannot easily divide into
parts, and which consequently I do not recognise as being
divisible; this would be sufficient to teach me that the mind
or soul of man is entirely different from the body, if I had
not already learned it from other sources.
