<a href="https://colab.research.google.com/github/sh-mukherjee/love-or-money-word-count/blob/main/Love_Money_Word_Count.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Love or Money? 
## I analyse five classic murder mysteries written in the early 1920s by crime writers who would later go on to join [The Detection Club](https://www.fantasticfiction.com/c/detection-club/). What is more prominent in these mysteries -- love or money?

The novels mentioned here are in the public domain and the texts are taken from [Project Gutenberg](https://http://www.gutenberg.org/).

* '[The Mysterious Affair at Styles](http://www.gutenberg.org/ebooks/863)' (1920) by [Agatha Christie](https://https://www.agathachristie.com/)

* '[Whose Body?](https://http://www.gutenberg.org/ebooks/58820)' (1923) by [Dorothy L. Sayers](https://https://www.sayers.org.uk/)

* '[The Cask](https://http://www.gutenberg.org/ebooks/59854)' (1920) by [Freeman Wills Crofts](https://https://www.fantasticfiction.com/c/freeman-wills-crofts/)

* '[The Red House Mystery](https://http://www.gutenberg.org/ebooks/1872)' (1922) by [A. A. Milne](https://https://www.britannica.com/biography/A-A-Milne)

* '[The Bittermeads Mystery](https://http://www.gutenberg.org/ebooks/1888)' (1922) by [E. R. Punshon](https://https://www.fantasticfiction.com/p/e-r-punshon/)


I've used a rather simplistic method here -- a count of the number of times the words 'love' and 'money' appear in the texts of these novels. But the results are illuminating nonetheless!

To do the word count I've used the [spaCy](https://https://spacy.io/) library for natural language processing in Python.



In [11]:
#!pip install spacy     NOTE: I've commented this out since I built this notebook on Google Colab, which already has spaCy installed

In [7]:
import spacy
import base64
import requests
import pandas as pd
import plotly.express as px

nlp = spacy.load("en_core_web_sm")

# Define a function that counts the number of times the word 'love' appears in a text
def word_count_love(string):
    words_countedl = 0
    my_stringl = nlp(string)

    for token in my_stringl:
        # actual word
        wordl = token.text
        # lemma
        lemma_wordl = token.lemma_
        # part of speech
        word_posl = token.pos_
        if lemma_wordl in ['love']:
            words_countedl += 1
            #print(lemma_wordl)
    return words_countedl

# Define a function that counts the number of times the word 'love' appears in a text
def word_count_money(string):
    words_countedm = 0
    my_stringm = nlp(string)

    for token in my_stringm:
        # actual word
        wordm = token.text
        # lemma
        lemma_wordm = token.lemma_
        # part of speech
        word_posm = token.pos_
        if lemma_wordm in ['money']:
            words_countedm += 1
            #print(lemma_wordm)
    return words_countedm   

In [2]:
# Choose the colours to use in the graphs
colors = ['crimson','gold']

## 'The Mysterious Affair at Styles' (1920) by Agatha Christie
This is Christie's first published novel featuring her most famous creation, Hercule Poirot.

A wealthy woman who had assisted Poirot and some of his fellow refugees from Belgium is found dead. Is it a case of taking the wrong medication, or is there something more sinister going on? Poirot investigates the matter with the help of his friend Captain Hastings.

In [13]:
# Read in the text of 'The Mysterious Affair at Styles' by Agatha Christie
url_ac = 'https://raw.githubusercontent.com/sh-mukherjee/love-or-money-word-count/main/styles.txt'
text_ac = requests.get(url_ac).text

# Create a dataframe with the counts of the words 'Love' and 'Money'
data_ac = [['Love', word_count_love(text_ac)], ['Money', word_count_money(text_ac)]]
df_ac = pd.DataFrame(data_ac, columns = ['Word', 'Count'])

# Plot this dataframe using Plotly Express
fig_ac=px.bar(df_ac,x=df_ac.Word,y=df_ac.Count,title="'The Mysterious Affair at Styles' (1920) by Agatha Christie")
fig_ac.update_traces(marker_color=colors)
fig_ac.show()

## 'Whose Body?' (1923) by Dorothy L. Sayers

This was the first mystery novel by Dorothy Sayers featuring her most famous creation, Lord Peter Wimsey.

An architect discovers a stranger's dead body in his bathtub. Around the same time a well-known financier disappears without a trace. Could the two incidents be connected?

In [14]:
# Read in the text of 'Whose Body?' by Dorothy L. Sayers
url_ds = 'https://raw.githubusercontent.com/sh-mukherjee/love-or-money-word-count/main/whosebody.txt'
text_ds = requests.get(url_ds).text

# Create a dataframe with the counts of the words 'Love' and 'Money'
data_ds = [['Love', word_count_love(text_ds)], ['Money', word_count_money(text_ds)]]
df_ds = pd.DataFrame(data_ds, columns = ['Word', 'Count'])

# Plot the dataframe using Plotly Express
fig_ds=px.bar(df_ds,x=df_ds.Word,y=df_ds.Count,title="'Whose Body?' (1923) by Dorothy L. Sayers")
fig_ds.update_traces(marker_color=colors)
fig_ds.show()

## 'The Cask' (1920) by Freeman Wills Croft
This is one of the first novels by Crofts. It does not feature his series police detective Inspector French, but it is quite fun all the same.

A cask of French wine arrives at a London dockyard, but it happens to contain a corpse. What is its identity, and how did it end up in the cask?


In [15]:
# Read in the text of 'The Cask' by Freeman Wills Crofts
url_fwc = 'https://raw.githubusercontent.com/sh-mukherjee/love-or-money-word-count/main/thecask.txt'
text_fwc = requests.get(url_fwc).text

# Create a dataframe with the counts of the words 'Love' and 'Money'
data_fwc = [['Love', word_count_love(text_fwc)], ['Money', word_count_money(text_fwc)]]
df_fwc = pd.DataFrame(data_fwc, columns = ['Word', 'Count'])

# Plot the dataframe using Plotly Express
fig_fwc=px.bar(df_fwc,x=df_fwc.Word,y=df_fwc.Count,title="'The Cask' (1920) by Freeman Wills Crofts")
fig_fwc.update_traces(marker_color=colors)
fig_fwc.show()

## 'The Red House Mystery' (1922) by A. A. Milne
This is the one and only mystery novel by the author better known for his Winnie-the-Pooh stories. It is quite enjoyable and has a good camaraderie between the main investigator and his friend. 

In [17]:
# Read in the text of 'The Red House Mystery' by A. A. Milne
url_aam = 'https://raw.githubusercontent.com/sh-mukherjee/love-or-money-word-count/main/redhouse.txt'
text_aam = requests.get(url_aam).text

# Create a dataframe with the counts of the words 'Love' and 'Money'
data_aam = [['Love', word_count_love(text_aam)], ['Money', word_count_money(text_aam)]]
df_aam = pd.DataFrame(data_aam, columns = ['Word', 'Count'])

# Plot the dataframe using Plotly Express
fig_aam=px.bar(df_aam,x=df_aam.Word,y=df_aam.Count,title="'The Red House Mystery' (1922) by A. A. Milne")
fig_aam.update_traces(marker_color=colors)
fig_aam.show()

## 'The Bittermeads Mystery' (1922) by E. R. Punshon
E. R. Punshon is not very well-remembered now, but he was a very active crime fiction writer and published a series featuring his character Inspector Bobby Owen.
This novel is among his earliest ones and is a stand-alone book.

A somewhat unsavoury man appears in disguise at a house and frightens a woman living there. He means to find out what happened to his friend who had disappeared, and soon he makes a terrible discovery.


In [18]:
# Read in the text of 'The Bittermeads Mystery' by E. R. Punshon
url_erp = 'https://raw.githubusercontent.com/sh-mukherjee/love-or-money-word-count/main/bittermeads.txt'
text_erp = requests.get(url_erp).text

# Create a dataframe with the counts of the words 'Love' and 'Money'
data_erp = [['Love', word_count_love(text_erp)], ['Money', word_count_money(text_erp)]]
df_erp = pd.DataFrame(data_erp, columns = ['Word', 'Count'])

# Plot the dataframe using Plotly Express
fig_erp=px.bar(df_erp,x=df_erp.Word,y=df_erp.Count,title="'The Bittermeads Mystery' (1922) by E. R. Punshon")
fig_erp.update_traces(marker_color=colors)
fig_erp.show()