---
# DH140 Final Project
# Rachel Han
# August 1, 2024
# The Story of Kangetsu:
# Network Analysis of a Buddhist Scholar's Memoirs
---

# Introduction

The project focuses on an unpublished memoir titled "Kangetsu: The Memoirs of Ruth Strout McCandless As Told to Günther Cologna in April 1994."  The memoir is a part of the Ruth Strout McCandless Collection on Nyogen Senzaki, housed at UCLA's East Asian library. This collection, introduced to me by my supervisor Dr. Jesse Drian, contains the memoir, which includes transcripts of eight tapes, recounts Ruth McCandless's life, particularly her involvement with Zen Buddhism. Throughout the memoir, she details interactions with prominent Zen Buddhists such as Nyogen Senzaki, Soen Nakagawa, Soyen Shaku, Eido Tai Shimano, and D. T. Suzuki. Additionally, she was involved with influential Indian philosophers like Vanda Scaravelli, Jiddu Krishnamurti, and Selvarajan Yesudian, as well as notable figures in the writing and creative industry such as Anne Lindbergh, Harriet Doerr, Wallace Stegner, and Peter Matthiessen. The memoir also touches on her connections with spiritualists like Carolyn Conger and Brian Weiss. In addition to the educators from Stanford University, including Edith R. Mirrielees, Anthony Sokol, and Wilfred Stone.

Given the numerous characters that appear in the memoir, it became evident that a network analysis would be beneficial to understand Ruth McCandless's interactions and contributions to various fields, particularly Zen Buddhism in mid-20th century America. By creating a network analysis, I aim to highlight her unique position as a female Buddhist scholar and elucidate the prominent figures in the development of Zen Buddhism in the United States. Additionally, the network analysis can illustrate the connections between individuals interested in Buddhism and those in the creative industry, offering a comprehensive view of the social landscape of an upper-class white female Buddhist scholar based in California during the twentieth century.

The memoir includes both famous and non-famous individuals, but for privacy reasons, only recognized figures were included in the network analysis. Despite the chronological order of the interview questions, a thorough reading revealed that the characters could be grouped into professions such as Indian Buddhist practitioner, Japanese Buddhist practitioner, creator, spiritualist, educator, and organizational personnel. Creator were further divided into writer, actress, film producer, and publisher. 

This project aims to illuminate Ruth McCandless' significant contributions and connections within the realms of Zen Buddhism, Indian philosophy, and the creative industry through a comprehensive network analysis.

# Methods
## Data Preparation

1. Text Extraction and Cleaning:
   - I began by converting an OCR-scanned document to a text format and saved it as "original copy 2.txt." This document contained the memoirs of Ruth Strout McCandless.
   - To identify named entities within the text, I utilized spaCy, an NLP library. I experimented with different pipelines: the medium-size pipeline (en_core_web_md), large-size pipeline (en_core_web_lg), and transformer pipeline (en_core_web_trf). I found that the results were almost the same across these pipelines.
   - I combined the named entity recognition (NER) results from these pipelines. I identified both famous and non-famous individuals mentioned in the memoir. For this project, I only focused on famous individuals. I created an Excel file named "Noa_Names.xlsx," which includes famous individuals. I compiled all variations of their names found in the NER process into one column “Name in NER” separated by commas, and included their full names in another column “Full name”.


In [1]:
# Using transformer pipeline to generate name entities
!pip install -U spacy

[0m

In [2]:
# The English transformer pipeline and the large pipeline might not work on Jupyter Notebook because of their sizes. They works fine on local VS code.
!python -m spacy download en_core_web_trf

[0mCollecting en-core-web-trf==3.7.3
  Downloading https://github.com/explosion/spacy-models/releases/download/en_core_web_trf-3.7.3/en_core_web_trf-3.7.3-py3-none-any.whl (457.4 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m457.4/457.4 MB[0m [31m2.2 MB/s[0m eta [36m0:00:00[0m00:01[0m00:01[0m
Collecting spacy-curated-transformers<0.3.0,>=0.2.0 (from en-core-web-trf==3.7.3)
  Using cached spacy_curated_transformers-0.2.2-py2.py3-none-any.whl.metadata (2.7 kB)
Collecting curated-transformers<0.2.0,>=0.1.0 (from spacy-curated-transformers<0.3.0,>=0.2.0->en-core-web-trf==3.7.3)
  Using cached curated_transformers-0.1.1-py2.py3-none-any.whl.metadata (965 bytes)
Collecting curated-tokenizers<0.1.0,>=0.0.9 (from spacy-curated-transformers<0.3.0,>=0.2.0->en-core-web-trf==3.7.3)
  Using cached curated_tokenizers-0.0.9-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (1.9 kB)
Collecting torch>=1.12.0 (from spacy-curated-transformers<0.3.0,>=0.2.0->

Collecting nvidia-cusolver-cu12==11.4.5.107 (from torch>=1.12.0->spacy-curated-transformers<0.3.0,>=0.2.0->en-core-web-trf==3.7.3)
  Using cached nvidia_cusolver_cu12-11.4.5.107-py3-none-manylinux1_x86_64.whl.metadata (1.6 kB)
Collecting nvidia-cusparse-cu12==12.1.0.106 (from torch>=1.12.0->spacy-curated-transformers<0.3.0,>=0.2.0->en-core-web-trf==3.7.3)
  Using cached nvidia_cusparse_cu12-12.1.0.106-py3-none-manylinux1_x86_64.whl.metadata (1.6 kB)
Collecting nvidia-nccl-cu12==2.20.5 (from torch>=1.12.0->spacy-curated-transformers<0.3.0,>=0.2.0->en-core-web-trf==3.7.3)
  Using cached nvidia_nccl_cu12-2.20.5-py3-none-manylinux2014_x86_64.whl.metadata (1.8 kB)
Collecting nvidia-nvtx-cu12==12.1.105 (from torch>=1.12.0->spacy-curated-transformers<0.3.0,>=0.2.0->en-core-web-trf==3.7.3)
  Using cached nvidia_nvtx_cu12-12.1.105-py3-none-manylinux1_x86_64.whl.metadata (1.7 kB)
Collecting triton==3.0.0 (from torch>=1.12.0->spacy-curated-transformers<0.3.0,>=0.2.0->en-core-web-trf==3.7.3)
  Dow

In [None]:
import spacy
from spacy import displacy
from collections import Counter
import pandas as pd
pd.set_option('display.max_rows', 600)
pd.set_option('display.max_colwidth', 400)
nlp = spacy.load('en_core_web_trf')

In [1]:
filepath = "original copy 2.txt"
with open(filepath, encoding='utf-8') as file:
    text = file.read()

# Process the text using SpaCy
document = nlp(text)

NameError: name 'nlp' is not defined

In [None]:
# Divide the text into chunks to process it more efficiently with spaCy
import math

# Define the number of chunks to divide the text into
number_of_chunks = 70

# Calculate the size of each chunk
chunk_size = math.ceil(len(text) / number_of_chunks)

# Split the text into chunks using list comprehension
text_chunks = [text[i:i + chunk_size] for i in range(0, len(text), chunk_size)]

In [None]:
chunked_documents = list(nlp.pipe(text_chunks))

In [None]:
# Extract and count named entities labeled as "PERSON"
people = [named_entity.text for document in chunked_documents for named_entity in document.ents if named_entity.label_ == "PERSON"]

# Count the occurrences of each person's name
people_tally = Counter(people)

# Create a DataFrame from the tally of people
df_character = pd.DataFrame(people_tally.most_common(), columns=['character', 'count'])
df_character

In [2]:
# Using large pipeline to generate name entities for large pipeline
!python -m spacy download en_core_web_lg

[0mCollecting en-core-web-lg==3.7.1
  Downloading https://github.com/explosion/spacy-models/releases/download/en_core_web_lg-3.7.1/en_core_web_lg-3.7.1-py3-none-any.whl (587.7 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m587.7/587.7 MB[0m [31m1.8 MB/s[0m eta [36m0:00:00[0m00:01[0m00:01[0m
[0mInstalling collected packages: en-core-web-lg
Successfully installed en-core-web-lg-3.7.1
[38;5;2m✔ Download and installation successful[0m
You can now load the package via spacy.load('en_core_web_lg')


In [2]:
import spacy
from spacy import displacy
from collections import Counter
import pandas as pd
pd.set_option('display.max_rows', 600)
pd.set_option('display.max_colwidth', 400)
nlp = spacy.load('en_core_web_lg')

In [None]:
filepath = "original copy 2.txt"
with open(filepath, encoding='utf-8') as file:
    text = file.read()

# Process the text using SpaCy
document = nlp(text)

In [None]:
import math

# Define the number of chunks to split the text into
number_of_chunks = 80

chunk_size = math.ceil(len(text) / number_of_chunks)

# Split the text into chunks using list comprehension
text_chunks = [text[i:i + chunk_size] for i in range(0, len(text), chunk_size)]

In [None]:
# Process the text chunks with spaCy
chunked_documents = list(nlp.pipe(text_chunks))

In [None]:
# Extract and count named entities labeled as "PERSON"
people = [named_entity.text for document in chunked_documents for named_entity in document.ents if named_entity.label_ == "PERSON"]

# Tally the occurrences of each name
people_tally = Counter(people)

df = pd.DataFrame(people_tally.most_common(), columns=['character', 'count'])
df

In [5]:
# Using medium pipeline to generate name entities
!python -m spacy download en_core_web_md

[0mCollecting en-core-web-md==3.7.1
  Downloading https://github.com/explosion/spacy-models/releases/download/en_core_web_md-3.7.1/en_core_web_md-3.7.1-py3-none-any.whl (42.8 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m42.8/42.8 MB[0m [31m29.7 MB/s[0m eta [36m0:00:00[0m00:01[0m00:01[0m
[0mInstalling collected packages: en-core-web-md
Successfully installed en-core-web-md-3.7.1
[38;5;2m✔ Download and installation successful[0m
You can now load the package via spacy.load('en_core_web_md')


In [6]:
import spacy
from spacy import displacy
from collections import Counter
import pandas as pd
pd.set_option('display.max_rows', 600)
pd.set_option('display.max_colwidth', 400)
nlp = spacy.load('en_core_web_md')

In [11]:
filepath = "original copy 2.txt"
with open(filepath, encoding='utf-8') as file:
    text = file.read()

# Process the text using SpaCy
document = nlp(text)

In [15]:
import math

# Define number of chunks and calculate chunk size
number_of_chunks = 80

# Split text into chunks
chunk_size = math.ceil(len(text) / number_of_chunks)

# Split the text into chunks using list comprehension
text_chunks = [text[i:i + chunk_size] for i in range(0, len(text), chunk_size)]

In [16]:
# Process text chunks with spaCy
chunked_documents = list(nlp.pipe(text_chunks))

In [18]:
# Extract and count named entities labeled as "PERSON"
people = [named_entity.text for document in chunked_documents for named_entity in document.ents if named_entity.label_ == "PERSON"]

people_tally = Counter(people)

df = pd.DataFrame(people_tally.most_common(), columns=['character', 'count'])
df

Unnamed: 0,character,count
0,Duncan,95
1,John,62
2,Keith,62
3,Senzaki Sensei,47
4,Krishna-ji,47
5,Soen Roshi,30
6,Kirk,30
7,Vanda,29
8,Gunther,24
9,Zen,18


2. Text Pre-processing:
   - To create a suitable node-edge table for network analysis, I pre-processed "original copy 2.txt." I removed all unnecessary information, such as titles, endings, punctuation, line breaks, and page breaks. Additionally, there are notations of "tape 1" to "tape 8" at the beginning of every tape recording. Because of the OCR, tapes plus numbers are sometimes recognized as slightly different, such as "crape 5." Therefore, I came up with a regular expression for capturing variations of the word "tape" with a number from 1 to 8 and deleting them.
   - I saved the cleaned text as "processed.txt."


In [1]:
# Removing all punctuations
file_path = 'original copy 2.txt'

# Read the content of the file
with open(file_path, 'r') as file:
    content = file.read()

import string

# Create a translation table to map each punctuation character to None
translator = str.maketrans('', '', string.punctuation)

nopunctuation_content = content.translate(translator)

print(nopunctuation_content)


Tape 1

This was in Japan I think about 196465 and Senzaki Sensei had not
been back to Japan since he had left it early in the century to follow
Shaku Soen and Daisetsu Teitaro Suzuki to the United States It was a
beautiful beautiful day a Japanese day I would say There was a very
gentle shiguritype rain and little diamond raindrops would come
down the fetch of the monastery There was also some sort of
celebration I dont know which one possibly of Buddhas birthday
Roshi abbots priests of Zen had gathered from all over Japan and
were there in their magnificent brocade robes in this gray gentle quiet
atmosphere and I remember watching them file into the main
building and after they were in Senzaki Sensei and I went in There
were a lot of little lacquered tables around on the floor The people
there were all seated on the floor I remember the monks in their black
robes hurrying around to serve various guests That was my introduction to Japan
Asahina Roshi was the roshi at the time and be

In [2]:

# Create a regex pattern to match various misspellings or variations of "Tape" followed by numbers 1 to 8, such as <Tape 1, crape 5...
import re
pattern = re.compile(r"\b(?:<|CJ'|'l'|c|r)?[TtCcRr]'?[Aa][Pp][Ee]?\s*[1-8]\b")

# extract 
# Split the content into lines
lines = nopunctuation_content.split('\n')

# Filter out lines matching the pattern
filtered_lines = [line for line in lines if not pattern.search(line)]

# Join the filtered lines back into a single string
notape_content = '\n'.join(filtered_lines)

# Output the modified content
print(notape_content)



This was in Japan I think about 196465 and Senzaki Sensei had not
been back to Japan since he had left it early in the century to follow
Shaku Soen and Daisetsu Teitaro Suzuki to the United States It was a
beautiful beautiful day a Japanese day I would say There was a very
gentle shiguritype rain and little diamond raindrops would come
down the fetch of the monastery There was also some sort of
celebration I dont know which one possibly of Buddhas birthday
Roshi abbots priests of Zen had gathered from all over Japan and
were there in their magnificent brocade robes in this gray gentle quiet
atmosphere and I remember watching them file into the main
building and after they were in Senzaki Sensei and I went in There
were a lot of little lacquered tables around on the floor The people
there were all seated on the floor I remember the monks in their black
robes hurrying around to serve various guests That was my introduction to Japan
Asahina Roshi was the roshi at the time and before I l

In [3]:
#deleting all the line breaks and page breaks 

nolinebreaks_content = notape_content.replace('\n', ' ').replace('\r', ' ').replace('\f', '')
print(nolinebreaks_content)



3. Name Replacement:
   - Using the list of famous people's names from "Noa_Names.xlsx," I replaced all variations of these names in "processed.txt" with their full names. The updated file was saved as "rename.txt."
    - As seen in the “Noa_Names.xlsx” file, I also replaced the pronouns “I”, “my”, “me”, and “mine” with “Ruth McCandless”. Although this approach is not perfectly accurate, it suffices for the project's needs. Since mentions of “Ruth McCandless” are rare in the original text, and she is the relater and central figure of the network, this replacement is necessary.
    - I added regular expressions with `(?<=\s)` in front and back of words to indicate two blank spaces before and after the word, ensuring it is recognized as a standalone word. This prevents partial replacements within other words.
    - I manually rearranged longer name variations to be listed before shorter ones. For example, I ordered them as “Senzaki Senseis, Senzaki-sans, Senzaki Sensei, Senzaki”. This prevents repetitive replacements, such as replacing “Senzaki Senseis” with “Nyogen Senzaki” and then erroneously replacing “Nyogen Senzaki” with “Nyogen Nyogen Senzaki.”


In [4]:
# Open a file in write mode
with open("processed.txt", "w") as file:
    # Write the string to the file
    file.write(nolinebreaks_content)

In [5]:
import re
import pandas as pd

# Load the specific worksheet from the Excel file
df = pd.read_excel('Noa_Names.xlsx', sheet_name='Famous People')

# Read the text from the file
with open('processed.txt', 'r') as file:
   text = file.read().strip()


In [6]:
text1 = df['Name in NER'][1]
text2 = df['Full name'][1]

# Replace commas with pipe characters in the 'Name in NER' column

df['Name in NER'] = df[['Name in NER']].apply(lambda x: x.str.replace(", ", "|"))
# Iterate over each row in the DataFrame

for i in df.index:
   # Get the 'Name in NER' and 'Full name' for the current row
   text1 = df['Name in NER'][i]
   text2 = df['Full name'][i]
   # Replace occurrences of names in 'text' with their full names
   text = re.sub(text1, text2, text)

print(text)   



In [25]:

with open('rename.txt', 'w') as file:
    file.write(text)


4. Node-Edge Table:
   - Following the method used by Andrew Beveridge and Jie Shan in their paper "Network of Thrones," I calculated the frequency of co-occurrences of characters within 15 words of each other in the text. My method is a simpler version because it does not take pronouns, dialogues into consideration. The interview has a straightforward structure compared to novels. The interviewee was recounting what has happened.
   - This frequency data was used to create a node-edge table, which represented the degree of connections between individuals in the memoir.


In [7]:
#Convert names in a column to a list
import pandas as pd

df = pd.read_excel('Noa_Names.xlsx', sheet_name='Famous People')
name_column = df['Full name']

names = name_column.tolist()  

print(names)


['Jiddu Krishnamurti', 'Nyogen Senzaki', 'Soen Nakagawa', 'Vanda Scaravelli', 'William Brugh Joy', 'Anne Lindbergh', 'D. T. Suzuki', 'Ruth McCandless', 'Radha Rajagopal Sloss', 'Soyen Shaku', 'Eido Tai Shimano', 'Sohaku Ogata', 'Helen Corral', 'Edith R. Mirrielees', 'Carolyn Conger', 'Harriet Doerr', 'Mary Zimbalist', 'Sōgen Asahina', 'Robert Baker Aitken', 'Brian Weiss', 'Wallace Stegner', 'Shubin Tanahashi', 'Anthony Sokol', 'Beatrice Wood', 'Wilfred Stone', 'Geoffrey Farwell', 'Zentatsu Richard Baker', 'Bill Rample', 'Noel Rodriguez', 'Peter Matthiessen', 'Felix Wolff', 'Helen Wolff', 'Manly P. Hall', 'Paul Reps', 'Frederick Spiegelberg', 'John Dodds', 'Gary Snyder', 'Jack Kerouac', 'Reginald Berkeley', 'George Gurdjieff', 'Alan Watts', 'Robertson Davies', 'Sam Zimbalist', 'Elisabeth Haich', 'Selvarajan Yesudian']


In [8]:
# create a node-edge table that calculates the numbers of occurrence of names within 15 words
import re
from collections import defaultdict


# Load the text data
file_path = 'rename.txt'
with open(file_path, 'r') as file:
    text = file.read()

# Split text into words
words = text.split()
positions = {i: word for i, word in enumerate(words)}

matches = []

# Check each word and see if it matches any name
for i in range(len(words)):
    for name in names:
        name_words = name.split()
        if words[i:i + len(name_words)] == name_words:
            matches.append((name, i))

# Create a dictionary and calculate co-occurrences within 15 words
co_occurrences = defaultdict(int)

for i, (name1, pos1) in enumerate(matches):
    for name2, pos2 in matches[i + 1:]:
        if name1 != name2 and abs(pos2 - pos1) <= 15:
            pair = tuple(sorted((name1, name2)))
            co_occurrences[pair] += 1

# Create a DataFrame to display the results
data = [{'Source': source, 'Target': target, 'Weight': weight} for (source, target), weight in co_occurrences.items()]

df = pd.DataFrame(data)

pd.set_option('display.max_rows', 100)
print(df)

                   Source                  Target  Weight
0          Nyogen Senzaki         Ruth McCandless      80
1            D. T. Suzuki             Soyen Shaku       3
2         Ruth McCandless           Sōgen Asahina       8
3         Ruth McCandless           Soen Nakagawa      55
4           Soen Nakagawa           Sōgen Asahina       1
5        Eido Tai Shimano         Ruth McCandless       6
6           Manly P. Hall         Ruth McCandless       7
7         Ruth McCandless             Soyen Shaku       1
8         Ruth McCandless       William Brugh Joy      30
9          Nyogen Senzaki           Soen Nakagawa      13
10           D. T. Suzuki          Nyogen Senzaki       3
11           D. T. Suzuki         Ruth McCandless       9
12        Ruth McCandless            Sohaku Ogata       4
13         Nyogen Senzaki            Sohaku Ogata       3
14         Nyogen Senzaki  Zentatsu Richard Baker       2
15     Jiddu Krishnamurti         Ruth McCandless      10
16     Jiddu K

## Network Visualization

### 1. Community-Based Network Graph:
   - Using the node-edge table, I created a network graph with nodes sized and colored based on their modularity class (communities). This classification helped visualize clusters of closely connected individuals. The title of this graph is “Memoir Network by Modularity Class.”



In [29]:
# Creating an interactive network with labels and responsive highlighting for communities 
import networkx
import matplotlib.pyplot as plt
import numpy as np

#!pip install bokeh
from bokeh.io import output_notebook, show, save
output_notebook()

In [30]:
G = networkx.from_pandas_edgelist(df, 
                                  'Source', 
                                  'Target', 
                                  'Weight')

In [36]:
from bokeh.models import Range1d, Circle, ColumnDataSource, MultiLine, NodesAndLinkedEdges, LabelSet
from bokeh.plotting import figure, from_networkx, show, save, output_file
from bokeh.palettes import BuRd8
from bokeh.transform import linear_cmap
from networkx.algorithms import community

In [37]:
#Calculate the degree for each node and add it as a node attribute
degrees = dict(networkx.degree(G))
networkx.set_node_attributes(G, name='degree', values=degrees)

In [38]:
communities = community.greedy_modularity_communities(G)

In [39]:
modularity_class = {}
modularity_color = {}

#Loop through each community
for community_number, community in enumerate(communities):
    color = BuRd8[community_number % len(BuRd8)]  # Ensure color is within palette range
    for name in community: 
        modularity_class[name] = community_number
        modularity_color[name] = color

networkx.set_node_attributes(G, name='modularity_class', values=modularity_class)
networkx.set_node_attributes(G, name='modularity_color', values=modularity_color)

In [41]:
title = 'Memoir Network by Modularity Class'

#Define the colors used to highlight nodes and edges
node_highlight_color = 'yellow'
edge_highlight_color = 'red'

#Determine the attributes from the network that set the radius and color of nodes
node_radius_attribute = 1
node_color_attribute = 'modularity_color'

#Define categories
TOOLTIPS = [
       ("Modularity Class", "@modularity_class"),
       ("Character", "@index"),
       ("Modularity Color", "$color[swatch]:modularity_color"),
       ("Degree", "@degree")
]

#Generate a plot with specified dimensions, toolbar options, and title
plot = figure(tooltips = TOOLTIPS, tools="wheel_zoom,reset,save,pan", active_scroll='wheel_zoom',
              x_range=Range1d(-10, 10), y_range=Range1d(-10, 10), title=title)

#Construct a network graph object from the network data
network_graph = from_networkx(G, networkx.spring_layout, scale=20, center=(0, 0))

#Assign the colors used to highlight edges
network_graph.edge_renderer.hover_glyph = MultiLine(line_color=edge_highlight_color, line_width=3)
network_graph.edge_renderer.selection_glyph = MultiLine(line_color=edge_highlight_color, line_width=3)

#Adjust the radius and colors of nodes based on their degree
network_graph.node_renderer.glyph = Circle(radius=node_radius_attribute, fill_color=node_color_attribute)

#Define the opacity and width for edges
network_graph.edge_renderer.glyph = MultiLine(line_alpha=0.5, line_width=3)

#Specify the colors used to highlight nodes
network_graph.node_renderer.selection_glyph = Circle(radius=node_radius_attribute, fill_color=node_highlight_color, line_width=3)
network_graph.node_renderer.hover_glyph = Circle(radius=node_radius_attribute, fill_color=node_highlight_color, line_width=3)

#Enable highlighting of nodes and edges
network_graph.inspection_policy = NodesAndLinkedEdges()
network_graph.selection_policy = NodesAndLinkedEdges()

plot.renderers.append(network_graph)

#Add Labels to the graph
x, y = zip(*network_graph.layout_provider.graph_layout.values())
node_labels = list(G.nodes())
source = ColumnDataSource({'x': x, 'y': y, 'name': [node_labels[i] for i in range(len(x))]})
labels = LabelSet(x='x', y='y', text='name', source=source, background_fill_color='white', 
                  text_font_size='10px', background_fill_alpha=0.7)
plot.renderers.append(labels)

output_file(f"{title}.html", title=title)
show(plot)
save(plot)

'/home/jovyan/Final Project/Memoir Network by Modularity Class.html'

#### Result 1: Memoir Network by Modularity Class

This graph focuses on modularity class, a measure of the strength of division of a network into communities.

Features:
The colors of the nodes are from colorblind-friendly palettes. 

Usage:
This graph is useful for automatically visualizing the community structure within the network, showing how individuals are grouped based on their connections. It works better with long text and large amount of characters.

Findings:
Ruth McCandless is the central figure.
High-degree nodes like D.T. Suzuki and Nyogen Senzaki are influential within the network which fits with the insights from the close reading.
The communities are not accurate because the modularity class does not represent the connections between characters knowledge gained in close reading. 

### 2. Profession-Based Network Graphs:
   - Though Community-Based Network Graph could give us some insights about the network in the text, given the diversity of professions, I aimed to group nodes by their professions to provide a clearer visualization, where nodes were colored based on their professions.
   - For I assigned each individual to a profession category, which are Indian Buddhist Practitioner, Japanese Buddhist Practitioner, Creator, Spiritualist, Educator, or Organizational Personnel. The title of this graph is “Memoir Network by Six Professions”
   - Creators were further divided into sub-categories: Writer, Actress, Film Producer, and Publisher. The title of this graph is “Memoir Network by Eleven Professions'
    -  Given the diversity of professions in the graph, I aimed to group them together by related professions within the same area. Despite my attempts, I faced challenges in achieving this grouping effectively. The two failed appoarches are named “Failled: Memoir Network by Professions Grouped Horizontally” and “Failled: Memoir Network by Professions Grouped Separately”


In [42]:
# Building “Memoir Network by Six Professions”

from bokeh.io import output_notebook, show
from bokeh.models import Range1d, ColumnDataSource, MultiLine, NodesAndLinkedEdges, LabelSet, Legend, LegendItem, Scatter
from bokeh.plotting import figure, from_networkx, show, save, output_file
from bokeh.palettes import Light, HighContrast, Bright
import networkx as nx

output_notebook()

In [43]:
# Assign professions to people 

professions = {
    'Jiddu Krishnamurti': 'Indian Buddhist Practitioner',
    'Nyogen Senzaki': 'Japanese Buddhist Practitioner',
    'Soen Nakagawa': 'Japanese Buddhist Practitioner',
    'Vanda Scaravelli': 'Indian Buddhist Practitioner',
    'William Brugh Joy': 'Spiritualist',
    'Anne Lindbergh': 'Creator',
    'D. T. Suzuki': 'Japanese Buddhist Practitioner',
    'Ruth McCandless': 'Japanese Buddhist Practitioner',
    'Radha Rajagopal Sloss': 'Indian Buddhist Practitioner',
    'Soyen Shaku': 'Japanese Buddhist Practitioner',
    'Eido Tai Shimano': 'Japanese Buddhist Practitioner',
    'Sohaku Ogata': 'Japanese Buddhist Practitioner',
    'Helen Corral': 'Creator',
    'Edith R. Mirrielees': 'Educator',
    'Carolyn Conger': 'Spiritualist',
    'Harriet Doerr': 'Creator',
    'Mary Zimbalist': 'Creator',
    'Sōgen Asahina': 'Japanese Buddhist Practitioner',
    'Robert Baker Aitken': 'Japanese Buddhist Practitioner',
    'Brian Weiss': 'Spiritualist',
    'Wallace Stegner': 'Creator',
    'Shubin Tanahashi': 'Japanese Buddhist Practitioner',
    'Anthony Sokol': 'Educator',
    'Beatrice Wood': 'Creator',
    'Wilfred Stone': 'Educator',
    'Geoffrey Farwell': 'Organizational Personnel',
    'Zentatsu Richard Baker': 'Japanese Buddhist Practitioner',
    'Bill Rample': 'Organizational Personnel',
    'Noel Rodriguez': 'Japanese Buddhist Practitioner',
    'Peter Matthiessen': 'Creator',
    'Felix Wolff': 'Creator',
    'Helen Wolff': 'Creator',
    'Dennis Merzel': 'Japanese Buddhist Practitioner',
    'Manly P. Hall': 'Creator',
    'Paul Reps': 'Creator',
    'Frederick Spiegelberg': 'Educator',
    'John Dodds': 'Educator',
    'Gary Snyder': 'Creator',
    'Jack Kerouac': 'Creator',
    'Reginald Berkeley': 'Creator',
    'George Gurdjieff': 'Spiritualist',
    'Alan Watts': 'Japanese Buddhist Practitioner',
    'Robertson Davies': 'Creator',
    'Sam Zimbalist': 'Creator',
    'Elisabeth Haich': 'Indian Buddhist Practitioner',
    'Selvarajan Yesudian': 'Indian Buddhist Practitioner'
}

# Assign professions to nodes
for node in G.nodes():
    if node in professions:
        G.nodes[node]['profession'] = professions[node]

# Create a color palette for professions
profession_color_palette = {
    'Indian Buddhist Practitioner': Light[7][0],
    'Japanese Buddhist Practitioner': Light[7][1],
    'Creator': Light[7][2],
    'Educator': Light[7][3],
    'Spiritualist': Light[7][4],
    'Organizational Personnel': Light[7][5]
}

# Assign colors to nodes based on profession
for node in G.nodes():
    G.nodes[node]['profession_color'] = profession_color_palette[G.nodes[node]['profession']]

# Define the colors used to highlight nodes and edges
node_highlight_color = 'purple'
edge_highlight_color = 'yellowgreen'

# Define tooltips
HOVER_TOOLTIPS = [
    ("Character", "@index"),
    ("Degree", "@degree"),
    ("Profession", "@profession"),
    ("Profession Color", "$color[swatch]:profession_color"),
]

title = 'Memoir Network by Six Professions'

# Generate a plot with specified dimensions, toolbar options, and title
plot = figure(tooltips=HOVER_TOOLTIPS,
              tools="pan,wheel_zoom,save,reset", active_scroll='wheel_zoom',
              x_range=Range1d(-20, 20), y_range=Range1d(-20, 20), title=title,
              width=800, height=800) 

# Construct a network graph object from the network data
network_graph = from_networkx(G, nx.spring_layout, scale=30, center=(0, 0))

# Adjust the sizes and colors of nodes based on their degree
network_graph.node_renderer.glyph = Scatter(size=25, fill_color='profession_color')

# Specify the colors used to highlight nodes
network_graph.node_renderer.hover_glyph = Scatter(size=25, fill_color=node_highlight_color, line_width=2)
network_graph.node_renderer.selection_glyph = Scatter(size=25, fill_color=node_highlight_color, line_width=2)

# Define the opacity and width for edges
network_graph.edge_renderer.glyph = MultiLine(line_alpha=0.3, line_width=1)

# Assign the colors used to highlight edges
network_graph.edge_renderer.selection_glyph = MultiLine(line_color=edge_highlight_color, line_width=2)
network_graph.edge_renderer.hover_glyph = MultiLine(line_color=edge_highlight_color, line_width=2)

# Enable highlighting of nodes and edges
network_graph.selection_policy = NodesAndLinkedEdges()
network_graph.inspection_policy = NodesAndLinkedEdges()

plot.renderers.append(network_graph)

# Add Labels to the graph
x, y = zip(*network_graph.layout_provider.graph_layout.values())
node_labels = list(G.nodes())
source = ColumnDataSource({'x': x, 'y': y, 'name': node_labels})
labels = LabelSet(x='x', y='y', text='name', source=source, background_fill_color='white', text_font_size='15px', background_fill_alpha=0.7)
plot.renderers.append(labels)

# Add legend
legend_items = []
for prof, color in profession_color_palette.items():
    dummy_source = ColumnDataSource({'x': [0], 'y': [0]})
    dummy_renderer = plot.scatter(x='x', y='y', size=1, color=color, source=dummy_source, legend_label=prof)
    legend_item = LegendItem(label=prof, renderers=[dummy_renderer])

legend = Legend(items=legend_items, location=(0, -30))

output_file(f"{title}.html", title=title)
show(plot)
save(plot)


'/home/jovyan/Final Project/Memoir Network by Six Professions.html'

#### Result 2: Memoir Network by Six Professions

This graph categorizes nodes based on six professions: Indian Buddhist Practitioner, Japanese Buddhist Practitioner, Creator, Educator, Spiritualist, and Organizational Personnel. This categorization helps in quickly identifying the professional diversity within the network.

Features:
Node Attributes: Nodes are colored based on their profession which were input manually.
The legend on the right side clearly explains the color codes, making it accessible for viewers to identify professions.
The colors are from colorblind-friendly palettes.

Usage:
This graph is useful for understanding the professional composition of the network and seeing how different professions are connected. However, it relies on close reading and requires more time to create the graph.

Findings:
Through the clear categorization of individuals by profession, the group of Japanese Buddhist practitioners has the largest number of characters. McCandless is also frequently involved with creators. Creators in that era are also interested in exploring different religion and spirtual wellbeings. In addition, Spiritualist such as "George Gurdjieff" is closed involved with Japanese Buddhist practitioners. 

In [44]:
# Building “Memoir Network by Eleven Professions”

professions = {
    'Jiddu Krishnamurti': 'Indian Buddhist Practitioner',
    'Nyogen Senzaki': 'Japanese Buddhist Practitioner',
    'Soen Nakagawa': 'Japanese Buddhist Practitioner',
    'Vanda Scaravelli': 'Indian Buddhist Practitioner',
    'William Brugh Joy': 'Spiritualist',
    'Anne Lindbergh': 'Writer',
    'D. T. Suzuki': 'Japanese Buddhist Practitioner',
    'Ruth McCandless': 'Japanese Buddhist Practitioner',
    'Radha Rajagopal Sloss': 'Indian Buddhist Practitioner',
    'Soyen Shaku': 'Japanese Buddhist Practitioner',
    'Eido Tai Shimano': 'Japanese Buddhist Practitioner',
    'Sohaku Ogata': 'Japanese Buddhist Practitioner',
    'Helen Corral': 'Actress',
    'Edith R. Mirrielees': 'Educator',
    'Carolyn Conger': 'Spiritualist',
    'Harriet Doerr': 'Writer',
    'Mary Zimbalist': 'Actress',
    'Sōgen Asahina': 'Japanese Buddhist Practitioner',
    'Robert Baker Aitken': 'Japanese Buddhist Practitioner',
    'Brian Weiss': 'Spiritualist',
    'Wallace Stegner': 'Writer',
    'Shubin Tanahashi': 'Japanese Buddhist Practitioner',
    'Anthony Sokol': 'Educator',
    'Beatrice Wood': 'Painter',
    'Wilfred Stone': 'Educator',
    'Geoffrey Farwell': 'Organizational Personnel',
    'Zentatsu Richard Baker': 'Japanese Buddhist Practitioner',
    'Bill Rample': 'Organizational Personnel',
    'Noel Rodriguez': 'Japanese Buddhist Practitioner',
    'Peter Matthiessen': 'Writer',
    'Felix Wolff': 'Publisher',
    'Helen Wolff': 'Publisher',
    'Dennis Merzel': 'Japanese Buddhist Practitioner',
    'Manly P. Hall': 'Writer',
    'Paul Reps': 'Writer',
    'Frederick Spiegelberg': 'Educator',
    'John Dodds': 'Educator',
    'Gary Snyder': 'Writer',
    'Jack Kerouac': 'Writer',
    'Reginald Berkeley': 'Writer',
    'George Gurdjieff': 'Spiritualist',
    'Alan Watts': 'Japanese Buddhist Practitioner',
    'Robertson Davies': 'Writer',
    'Sam Zimbalist': 'Flim Producer',
    'Elisabeth Haich': 'Indian Buddhist Practitioner',
    'Selvarajan Yesudian': 'Indian Buddhist Practitioner'
}

# Assign professions to nodes
for node in G.nodes():
    if node in professions:
        G.nodes[node]['profession'] = professions[node]

# Create a color palette for professions
profession_color_palette = {
    'Indian Buddhist Practitioner': Light[9][0],
    'Japanese Buddhist Practitioner': Light[9][1],
    'Creator': Light[9][2],
    'Educator': Light[9][3],
    'Spiritualist': Light[9][4],
    'Organizational Personnel': Light[9][5],
    'Writer': Light[9][6],
    'Actress': Bright[7][5],
    'Painter': HighContrast[3][0],
    'Flim Producer': HighContrast[3][1],
    'Publisher': HighContrast[3][2]
}

# Assign colors to nodes based on profession
for node in G.nodes():
    G.nodes[node]['profession_color'] = profession_color_palette[G.nodes[node]['profession']]

# Define the colors used to highlight nodes and edges
node_highlight_color = 'purple'
edge_highlight_color = 'yellowgreen'

# Define tooltips
HOVER_TOOLTIPS = [
    ("Character", "@index"),
    ("Degree", "@degree"),
    ("Profession", "@profession"),
    ("Profession Color", "$color[swatch]:profession_color"),
]

title = 'Memoir Network by Eleven Professions'

# Generate a plot with specified dimensions, toolbar options, and title
plot = figure(tooltips=HOVER_TOOLTIPS,
              tools="pan,wheel_zoom,save,reset", active_scroll='wheel_zoom',
              x_range=Range1d(-20, 20), y_range=Range1d(-20, 20), title=title,
              width=1000, height=1000) 

# Construct a network graph object from the network data
network_graph = from_networkx(G, nx.spring_layout, scale=30, center=(0, 0))

# Adjust the sizes and colors of nodes based on their degree
network_graph.node_renderer.glyph = Scatter(size=25, fill_color='profession_color')

# Specify the colors used to highlight nodes
network_graph.node_renderer.hover_glyph = Scatter(size=25, fill_color=node_highlight_color, line_width=2)
network_graph.node_renderer.selection_glyph = Scatter(size=25, fill_color=node_highlight_color, line_width=2)

# Define the opacity and width for edges
network_graph.edge_renderer.glyph = MultiLine(line_alpha=0.3, line_width=1)

# Assign the colors used to highlight edges
network_graph.edge_renderer.selection_glyph = MultiLine(line_color=edge_highlight_color, line_width=2)
network_graph.edge_renderer.hover_glyph = MultiLine(line_color=edge_highlight_color, line_width=2)

# Enable highlighting of nodes and edges
network_graph.selection_policy = NodesAndLinkedEdges()
network_graph.inspection_policy = NodesAndLinkedEdges()

plot.renderers.append(network_graph)

# Add Labels to the graph
x, y = zip(*network_graph.layout_provider.graph_layout.values())
node_labels = list(G.nodes())
source = ColumnDataSource({'x': x, 'y': y, 'name': node_labels})
labels = LabelSet(x='x', y='y', text='name', source=source, background_fill_color='white', text_font_size='15px', background_fill_alpha=0.7)
plot.renderers.append(labels)

# Add legend
legend_items = []
for prof, color in profession_color_palette.items():
    dummy_source = ColumnDataSource({'x': [0], 'y': [0]})
    dummy_renderer = plot.scatter(x='x', y='y', size=1, color=color, source=dummy_source, legend_label=prof)
    legend_item = LegendItem(label=prof, renderers=[dummy_renderer])

legend = Legend(items=legend_items, location=(0, -30))

# Specify output file and save the plot
output_file(f"{title}.html", title=title)
show(plot)
save(plot)


'/home/jovyan/Final Project/Memoir Network by Eleven Professions.html'

### Result 3: Memoir Network by Eleven Professions

The "Memoir Network by Eleven Professions" graph provides a more detailed categorization of the professional roles within Ruth McCandless's network, offering a richer analysis compared to the six-profession graph.

Features: Nodes are color-coded based on eleven professional categories: Indian Buddhist Practitioner, Japanese Buddhist Practitioner, Creator, Educator, Spiritualist, Organizational Personnel, Writer, Actress, Painter, Film Producer, and Publisher. This detailed categorization helps in understanding the specific roles of individuals within the network.
The legend on the right side clearly explains the color codes, making it accessible for viewers to identify professions.

Findings: Ruth McCandless interacts with individuals across various fields of expertise, highlighting a notable trend in the mid to late twentieth century: upper-class individuals, particularly writers and those in the entertainment industry, had a keen interest in exploring religion and spirituality. This phenomenon underscores the cultural and intellectual curiosity of the era, where prominent figures sought deeper understanding and experiences beyond their professional domains.


In [29]:
# Building “Failled: Memoir Network by Professions Grouped Horizontally”

professions = {
    'Jiddu Krishnamurti': 'Indian Buddhist Practitioner',
    'Nyogen Senzaki': 'Japanese Buddhist Practitioner',
    'Soen Nakagawa': 'Japanese Buddhist Practitioner',
    'Vanda Scaravelli': 'Indian Buddhist Practitioner',
    'William Brugh Joy': 'Spiritualist',
    'Anne Lindbergh': 'Creator',
    'D. T. Suzuki': 'Japanese Buddhist Practitioner',
    'Ruth McCandless': 'Japanese Buddhist Practitioner',
    'Radha Rajagopal Sloss': 'Indian Buddhist Practitioner',
    'Soyen Shaku': 'Japanese Buddhist Practitioner',
    'Eido Tai Shimano': 'Japanese Buddhist Practitioner',
    'Sohaku Ogata': 'Japanese Buddhist Practitioner',
    'Helen Corral': 'Creator',
    'Edith R. Mirrielees': 'Educator',
    'Carolyn Conger': 'Spiritualist',
    'Harriet Doerr': 'Creator',
    'Mary Zimbalist': 'Creator',
    'Sōgen Asahina': 'Japanese Buddhist Practitioner',
    'Robert Baker Aitken': 'Japanese Buddhist Practitioner',
    'Brian Weiss': 'Spiritualist',
    'Wallace Stegner': 'Creator',
    'Shubin Tanahashi': 'Japanese Buddhist Practitioner',
    'Anthony Sokol': 'Educator',
    'Beatrice Wood': 'Creator',
    'Wilfred Stone': 'Educator',
    'Geoffrey Farwell': 'Organizational Personnel',
    'Zentatsu Richard Baker': 'Japanese Buddhist Practitioner',
    'Bill Rample': 'Organizational Personnel',
    'Noel Rodriguez': 'Japanese Buddhist Practitioner',
    'Peter Matthiessen': 'Creator',
    'Felix Wolff': 'Creator',
    'Helen Wolff': 'Creator',
    'Dennis Merzel': 'Japanese Buddhist Practitioner',
    'Manly P. Hall': 'Creator',
    'Paul Reps': 'Creator',
    'Frederick Spiegelberg': 'Educator',
    'John Dodds': 'Educator',
    'Gary Snyder': 'Creator',
    'Jack Kerouac': 'Creator',
    'Reginald Berkeley': 'Creator',
    'George Gurdjieff': 'Spiritualist',
    'Alan Watts': 'Japanese Buddhist Practitioner',
    'Robertson Davies': 'Creator',
    'Sam Zimbalist': 'Creator',
    'Elisabeth Haich': 'Indian Buddhist Practitioner',
    'Selvarajan Yesudian': 'Indian Buddhist Practitioner'
}

# Assign professions to nodes
for node in G.nodes():
    if node in professions:
        G.nodes[node]['profession'] = professions[node]

# Create a color palette for professions
profession_color_palette = {
    'Indian Buddhist Practitioner': Light[7][0],
    'Japanese Buddhist Practitioner': Light[7][1],
    'Creator': Light[7][2],
    'Educator': Light[7][3],
    'Spiritualist': Light[7][4],
    'Organizational Personnel': Light[7][5]
}

# Assign colors to nodes based on profession
for node in G.nodes():
    G.nodes[node]['profession_color'] = profession_color_palette[G.nodes[node]['profession']]

# Define custom positions to group by profession
profession_positions = {}
offset = 0
for profession, color in profession_color_palette.items():
    subgraph = G.subgraph([n for n, d in G.nodes(data=True) if d['profession'] == profession])
    pos = nx.spring_layout(subgraph, center=(offset, 0), scale=3)
    profession_positions.update(pos)
    offset += 5


# Define the colors used to highlight nodes and edges
node_highlight_color = 'purple'
edge_highlight_color = 'yellowgreen'

# Define tooltips
HOVER_TOOLTIPS = [
    ("Character", "@index"),
    ("Degree", "@degree"),
    ("Profession", "@profession"),
    ("Profession Color", "$color[swatch]:profession_color"),
]

title = 'Failled: Memoir Network by Professions Grouped Horizontally'

# Generate a plot with specified dimensions, toolbar options, and title
plot = figure(tooltips=HOVER_TOOLTIPS,
              tools="pan,wheel_zoom,save,reset", active_scroll='wheel_zoom',
              x_range=Range1d(-20, 20), y_range=Range1d(-20, 20), title=title,
              width=800, height=800) 

# Construct a network graph object from the network data
network_graph = from_networkx(G, profession_positions)

# Adjust the sizes and colors of nodes based on their degree
network_graph.node_renderer.glyph = Scatter(size=25, fill_color='profession_color')

# Specify the colors used to highlight nodes
network_graph.node_renderer.hover_glyph = Scatter(size=25, fill_color=node_highlight_color, line_width=2)
network_graph.node_renderer.selection_glyph = Scatter(size=25, fill_color=node_highlight_color, line_width=2)

# Define the opacity and width for edges
network_graph.edge_renderer.glyph = MultiLine(line_alpha=0.3, line_width=1)

# Assign the colors used to highlight edges
network_graph.edge_renderer.selection_glyph = MultiLine(line_color=edge_highlight_color, line_width=2)
network_graph.edge_renderer.hover_glyph = MultiLine(line_color=edge_highlight_color, line_width=2)

# Enable highlighting of nodes and edges
network_graph.selection_policy = NodesAndLinkedEdges()
network_graph.inspection_policy = NodesAndLinkedEdges()

plot.renderers.append(network_graph)

# Add Labels to the graph
x, y = zip(*network_graph.layout_provider.graph_layout.values())
node_labels = list(G.nodes())
source = ColumnDataSource({'x': x, 'y': y, 'name': node_labels})
labels = LabelSet(x='x', y='y', text='name', source=source, background_fill_color='white', text_font_size='15px', background_fill_alpha=0.7)
plot.renderers.append(labels)

# Add legend
legend_items = []
for prof, color in profession_color_palette.items():
    dummy_source = ColumnDataSource({'x': [0], 'y': [0]})
    dummy_renderer = plot.scatter(x='x', y='y', size=1, color=color, source=dummy_source, legend_label=prof)
    legend_item = LegendItem(label=prof, renderers=[dummy_renderer])

legend = Legend(items=legend_items, location=(0, -30))

show(plot)


In [30]:
# Building “Failled: Memoir Network by Professions Grouped Separately”

professions = {
    'Jiddu Krishnamurti': 'Indian Buddhist Practitioner',
    'Nyogen Senzaki': 'Japanese Buddhist Practitioner',
    'Soen Nakagawa': 'Japanese Buddhist Practitioner',
    'Vanda Scaravelli': 'Indian Buddhist Practitioner',
    'William Brugh Joy': 'Spiritualist',
    'Anne Lindbergh': 'Creator',
    'D. T. Suzuki': 'Japanese Buddhist Practitioner',
    'Ruth McCandless': 'Japanese Buddhist Practitioner',
    'Radha Rajagopal Sloss': 'Indian Buddhist Practitioner',
    'Soyen Shaku': 'Japanese Buddhist Practitioner',
    'Eido Tai Shimano': 'Japanese Buddhist Practitioner',
    'Sohaku Ogata': 'Japanese Buddhist Practitioner',
    'Helen Corral': 'Creator',
    'Edith R. Mirrielees': 'Educator',
    'Carolyn Conger': 'Spiritualist',
    'Harriet Doerr': 'Creator',
    'Mary Zimbalist': 'Creator',
    'Sōgen Asahina': 'Japanese Buddhist Practitioner',
    'Robert Baker Aitken': 'Japanese Buddhist Practitioner',
    'Brian Weiss': 'Spiritualist',
    'Wallace Stegner': 'Creator',
    'Shubin Tanahashi': 'Japanese Buddhist Practitioner',
    'Anthony Sokol': 'Educator',
    'Beatrice Wood': 'Creator',
    'Wilfred Stone': 'Educator',
    'Geoffrey Farwell': 'Organizational Personnel',
    'Zentatsu Richard Baker': 'Japanese Buddhist Practitioner',
    'Bill Rample': 'Organizational Personnel',
    'Noel Rodriguez': 'Japanese Buddhist Practitioner',
    'Peter Matthiessen': 'Creator',
    'Felix Wolff': 'Creator',
    'Helen Wolff': 'Creator',
    'Dennis Merzel': 'Japanese Buddhist Practitioner',
    'Manly P. Hall': 'Creator',
    'Paul Reps': 'Creator',
    'Frederick Spiegelberg': 'Educator',
    'John Dodds': 'Educator',
    'Gary Snyder': 'Creator',
    'Jack Kerouac': 'Creator',
    'Reginald Berkeley': 'Creator',
    'George Gurdjieff': 'Spiritualist',
    'Alan Watts': 'Japanese Buddhist Practitioner',
    'Robertson Davies': 'Creator',
    'Sam Zimbalist': 'Creator',
    'Elisabeth Haich': 'Indian Buddhist Practitioner',
    'Selvarajan Yesudian': 'Indian Buddhist Practitioner'
}

# Assign professions to nodes
for node in G.nodes():
    if node in professions:
        G.nodes[node]['profession'] = professions[node]

# Create a color palette for professions
profession_color_palette = {
    'Indian Buddhist Practitioner': Light[7][0],
    'Japanese Buddhist Practitioner': Light[7][1],
    'Creator': Light[7][2],
    'Educator': Light[7][3],
    'Spiritualist': Light[7][4],
    'Organizational Personnel': Light[7][5]
}

# Assign colors to nodes based on profession
for node in G.nodes():
    G.nodes[node]['profession_color'] = profession_color_palette[G.nodes[node]['profession']]

# Define custom positions to group by profession
profession_positions = {}
angle_step = 2 * 3.14159 / len(profession_color_palette)  # divide circle based on the number of professions
angle = 0
for profession, color in profession_color_palette.items():
    subgraph = G.subgraph([n for n, d in G.nodes(data=True) if d['profession'] == profession])
    sub_pos = nx.spring_layout(subgraph, center=(10 * np.cos(angle), 10 * np.sin(angle)), scale=1)
    profession_positions.update(sub_pos)
    angle += angle_step

# Normalize positions to fit within a circular layout
for key, (x, y) in profession_positions.items():
    norm = (x**2 + y**2)**0.5
    profession_positions[key] = (x / norm * 10, y / norm * 10)

# Define the colors used to highlight nodes and edges
node_highlight_color = 'purple'
edge_highlight_color = 'yellowgreen'

# Define tooltips
HOVER_TOOLTIPS = [
    ("Character", "@index"),
    ("Degree", "@degree"),
    ("Profession", "@profession"),
    ("Profession Color", "$color[swatch]:profession_color"),
]

title = 'Failled: Memoir Network by Professions Grouped Separately'

# Generate a plot with specified dimensions, toolbar options, and title
plot = figure(tooltips=HOVER_TOOLTIPS,
              tools="pan,wheel_zoom,save,reset", active_scroll='wheel_zoom',
              x_range=Range1d(-15, 15), y_range=Range1d(-15, 15), title=title,
              width=800, height=800)

# Construct a network graph object from the network data
network_graph = from_networkx(G, profession_positions)

# Adjust the sizes and colors of nodes based on their degree
network_graph.node_renderer.glyph = Scatter(size=25, fill_color='profession_color')

# Specify the colors used to highlight nodes
network_graph.node_renderer.hover_glyph = Scatter(size=25, fill_color=node_highlight_color, line_width=2)
network_graph.node_renderer.selection_glyph = Scatter(size=25, fill_color=node_highlight_color, line_width=2)

# Define the opacity and width for edges
network_graph.edge_renderer.glyph = MultiLine(line_alpha=0.3, line_width=1)

# Assign the colors used to highlight edges
network_graph.edge_renderer.selection_glyph = MultiLine(line_color=edge_highlight_color, line_width=2)
network_graph.edge_renderer.hover_glyph = MultiLine(line_color=edge_highlight_color, line_width=2)

# Enable highlighting of nodes and edges
network_graph.selection_policy = NodesAndLinkedEdges()
network_graph.inspection_policy = NodesAndLinkedEdges()

plot.renderers.append(network_graph)

# Add Labels to the graph
x, y = zip(*network_graph.layout_provider.graph_layout.values())
node_labels = list(G.nodes())
source = ColumnDataSource({'x': x, 'y': y, 'name': node_labels})
labels = LabelSet(x='x', y='y', text='name', source=source, background_fill_color='white', text_font_size='15px', background_fill_alpha=0.7)
plot.renderers.append(labels)

# Add legend
legend_items = []
dummy_source = ColumnDataSource({'x': [0], 'y': [0]})
dummy_renderer = plot.scatter(x='x', y='y', size=10, source=dummy_source)

for prof, color in profession_color_palette.items():
    legend_item = LegendItem(label=prof, renderers=[dummy_renderer], index=0)
    legend_items.append(legend_item)

legend = Legend(items=legend_items, location="right",
                label_text_font_size="8pt",  # Smaller font size
                glyph_height=10,  # Smaller glyph height
                glyph_width=10,  # Smaller glyph width
                label_standoff=5)  # Smaller distance between label and glyph

plot.add_layout(legend, 'right')

show(plot)


### Result 4 and Result 5

They both failled to group nodes separately by profession in a circular layout.

## Summary of Tools and Techniques Used
  - spaCy: Used for named entity recognition to identify and extract names from the text.
  - Regular Expressions: Employed to clean the text and handle OCR inconsistencies.
  - Bokeh: A Python visualization library used to create interactive network graphs.
  - Node-Edge Table: Constructed based on co-occurrence frequencies of characters to represent the network connections.


# Results

In this project, network analysis was conducted to illustrate connections between various individuals mentioned in Ruth McCandless's memoir. Ruth McCandless is the central figure in this network, serving as a significant node from which numerous connections radiate outward. The network nodes represent different individuals, and the edges (lines) signify the relationships or connections between them. The nodes are color-coded to represent different communities within the network, identified through modularity-based clustering or manually assigned professions.


# Discussion

## Analysis and Insights

The memoir network analysis reveals the extensive and varied connections of Ruth McCandless with individuals across different fields of expertise. This network graph illustrates her pivotal role in connecting diverse groups, particularly within the realms of Zen Buddhism, Indian philosophy, and the creative industry. The modularity class graph highlights the community structures within her network, showing clusters of closely connected individuals, while the profession-based graphs categorize these individuals by their professional roles, providing a detailed view of their contributions.


Several observations can be drawn from the visual representation:

1. **Dominance of Japanese Buddhist Practitioners**: Japanese Buddhist practitioners are prominently represented in the network, possibly reflecting Ruth McCandless’s own background as a Buddhist and author of two Buddhist books. This suggests a significant presence and influence of Japanese Buddhism within her professional circle.

2. **Close Proximity of Indian Buddhist Practitioners, Japanese Buddhist Practitioners, and Spiritualists**: The graph shows a notable closeness among Indian Buddhist practitioners, Japanese Buddhist practitioners, and spiritualists. This proximity might imply a perceived connection between these groups. However, close reading of the source material may not reveal substantial interactions, indicating that the methodological approach might be skewed by the frequent mention of these names in proximity, rather than actual relationships.

3. **Connections with Other Professions**: The analysis reveals a trend among upper-class individuals, particularly writers and those in the entertainment industry, to explore religion and spirituality in the mid to late 20th century. Ruth McCandless's interactions with various spiritual leaders and philosophers reflect this broader cultural and intellectual curiosity. Some individuals have a genuine interest in Buddhism and seek deeper knowledge about Japanese Buddhism through their relationship with Ruth McCandless, while others are acquaintances from upper social circles.

4. **Interest in Religion in the Literary and Creative Circles**: The graph underscores a notable interest in religion among people in the literary and creative fields during the 20th century. This period saw prominent figures seeking deeper understanding and experiences beyond their professional domains. Religious groups, including Buddhists, were not isolated but rather deeply integrated into broader societal networks.

**Summary**: The graph illustrates the interconnectedness of various belief systems and professional domains within the social fabric of the mid-20th century. It highlights the trend among upper-class individuals, especially in the literary and creative circles, to explore religion and spirituality, reflecting a broader cultural and intellectual curiosity. This interconnectivity underscores the significant role of religious and spiritual interests in shaping social and professional relationships during that era.

## Utility of Findings

Understanding the structure of Ruth McCandless' network and the broader cultural trends it represents can have several applications:

1. Historical Analysis: Identifying key figures and their connections offers insights into historical relationships and influences, particularly in the context of the American Zen Buddhist movement and its development.

2. Literary Studies: Analyzing the network of authors, thinkers, and creators can reveal underlying connections and thematic links in their works, reflecting their spiritual and intellectual engagements.

3. Social Network Analysis: This visualization helps in understanding social structures, key influencers, and the dynamics of relationships within a group, providing a framework for studying other historical or contemporary networks.

The interactive nature of the graph, with its highlighting features and detailed tooltips, makes it a powerful tool for researchers and analysts to explore the network deeply. By examining degrees and classes, one can derive significant insights about the roles and influences of different individuals within the network.

Overall, the memoir network analysis not only sheds light on Ruth McCandless' significant contributions but also reflects broader cultural and intellectual trends of her time, providing a valuable resource for historical, literary, and social studies.

# Bibliography


[Review of “Iron Flute, The. Trans. and Ed. by” Nyogen Senzaki and Ruth Strout McCandless (Book Review)]. (1961). The Middle Way, 36, 84-. London: Buddhist Society UK.

Andrew Beveridge, & Jie Shan. (2016). Network of Thrones. Math Horizons, 23(4), 18–22. 
https://doi.org/10.4169/mathhorizons.23.4.18

Kitagawa, J. M. (1961). [Review of The Iron Flute; Zen Flesh, Zen Bones: A Collection of Zen and Pre-Zen Writings, by N. Senzaki, R. S. McCandless, & P. Reps]. The Journal of Religion, 41(4), 321–322.http://www.jstor.org/stable/1200996

Labatut, V., & Bost, X. (2019). Extraction and Analysis of Fictional Character Networks: a survey. In Figshare. https://doi.org/10.6084/m9.figshare.7993040.v3

LEVINE, G. P. A. (2017). ZEN SELLS. In Long Strange Journey: On Modern Zen, Zen Art, and Other Predicaments (pp. 195–238). University of Hawai’i Press. http://www.jstor.org/stable/j.ctvvn1jn.14

Love, J. P. (1963). [Review of The Iron Flute. 100 Zen Kōan with commentary by Genrō, Fūgai, and Nyōgen., by N. Senzaki & R. S. McCandless]. Monumenta Nipponica, 18(1/4), 386–387. https://doi.org/10.2307/2383157

Martino, R. D. (1954). [Review of Nyogen Senzaki and Ruth Strout McCandless: Buddhism and Zen (Book Review)]. The Review of Religion, 19(1), 54-. New York: Columbia University Press.

McCandless (Ruth S.) collection on Nyogen Senzaki. (n.d.). http://www.oac.cdlib.org/findaid/ark:/13030/c8wh2vs5/

Named Entity Recognition — Introduction to cultural Analytics & Python. (n.d.). 
https://melaniewalsh.github.io/Intro-Cultural-Analytics/05-Text-Analysis/12-Named-Entity-Recognition.html#get-named-entities

Ōryū, G. (2000). The Iron Flute: 100 Zen Kōans. Tuttle Publishing.

Ruth Strout McCandless. (n.d.). GALE LITERATURE RESOURCE CENTER. https://go.gale.com/ps/i.do?p=LitRC&u=uclosangeles&id=GALE%7CH1000065800&v=2.1&it=r&sid=summon

Sasaki, R. F. (1960). A Bibliography of Translations of Zen (Ch’an) Works. Philosophy East and West, 10(3/4), 149–166. https://doi.org/10.2307/1397013

Senzaki, N., & McCandless, R. S. (1988). Buddhism and zen. Macmillan.

spaCy 101: Everything you need to know · spaCy Usage Documentation. (n.d.). spaCy 101: Everything You Need to Know. https://spacy.io/usage/spacy-101