Skip to content
This repository contains various Jupyter notebooks pertaining to the use of stylometric analysis for Chinese literature. These notebooks were originally created for my DHAsia workshop at Stanford in February 2016.
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
corpus
LICENSE
README.md
Stanford DH Asia Python Basics.ipynb
Stanford DH Asia Stylometry.ipynb
Stanford Workshop Streamlined, HCA.ipynb
Stanford Workshop Streamlined, PCA.ipynb
instructions.pptx
metadata.txt
test.txt

README.md

chinese_stylometry

This repository contains various Jupyter notebooks pertaining to the use of stylometric analysis for Chinese literature. These notebooks were originally created for my DHAsia workshop at Stanford in February 2016.

Feel free to use them and modify them for your own research, but please cite this repository.

The Python Basics notebook contains a quick rundown of the basic knowledge necessary to understand the code used in the other notebooks.

The Stylometry notebook contains a detailed explanation of how to conduct stylometric analysis. This includes both hierarchical cluster analysis and principal component analysis.

The two streamlined files contain just the code necessary to do pca and hca. Variables that the user should adjust are highlighted with comments.

The general shape of my approach has been influenced by many folks, but the methods described in this workshop were strongly influenced by stylometric work done by Cristof Schöch (http://dragonfly.hypotheses.org/), the computational stylistic group's "stylo" package (https://sites.google.com/site/computationalstylistics/), as well as:

J.F. Burrows and D.H. Craig, “Lyrical Drama and the ‘Turbid Mountebanks:’ Styles of Dialogue in Romantic and Renaissance Tragedy,” Computers and the Humanities 28 (1994): 63-86

JNG Binongo and MWA Smith, “The Application of Principal Component Analysis to Stylometry,” Literary and Linguistic Computing 14 (1999): 445-466.

You can’t perform that action at this time.