This is code that we will cover in my Hacking the Humanities class at Leiden University. Video tutorials will be uploaded to my YouTube channel at https://www.youtube.com/channel/UCSarHXwz_HKtiZ3vNTX1rfw
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
01_ultrabasic1_comments.py
01_ultrabasic2_types.py
02_variables1.py
02_variables2_fstring.py
03_strings1.py
03_strings2_count.py
03_strings3_concat1.py
03_strings3_concat2.py
03_strings3_concat3.py
03_strings4_len.py
03_strings5_substr1.py
03_strings5_substr2.py
03_strings6_find1.py
03_strings7_find2.py
03_strings8_replace.py
03_strings9_methods.py
04_math1.py
04_math2_strings.py
05_changingtypes.py
06_lists1.py
06_lists2_indexing.py
06_lists3_changing.py
06_lists4_toandfrom.py
06_lists5_methods.py
07_booleans1.py
07_booleans2_nonnumeric.py
07_booleans3_if.py
07_booleans4_ifelse1.py
07_booleans4_ifelse2.py
07_booleans5_multiconditions.py
07_booleans6_innot.py
08_loops1_while.py
08_loops2_for.py
08_loops2_forb_liststring.py
09_nestedblocks.py
10_files.py
11_dictionaries1.py
11_dictionaries2_keyvalue.py
12_mostcommoncharacter1.py
12_mostcommoncharacter2.py
12_mostcommoncharacter3.py
12_mostcommoncharacter4.py
12_mostcommoncharacter5.py
13_commonwords1.py
13_commonwords2.py
13_commonwords3.py
13_commonwords4.py
14_errors1.py
14_errors2.py
14_errors3.py
14_errors4.py
15_functions1.py
15_functions2_input.py
15_functions3_multiinput.py
15_functions4_keywordinput.py
15_functions5_output.py
16_libraries.py
17_nltk1.py
17_nltk2_tokenize1.py
17_nltk2_tokenize2.py
17_nltk3_textobject1.py
17_nltk3_textobject2.py
17_nltk4_basicanalysis.py
17_nltk5_stemming.py
18_regexes01.py
18_regexes02_matchobject.py
18_regexes03_specialchar.py
18_regexes04_finditer.py
18_regexes05_sets.py
18_regexes06_escape.py
18_regexes07_multiples.py
18_regexes08_groups.py
18_regexes09_greedy.py
18_regexes10_or.py
18_regexes11_replace.py
19_pandas1_series1.py
19_pandas1_series2.py
19_pandas2_dataframe1.py
19_pandas2_dataframe2.py
19_pandas2_dataframe3.py
19_pandas2_dataframe4.py
19_pandas2_dataframe5.py
19_pandas2_dataframe6.py
19_pandas2_dataframe7.py
20_corpusrep1_dividing.py
20_corpusrep2_openfiles.py
20_corpusrep3_tdm.py
20_corpusrep4_tfidf.py
20_corpusrep5_fast.py
21_viz_matplotlib01_line.py
21_viz_matplotlib02_twolines.py
21_viz_matplotlib03_title.py
21_viz_matplotlib04_save.py
21_viz_matplotlib05_xvsy.py
21_viz_matplotlib06_scatter.py
21_viz_matplotlib07_sizeandcolor.py
21_viz_matplotlib08_scforeach.py
21_viz_matplotlib09_subplot.py
21_viz_matplotlib10_subplotlabels.py
21_viz_matplotlib11_barchart.py
21_viz_matplotlib12_seaborn.py
22_stylometry1_hca.py
22_stylometry2_color.py
22_stylometry3_federalist.py
22_stylometry4_pca.py
22_stylometry5_pcafed.py
22_stylometry6_loadings.py
23_gensim1_prep.py
23_gensim2_topicmodel1.py
23_gensim2_topicmodel2.py
23_gensim2_topicmodel3.py
23_gensim2_topicmodel4.py
23_gensim2_topicmodel5.py
23_gensim3_word2vec.py
24_web1_urllib1.py
24_web1_urllib2.py
24_web2_wikiapi.py
24_web3_json.py
24_web4_moreapi.py
24_web5_soup.py
LICENSE
README.md
federalist.txt
holmes.txt
weather.csv

README.md

This repository contains all of the code we will cover in the Fall 2018 edition of my Hacking the Humanities class, which I teach at Leiden University as part of the Digital Humanities minor program.

This code accompanies the Hacking the Humanities tutorial video that I filmed (https://www.youtube.com/playlist?list=PL6kqrM2i6BPIpEF5yHPNkYhjHm-FYWh17). The goal is to create a resource for students so we can spend less time in class talking about the technical details of coding and more about how we can use this code to study human culture.

While this should serve as a functional introduction to Python, it is not meant to be comprehensive. Instead, it is designed to help students with no knowledge of coding at all learn how to do text analysis and data visualization.

This code is written assuming you have installed the Anaconda distribution of Python 3 (and should be compatible with Python 3.6 and up).

As of October 1st 2018, this is fully complete, but I will update the code as I catch errors.

Note

Code in the stylometry files opens texts from a folder called "corpus." This folder is created by code in the 20_corpusrep1_dividing.py, so run that before you try to run the stylometry code. The two fed (federalist papers) examples do NOT require this corpus folder, so feel free to run those immediately.

Updates

Renamed files to aid organization. Added DataFrames, Corpus Basics, matplotlib, stylometry, topic modeling, word2vec, apis, and webscraping.