ipython notebooks for processing bunraku collection data @cul 🇯🇵 🎎
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
in
out
post-processing
stats
.gitignore
README.md
bunraku-full.ipynb
bunraku-online.ipynb

README.md

bunraku-ipy

Jupyter notebooks &etc. for processing data from the Barbara Curtis Adachi Bunraku (Japanese Puppet Theater) Collection.

pipeline(s):

online collection data / bunraku-online.ipynb

#f03c15 Cake PHP site powered by Relational MYSQL database
1 MySQL dump to CSVs
2 Import CSVs into IPython as Pandas Dataframes
3 Merge relational data (from CSV jointables) onto Dataframes by type
4 Export Dataframes as JSON records (and CSVs, for archival purposes only).
5 Drop null key:value pairs from JSON (bash JQ)
6 Convert (no nulls) JSON to YAML (bash Pyyaml)
7 Generate Jekyll collections (and pages) from YAML using Yaml-Splitter plugin
#c5f015 Static Jekyll site powered by YAML data, with JSON index for static search

total collection data / bunraku-full.ipynb

The data accessible on the original PHP site (as well as the new Jekyll site) represents only about 60% or so of the information stored in the MySQL database. To preserve that information for future use, I used a separate Ipy notebook/pipeline to output CSVs and JSON where images/media marked 'offline' were not dropped.

stats:

There is also a Jupyter notebook for generating matplotlib graphs and D3-specific/refactored JSON for data visualization. (bunraku-stats.ipynb)