Jupyter notebooks &etc. for processing data from the Barbara Curtis Adachi Bunraku (Japanese Puppet Theater) Collection.
|Cake PHP site powered by Relational MYSQL database|
|1||MySQL dump to CSVs|
|2||Import CSVs into IPython as Pandas Dataframes|
|3||Merge relational data (from CSV jointables) onto Dataframes by type|
|4||Export Dataframes as JSON records (and CSVs, for archival purposes only).|
|5||Drop null key:value pairs from JSON (bash JQ)|
|6||Convert (no nulls) JSON to YAML (bash Pyyaml)|
|7||Generate Jekyll collections (and pages) from YAML using Yaml-Splitter plugin|
|Static Jekyll site powered by YAML data, with JSON index for static search|
The data accessible on the original PHP site (as well as the new Jekyll site) represents only about 60% or so of the information stored in the MySQL database. To preserve that information for future use, I used a separate Ipy notebook/pipeline to output CSVs and JSON where images/media marked 'offline' were not dropped.
There is also a Jupyter notebook for generating matplotlib graphs and D3-specific/refactored JSON for data visualization. (bunraku-stats.ipynb)