Skip to content

jayhack/inpixels09-archive

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

inpixels09 — archive

A complete archive of the WordPress.com blog in Pixels (inpixels09.wordpress.com), written by Jay Hack between September 2009 and January 2010 during a 5.5-month student exchange in Nagano, Japan.

The archive captures all 26 published posts, their original rendered HTML pages, and every embedded image, in formats that should outlive the original site.

Rendered site: https://jayhack.github.io/inpixels09-archive/

What's in here

.
├── README.md
├── _config.yml                 # Jekyll config for GitHub Pages
├── Gemfile                     # Pinned to the github-pages gem for local previews
├── index.md, about.md          # Landing pages for the rendered site
├── _posts/                     # Jekyll-friendly copies of each post (ASCII slugs, rewritten image paths)
├── scripts/
│   ├── build_archive.py        # Rebuilds posts-md/, mirror/, assets/ from posts-json/all-posts.json
│   └── build_jekyll.py         # Rebuilds _posts/ from posts-md/
├── posts-json/
│   ├── site.json               # Site metadata from the WordPress.com REST API
│   ├── tags.json               # All tags
│   ├── categories.json         # All categories
│   ├── all-posts.json          # Single-file dump of every post (source of truth)
│   └── posts/                  # One pretty-printed JSON file per post
├── posts-md/
│   ├── INDEX.md                # Chronological table of contents
│   └── YYYY-MM-DD-<slug>.md    # One Markdown file per post (with YAML frontmatter)
├── mirror/
│   └── YYYY-MM-DD-<slug>.html  # The original rendered HTML page for each post
└── assets/
    └── <slug>/                 # Images embedded in each post, downloaded locally

Format choices

Three independent representations are kept on purpose, because each one fails in a different way over time:

  1. JSON (posts-json/) — structured, machine-readable, lossless. Pulled straight from the WordPress.com public REST API (/sites/<host>/posts). Contains the full HTML body, dates, tags, categories, author info, etc. This is the canonical source the other formats are derived from.
  2. Markdown (posts-md/) — human-readable, plain-text, future-proof. Image links in the Markdown have been rewritten to point at the local copies in assets/, so you can read these files entirely offline.
  3. HTML mirror (mirror/) — the original rendered page, exactly as WordPress.com served it at archive time. Useful if you ever want to see the posts with their original styling and surrounding chrome.

Rebuilding

The Markdown, HTML mirror, and image assets are all regenerable from posts-json/all-posts.json:

python3 scripts/build_archive.py

The script is dependency-free (standard library only) and idempotent — it skips any HTML page or image that is already present on disk.

To regenerate the Jekyll-friendly _posts/ from the portable Markdown:

python3 scripts/build_jekyll.py

To preview the rendered Pages site locally:

bundle install
bundle exec jekyll serve

Re-fetching from WordPress.com

If you ever want to pull a fresh copy of the source data from the live site:

HOST=inpixels09.wordpress.com
curl -sS "https://public-api.wordpress.com/rest/v1.1/sites/$HOST" -o posts-json/site.json
curl -sS "https://public-api.wordpress.com/rest/v1.1/sites/$HOST/posts/?number=100&fields=ID,date,modified,title,URL,slug,excerpt,content,tags,categories,author" -o posts-json/all-posts.json
curl -sS "https://public-api.wordpress.com/rest/v1.1/sites/$HOST/tags?number=200" -o posts-json/tags.json
curl -sS "https://public-api.wordpress.com/rest/v1.1/sites/$HOST/categories?number=200" -o posts-json/categories.json
curl -sS "https://public-api.wordpress.com/rest/v1.1/sites/$HOST/comments?number=100" -o posts-json/all-comments.json
python3 scripts/build_archive.py
python3 scripts/build_jekyll.py

Posts

See posts-md/INDEX.md for the full chronological list. The first post is Tokyo (2009-09-06) and the last is 消滅のやり方 / How to Disappear Completely (2010-01-09).

About

Archive of inpixels09.wordpress.com (2009-2010 student-exchange blog from Nagano, Japan)

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages