Skip to content

whusterj/whusterj.github.io

Repository files navigation

whusterj.github.io

My website.

To set up a new dev environment, see: See Installation.md.

Then run:

jekyll serve

Github pages is integrated with Jekyll. It builds and deploys this site: See documentation here.

Hash Styles

I know I should automate this, but for now I append a hash to style.css in the default.html template:

md5 static/css/style.css

Handling Python Notebooks

I've started doing more work in Python notebooks and would like to develop a publishing workflow to this blog. Here are the steps I'm taking right now, but it's a bit manual:

  1. Export the notebook to markdown
jupyter nbconvert --to markdown my_notebook.ipynb
  1. Create a new post in _posts and copy-paste the markdown

  2. Move the images to the static/images directory and update the links in the markdown. Make sure to include the absolute_url directive - this is the most laborious step.

  3. Convert LaTex statements to MathJax. This is also laborious, because you can't simply substitute

I've added my first notebook post, and I'm also seeing some CSS issues. I will try to correct these in the main stylesheet, but it may add more steps.

Handling Images

Extract Metadata from Photos

Short version: use gen_photo_frontmatter.sh

In early 2022 I added a 'photos' page and photos collection. To make this easier to manage and the information more interesting, I'm keeping the details in the image metadata and extracting it to the markdown frontmatter using exiftool.

To install the tool:

sudo apt install libimage-exiftool-perl

And use it to extract some basic image info that we can copy-paste into the markdown frontmatter:

exiftool *.jpg -S -Title -ImageWidth -ImageHeight -Make -Model -FNumber -ExposureTime -ISO -LensID -Keywords -DateTimeOriginal -d "%Y-%m-%d %H:%M:%S"

Convert Photos to WebP

Install the CLI utility:

brew install webp
sudo apt-get install webp

Bulk compress some images to quality 75%. In my tests, this can reduce file size by as much as 50+% (from 190kb to 90kb).

find ./ -type f -name '*.jpg' -exec sh -c 'cwebp -q 75 "$1" -o "${1%.jpg}.webp"' _ {} \;

Lossless compression. The following command will give highest possible compression, and will take some time to run:

find ./ -type f -name '*.jpg' -exec sh -c 'cwebp -lossless -m 6 -z 9 -q 100 "$1" -o "${1%.jpg}.webp"' _ {} \;

In my tests, the above settings for lossless compression of JPGs actually INCREASED file size. This might work better for large lossless images, but since JPG itself is lossy, we're actually backtracking with this one.

Sync Images and Videos to R2

NO LONGER USED, FOR REFERENCE ONLY: I experimented with using Cloudflare's R2 to host my images. Unfortunately my analytics showed a big spike in latency to serve these images from the R2 bucket.

Back Up Images to R2

R2 is free with my Cloudflare account and a good backup solution I think, in case I ever need to migrate from GH Pages.

Here are instructions to sync this repo's images to and from R2:

Media can be served from the subdomain images.williamhuster.com. I used rclone to quickly sync

If it's the first time on this computer, then follow the rclone configuration steps for R2.

Sync from my R2 bucket to local, using --interactive or -i to confirm changes and avoid data loss:

rclone sync r2:blog-images static/images -i

When adding new files, sync from local up to R2:

rclone sync static/images r2:blog-images -i

Use a CDN?

If I end up having many GB of images and video, then it would be better to serve media from a CDN. I researched different solutions and the frontrunner for me was Cloudflare Images. This costs $5/month for up to 100K images. Features include auto-generated variants, which is attractive.

The big drawback is that I would somehow need to automate the image upload process. CF Images generates a UUID for each image by default. You can also specify a "custom key." I would just use the filename to make it easier to identify.

For now (2024-02-16) I've decided to move back to GitHub for image hosting. I only have ~50MB of media so far, and GitHub seems to put media on its user content CDN. Most of all, it is free to use!