Skip to content
This repository has been archived by the owner on Feb 25, 2022. It is now read-only.

Serve .md files from Helix Content Proxy #452

Closed
trieloff opened this issue May 28, 2020 · 1 comment
Closed

Serve .md files from Helix Content Proxy #452

trieloff opened this issue May 28, 2020 · 1 comment
Labels
enhancement New feature or request

Comments

@trieloff
Copy link
Contributor

trieloff commented May 28, 2020

Paraphrasing @davidnuescheler:

When I'm looking at my site and https://example.com/foo/bar.html isn't working as expected, I first need to make sure the Markdown is alright. Normally this involves reading the fstab.yaml, then calling word2md, and a bunch of other stuff. Wouldn't it be great if I could just use https://example.com/foo/bar.md to get the Markdown representation?

Yes, it would be great. We could just use the helix-content-proxy and a new Request-Type: Content to serve the markdown.

Then Helix-Pipeline could use this URL to fetch the raw Markdown content: adobe/helix-pipeline#734

Challenges

  • helix-content-proxy expects (but doesn't enforce yet) absolute refs (shas) to keep content in the cache forever
  • our VCL only has symbolic branch names (like master) and resolution only happens in helix-dispatch

I see two ways to deal with this:

Approach A: serve it anyway, but with inconvenient caches

  1. we call helix-content-proxy, which will never enforce shas
  2. we serve content that might be slightly out of date to begin with (due to raw.githubusercontent.com's built-in caching that resolve-git-ref would normally circumvent)
  3. the MD file is served with a cache lifetime that will be too short for the cache to be effective and too long to be convenient for the developer
  4. the .md URL will be unusable for helix-pipeline Fetch Markdown from Helix Content Proxy helix-pipeline#734

Approach B: serve a redirect, to an inconvenient URL

  1. given a symbolic name, we call helix-resolve-git-ref and return the result as a 302 temporary redirect to the same file, this time including the correct ref as a URL parameter
  2. the redirect remains uncached (potentially slow)
  3. given an absolute ref, either from the strain or through the ref URL parameter, we call helix-content-proxy and serve with a long cache timeline
  4. helix-pipeline will append the ?ref=… parameter to the URL when fetching (it is always being called with a pre-resolved ref), so that no additional lookup is needed

So our developer would take https://example.com/foo/bar.html, change the html to md in the browser, hit return and get https://example.com/foo/bar.md?ref= 0b4cd12233ba3d3585b7270c63aec3b494fb0202 . Upon making changes in the repo, the developer would need to remove the ref= parameter before reloading.

Approach C: the best (or worst) of both worlds

We use Approach A when dealing with browsers and Approach B when dealing with helix-fetch and other API clients (they'd use a header like helix-redirect: please).

@trieloff
Copy link
Contributor Author

trieloff commented Jun 3, 2020

Approach D: dear developer, you decide

  1. we call helix-content-proxy, which will never enforce shas
  2. but given Make Caching Dependent on presence of helix-bot helix-content-proxy#34 it will serve from master either uncached or with a long shared cache timeout if the bot is installed (the bot will flush the cache on push)
  3. the browser will treat the file as uncachable and refetch at every reload
  4. helix-pipeline and other API clients will use the ?ref= shortcut to get the cached version (because they know the full ref)

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant