New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement a Discourse-backed blog page #208

Merged
merged 14 commits into from Jan 9, 2018

Conversation

Projects
None yet
3 participants
@notriddle
Contributor

notriddle commented Dec 8, 2017

How it works

Create the topic in the Newsletter category. If you don't tag it as #published, it won't show up. Once you tag it as published (and only staff have permission to tag a topic as published), it will show up on Janitor's blog page.

Also, the webhook payload is not read, and the signature is not validated. So you can manually trigger a poll by pointing your browser at that URL.

Importing old entries

The "slug" in Discourse, which can be seen in the URL, becomes the #id of the post on Janitor (for example, the slug of this topic is newsletters-composed-in-discourse-and-published-in-janitor). If you don't want to break old links, make sure the post title will result in a correct slug.

Also, you can click the admin wrench and change the topic timestamp. The Janitor's blog page sorts posts by their timestamp, so old newsletter entries will line get sorted correctly.

@notriddle notriddle added the WIP label Dec 8, 2017

@jankeromnes

This comment has been minimized.

Member

jankeromnes commented Dec 8, 2017

So cool! Thanks a lot for porting the Blog page over to our new design, and also for preparing it for pulling from our Discourse!

I wanted to have a look this morning but then I got distracted. Let's try again. 😄

title,
user,
blog,
log,

This comment has been minimized.

@jankeromnes

jankeromnes Dec 8, 2017

Member

Interesting, why are you passing log to the template? 🙂

This comment has been minimized.

@notriddle

notriddle Dec 8, 2017

Contributor

Erm, I was debugging stuff. It's not needed any more...

@@ -0,0 +1,14 @@
<section class="blog">
<h2 class="header">Blog</h2>
{{

This comment has been minimized.

@jankeromnes

jankeromnes Dec 8, 2017

Member

Nit: Our templating system is a bit weird, but I like to set up indentation just right so that the produced HTML doesn't contain suspicious whitespaces. Here, everything outside {{ and }} will be included in the document, i.e. several trailing spaces and then a new line. To work around this, you can do something like:

      <h2 class="header">Blog</h2>{{
        for (let post in blog.posts) {
          const {title, body} = blog.posts[post];
      }}
      <div class="panel panel-default">
        ...
      </div>{{
        }
      }}
    </section>
}}
<div class="panel panel-default">
<div class="panel-heading"><h3>{{= title in html}}</h3></div>
<div class="panel-body"><section>{{= body}}</section></div>

This comment has been minimized.

@jankeromnes

jankeromnes Dec 8, 2017

Member

So the title will be successfully escaped, but how do we prevent body from including nefarious <script> tags that steals user credentials or performs random API calls? Do we trust everything we'll import from Discourse?

This comment has been minimized.

@notriddle

notriddle Dec 8, 2017

Contributor

I would just trust everything we import from Discourse.

@notriddle notriddle removed the WIP label Dec 8, 2017

@notriddle

This comment has been minimized.

Contributor

notriddle commented Dec 8, 2017

The blog is now stored in the database, and synced with Discourse when /blog-webhook-new/ is called. Which it will be whenever anything happens in the 'newsletter' category (I configured a Discourse webhook for that).

@notriddle notriddle force-pushed the notriddle:new-blog-page branch from 1025e85 to 8c46f9c Dec 8, 2017

@notriddle notriddle added the Web app label Dec 13, 2017

@notriddle notriddle force-pushed the notriddle:new-blog-page branch 4 times, most recently from 693a8fe to 7eaff2d Dec 14, 2017

@notriddle notriddle changed the title from New blog page to Implement a Discourse-backed blog page Dec 15, 2017

@notriddle notriddle force-pushed the notriddle:new-blog-page branch from 7eaff2d to d1ba5c2 Dec 15, 2017

notriddle added a commit to notriddle/janitor that referenced this pull request Dec 15, 2017

Implement a Discourse-backed blog page
How it works
============

Create the topic in the Newsletter category. If you don't tag it as
`#published`, it won't show up. Once you tag it as published (and only
staff have permission to tag a topic as published), it will show up on
Janitor's blog page.

Also, the webhook payload is not read, and the signature is not
validated. So you can manually trigger a poll by pointing your browser
at that URL.

The Webhook: https://discourse.janitor.technology/admin/api/web_hooks/1

The Pull Request implementing it:
JanitorTechnology#208

Importing old entries
=====================

The "slug" in Discourse, which can be seen in the URL, becomes the `#id`
of the post on Janitor (for example, the slug of this topic is
`newsletters-composed-in-discourse-and-published-in-janitor`). If you
don't want to break old links, make sure the post title will result in a
correct slug.

Also, you can click the admin wrench and change the topic timestamp. The
Janitor's blog page sorts posts by their timestamp, so old newsletter
entries will line get sorted correctly.

@notriddle notriddle force-pushed the notriddle:new-blog-page branch 3 times, most recently from 5acddbf to a416b9b Dec 15, 2017

@notriddle

This comment has been minimized.

Contributor

notriddle commented Dec 15, 2017

To avoid messing up link-buttons, I moved the link styling over to being blog-specific.

Implement a Discourse-backed blog page
How it works
============

Create the topic in the Newsletter category. If you don't tag it as
`#published`, it won't show up. Once you tag it as published (and only
staff have permission to tag a topic as published), it will show up on
Janitor's blog page.

Also, the webhook payload is not read, and the signature is not
validated. So you can manually trigger a poll by pointing your browser
at that URL.

Importing old entries
=====================

The "slug" in Discourse, which can be seen in the URL, becomes the `#id`
of the post on Janitor (for example, the slug of this topic is
`newsletters-composed-in-discourse-and-published-in-janitor`). If you
don't want to break old links, make sure the post title will result in a
correct slug.

Also, you can click the admin wrench and change the topic timestamp. The
Janitor's blog page sorts posts by their timestamp, so old newsletter
entries will line get sorted correctly.

@notriddle notriddle force-pushed the notriddle:new-blog-page branch from a416b9b to 733dc2d Dec 15, 2017

@jankeromnes

Just a few more comments in passing. Thanks again for implementing this!

lib/blog.js Outdated
@@ -0,0 +1,85 @@
// Copyright © 2015 Jan Keromnes. All rights reserved.

This comment has been minimized.

@jankeromnes

jankeromnes Dec 15, 2017

Member

Nit: Copyright 2017 @notriddle

exports.blogWebhookNew = function (response) {
blog.pull();
response.json({}, null, 2);

This comment has been minimized.

@jankeromnes

jankeromnes Dec 15, 2017

Member

Nit: I feel like this web hook would make more sense within the Janitor API (since you're returning JSON anyway).

The API handler implementations are in /api/, and you could create a new blog-api.js file with just the webhook for now.

@notriddle

This comment has been minimized.

Contributor

notriddle commented Dec 15, 2017

Moved the webhook.

app.js Outdated
});
// New public blog page webhook.
app.route(/^\/blog-new-webhook\/?$/, (data, match, end, query) => {

This comment has been minimized.

@jankeromnes

jankeromnes Dec 22, 2017

Member

Nit: Since you moved the webhook to the API, please remove this handler.

description: 'Pull the blog section from Discourse.',
handler: (_request, response) => {
blog.pull();

This comment has been minimized.

@jankeromnes

jankeromnes Dec 22, 2017

Member

Nit: Don't you want to handle async results or errors here?

notriddle added some commits Dec 22, 2017

@jankeromnes

Thanks again for this very cool new blog page!

I was finally able to make a pass on the code (sorry for the delay, I was away for the holidays. Also, happy new year! 🍾🎆🎉)

I haven't yet looked too closely at the visual result (although I remember it looks really nice now, and I'm super happy that we'll finally have a Comment feature), and believe I'll have a few nits there as well, but since it already looks very nice and also the page won't be publicly exposed at first, we can also leave any adjustments for later without further delaying this pull request.

Please have a look at my nits below (sorry for being extremely nit-picky...). They're mostly about being consistent with the coding style in the rest of the code base, adding a few comments here and there.

And again, huge 👍 on this! 😄💯🥇🎉

title: 'Blog'
});
// API sub-resource for the sync webhook.

This comment has been minimized.

@jankeromnes

jankeromnes Jan 3, 2018

Member

Nit: The word "sync" is ambiguous here (e.g. "how can a webhook be sync/async?"). Please be more explicit, e.g. for the Discourse synchronization webhook.

});
// API sub-resource for the sync webhook.
const webhookAPI = blogAPI.api('/sync-webhook');

This comment has been minimized.

@jankeromnes

jankeromnes Jan 3, 2018

Member

Nit: This is not a strong opinion, but in other API handlers (e.g. /api/projects/:project/pull) we use a single full verb for actions, without suffixes. What do you think about /api/blog/synchronize? (Especially, do you find this explicit enough for everyone to guess what this does?)

// API sub-resource for the sync webhook.
const webhookAPI = blogAPI.api('/sync-webhook');
webhookAPI.get({

This comment has been minimized.

@jankeromnes

jankeromnes Jan 3, 2018

Member

Nit: Do you think we'll implement other HTTP methods than GET for this webhook? (e.g. POST, PUT, DELETE... ?) If not, maybe there is no need to make an API sub-resource, and you can simply add a prefix to the handler, like so:

blogAPI.get('/synchronize', {
  /* ... */
});
description: 'Pull the blog section from Discourse.',
handler: (_request, response) => {
blog.pull().then(null, log);

This comment has been minimized.

@jankeromnes

jankeromnes Jan 3, 2018

Member

Nit: I prefer painfully explicit code to clever hacks, please expand this line to something like:

handler: (request, response) => {
  blog.pull().then(status => {
    log('synchronized blog', status);
    response.json(status, null, 2);
  }, error => {
    log('[fail] synchronizing blog', error);
    response.statusCode = 500; // Internal Server Error
    response.json({ error: 'Could not synchronize' }, null, 2);
  });
}

or with async/await:

handler: async (request, response) => {
  try {
    const status = await blog.pull();
    log('synchronized blog', status);
    response.json(status, null, 2);
  } catch (error) {
    log('[fail] synchronizing blog', error);
    response.statusCode = 500; // Internal Server Error
    response.json({ error: 'Could not synchronize' }, null, 2);
  }
}

(Note: If status doesn't hold anything valuable, no need to put it in a variable or to log it, just response.json({ status: 'Synchronized' }, null, 2).)

This comment has been minimized.

@nt1m

nt1m Jan 3, 2018

Member

+1 for async/await

title: 'Synchronize',
description: 'Pull the blog section from Discourse.',
handler: (_request, response) => {

This comment has been minimized.

@jankeromnes

jankeromnes Jan 3, 2018

Member

Interesting, do you always prefix unused variables with _? If you think this convention would be useful for Janitor, please enforce it via our ESLint rules, and auto-fix the entire code base (with npm run lint-fix).

This comment has been minimized.

@notriddle

notriddle Jan 3, 2018

Contributor

I usually do that, yeah. It's a compiler-recommended lint in both Rust and Elixir, so I've kinda picked it up.

I'd rather do that as a separate pull request, though.

@@ -263,6 +263,10 @@ Array.forEach(document.querySelectorAll('h1[id],h2[id]'), element => {
element.appendChild(small);
});
Array.forEach(document.querySelectorAll('.blog article p a'), element => {
element.target = '_blank';

This comment has been minimized.

@jankeromnes

jankeromnes Jan 3, 2018

Member

Don't all blog links already have a target="_blank" attribute? (Seriously asking, I don't know if they do, and if they don't then your code here is probably necessary.)

This comment has been minimized.

@nt1m

nt1m Jan 3, 2018

Member

Might be nice to put this into a custom server-side blog formatter (eg. the one that would remove script tags), rather than janitor.js ?

This comment has been minimized.

@notriddle

notriddle Jan 3, 2018

Contributor

Don't all blog links already have a target="_blank" attribute?

No. Discourse doesn't put that in the HTML it generates.

const {title, body_html, slug, id, posts_count} = blog.posts[post];
const title_url = blog.post_url_template
.replace("{{slug}}", slug)
.replace("{{id}}", id);

This comment has been minimized.

@jankeromnes

jankeromnes Jan 3, 2018

Member

Nit: Oh, I see why you added the template URLs in the DB instead of using variables global to the blog module: You're re-using them here. Instead of adding them to the DB and duplicating the templating code, I suggest adding a "secret getter" to all post objects in /lib/blog.js, which can be used in this HTML template but won't get serialized to the DB file.

You can do it like so:

if (!post.hasOwnProperty('_titleUrl')) {
  Object.defineProperty(post, '_titleUrl', {
    get () {
      return PostUrlTemplate.replace('{{slug}}', this.slug).replace('{{id}}', this.id);
    }
  });
}

Then you'll be able to use post._titleUrl in this new blog HTML template.

(See also this example.)

const comments_count = posts_count - 1;
const comments_url = blog.comments_url_template
.replace("{{slug}}", slug)
.replace("{{id}}", id);

This comment has been minimized.

@jankeromnes

jankeromnes Jan 3, 2018

Member

Nit: Same comment about the "secret getter".

}}
<article>
<h2 id="{{= slug in xmlattr}}">{{= title in html}}</h2>
{{= body_html}}

This comment has been minimized.

@jankeromnes

jankeromnes Jan 3, 2018

Member

Nit: This looks dangerous to me, even if Discourse promises to strip all <script> elements from its cooked HTML. Maybe we should add a Camp template parser that leaves HTML code untouched except for <script> elements? (See these short examples for how this could be done.)

This comment has been minimized.

@notriddle

notriddle Jan 3, 2018

Contributor

HTML sanitization is harder than that. I know.

<h2 id="{{= slug in xmlattr}}">{{= title in html}}</h2>
{{= body_html}}
</article>
<footer><a class="btn" href="{{= url}}">{{= comments_count in html}} comments</a></footer>{{

This comment has been minimized.

@jankeromnes

jankeromnes Jan 3, 2018

Member

Nit: Would escaping {{= url}} with in uri break the URL? (See also the list of default template parsers for brief explanations of what they do.)

This comment has been minimized.

@notriddle

notriddle Jan 3, 2018

Contributor

in uri mode actually escapes special characters like ://, since it's intended to be part of the URL, not the whole thing. It would be better to use in xmlattr, I guess.

@notriddle notriddle force-pushed the notriddle:new-blog-page branch from f92cdd2 to b0b23e2 Jan 3, 2018

@notriddle notriddle force-pushed the notriddle:new-blog-page branch from b0b23e2 to 643781c Jan 3, 2018

@notriddle

This comment has been minimized.

Contributor

notriddle commented Jan 3, 2018

Fixed most of the nits.

lib/blog.js Outdated
}
// Get the cached Discourse-backed blog.
module.exports.getDb = function () {

This comment has been minimized.

@nt1m

nt1m Jan 3, 2018

Member

nit: dropping module. is more consistent with the current code, and also works.

lib/blog.js Outdated
module.exports.synchronize = async function () {
const blog = module.exports.getDb();
const time = Date.now();
let topics = (await fetchDiscourseAPI(discourseBlogTopicsUrl)).topic_list.topics;

This comment has been minimized.

@nt1m

nt1m Jan 3, 2018

Member

nit: you can drop the brackets

Also, can you use destructuring ?

let { topics } = await fetchDiscourseAPI(discourseBlogTopicsUrl).topic_list;
}
/* Blogs page */

This comment has been minimized.

@nt1m

nt1m Jan 3, 2018

Member

nit: "Blog page"

.blog {
font-size: 16px;
}

This comment has been minimized.

@nt1m

nt1m Jan 3, 2018

Member

nit: Please add a blank line after each CSS rule to be consistent with the rest of the CSS file

notriddle added some commits Jan 3, 2018

lib/blog.js Outdated
exports.synchronize = async function () {
const blog = module.exports.getDb();
const time = Date.now();
let topics = await fetchDiscourseAPI(discourseBlogTopicsUrl).topic_list.topics;

This comment has been minimized.

@notriddle

notriddle Jan 3, 2018

Contributor

@nt1m Are you sure I can remove the brackets here? Because the tests are failing now.

This comment has been minimized.

@nt1m

nt1m Jan 3, 2018

Member

Hmm, apparently not. I remember something like this was/is possible, but maybe I'm confusing this with something else.

@notriddle

This comment has been minimized.

Contributor

notriddle commented Jan 4, 2018

Okay, I changed it back.

@nt1m

This comment has been minimized.

Member

nt1m commented Jan 4, 2018

Okay, I changed it back.

@notriddle My comment about using destructuring still applies though.

@notriddle

This comment has been minimized.

Contributor

notriddle commented Jan 4, 2018

Okay, @nt1m. Switched to using destructuring.

@jankeromnes

Thanks a lot for working so hard on improving this pull request! I'm running out of things to complain about, but I couldn't resist publishing another round of nits (I'm very sorry about them).

After that, I guess the last big complaint I might have about this feature is the security aspect of fully trusting imported Discourse HTML, but maybe I'm making too big a deal out of this. I guess I should try to exploit it, and then if I fail, I should stop complaining. 😄

});
// API sub-resource for the Discourse-Blog synchronization webhook.
// should show up as https://janitor.technology/api/blog/synchronize

This comment has been minimized.

@jankeromnes

jankeromnes Jan 8, 2018

Member

Nit: This isn't really an "API sub-resource" anymore, but more like an "API handler". And the second line would need a capital letter and a full stop. But is this comment really necessary?

try {
const status = await blog.synchronize();
log('synchronized blog', status);
response.json(status, null, 2);

This comment has been minimized.

@jankeromnes

jankeromnes Jan 8, 2018

Member

Nit: Wrapping the count variable in a status object makes it easier for us to add more return fields later, but it also looks a bit scary from a security stand point (e.g. what if one day a status object contains something like "warning": "private token abcd0123 expires soon", or other unintended leak of private information?). So if you don't mind, I would prefer the more restrictive version where we craft just the count value into a message, like we do when deploying configuration files to all containers. (Or you could keep the return Object, but destructure it here like so: const { count } = await blog.synchronize().)

Note: That's also why we don't call response.json(error, null, 2) below, which could be significantly more helpful to users than the generic "Could not synchronize" message, but error-leaking is a well-known attack vector which we've (to my knowledge) entirely avoided throughout this app.

examples: [{
response: {
body: JSON.stringify({count: 1}, null, 2)

This comment has been minimized.

@jankeromnes

jankeromnes Jan 8, 2018

Member

Nit: Please add a space after { and another before } (to make in-line objects more readable, and remain consistent with our style guide). Bonus points for enforcing this by amending our ESLint rules.

@@ -149,6 +149,25 @@ exports.blogPage = function (response, user = null) {
]);
};
// Public blog page.
const blogSectionNew = camp.template('./templates/blog-new.html');
exports.blogPageNew = function (response, params) {

This comment has been minimized.

@jankeromnes

jankeromnes Jan 8, 2018

Member

Nit: In general, we prefer full variable names rather than abbreviations, so please use parameters instead of params (sorry, "code beauty" is very subjective, and I agree it doesn't help to disambiguate in this case, but we should at least remain consistent, see the webProxy route).

EDIT: Or you could simply have blog, topics and user as function parameters, without wrapping them into a parameters object.

const blogSectionNew = camp.template('./templates/blog-new.html');
exports.blogPageNew = function (response, params) {
const title = 'Blog';
const {user, blog, topics} = params;

This comment has been minimized.

@jankeromnes

jankeromnes Jan 8, 2018

Member

Nit: Please add a space after { and another before } for in-line objects.

lib/blog.js Outdated
topics.sort((a, b) => {
if (a.created_at > b.created_at) {
return -1;
} else if (a.created_at === b.created_at) {

This comment has been minimized.

@jankeromnes

jankeromnes Jan 8, 2018

Member

Nit: No need to use else after a return. You could simplify this code like so:

if (a.created_at > b.created_at) {
  return -1;
}
if (a.created_at === b.created_at) {
  return 0;
}
return 1;
lib/blog.js Outdated
return 1;
}
});
blog.topics = await Promise.all(topics.map(async topic => {

This comment has been minimized.

@jankeromnes

jankeromnes Jan 8, 2018

Member

Nit: This line is doing too many things at once. Could you please split it, e.g. by creating an intermediary promises Array variable?

lib/blog.js Outdated
});
blog.topics = await Promise.all(topics.map(async topic => {
const bodyUrl = module.exports.getPostUrl(topic);
const bodyHtml = (await fetchDiscourseAPI(bodyUrl)).post_stream.posts[0].cooked;

This comment has been minimized.

@jankeromnes

jankeromnes Jan 8, 2018

Member

Nit: Same as above, I'm not too fond of inline-awaits, and prefer code that's simpler even if slightly longer, with explicitly-named intermediary variables.

lib/blog.js Outdated
// Big note for anybody who's trying to understand how the blog works:
// Discourse is made up of "topics" that contain one or more "posts".
// The first post in a published (published is a tag) blog (blog is a topic)
// post is considered the blog post itself. The rest are comments on it.

This comment has been minimized.

@jankeromnes

jankeromnes Jan 8, 2018

Member

Super-nit: The last sentence reads "The first post in a published blog post is considered the blog post itself" (I don't think there can be a "first post" in a "post"). Maybe replace the word "post" (in italic) by a more correct word like "topic" or "discussion" or "thread" or another synonym.

.blog {
font-size: 16px;

This comment has been minimized.

@jankeromnes

jankeromnes Jan 8, 2018

Member

Nit: Ok, this is starting to look really good now, but I think the new blog's readability can still be improved.

Please significantly bump up the font size, the line height, and narrow the post width, like in the old blog. You can take inspiration from these CSS rules (especially, font-size: 18px, line-height: 1.6 and max-width: 700px; padding-left: 25px; padding-right: 25px).

This comment has been minimized.

@jankeromnes

jankeromnes Jan 8, 2018

Member

Nit: Also, please don't indent titles like 1. Announcing Windows Environments (I don't like how Discourse indents them, it hurts readability. I also don't like the enormous bullet points, but I can live with them so I won't complain about them officially.)

This comment has been minimized.

@notriddle

notriddle Jan 8, 2018

Contributor

I didn't indent anything. You wrote Markdown Numbered Lists, and that's what Discourse gave me.

I went ahead and changed them to actual headings.

notriddle added some commits Jan 8, 2018

@jankeromnes

R+ with nits, thanks! 😄

I'm OK with this being merged, but before merging, please fix the few nits below.

lib/blog.js Outdated
// Perform an asynchronous Discourse API request.
function fetchDiscourseAPI (url) {
return new Promise((resolve, reject) => {
https.get(url + '.json', response_stream => {

This comment has been minimized.

@jankeromnes

jankeromnes Jan 9, 2018

Member

Nit: The project coding style prefers camelCase over snake_case.

lib/blog.js Outdated
let json = '';
response_stream.on('data', (chunk) => {
try {
json += chunk;

This comment has been minimized.

@jankeromnes

jankeromnes Jan 9, 2018

Member

Nit: Might be safer with String(chunk) instead of just chunk.

lib/blog.js Outdated
try {
json += chunk;
} catch (error) {
response_stream.destroy(error);

This comment has been minimized.

@jankeromnes

jankeromnes Jan 9, 2018

Member

Are we absolutely certain that stream.destroy(error) will trigger the stream.on('error', reject) below? If not I'm worried that some errors will go un-noticed.

This comment has been minimized.

@notriddle
lib/blog.js Outdated
'data': {
'updated': null,
'pull-time': [],
},

This comment has been minimized.

@jankeromnes

jankeromnes Jan 9, 2018

Member

Nit: The metrics library will automatically add any missing data entries, so you don't really need to pre-fill data with empty values here.

lib/blog.js Outdated
blog.topics = await Promise.all(topicsPromises);
const now = Date.now();
metrics.set(blog, 'updated', now);
metrics.push(blog, 'pull-time', [now, Date.now() - time]);

This comment has been minimized.

@jankeromnes

jankeromnes Jan 9, 2018

Member

Nit: I think you mean now - time instead of Date.now() - time.

const title = 'Blog';
const { topics } = blog.getDb();
const contents = topics.map(topic => {

This comment has been minimized.

@jankeromnes

jankeromnes Jan 9, 2018

Member

Nit: s/contents/posts/?

@@ -0,0 +1,12 @@
<section class="blog">{{
for (const topic of contents) {

This comment has been minimized.

@jankeromnes

jankeromnes Jan 9, 2018

Member

Nit: Is it plain wrong to call these variables for (const post of posts)?

@notriddle notriddle merged commit 42ee35f into JanitorTechnology:master Jan 9, 2018

2 checks passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details
security/snyk No new issues
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment