Anchor inside navigation #96

jkfran · 2021-04-07T17:12:25Z

When the navigation title includes an anchor the Navigation is missing. Our discourses don't include the anchor automatically but it could happen like on https://discuss.kubernetes.io/t/introduction-to-microk8s/11243:

The issue was solved by changing ## Navigation to <h2>Navigation</h2>

The text was updated successfully, but these errors were encountered:

evilnick · 2021-04-07T17:16:11Z

@jkfran thanks for working out the cause. I imagine some regex update would fix it ;)

jpmartinspt · 2021-04-07T20:27:13Z

Just to add some more info, white spaces inside the h2 broke the parsing.
Ex: <h2> URLs</h2> will result in url_map failing to be parsed.

evilnick · 2021-04-08T14:39:52Z

I assume we are going to make some changes to make handling these things more robust? I have minimal control over the discourse we use for MicroK8s, so I'm not even notified if any aspect of it changes. If this had happened today, for example, it would have been a major problem.

anthonydillon · 2021-04-08T15:30:27Z

Understood. This is triaged as a high priority and so should be scheduled to be remedied soon.

ktsakalozos · 2021-04-08T15:47:35Z

Thank you web-team and @nick Veitch for addressing this issue so quickly especially since it was outside your working hours.

What guards will we have in place so we do not need to face such an unpleasant incident again? Some use cases we have to consider is what would happen if https://discuss.kubernetes.io goes down or is inaccessible for a significant amount of time? Do we keep backups of our docs? How can we use them? Do we have enough alerting to quickly react to such incidents?

This is probably not the right place to have this discussion. Please, let me know how we can move forward and possibly address some of the above. As i am not familiar with the processes already in place, it is very possible that we already address most of the issues I am concerned with. Thank you again.

nottrobin · 2021-04-09T11:45:53Z

Hi @ktsakalozos

What guards will we have in place so we do not need to face such an unpleasant incident again? Some use cases we have to consider is what would happen if https://discuss.kubernetes.io goes down or is inaccessible for a significant amount of time?

If the back-end that contains the content is down, currently the cached content should be shown for about 5 minutes, after which we will start displaying an error page. I don't know what this error message currently looks like, we could certainly look at improving the messaging to inform the user as best we can as to what happened. I'm not sure this behaviour should be fundamentally changed though, since the canonical source of the content is the kubernetes discourse. This is where it is edited etc. - it is a fundamental part of the system. If we feel relying on an external discuss.kubernetes.io in this way is too risky, we should simply move the content somewhere else, but that discussion should probably involve Mark.

Do we keep backups of our docs? How can we use them?

This is a very good point, thanks for bringing it up. I don't know what backups happen of the Discourse platforms run by IS, but certainly we don't back-up the content in discuss.kubernetes.io. I've filed #98, and we'll discuss it soon.

Do we have enough alerting to quickly react to such incidents?

I'm not sure. We have some alerting, and we have been actively working on this. @tbille was setting up some alerting that would help, I'm not sure where that work got to. @tbille do you know if there's an issue about this anywhere? If not, could you create one?

ktsakalozos · 2021-04-09T12:58:23Z

Hi @nottrobin, thank you for the reply.

I'm not sure this behaviour should be fundamentally changed though, since the canonical source of the content is the kubernetes discourse. This is where it is edited etc. - it is a fundamental part of the system. If we feel relying on an external discuss.kubernetes.io in this way is too risky, we should simply move the content somewhere else, but that discussion should probably involve Mark.

The content should be served from the kubernetes discourse since this is where it is edited. We agree on this here. However, IMHO when there is an outage we should have a plan to mitigate it. What if the core personnel is sleeping, missing, are on holidays? What if discuss.kubernetes.io goes down for a weekend or does a change that we are not able to adjust to in a short period of time? It also feels wrong to have the engineers work under the pressure of the site being down. I would much prefer if we could flip a switch and serve the content from a backup while we wait for Nick and the web-team to wake up and work on the issue without stress during their working hours. Again, this is only my opinion.

evilnick · 2021-04-09T13:44:04Z

Hi @nottrobin, welcome back! Thanks for looking at this. My brief thoughts on this:

I'm not sure this behaviour should be fundamentally changed though, since the canonical source of the content is the kubernetes discourse. This is where it is edited etc. - it is a fundamental part of the system.

I know there are very good reasons for keeping the content there, so nobody is talking about moving it. However, if the front-end code can't see or make sense of the discourse, there is a possibility we can't either. It makes sense to me to continue serving the last 'single source of truth' as we know it until everything is working again. I would particularly like to see a solution where:

the front end notified us/someone if the discourse was 'unhealthy'
continued serving the cache or a backup indefinitely until the problem was resolved.

As for what is unhealthy. I think failing to load the navigation or url mapping would qualify and I think this would be useful for all of the discourse-based docs, not just specifically the microK8s ones.

SirSamTumless added the Priority: High label Apr 8, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Anchor inside navigation #96

Anchor inside navigation #96

jkfran commented Apr 7, 2021

evilnick commented Apr 7, 2021

jpmartinspt commented Apr 7, 2021

evilnick commented Apr 8, 2021

anthonydillon commented Apr 8, 2021

ktsakalozos commented Apr 8, 2021

nottrobin commented Apr 9, 2021 •

edited

ktsakalozos commented Apr 9, 2021

evilnick commented Apr 9, 2021

Anchor inside navigation #96

Anchor inside navigation #96

Comments

jkfran commented Apr 7, 2021

evilnick commented Apr 7, 2021

jpmartinspt commented Apr 7, 2021

evilnick commented Apr 8, 2021

anthonydillon commented Apr 8, 2021

ktsakalozos commented Apr 8, 2021

nottrobin commented Apr 9, 2021 • edited

ktsakalozos commented Apr 9, 2021

evilnick commented Apr 9, 2021

nottrobin commented Apr 9, 2021 •

edited