[docs] postgres collation warning #1017

lzap · 2022-11-10T19:51:08Z

Check your Postgres collation for Mastodon instance now. A story about incorrectly ordered toots (long).

I spent three evenings investigating why my instance stopped updating notifications and statuses correctly. I figured out that statuses were not gone, but not ordered correctly. Like if something shaked them a bit, but not much, just a bit.

I was debugging goroutines, learning about Universally Unique Lexicographically Sortable Identifier (ULID) which is the ID that is used in the ActivityPub protocol. No luck. This is how they look like btw:

01GHGAC5EHKSQQ0YRPXNWVZ7EJ
01GHGA78BHHQ8A3T6SFVYXAV4Y

These ULIDs are used as unique identifiers and because they are lexicographically sortable, Mastodon implementations take advantage of that and sort by this database column.

Now it might be clear, but jeeez I spent some time until I finally figured: I created my Postgres database on a system with cs_CZ.UTF-8 locale. Therefore my database was created with cs_CZ collation.

See, in Czech, we have one special character "CH" and Czech collation it goes between "H" and "I". That was the problem and this is the big lesson that I learned.

Always create SQL database for Mastodon instances with "neutral" (English, none or C) collation: C.UTF-8. In case of Postgres, what you need to do is:

create database xxx with locale C.UTF-8 template template0

To check your collate, on Postgres do:

SELECT datcollate AS collation FROM pg_database WHERE datname = current_database();

Czech is not the only language that might bring problems I suppose. Check your databases now! Boost it. Thanks! Have fun.

https://social.zapletalovi.com/@lukas/statuses/01GHHJQKMCGSB8TV1SMGE6JDM0

tsmethurst · 2022-11-11T08:42:00Z

Thank you so much for your debugging work!

[docs] postgres collation warning

507c39a

tsmethurst merged commit b755906 into superseriousbusiness:main Nov 11, 2022

lzap deleted the pg-collate-doc branch November 11, 2022 10:13

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[docs] postgres collation warning #1017

[docs] postgres collation warning #1017

lzap commented Nov 10, 2022 •

edited

tsmethurst commented Nov 11, 2022

[docs] postgres collation warning #1017

[docs] postgres collation warning #1017

Conversation

lzap commented Nov 10, 2022 • edited

tsmethurst commented Nov 11, 2022

lzap commented Nov 10, 2022 •

edited