fix: postgres errors #80

lasseintree · 2024-01-30T17:34:20Z

Having tried to implement meilisync on a postgres database I encountered some errors

meilisync check did not work because RealDictCursor was used as cursor_factory and get_count used integer indexing, which RealDictCursor does not support Data not syncing #68 (comment)
- I have made my own custom Row and Cursor types that support key indexing and integer indexing
When deleting something from the database columntypes is not included in the payload. .get is used to fetch columntypes where the length of it is then computed, but .get is used without a default, so when it is not included, it would call len of None which gives an error
- Added an empty list as default for the various column fields in the payload
I am not sure if this is an error, but in the way I want to use meilisync it is. When setting fields for an index in the config.yml file, it updates the documents of the index with all the fields for the affected row. This makes the amount of fields returned in search results inconsistent, and could potentially reveal information to the user it should not have.
- Made it such that when giving fields in the config, it only updates the documents with those given fields. If no fields are given it will include all fields.

I have also made a healthcheck bash script that can be used for a healthcheck when using meilisync as a service in docker compose. It runs meilisync check and if errors occur, then it runs meilisync refresh on the tables that are out of sync, it then returns exit code 1, and if meilisync check returns no errors it returns exit code 0, and does not run meilisync refresh .

#!/bin/bash

tables=$(meilisync check 2>&1 | grep 'ERROR' | awk -F '"' '{print $2}' | awk -F '.' '{print $2}')
args=""
if echo "$tables" | grep -q "\w"; then
    for table in $tables; do
        args="$args -t $table"
    done
    meilisync refresh $args 2> /dev/null
    false
else
    true
fi

I wasn't sure if I should include it in the pull request or not, but adding it to the repo would make it easy for people using meilisync to have a healthcheck in docker compose

    healthcheck:
      test: ["CMD-SHELL", "./meilisync-healthcheck.sh"]
      interval: 1h
      timeout: 10m
      retries: 2

When jumping between branches with meilisync requiring wal_level>=logical and branches not having wal2json it would not allow the database to start without wal2json and an active replication slot. I have made an sql script to drop the meilisync replication slot, to be used when you want to shut down your postgres database, so you can start your postgres database without wal2json. When you want to drop a replication slot it cannot be active, but when terminating the process it will try to start up again, so you have to drop it before it starts up again, this script will try doing it 5 times, since it is sometimes not successful in the first try.

CREATE OR REPLACE FUNCTION drop_replication_slot_with_retry() RETURNS VOID AS $$
DECLARE
    attempts INT := 0;
BEGIN
    LOOP
        BEGIN
            EXECUTE 'SELECT pg_drop_replication_slot(''meilisync'')';
            
            RAISE NOTICE 'Replication slot "meilisync" has been removed';
            EXIT;
        EXCEPTION
            WHEN others THEN
                IF attempts >= 5 THEN
                    RAISE NOTICE 'Replication slot "meilisync" is still active after 5 attempts. Aborting...';
                    RAISE;
                ELSIF SQLSTATE = '55006' THEN
                    attempts := attempts + 1;
                    
                    EXECUTE 'SELECT pg_terminate_backend(pid) FROM pg_stat_replication';
                    
                    RAISE NOTICE 'Replication slot "meilisync" is active. Terminating backend and retrying...';
                    
                    PERFORM pg_sleep(1); 
                ELSE
                    RAISE;
                END IF;
        END;
    END LOOP;
END;
$$ LANGUAGE plpgsql;
SELECT drop_replication_slot_with_retry();

Having tried to implement meilisync on a postgres database I encountered some errors - `meilisync check` did not work because `RealDictCursor` was used as `cursor_factory` and `get_count` used integer indexing, which `RealDictCursor` does not support - I have made my own custom Row and Cursor types that support key indexing and integer indexing - When deleting something from the database `columntypes` is not included in the payload. `.get` is used to fetch `columntypes` where the length of it is then computed, but `.get` is used without a default, so when it is not included, it would call `len` of `None` which gives an error - Added an empty list as default for the various `column` field in the payload - I am not sure if this is an error, but in the way I want to use meilisync it is. When setting fields for an index in the `config.yml` file, it updates the documents of the index with all the fields for the affected row. This makes search results inconsistent, and could potentially reveal information to the user it should not have. - Made it such that when giving fields in the config, it only updates the documents with those given fields. If no fields are given it will include all fields.

long2ice · 2024-01-31T06:34:56Z

Thanks! Please run make style local

long2ice · 2024-01-31T11:19:24Z

meilisync-healthcheck.sh

+    for table in $tables; do
+        args="$args -t $table"
+    done
+    meilisync refresh $args 2> /dev/null


This may have performance problem if has large data

I tested locally with a database that had an index for a table with 100k rows, and when setting the refresh size to be 100, meaning a thousand inserts would have to take place, after 10 minutes it was still not done. I then tried with size of 1000 and it finished in 10 seconds. I know it is not exactly scalable, since a size of 1000 running 100 inserts would still be slower than a size of 1 with 100 insert, but this would indicate that having 100 times the size be the amount of rows for an index is still within a reasonable time. But even with this it might eat up a lot of RAM and processing power, so I will just remove the healthcheck and just use it for myself when running locally.

long2ice · 2024-01-31T13:40:17Z

Thanks!

lint + add healthcheck

9db5a06

long2ice reviewed Jan 31, 2024

View reviewed changes

Delete meilisync-healthcheck.sh

22dedef

lasseintree requested a review from long2ice January 31, 2024 12:48

long2ice merged commit 9a588ba into long2ice:dev Jan 31, 2024
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: postgres errors #80

fix: postgres errors #80

lasseintree commented Jan 30, 2024 •

edited

long2ice commented Jan 31, 2024

long2ice Jan 31, 2024

lasseintree Jan 31, 2024

long2ice commented Jan 31, 2024

fix: postgres errors #80

fix: postgres errors #80

Conversation

lasseintree commented Jan 30, 2024 • edited

long2ice commented Jan 31, 2024

long2ice Jan 31, 2024

Choose a reason for hiding this comment

lasseintree Jan 31, 2024

Choose a reason for hiding this comment

long2ice commented Jan 31, 2024

lasseintree commented Jan 30, 2024 •

edited