Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: postgres errors #80

Merged
merged 3 commits into from
Jan 31, 2024
Merged

fix: postgres errors #80

merged 3 commits into from
Jan 31, 2024

Conversation

lasseintree
Copy link
Contributor

@lasseintree lasseintree commented Jan 30, 2024

Having tried to implement meilisync on a postgres database I encountered some errors

  • meilisync check did not work because RealDictCursor was used as cursor_factory and get_count used integer indexing, which RealDictCursor does not support Data not syncing  #68 (comment)
    • I have made my own custom Row and Cursor types that support key indexing and integer indexing
  • When deleting something from the database columntypes is not included in the payload. .get is used to fetch columntypes where the length of it is then computed, but .get is used without a default, so when it is not included, it would call len of None which gives an error
    • Added an empty list as default for the various column fields in the payload
  • I am not sure if this is an error, but in the way I want to use meilisync it is. When setting fields for an index in the config.yml file, it updates the documents of the index with all the fields for the affected row. This makes the amount of fields returned in search results inconsistent, and could potentially reveal information to the user it should not have.
    • Made it such that when giving fields in the config, it only updates the documents with those given fields. If no fields are given it will include all fields.

I have also made a healthcheck bash script that can be used for a healthcheck when using meilisync as a service in docker compose. It runs meilisync check and if errors occur, then it runs meilisync refresh on the tables that are out of sync, it then returns exit code 1, and if meilisync check returns no errors it returns exit code 0, and does not run meilisync refresh .

#!/bin/bash

tables=$(meilisync check 2>&1 | grep 'ERROR' | awk -F '"' '{print $2}' | awk -F '.' '{print $2}')
args=""
if echo "$tables" | grep -q "\w"; then
    for table in $tables; do
        args="$args -t $table"
    done
    meilisync refresh $args 2> /dev/null
    false
else
    true
fi

I wasn't sure if I should include it in the pull request or not, but adding it to the repo would make it easy for people using meilisync to have a healthcheck in docker compose

    healthcheck:
      test: ["CMD-SHELL", "./meilisync-healthcheck.sh"]
      interval: 1h
      timeout: 10m
      retries: 2

When jumping between branches with meilisync requiring wal_level>=logical and branches not having wal2json it would not allow the database to start without wal2json and an active replication slot. I have made an sql script to drop the meilisync replication slot, to be used when you want to shut down your postgres database, so you can start your postgres database without wal2json. When you want to drop a replication slot it cannot be active, but when terminating the process it will try to start up again, so you have to drop it before it starts up again, this script will try doing it 5 times, since it is sometimes not successful in the first try.

CREATE OR REPLACE FUNCTION drop_replication_slot_with_retry() RETURNS VOID AS $$
DECLARE
    attempts INT := 0;
BEGIN
    LOOP
        BEGIN
            EXECUTE 'SELECT pg_drop_replication_slot(''meilisync'')';
            
            RAISE NOTICE 'Replication slot "meilisync" has been removed';
            EXIT;
        EXCEPTION
            WHEN others THEN
                IF attempts >= 5 THEN
                    RAISE NOTICE 'Replication slot "meilisync" is still active after 5 attempts. Aborting...';
                    RAISE;
                ELSIF SQLSTATE = '55006' THEN
                    attempts := attempts + 1;
                    
                    EXECUTE 'SELECT pg_terminate_backend(pid) FROM pg_stat_replication';
                    
                    RAISE NOTICE 'Replication slot "meilisync" is active. Terminating backend and retrying...';
                    
                    PERFORM pg_sleep(1); 
                ELSE
                    RAISE;
                END IF;
        END;
    END LOOP;
END;
$$ LANGUAGE plpgsql;
SELECT drop_replication_slot_with_retry();

Having tried to implement meilisync on a postgres database I encountered some errors

- `meilisync check` did not work because `RealDictCursor` was used as `cursor_factory` and `get_count` used integer indexing, which  `RealDictCursor` does not support

    - I have made my own custom Row and Cursor types that support key indexing and integer indexing

- When deleting something from the database `columntypes` is not included in the payload. `.get` is used to fetch `columntypes` where the length of it is then computed, but `.get` is used without a default, so when it is not included, it would call `len` of `None` which gives an error

    - Added an empty list as default for the various `column` field in the payload

- I am not sure if this is an error, but in the way I want to use meilisync it is. When setting fields for an index in the `config.yml` file, it updates the documents of the index with all the fields for the affected row. This makes search results inconsistent, and could potentially reveal information to the user it should not have.

    - Made it such that when giving fields in the config, it only updates the documents with those given fields. If no fields are given it will include all fields.
@long2ice
Copy link
Owner

Thanks! Please run make style local

for table in $tables; do
args="$args -t $table"
done
meilisync refresh $args 2> /dev/null
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This may have performance problem if has large data

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tested locally with a database that had an index for a table with 100k rows, and when setting the refresh size to be 100, meaning a thousand inserts would have to take place, after 10 minutes it was still not done. I then tried with size of 1000 and it finished in 10 seconds. I know it is not exactly scalable, since a size of 1000 running 100 inserts would still be slower than a size of 1 with 100 insert, but this would indicate that having 100 times the size be the amount of rows for an index is still within a reasonable time. But even with this it might eat up a lot of RAM and processing power, so I will just remove the healthcheck and just use it for myself when running locally.

@long2ice long2ice merged commit 9a588ba into long2ice:dev Jan 31, 2024
1 check passed
@long2ice
Copy link
Owner

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants