I have been using pg_vector in a project and so far I'm loving it!
One thing that I've noticed is that, when using an IVFFlat index, new insertions are more difficult to find (i.e, for 100 lists they require probes to go to a huge number, like 80-90). I guess that's because, while these new entries get added to the index, they don't run k-means in order to be placed in the proper inverted list.
The issue goes away by reindexing often. I have added a cronjob that runs a reindex every X new rows, but I was wondering if there's any way to achieve the same thing (or maybe whether there are plans to do it) without this cronjob?
I have seen that #105 and #136 touch a similar subject, so I guess for now the best option would be the cronjob?
Just to clarify, I did create the index after all the initial data was in the table. It's the new rows added after that who worry me.
I have been using pg_vector in a project and so far I'm loving it!
One thing that I've noticed is that, when using an IVFFlat index, new insertions are more difficult to find (i.e, for 100 lists they require probes to go to a huge number, like 80-90). I guess that's because, while these new entries get added to the index, they don't run k-means in order to be placed in the proper inverted list.
The issue goes away by reindexing often. I have added a cronjob that runs a reindex every X new rows, but I was wondering if there's any way to achieve the same thing (or maybe whether there are plans to do it) without this cronjob?
I have seen that #105 and #136 touch a similar subject, so I guess for now the best option would be the cronjob?
Just to clarify, I did create the index after all the initial data was in the table. It's the new rows added after that who worry me.