Fix gist contained #2

funny-falcon · 2011-02-17T06:32:26Z

create table tst (
  pref prefix_range
);

insert into tst
select trim(to_char(i, '00000')) from generate_series(1, 99999) as i;

select count(*)
from tst
where pref <@ '55';

create index tst_ix on tst using gist ( pref );

set enable_seqscan = off;

select count(*)
from tst
where pref <@ '55';

dimitri · 2011-02-19T20:56:36Z

On my way to apply the debug fix.

The <@ fix, I would apply (I'd prefer to test first, setting up things here), but don't you think we need to do the same thing for the @> code path too? If we don't, why?

Thanks for contributing!

funny-falcon · 2011-02-19T21:30:15Z

Because, if union of prefixes doesn't contain our query, then each of prefixes also doesn't contain our query (hence to set theory). So that current implementation is correct.

I try now to extend "prefix" to allow ranges longer than one symbol.
Some preparations are going in prepare_big_range branch.
Main thought: phone number N belongs to range '123[23-45]' if len(N) >= 5 AND '12323' <= N < '12346' - last prefix is incremented.
I use this technique to work without gist index - by btree index on ( (to_part(phone)), (from_part(phone))) where to_part returns incremented last prefix and from_part - first prefix.

At the moment I stuck at the penalty function - could not write something as good as yours.
Could you describe main principle of it?

dimitri · 2011-02-19T21:39:58Z

I had a TODO item here, so that we have '1234-1235' ranges, and even '123-356' ones. That should allow for much denser indexes. If we want to expand on the current notation, I don't think we have to keep prefix/first/last idea, let's just go to the plain range [start, end].

Oooops, the penalty function is poorly commented, sorry about that. Will try to fix that later, but I haven't been looking at this code for more than 2 years now I think.

funny-falcon · 2011-02-20T07:53:51Z

so that we have '1234-1235' ranges, and even '123-356' one
Yeah, that's what I want too :)

Another thought:
Since it`s primary usage case is telephony, I stick to use following rules at my work:

range 123000-123xxx equal to ranges 12300-123xxx and 1230-123xxx and 123-123xxx
range 123xxx-123999 equal to ranges 123xxx-12399 and 123xxx-1239 and 123xxx-123
range 123000-123999 equal to ... 123
So that I always truncate ranges according to this rules.

But it applied only to numeric ranges. It could still count non-numeric prefixes as exact
(for example, in one case I manually add prefix 'rt' to number) and count rest of string as numeric.

What do you think about?

lathspell · 2012-10-11T08:57:15Z

I've also encountered this bug (http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=690160) and the patch seems to fix it. Any reasons why it is not yet applied?

dimitri · 2012-10-15T09:33:41Z

Lack of round tuits. Sorry about the delay, I've now been able to test it locally and apply the patch.

Thanks a lot @funny-falcon for your contribution!

funny-falcon added 2 commits February 17, 2011 09:28

fix debug mode

c57de26

fix gist contained

99a5739

dimitri closed this Oct 15, 2012

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix gist contained #2

Fix gist contained #2

funny-falcon commented Feb 17, 2011

dimitri commented Feb 19, 2011

funny-falcon commented Feb 19, 2011

dimitri commented Feb 19, 2011

funny-falcon commented Feb 20, 2011

lathspell commented Oct 11, 2012

dimitri commented Oct 15, 2012

Fix gist contained #2

Fix gist contained #2

Conversation

funny-falcon commented Feb 17, 2011

dimitri commented Feb 19, 2011

funny-falcon commented Feb 19, 2011

dimitri commented Feb 19, 2011

funny-falcon commented Feb 20, 2011

lathspell commented Oct 11, 2012

dimitri commented Oct 15, 2012