Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix gist contained #2

Closed
wants to merge 2 commits into from

Conversation

funny-falcon
Copy link
Contributor

create table tst (
  pref prefix_range
);

insert into tst
select trim(to_char(i, '00000')) from generate_series(1, 99999) as i;

select count(*)
from tst
where pref <@ '55';

create index tst_ix on tst using gist ( pref );

set enable_seqscan = off;

select count(*)
from tst
where pref <@ '55';

@dimitri
Copy link
Owner

dimitri commented Feb 19, 2011

On my way to apply the debug fix.

The <@ fix, I would apply (I'd prefer to test first, setting up things here), but don't you think we need to do the same thing for the @> code path too? If we don't, why?

Thanks for contributing!

@funny-falcon
Copy link
Contributor Author

Because, if union of prefixes doesn't contain our query, then each of prefixes also doesn't contain our query (hence to set theory). So that current implementation is correct.

I try now to extend "prefix" to allow ranges longer than one symbol.
Some preparations are going in prepare_big_range branch.
Main thought: phone number N belongs to range '123[23-45]' if len(N) >= 5 AND '12323' <= N < '12346' - last prefix is incremented.
I use this technique to work without gist index - by btree index on ( (to_part(phone)), (from_part(phone))) where to_part returns incremented last prefix and from_part - first prefix.

At the moment I stuck at the penalty function - could not write something as good as yours.
Could you describe main principle of it?

@dimitri
Copy link
Owner

dimitri commented Feb 19, 2011

I had a TODO item here, so that we have '1234-1235' ranges, and even '123-356' ones. That should allow for much denser indexes. If we want to expand on the current notation, I don't think we have to keep prefix/first/last idea, let's just go to the plain range [start, end].

Oooops, the penalty function is poorly commented, sorry about that. Will try to fix that later, but I haven't been looking at this code for more than 2 years now I think.

@funny-falcon
Copy link
Contributor Author

so that we have '1234-1235' ranges, and even '123-356' one
Yeah, that's what I want too :)

Another thought:
Since it`s primary usage case is telephony, I stick to use following rules at my work:

  • range 123000-123xxx equal to ranges 12300-123xxx and 1230-123xxx and 123-123xxx
  • range 123xxx-123999 equal to ranges 123xxx-12399 and 123xxx-1239 and 123xxx-123
  • range 123000-123999 equal to ... 123
    So that I always truncate ranges according to this rules.

But it applied only to numeric ranges. It could still count non-numeric prefixes as exact
(for example, in one case I manually add prefix 'rt' to number) and count rest of string as numeric.

What do you think about?

@lathspell
Copy link

I've also encountered this bug (http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=690160) and the patch seems to fix it. Any reasons why it is not yet applied?

@dimitri
Copy link
Owner

dimitri commented Oct 15, 2012

Lack of round tuits. Sorry about the delay, I've now been able to test it locally and apply the patch.

Thanks a lot @funny-falcon for your contribution!

@dimitri dimitri closed this Oct 15, 2012
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants