get_rel_size() and get_path_keys() questions #43

johnmudd · 2013-12-06T20:16:54Z

I just want to verify some details before I implement these methods. My underlying data is ISAM files with indexes.

get_rel_size():
Do I return the number of rows that I expect my execute() method to yield or the numbers of rows that Postgres will return after applying the full search criteria? In get_rel_size() I plan to look at the quals (if any), determine which indexes I can use based on the quals, calculate the number of rows I'll yield for each index and return the lowest number, the number associated with the “best” index. It's similar to the steps I'll follow in execute().

My assumption is that I should use an index whenever possible to help reduce the number of records I’ll yield and to minimize the number of reads I’ll perform to find these candidate records. I also assume there’s not much benefit in applying every qual to the records. Postgres can do that filtering about as efficiently as I can. And I assume in some cases the full search criteria is not represented in the quals so in those cases I have no choice but to let Postgres perform the final filtering.

Will execute() always receive the exact same quals as get_rel_size()? If so then I'll save my index selection from get_rel_size() for later use in execute(). In fact, I probably should cache my decisions since the app will be repeating a limited set of queries.

get_path_keys():
I plan to return one row count estimate for each index in my ISAM file. In the case of compound indexes I'll return one estimate for each combination of fields. So in the case of an index of three fields such as [a,b,c] I’ll return row counts for [a], [b], [a,b], [c], [a,c], [b,c] and [a,b,c]. I’m wondering if these estimates on the combinations will be misleading though. I think I can return accurate estimates but the numbers won’t reflect the amount of reads required to locate the records. For example, if only field “a” appears in the search criteria then I can use the index to go directly to the target records. But it will be more complicated for searches by “c” alone. The index will help me skip some records but in the worst case I will have to read all records to find matches on “c” alone. Is that a valid concern?

johnmudd closed this as completed Dec 9, 2013

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

get_rel_size() and get_path_keys() questions #43

get_rel_size() and get_path_keys() questions #43

johnmudd commented Dec 6, 2013

get_rel_size() and get_path_keys() questions #43

get_rel_size() and get_path_keys() questions #43

Comments

johnmudd commented Dec 6, 2013