Refs #4594 - Adds ability to remove DISTINCT modifier from SQL by passin... #5241

Closed
wants to merge 2 commits into
from

Projects

None yet

4 participants

@Srokap
Collaborator
Srokap commented Mar 17, 2013

...g 'distinct' => false in $options. That does not fix the ticket, but gives ability to optimize queries without breaking compatibility.

#418 rebased on master branch

Srokap added some commits Nov 4, 2012
@Srokap Srokap Refs #4594 - Adds ability to remove DISTINCT modifier from SQL by pas…
…sing 'distinct' => false in $options. That does not fix the ticket, but gives ability to optimize queries without breaking compatibility.
68546a2
@Srokap Srokap Refs #4594 - Addded some more egef* 'distinct' usages df25045
@Srokap
Collaborator
Srokap commented Mar 17, 2013

Rebased to most recent master to pass Travis tests

@ewinslow
Member

Is there somewhere besides messages that I can test this?

@Srokap
Collaborator
Srokap commented Mar 18, 2013

We want it to be safe in almost every non-fancy egef* call. I suppose it will break on egef_metadata as it always does this n_table join (I'd like to remove it when not actually used by specifying other params).

I can dig into this deeper before you test. I'd like to add more usages anyway.

@cash
cash commented Jul 1, 2013

I would rather make the distinct be false by default. It will be easier to make it false on a major change to the API. For example, we could introduce a data mapper in a future version - something like this:

$posts = elgg()->entities->fetch(array(
    'type' => 'object',
    'subtype' => 'blog',
    'owner_guid' => $user_guid,
));

Using this API would have distinct turned off by default while using the old elgg_get_entities() has it turned on by default.

Do you think it would be possible to detect when we need to use 'distinct'?

@ewinslow
Member
ewinslow commented Jul 1, 2013

Ooooo Evan likes Cash's API idea.

Evan Winslow
http://community.elgg.org/profile/ewinslow

On Mon, Jul 1, 2013 at 11:46 AM, Cash Costello notifications@github.comwrote:

I would rather make the distinct be false by default. It will be easier to
make it false on a major change to the API. For example, we could introduce
a data mapper in a future version - something like this:

$posts = elgg()->entities->fetch(array(
'type' => 'object',
'subtype' => 'blog',
'owner_guid' => $user_guid,
));

Using this API would have distinct turned off by default while using the
old elgg_get_entities() has it turned on by default.

Do you think it would be possible to detect when we need to use 'distinct'?


Reply to this email directly or view it on GitHubhttps://github.com/Elgg/Elgg/pull/5241#issuecomment-20301828
.

@Srokap
Collaborator
Srokap commented Jul 1, 2013

I'm thinking a bit about way of introducing query object, but entities->fetch is nice as first step. I was thinking about adapting Doctrine's Query Builder https://github.com/doctrine/doctrine2/blob/master/lib/Doctrine/ORM/QueryBuilder.php + some additional API to make whole thing more tuned to Elgg plugins world (ie. how to get existing join /by smarter name than random number/ to add more wheres regarding it). Refs: #5071

Another thing is that elgg()->entities->fetch could return collection object that is lazily evaluated. Something based on ElggBatch probably. Not sure if we have a ticket for it already.

I think detecting need for distinct may be tricky to do reliably. We could do some analysis, but it seem to involve analyzing joins and wheres structure so we could catch situations when it's very likely to be necessary to use distinct (ie. join with metadata table on entity guid but where on metadata with multiple names or alternative of names?). Seems not trivial but doable to some extent.

@mrclay
Member
mrclay commented Aug 12, 2013

I don't want to hold this up debating a querying API. The commits so far look good. Proposal: Rebase. If distinct is not set in the $options array, then pre-process the returned rows watching for duplicate GUIDs. If found, filter them out and log a warning/register_error (not sure which).

@mrclay
Member
mrclay commented Nov 24, 2013

Suggest bumping this to 1.10. Manually filtering duplicates will not work for the COUNT queries so that's out. We could always change the default to false later in a new API.

@Srokap Srokap self-assigned this Feb 16, 2014
@ewinslow ewinslow removed this from the Elgg 1.10.0 milestone Jun 13, 2014
@mrclay
Member
mrclay commented Oct 20, 2014

1.x PR #7338

@mrclay mrclay closed this Oct 20, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment