Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to search with multiple relationships to the same table #94

Open
diresquirrel opened this issue Jun 25, 2014 · 21 comments
Open

Unable to search with multiple relationships to the same table #94

diresquirrel opened this issue Jun 25, 2014 · 21 comments

Comments

@diresquirrel
Copy link

When working with a model with multiple belongs_to relationships to the same table, the query being generated ends up ignoring the relationships due to the WHERE clause of the query going directly to the table name rather than the association. For example:

class support_ticket < ActiveRecord::Base
  belongs_to :requestor, class_name: "User"
  belongs_to :worker, class_name: "User"
  ...
  scoped_search :in => : requestor, :on => :name
  scoped_search :in => : worker, :on => :name
end

The WHERE clause in the query generated from those relationships is similar to the following:

WHERE (`users`.`name` LIKE '%john%') ...

Thus the filtering only applies to whichever association happens to JOIN to the users table without an alias. Ideally it would be:

WHERE (`workers_support_tickets`.`name` LIKE '%john%') ...

Or something similar, depending on how the db alias on the JOIN is generated by ActiveRecord for the query.

@anazar
Copy link

anazar commented Jul 16, 2014

@wvanbergen - I've encountered the same issue... any ideas on how to fix this?

@brocktimus
Copy link
Collaborator

Can you send me the full SQL of the query it generates at the moment? I don't have the code open right now, just figuring out if its already aliasing the join properly.

I think this should be fixed at the same time as #91 since they both relate to joining.

Although this one looks like it could be an easier fix. Since we already have the in key for the table we're joining to we could just use that as the table alias in the join.

I think it'll also need to be a LEFT OUTER JOIN rather than an INNER JOIN because some OR will be prefiltered by the inner join. Potentially a separate issue but very closely related.

@diresquirrel
Copy link
Author

@brocktimus - Here is the SQL for the joins in question:

SELECT
`claims`.`id` AS t0_r0,
`claims`.`company_id` AS t0_r1,
...
`users`.`first_name` AS t1_r2,
`users`.`last_name` AS t1_r3,
`users`.`email` AS t1_r4,
`users`.`role_id` AS t1_r5,
`users`.`company_id` AS t1_r6,
...
`primary_client_contacts_claims`.`first_name` AS t7_r2,
`primary_client_contacts_claims`.`last_name` AS t7_r3,
`primary_client_contacts_claims`.`email` AS t7_r4,
`primary_client_contacts_claims`.`role_id` AS t7_r5,
`primary_client_contacts_claims`.`company_id` AS t7_r6,
...
FROM `claims`
LEFT OUTER JOIN `users` ON `users`.`id` = `claims`.`negotiator_id`
...
LEFT OUTER JOIN `users` `primary_client_contacts_claims` ON `primary_client_contacts_claims`.`id` = `claims`.`primary_client_contact_id`
WHERE ((`users`.`last_name` LIKE '%Jones%'))
ORDER BY claims.created_at desc LIMIT 30 OFFSET 0

And our call to scoped_search is as follows:

scoped_search in: :primary_client_contact, on: :last_name, alias: :primary_client_contact_last_name

So as you can see in the SQL above, the WHERE clause should be referencing the users table via the primary_client_contacts_claims alias.

@anazar
Copy link

anazar commented Jul 17, 2014

@brocktimus - just confirming that the example from @diresquirrel is the same issue I'm having. Having multiple belongs_to on the same class causes issues when you want to create searches on each association.

@brocktimus
Copy link
Collaborator

@diresquirrel @anazar thanks guys, much appreciated. Could I also see:

  • The part of the model which sets up your joins to the negotiator and primary_client_contacts_claims
  • The code you're running to get that SQL. Mainly interested in joins, includes, references and search_for() bits.

If we're already doing LEFT OUTER JOINs then it should be a relatively trivial fix to change the WHERE alias I think. I'll go dive into that code now to check.

@diresquirrel
Copy link
Author

@brocktimus thanks for helping to nail this down.

Relationships in the model:

class Claim < ActiveRecord::Base
  ...
  belongs_to :negotiator, class_name: "User"
  belongs_to :primary_client_contact, class_name: "User"
  ...
end

Running the query:

  queue = Models::Claim.restrict!(current_user)
                        .includes(:negotiator, :vehicle, :company, :claim_label, :carrier, :subro)
                        .order(@search.sort_column + " " + @search.sort_direction)
                        .search_for(@search.to_s)
                        .paginate(page: page)

In this case, @search.to_s generates the following:

"primary_client_contact_last_name ~ \"Jones\""

Let me know if you need any other details.

@brocktimus
Copy link
Collaborator

OK this ends up being much more complicated than initially anticipated. The reason being how rails handles subsequent includes. I did some testing on my own projects while investigating.

For example look at the code generated in your project doing the following:

Claim.includes(:negotiator, :primary_client_contact).first
Claim.includes(:primary_client_contact, :negotiator).first

I think you'll find you end up with SQL like this

SELECT ...
FROM claims
LEFT OUTER JOIN `users` ON `users`.`id` = `claims`.`negotiator_id`
LEFT OUTER JOIN `users` `primary_client_contacts_claims` ON `primary_client_contacts_claims`.`id` = `claims`.`primary_client_contact_id`

SELECT ...
FROM claims
LEFT OUTER JOIN `users` ON `users`.`id` = `claims`.`primary_client_contact_id`
LEFT OUTER JOIN `users` `negotiators_claims` ON `negotiators_claims`.`id` = `claims`.`negotiator_id`

Basically the order they're included matters, which makes it very difficult to generate the appropriate where clause. Even if the includes are only happening internal to scoped search (not in this case) we then need to do complex tracking of "how many times we have included against this table and what each would be aliased as".

The current case is potentially even harder because they're included into the relation prior to #search_for being called. This means we would need to introspect the previous includes and then try and do all the figuring out, no idea if thats even possible.

I think rails should be aliasing all includes, just for consistency. I'm going to raise that in an issue and maybe try the pull request myself.

In the immediate term I think we can make the above query work as follows. I think this will work, at least on rails >= 4. It should work because when you pass an active record relation into a where clause it turns it into a sub select with a select(:id) appended by default.

scoped_relation = Models::Claim.search_for(@search.to_s)
queue = Models::Claim.restrict!(current_user)
                        .includes(:negotiator, :vehicle, :company, :claim_label, :carrier, :subro)
                        .order(@search.sort_column + " " + @search.sort_direction)
                        .where(id: scoped_relation)
                        .paginate(page: page)

In the short term I don't think we can do a single search against a single related table being joined to multiple times due to aforementioned complexities. My pull request here with the failing test is potentially a misnomer, if its not I'm going to try fixing it though.

Hows all that sound?

@brocktimus
Copy link
Collaborator

OK apparently that includes behaviour is working as intended rails/rails#16241.

So I'm going to have a look and see how hard it would be to make includes order independent. But that wouldn't make it in until 4.2 and is dependent on me being able to figure it out and it being allowed.

Other thing I'm considering would be to manually construct left outer joins with scoped search specific aliases. It means your query would end up having joins to scoped_search_negotiator and scoped_search_primary_contact or similar. These are completely separate from any joins / includes you do manually and only "used" by us when we're searching on that field. Potentially more work since we can't lean on AR as heavily, but that is both a good and bad thing in some respects.

Thoughts @anazar + @diresquirrel ?

@brocktimus
Copy link
Collaborator

It seems to be hitting:

The last of which you can see does different aliasing logic based on order as I thought.

Bad part being by the time it gets to the aliasing we're VERY far from the belongs_to references so I have a feeling any changes could break a lot of stuff. Will be asking rails core mailing list if its worth trying beforehand.

@diresquirrel
Copy link
Author

@brocktimus I was afraid this would be the case. I'm testing out your sample with the active record relation nested in the where clause to see if that will fit what we're trying to achieve.

The only issue I can immediately see with "scoped search"-specific left outer joins is that it could be creating duplicate joins for any tables already being included via AR. Not ideal in terms of query efficiency, but in our case it's not too horrible because for many of our queries we search against tables/relationships that we don't need to explicitly include to pull data.

I'm going to look a bit more at AR. My initial thought is that getting rid of the special case (https://github.com/rails/rails/blob/master/activerecord/lib/active_record/associations/alias_tracker.rb#L70) and treating all aliases the same would be the easiest path, but if that current implementation is the desired behavior then the solution gets much more complicated.

@brocktimus
Copy link
Collaborator

@diresquirrel thinking on it more mine will only work if its the only belongs_to of that table being searched for. Either by being the only one in the definition or an explicit search. Otherwise all searches will just hit the first alias.

I tried removing the if block which looks out of place, huge number of errors appeared. Including some segfaults! I think ts beyond my ability to try and remove that.

Yeah I think all of the options I'm coming up with are sub optimal from a query perspective. But since this library isn't designed to do super hi performance searching it shouldn't be too bad. Everything is based on ILIKE / LIKE so I think that will cause more slowness than extra joins which are used in filtering.

Options I'm coming up with I think would all be near equivalent:

  • Sub select rather than doing joins, quite awkward and likely slow on some DBs. In some basic testing PG perf was about equivalent though
  • Larger sub selects but similar principles
  • Aforementioned manual LEFT OUTER JOIN by us

So I think the manually aliased LEFT OUTER JOIN is only real option...

@anazar
Copy link

anazar commented Jul 26, 2014

@brocktimus - i agree that manually aliasing the left outer join seems like the only real option

@anazar
Copy link

anazar commented Jul 31, 2014

@brocktimus @wvanbergen -- any more thoughts on how to proceed here?

@brocktimus
Copy link
Collaborator

Maybe @wvanbergen can fix it quickly but I couldn't figure out a nice way. He's on holidays (I think) so unsure if he will reply.

I've started deconstructing the QueryBuilder to perform an analysis and then build it up using more Arel which was the big change we wanted to do when we dropped 2.x series. As a part of that I'm going to build in a JoinRegistry or similar and make sure any relationship needed for scoped search is joined once with a special scoped search alias.

@anazar
Copy link

anazar commented Aug 18, 2014

@wvanbergen -- any thoughts on this?

@brocktimus
Copy link
Collaborator

Just so everyone is aware, my current work in progress Arel Builder is https://github.com/brocktimus/scoped_search/blob/arel_builder/lib/scoped_search/arel_builder.rb.

I'm basically porting all SQL generation into Arel by just visiting the internal AST. Still got a fair amount to do and I'm only working on this in spare time. I basically want the public API to just be #wheres, #joins and #orders or similar. As little strings as possible, then less specific DB adapters should be required as well.

The #fields_for (private method) which ins't implemented will be returning Arel fields. Then its the point which "knows" which fields / tables are included in the query and can shove that into appropriate places without having to pass them around constantly.

Biggest unknowns is a few areas of the code base which have no tests or documentation so I'm unsure what they even do. ext_method stuff I'm looking at you. Then the has_many and has_many through stuff and making that work nicely across many versions of rails (they're changing that private API a lot).

@anazar
Copy link

anazar commented Oct 8, 2014

@brocktimus -- any luck making any progress on this?

@brocktimus
Copy link
Collaborator

Not yet. All this stuff is in spare time which is severely lacking of late 👎.

@ohadlevy
Copy link
Contributor

isnt this similar to what @abenari did in a7666e4 ?

@kbrock
Copy link

kbrock commented Mar 15, 2018

Is this still a problem?

@panbanda
Copy link

Yep, still a problem

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants