Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Another joinWith() approach #1618

Closed
creocoder opened this issue Dec 25, 2013 · 25 comments
Closed

Another joinWith() approach #1618

creocoder opened this issue Dec 25, 2013 · 25 comments

Comments

@creocoder
Copy link
Contributor

Suggest joinWith() removing and making one with() method with following signature:

with($with, $eagerLoading = true, $joinType = 'LEFT OUTER JOIN', $together = false);

to make API more logical.

@qiangxue
Copy link
Member

with() is an interface method shared by all DB solutions. However, not all DB solutions support join. So it's better we separate joinWith() from with().

@creocoder
Copy link
Contributor Author

So as i see the aspect of separation is:

However, not all DB solutions support join.

Ok, fine. Such DB solutions can ignore $together param for example. What other advantage of separating?

@qiangxue
Copy link
Member

The advantage of separation is that it is clearer - when you see the term joinWith(), your immediate response is that it will create a JOIN SQL and you should be extra careful. Using a $together parameter to indicate this is not as clear and could be easily overlooked.

@creocoder
Copy link
Contributor Author

@qiangxue Ok, anyway joinWith() looks for me like a fast hack to close #1581 issue. Dont you think issue was born by eliminating "together" feature from relations? How about more radical approach? I mean something like Yii 1 approach with dinamic relation options.

@qiangxue
Copy link
Member

We don't want to go for the 1.1 approach because it involves complicated table/column aliasing, which is not trivial to implement and is also tricky to use. The issue pops up because people are accustomed to the 1.1 approach which mixes the concept of data fetching and data filtering. In practice, it's very possible that you join with some relations while eager loading some other relations. Data fetching and filtering should be separated as designed in 2.0. The joinWith() implementation isn't a hack to #1581. I consider it as an important complement to with().

@slavcodev
Copy link
Contributor

I did not understand the need joinWith so this post does not make sense

@Ragazzo
Copy link
Contributor

Ragazzo commented Dec 25, 2013

@slavcodev lazyWith and eagerWith is more confusing. I think that people who knows that Yii2 AR first select all id's than select related AR by this ids would find more confusing eagerWith than joinWith, because second one is exactly pointing to what sql condition will be executed.

@creocoder
Copy link
Contributor Author

@qiangxue So the joinWith() role is strictly practical reasons, right? To simplify joins with relations.

@nineinchnick
Copy link
Contributor

I agree that this should be a separate method but I don't like @qiangxue explanation. You shouldn't state that data fetching and filtering should be separated. I think most database system, SQL or not, are faster in performing those tasks than PHP. Even NoSQL systems are advertised as providing an efficient map-reduce filtering method.

I think the most important cons are:

  • creating duplicate values in the result set
  • making the result set bigger
  • could create a large temporary data set, sometimes a subquery is faster than a join

The pros are:

  • easier to build conditions as one object
  • allow to use aggregate functions
  • very fast processing and little overhead (no primary key value lists, single query)
  • database planner can optimize the execution

I'm not listing aliasing as a con. Personally I never had trouble with that. Even if it's hard to implement it doesn't state anything about good design.

I like when three types of data loading are defined:

  • lazy loading
  • eager loading using multiple queries
  • eager loading using single query

Actually the terms lazy and eager should be well recognized by devs, as it is common to a lot of ORMs. Now introducing them to with and together creates the confusion. I'd say that eagerWith would be better than joinWith, even gramatically.

Recently I extensively audited the ActiveFinder from Yii 1 to add support for MANY_MANY through relations. It's complex, but complete. It's a shame to throw it away. I find it very useful and I build some tools around it.

To conclude, I think it's a very important feature and if it's not done well it can drive a lot devs away from Yii 2.

PS. Actually, the best solution is to use subqueries instead of separate queries and passing list of PK values. That allows cross-referencing tables with some limitations. This is just a crazy idea, I'm going to think about it more in near future when browsing the code.

@qiangxue
Copy link
Member

@creocoder Yes.

@nineinchnick The biggest drawback of AR in 2.0 is the potential performance issue for eager loading (even this may not be very true). Other than this, it has all the pros you listed for AR 1.1.

As you may have found out, the current AR 2.0 design allows it to be supported by NoSQL and other non-relational DBMS, and it even supports relation between ARs from different types of DBMS.

Regarding the term "eager" and "lazy", the rule for AR 2.0 is very simple: if you use with() or joinWith(), it is eager loading; otherwise it is lazy loading. For this reason, the terms eagerWith() and lazyWith() are not any more expressive. On the contrary, they may bring confusion (e.g. what does lazyWith() do?)

@nineinchnick
Copy link
Contributor

You're right, it's just a question what terms should we use for eager and eager-er, that is the two eager loading methods I mentioned.

I'd just like to address that the current method of eager loading by passing a list of PK values to a second query feels dirty. I will look into it and try to find a solution.

@creocoder
Copy link
Contributor Author

@nineinchnick

I'd just like to address that the current method of eager loading by passing a list of PK values to a second query feels dirty. I will look into it and try to find a solution.

You confuse eager loading with query strategy. Eager loading is wide term which related not only to ORMs. Both with() and joinWith() use eager loading methodology, but different query strategies. So your comment is related to query strategy of with() and not related to eager loading methodology at all.

@creocoder
Copy link
Contributor Author

@nineinchnick

Actually, the best solution is to use subqueries instead of separate queries and passing list of PK values.

Not all DBs support subqueries. Also there is some troubles with subqueries in some legacy mysql versions for example.

@qiangxue
Copy link
Member

I'd just like to address that the current method of eager loading by passing a list of PK values to a second query feels dirty.

We had the similar thought initially when we were redesigning AR. But our tests and investigation found that it is not that bad. It actually has many benefits, including the support for relational queries for NoSQL DBs. Performance wise, there is no proof that it would be worse than join (it may be faster in fact.) The only limitation is when you have too many PKs returned, the queries for related records could be too long to be accepted by DBMS. However, this is very uncommon.

@creocoder
Copy link
Contributor Author

@qiangxue Another idea for this:

$items = MyModel::find()
    ->with([
        'relation1' => function ($query, $options) {
            $options->together = true;
            $options->joinType = 'INNER JOIN';
        },
        'relation3' => function ($query, $options) {
            $query->andWhere(...);
            $options->together = true;
        },
        'relation4',
        'relation5',
    ])

there is no proof that it would be worse than join

I remember times when CActiveFinder::together become true by default with opposite argument :) So to be honest the real reason to avoid joins IS NOT perfomance. Reason is easy caching and some NoSQL dbs troubles. If someone will not use caching he will garantee loose perfomance with NoJOIN approach. Can prepare proofs to show NoJOIN approach without caching VS JOIN approach without caching.

@creocoder
Copy link
Contributor Author

@qiangxue And little more here:

However, not all DB solutions support join.

Fine, lets name this more abstractly as together. For RDBS its JOIN. I'm sure all NoSQL db solutions have possibility to make queries like that. Am i wrong? If not, where is the problem and why query strategy was changed to:

potential performance issue

strategy ?

@qiangxue
Copy link
Member

qiangxue commented Jan 8, 2014

@creocoder I'm interested to see your profiling results which compares non-join and join queries.

I have pretty much explained why we chose the current design and method names in my previous replies. I think we should spend time repeating the same debate again.

@creocoder
Copy link
Contributor Author

@qiangxue Ok will create profiling tests then with different environments: local sql server and remote sql server.

@qiangxue
Copy link
Member

qiangxue commented Jan 8, 2014

Cool. Make sure you disable query cache when testing: http://dev.mysql.com/doc/refman/5.1/en/query-cache.html

@cebe
Copy link
Member

cebe commented Jan 9, 2014

Fine, lets name this more abstractly as together. For RDBS its JOIN. I'm sure all NoSQL db solutions have possibility to make queries like that. Am i wrong?

No, most of them do not have the possibility to retrieve data in a join-like query. Querys do return records of one type in most cases and if it is multi type then the way the records are fetched is different from relation way but like cross type search or something like that. redis does not have it because it is key value only. elasticsearch can only fetch parent-child relation in one query. Not sure about Mongo.
In general you can not assume JOIN to be present in NoSQL DBMS Most of them have different goals and join is the most tricky part in querying data as far as I see so they leave it out as it may not easily be distributeable over a cluster for example.

@konapaz
Copy link

konapaz commented Sep 18, 2016

Is there any way to load eager data "with" or 'joinWith' for ORM by another database ?
I use Yii 2.0.9

For example
$companies = Company::find()->joinWith('city')->all();

In model Company I have

public function getCity() {
        return $this->hasOne(\common\models\City::className(), ['id' => 'city_id']);
   }

City is a model that has already configured to another database using:

 public static function getDb() {
        return Yii::$app->get('db_another_database');
    }

But is not work, the generated query does not include the name of another_database.city_table.id in condition LEFT JOIN, so a database exception occurs

I tried

$companies = Company::find()->leftJoin('another_database.city_table AS city' , 'company.city_id=city.id')->all();

But although the generated query is corrected the data of city for each company loaded by lazy method

@SilverFire
Copy link
Member

@konapaz Duplicates #5470

@konapaz
Copy link

konapaz commented Sep 19, 2016

So the best solution I found is

$companies = Company::find()
            ->joinWith(['city' => function($query) { 
                   return $query->from('another_database.city_table')
            }])->all();

So what is your opinion ? @SilverFire

@SilverFire
Copy link
Member

If you want to JOIN on plain SQL - yes, it's the best solution.
If eager loading in multiple queries is fine for you - use Company::find()->with('city')->all();

@keeper-evil
Copy link

#5470 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

9 participants