New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Listening for table operations #483

Closed
cowwoc opened this Issue Aug 26, 2013 · 24 comments

Comments

Projects
None yet
2 participants
@cowwoc
Contributor

cowwoc commented Aug 26, 2013

I'd like to detect programming errors (validate application-specific rules) by "listening" to what tables are being locked, in what order, and using what kind of lock (shared vs write-lock). If I detect an inconsistent locking order, I throw an exception.

I tried implementing this by subclassing/delegating to QueryDSL classes but they aren't designed to make this easy nor is it really what I'm interested in doing. I want to minimize the amount of work needed to implement this.

Right now I've implemented this by manually invoking LockMonitor (the class that checks the locking order) alongside QueryDSL calls, but I'd like to implement this in a more elegant manner. Any ideas?

@timowest

This comment has been minimized.

Show comment
Hide comment
@timowest

timowest Aug 26, 2013

Member

Do you look for a way to intercept Querydsl DML statements or only notifications? There isn't yet a suitable mechanism for this, but this could be easily added. Preferably in a quite general way.

Member

timowest commented Aug 26, 2013

Do you look for a way to intercept Querydsl DML statements or only notifications? There isn't yet a suitable mechanism for this, but this could be easily added. Preferably in a quite general way.

@cowwoc

This comment has been minimized.

Show comment
Hide comment
@cowwoc

cowwoc Aug 26, 2013

Contributor

I agree with you that a general mechanism is preferable. My goal is to register a listener and monitor all queries from a central location. The tricky part is that it's only makes sense to probe QueryDSL's metadata immediately before a DML is executed. Only these end-points know what tables are being joined, and the type of lock being used (think of sub-queries).

I took a look at SQLQuery.getMetaData() but it's not clear whether it contains all the information I need. For example, QueryMetaData.getJoins() returns JoinExpression but JoinExpression.getTarget() returns Expression<?> but I expect RelationalPath<?> because I want to invoke RelationalPath.getType(). Another problem is that queries, updates and deletes will need to implement a common interface for this to work (I don't know if this is already the case).

... so, where do we begin? :)

Contributor

cowwoc commented Aug 26, 2013

I agree with you that a general mechanism is preferable. My goal is to register a listener and monitor all queries from a central location. The tricky part is that it's only makes sense to probe QueryDSL's metadata immediately before a DML is executed. Only these end-points know what tables are being joined, and the type of lock being used (think of sub-queries).

I took a look at SQLQuery.getMetaData() but it's not clear whether it contains all the information I need. For example, QueryMetaData.getJoins() returns JoinExpression but JoinExpression.getTarget() returns Expression<?> but I expect RelationalPath<?> because I want to invoke RelationalPath.getType(). Another problem is that queries, updates and deletes will need to implement a common interface for this to work (I don't know if this is already the case).

... so, where do we begin? :)

@timowest

This comment has been minimized.

Show comment
Hide comment
@timowest

timowest Sep 3, 2013

Member

Here is a sketch I just threw together

interface SQLListener {

  void notifyQuery(QueryMetadata md);

  void notifyDelete(QueryMetadata md, RelationalPath<?> entity);

  void notifyMerge(QueryMetadata md, RelationalPath<?> entity, 
    List<Path<?>> keys, List<Path<?>> columns, List<Expression<?>> values,
    SubQueryExpression<?> subQuery);

  void notifyMerge(QueryMetadata md, RelationalPath<?> entity,
    List<SQLMergeBatch> batches);

  void notifyInsert(QueryMetadata md, RelationalPath<?> entity,
    List<Path<?>> columns, List<Expression<?>> values,
    SubQueryExpression<?> subQuery);

  void notifyInsert(QueryMetadata md, RelationalPath<?> entity,
    List<SQLInsertBatch> batches);

  void notifyUpdate(QueryMetadata md, RelationalPath<?> entity,
    List<Pair<Path<?>, Expression<?>> updates);

  void notifyUpdate(QueryMetadata md, RelationalPath<?> entity,
    List<SQLUpdateBatch> batches);

}

This covers all the elements that are contained in the queries and DML clauses including the batches.

Member

timowest commented Sep 3, 2013

Here is a sketch I just threw together

interface SQLListener {

  void notifyQuery(QueryMetadata md);

  void notifyDelete(QueryMetadata md, RelationalPath<?> entity);

  void notifyMerge(QueryMetadata md, RelationalPath<?> entity, 
    List<Path<?>> keys, List<Path<?>> columns, List<Expression<?>> values,
    SubQueryExpression<?> subQuery);

  void notifyMerge(QueryMetadata md, RelationalPath<?> entity,
    List<SQLMergeBatch> batches);

  void notifyInsert(QueryMetadata md, RelationalPath<?> entity,
    List<Path<?>> columns, List<Expression<?>> values,
    SubQueryExpression<?> subQuery);

  void notifyInsert(QueryMetadata md, RelationalPath<?> entity,
    List<SQLInsertBatch> batches);

  void notifyUpdate(QueryMetadata md, RelationalPath<?> entity,
    List<Pair<Path<?>, Expression<?>> updates);

  void notifyUpdate(QueryMetadata md, RelationalPath<?> entity,
    List<SQLUpdateBatch> batches);

}

This covers all the elements that are contained in the queries and DML clauses including the batches.

@cowwoc

This comment has been minimized.

Show comment
Hide comment
@cowwoc

cowwoc Sep 3, 2013

Contributor

Okay. Two questions:

  1. Why does insert/update/delete/merge have QueryMetadata (the word query implies Select)?
  2. Does QueryMetadata allow me to retrieve the list of table being queried (union across entire operation, including subqueries)?
Contributor

cowwoc commented Sep 3, 2013

Okay. Two questions:

  1. Why does insert/update/delete/merge have QueryMetadata (the word query implies Select)?
  2. Does QueryMetadata allow me to retrieve the list of table being queried (union across entire operation, including subqueries)?
@timowest

This comment has been minimized.

Show comment
Hide comment
@timowest

timowest Sep 4, 2013

Member
  1. QueryMetadata contains flags and where conditions.
  2. Yes, but there is no direct access for it.
Member

timowest commented Sep 4, 2013

  1. QueryMetadata contains flags and where conditions.
  2. Yes, but there is no direct access for it.
@cowwoc

This comment has been minimized.

Show comment
Hide comment
@cowwoc

cowwoc Sep 4, 2013

Contributor
  1. Can you add direct access to the list of tables being operated on?
  2. Regarding "batches", does this contain the information actually inserted/merged/deleted and in the normal (non-batch) case the list contains a single element?
  3. That last method, containing List<List<Pair<Path<?>,Expression<?>>>> batches, looks confusing. Perhaps we can introduce some intermediate type to make this easier to read.
Contributor

cowwoc commented Sep 4, 2013

  1. Can you add direct access to the list of tables being operated on?
  2. Regarding "batches", does this contain the information actually inserted/merged/deleted and in the normal (non-batch) case the list contains a single element?
  3. That last method, containing List<List<Pair<Path<?>,Expression<?>>>> batches, looks confusing. Perhaps we can introduce some intermediate type to make this easier to read.
@timowest

This comment has been minimized.

Show comment
Hide comment
@timowest

timowest Sep 4, 2013

Member

Can you add direct access to the list of tables being operated on?

Yes

Regarding "batches", does this contain the information actually inserted/merged/deleted and in the normal (non-batch) case the list contains a single element?

Merge, Insert and Update come in pairs of methods. The first one called in non-batch cases and the second one in batch cases.

That last method, containing List<List<Pair<Path,Expression>>> batches, looks confusing. Perhaps we can introduce some intermediate type to make this easier to read.

Yes, good idea.

Member

timowest commented Sep 4, 2013

Can you add direct access to the list of tables being operated on?

Yes

Regarding "batches", does this contain the information actually inserted/merged/deleted and in the normal (non-batch) case the list contains a single element?

Merge, Insert and Update come in pairs of methods. The first one called in non-batch cases and the second one in batch cases.

That last method, containing List<List<Pair<Path,Expression>>> batches, looks confusing. Perhaps we can introduce some intermediate type to make this easier to read.

Yes, good idea.

timowest added a commit that referenced this issue Sep 4, 2013

@cowwoc

This comment has been minimized.

Show comment
Hide comment
@cowwoc

cowwoc Sep 4, 2013

Contributor

Merge, Insert and Update come in pairs of methods. The first one called in non-batch cases and the second one in batch cases.

Ah, I get it now. I misread the API (didn't notice there were two separate methods for batch and non-batch operations).

Okay, so I guess this event listener looks good. Once you add the ability to look up the list of affected tables, I should be able to test it.

Contributor

cowwoc commented Sep 4, 2013

Merge, Insert and Update come in pairs of methods. The first one called in non-batch cases and the second one in batch cases.

Ah, I get it now. I misread the API (didn't notice there were two separate methods for batch and non-batch operations).

Okay, so I guess this event listener looks good. Once you add the ability to look up the list of affected tables, I should be able to test it.

@timowest

This comment has been minimized.

Show comment
Hide comment
@timowest

timowest Sep 11, 2013

Member

Do you have any opinion how the listener should be given to the queries and clauses? At first I thought about putting it as a member into Configuration, but the listener lifecycle seems shorter.

Maybe some kind of Session concept is also needed now on the Querydsl SQL level?

Session {
    Configuration configuration;
    SQLListener listener;
   // future: first & second level cache etc
}
Member

timowest commented Sep 11, 2013

Do you have any opinion how the listener should be given to the queries and clauses? At first I thought about putting it as a member into Configuration, but the listener lifecycle seems shorter.

Maybe some kind of Session concept is also needed now on the Querydsl SQL level?

Session {
    Configuration configuration;
    SQLListener listener;
   // future: first & second level cache etc
}
@cowwoc

This comment has been minimized.

Show comment
Hide comment
@cowwoc

cowwoc Sep 12, 2013

Contributor
  1. The design should support multiple listeners.
  2. I'm not too familiar with the purpose of Configuration. Can you please explain why SQLListener would have a shorter lifecycle than Configuration?
Contributor

cowwoc commented Sep 12, 2013

  1. The design should support multiple listeners.
  2. I'm not too familiar with the purpose of Configuration. Can you please explain why SQLListener would have a shorter lifecycle than Configuration?
@timowest

This comment has been minimized.

Show comment
Hide comment
@timowest

timowest Sep 13, 2013

Member
  1. Yes, that's possible
  2. Configuration is meant to be used for the whole lifetime of the running application. It is used to declare mappings from Java to JDBC types. SQLListener might be used for shorter lifetimes, for example if it is used to record change sets and to handle them at transaction commit/rollback boundaries.
Member

timowest commented Sep 13, 2013

  1. Yes, that's possible
  2. Configuration is meant to be used for the whole lifetime of the running application. It is used to declare mappings from Java to JDBC types. SQLListener might be used for shorter lifetimes, for example if it is used to record change sets and to handle them at transaction commit/rollback boundaries.
@cowwoc

This comment has been minimized.

Show comment
Hide comment
@cowwoc

cowwoc Sep 13, 2013

Contributor

I understand.

I would insert this behavior at two levels:

  1. Per-query listeners.
  2. Global listeners.

For per-query listeners, you'd invoke addListener() on an individual query object. This is the bulk of the work, is conceptually easy to understand and is thread-safe.

Global listeners are a convenience mechanism (inversion of control) where you register the listeners once, and anytime a new query instance is instantiated, it automatically pulls the currently-registered listeners from some global location.

For the latter case, you'd need to ensure thread-safety and consider whether users need to be able to change the listeners over the lifetime of the Configuration. I personally don't see a need to removeListener() or modify the list of global listener over the lifetime of Configuration but it doesn't mean others won't come up with such a case one day.

Contributor

cowwoc commented Sep 13, 2013

I understand.

I would insert this behavior at two levels:

  1. Per-query listeners.
  2. Global listeners.

For per-query listeners, you'd invoke addListener() on an individual query object. This is the bulk of the work, is conceptually easy to understand and is thread-safe.

Global listeners are a convenience mechanism (inversion of control) where you register the listeners once, and anytime a new query instance is instantiated, it automatically pulls the currently-registered listeners from some global location.

For the latter case, you'd need to ensure thread-safety and consider whether users need to be able to change the listeners over the lifetime of the Configuration. I personally don't see a need to removeListener() or modify the list of global listener over the lifetime of Configuration but it doesn't mean others won't come up with such a case one day.

@cowwoc

This comment has been minimized.

Show comment
Hide comment
@cowwoc

cowwoc Sep 13, 2013

Contributor

I think that SQLListener will mostly likely be bound to a request scope (which consists of one or more transactions).

Contributor

cowwoc commented Sep 13, 2013

I think that SQLListener will mostly likely be bound to a request scope (which consists of one or more transactions).

timowest added a commit that referenced this issue Sep 14, 2013

timowest added a commit that referenced this issue Sep 14, 2013

timowest added a commit that referenced this issue Sep 24, 2013

@timowest

This comment has been minimized.

Show comment
Hide comment
@timowest

timowest Oct 20, 2013

Member

Released in 3.2.4

Member

timowest commented Oct 20, 2013

Released in 3.2.4

@timowest timowest closed this Oct 20, 2013

@cowwoc

This comment has been minimized.

Show comment
Hide comment
@cowwoc

cowwoc Oct 24, 2013

Contributor

Timo,

Can you add direct access to the list of tables being operated on?

Yes

Did you ever get around to adding this? And how do I find out if SELECT ... FOR UPDATE was used?

Contributor

cowwoc commented Oct 24, 2013

Timo,

Can you add direct access to the list of tables being operated on?

Yes

Did you ever get around to adding this? And how do I find out if SELECT ... FOR UPDATE was used?

@timowest

This comment has been minimized.

Show comment
Hide comment
@timowest

timowest Oct 24, 2013

Member

Did you ever get around to adding this?

Yes, it is available via RelationalPathExtractor
http://www.querydsl.com/static/querydsl/3.2.4/apidocs/com/mysema/query/sql/RelationalPathExtractor.html

And how do I find out if SELECT ... FOR UPDATE was used?

That is currently not possible in a clean way, since Querydsl SQL handles for update as a string addition to the end of the query.

The flags are available via queryMetadata.getFlags()

Member

timowest commented Oct 24, 2013

Did you ever get around to adding this?

Yes, it is available via RelationalPathExtractor
http://www.querydsl.com/static/querydsl/3.2.4/apidocs/com/mysema/query/sql/RelationalPathExtractor.html

And how do I find out if SELECT ... FOR UPDATE was used?

That is currently not possible in a clean way, since Querydsl SQL handles for update as a string addition to the end of the query.

The flags are available via queryMetadata.getFlags()

@cowwoc

This comment has been minimized.

Show comment
Hide comment
@cowwoc

cowwoc Oct 24, 2013

Contributor

So, I finally got around to using the event listeners and here is some feedback:

  • My listener checks for a consistent locking order in order to reduce the probability of deadlocks. It needs access to the Connection to check the transaction isolation mode because READ_COMMITTED (or lower) do not hold locks so I need to take this into consideration.
  • My listener needed access to the SQLTemplates in order to check for SELECT ... FOR UPDATE.

So this makes me wonder whether listeners should be given a reference to:

  • Connection
  • SQLTemplates

perhaps by way of QueryMetadata.

We also need to find an easier way to check for FOR UPDATE. On that topic:

  1. Why does QueryFlag's first constructor invoke this(position, TemplateExpressionImpl.create(Object.class, flag));? Shouldn't the type be String.class instead of Object.class?
  2. Is there a QueryFlag "extractor" that'll give me back a string to compare against SQLTemplates.getForUpdate()?
Contributor

cowwoc commented Oct 24, 2013

So, I finally got around to using the event listeners and here is some feedback:

  • My listener checks for a consistent locking order in order to reduce the probability of deadlocks. It needs access to the Connection to check the transaction isolation mode because READ_COMMITTED (or lower) do not hold locks so I need to take this into consideration.
  • My listener needed access to the SQLTemplates in order to check for SELECT ... FOR UPDATE.

So this makes me wonder whether listeners should be given a reference to:

  • Connection
  • SQLTemplates

perhaps by way of QueryMetadata.

We also need to find an easier way to check for FOR UPDATE. On that topic:

  1. Why does QueryFlag's first constructor invoke this(position, TemplateExpressionImpl.create(Object.class, flag));? Shouldn't the type be String.class instead of Object.class?
  2. Is there a QueryFlag "extractor" that'll give me back a string to compare against SQLTemplates.getForUpdate()?
@cowwoc

This comment has been minimized.

Show comment
Hide comment
@cowwoc

cowwoc Oct 24, 2013

Contributor

FYI, here is how I ended up checking for FOR UPDATE:

boolean forUpdate = false;
for (QueryFlag queryFlags: md.getFlags())
{
    Expression<?> flag = queryFlags.getFlag();
    if (!(flag instanceof TemplateExpression))
        continue;
    TemplateExpression<?> expression = (TemplateExpression<?>) flag;
    Template template = expression.getTemplate();
    if (template.toString().equals(dialect.getForUpdate()))
    {
        forUpdate = true;
        break;
    }
}

Definitely not intuitive or efficient.

Contributor

cowwoc commented Oct 24, 2013

FYI, here is how I ended up checking for FOR UPDATE:

boolean forUpdate = false;
for (QueryFlag queryFlags: md.getFlags())
{
    Expression<?> flag = queryFlags.getFlag();
    if (!(flag instanceof TemplateExpression))
        continue;
    TemplateExpression<?> expression = (TemplateExpression<?>) flag;
    Template template = expression.getTemplate();
    if (template.toString().equals(dialect.getForUpdate()))
    {
        forUpdate = true;
        break;
    }
}

Definitely not intuitive or efficient.

@cowwoc

This comment has been minimized.

Show comment
Hide comment
@cowwoc

cowwoc Oct 24, 2013

Contributor

3.. Please add more Javadoc to SQLListener, especially for the batch operations. For example, in the case of notifyDeletes(QueryMetadata md, RelationalPath<?> entity, List<QueryMetadata> batches) it's not clear what is the difference between the first and last argument (if the last argument contains the batches what do I need the first argument for?)

Contributor

cowwoc commented Oct 24, 2013

3.. Please add more Javadoc to SQLListener, especially for the batch operations. For example, in the case of notifyDeletes(QueryMetadata md, RelationalPath<?> entity, List<QueryMetadata> batches) it's not clear what is the difference between the first and last argument (if the last argument contains the batches what do I need the first argument for?)

@cowwoc

This comment has been minimized.

Show comment
Hide comment
@cowwoc

cowwoc Oct 24, 2013

Contributor

4.. Why does RelationalPathExtractor return a List instead of a Set? Is it because technically speaking the SQL specification does not guarantee execution order for multi-table operations? Could returning a Set instead of a List ever get us in trouble?

Contributor

cowwoc commented Oct 24, 2013

4.. Why does RelationalPathExtractor return a List instead of a Set? Is it because technically speaking the SQL specification does not guarantee execution order for multi-table operations? Could returning a Set instead of a List ever get us in trouble?

@timowest

This comment has been minimized.

Show comment
Hide comment
@timowest

timowest Oct 25, 2013

Member

Maybe SQLQueryMetadata as a subinterface of QueryMetadata could work. It could provide SQL specific flags in a direct way (e.g. sqlQueryMetadata.isForUpdate()).

  1. Object is the class of the of the template expression.
  2. No, but that could be covered via sqlQueryMetadata.
  3. Ok.
  4. RelationalPathExtractor returns a Set
Member

timowest commented Oct 25, 2013

Maybe SQLQueryMetadata as a subinterface of QueryMetadata could work. It could provide SQL specific flags in a direct way (e.g. sqlQueryMetadata.isForUpdate()).

  1. Object is the class of the of the template expression.
  2. No, but that could be covered via sqlQueryMetadata.
  3. Ok.
  4. RelationalPathExtractor returns a Set
@cowwoc

This comment has been minimized.

Show comment
Hide comment
@cowwoc

cowwoc Oct 25, 2013

Contributor
  1. I still don't understand this. What are some possible values (aside from Object) for this property?
  2. Okay, do I need to open a separate issue for this?
  3. Okay, I'll keep my eye open for a related commit.
  4. Right, but my question was, why?
Contributor

cowwoc commented Oct 25, 2013

  1. I still don't understand this. What are some possible values (aside from Object) for this property?
  2. Okay, do I need to open a separate issue for this?
  3. Okay, I'll keep my eye open for a related commit.
  4. Right, but my question was, why?
@timowest

This comment has been minimized.

Show comment
Hide comment
@timowest

timowest Oct 25, 2013

Member

I still don't understand this. What are some possible values (aside from Object) for this property?

In general Object is used as the template type if the template expression represents an SQL snippet which doesn't have any semantics as an individual expression

Okay, do I need to open a separate issue for this?

Yes, that might be better covered in another issue

Okay, I'll keep my eye open for a related commit.
Right, but my question was, why?

You asked why List and not Set. With List the semantics would be much more tricky.

  • What kind of order should be used?
  • What about parent paths of columns?
  • What are use cases where the order is significant?
Member

timowest commented Oct 25, 2013

I still don't understand this. What are some possible values (aside from Object) for this property?

In general Object is used as the template type if the template expression represents an SQL snippet which doesn't have any semantics as an individual expression

Okay, do I need to open a separate issue for this?

Yes, that might be better covered in another issue

Okay, I'll keep my eye open for a related commit.
Right, but my question was, why?

You asked why List and not Set. With List the semantics would be much more tricky.

  • What kind of order should be used?
  • What about parent paths of columns?
  • What are use cases where the order is significant?
@cowwoc

This comment has been minimized.

Show comment
Hide comment
@cowwoc

cowwoc Oct 25, 2013

Contributor

You asked why List and not Set. With List the semantics would be much more tricky.

What kind of order should be used?
What about parent paths of columns?
What are use cases where the order is significant?

Okay. Let's leave it as Set. In any case I shouldn't be relying on the table order because the SQL specification doesn't define a locking order for multi-table operations.

Contributor

cowwoc commented Oct 25, 2013

You asked why List and not Set. With List the semantics would be much more tricky.

What kind of order should be used?
What about parent paths of columns?
What are use cases where the order is significant?

Okay. Let's leave it as Set. In any case I shouldn't be relying on the table order because the SQL specification doesn't define a locking order for multi-table operations.

@timowest timowest added this to the 3.2.4 milestone Apr 13, 2014

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment