1) If an SQL::Identifier object containing a double underscore was used as table in the receiving dataset or one of the arguments, the automatically created alias symbol for it contained a double underscore, and it was used to qualify other identifiers, leading to incorrect/broken SQL used. Wrap such alias symbols in SQL::Identifiers before using them as qualifiers to avoid the issue. 2) When the first dataset graphed contains a join (which results in an implicit from_self), override the implicit_quailfier with the alias used in the from_self, so that the join conditions are set up correctly. 3) When the first dataset graphed selects from a quailfied identifier, make the join conditions and the selected columns for that dataset use the fully-qualified name. 4) When eager_graphing, set the implicit qualifier for the initial graphs to the first source of the dataset, instead of to the alias symbol.
…es adapter Using cursors to implement paging should always be faster than the default strategies used by paged_each, and it also has fewer restrictions (works on unordered datasets), so it should be the default. Currently, I've made it the only approach, since I can't think of a reason you'd want to use one of the other strategies. If for some reason one of the other strategies makes sense, I'll be happy to offer a way to use one of the default strategies.
…pdating rows based on a cursor's current position This should be the fastest way to update a large dataset where you need to update individual rows because the logic for how to update the rows is in the application and not the database.
…rting cursor use outside of a transaction This can be useful for long running cursors that don't need to see a consistent view of the database, and where keeping a transaction open may negatively affect performance.
Cursors are the fastest way to implement paging. The postgres adapter already has a use_cursor method for using cursors (predating the introduction of paged_each), this just integrates the two so that you can get a cursor implementation using the paged_each API.
…mance This new strategy makes #paged_each not use offsets, but instead use filters based on the last retreived row to implement the paging. Using this new filtering approach is signficantly faster than using the previous offset approach. In simple testing: indexed unindexed 100000 rows 7.3x 4.1x 200000 rows 13.4x 4.6x 300000 rows 19.9x 4.6x Unfortunately, determining when this approach is safe to use is difficult. In SQL, identifiers appearing in the ORDER BY clause can be aliased in the SELECT clause, and the ORDER BY clause can refer to either the identifier or the alias, and they can be ambiguous (SELECT a AS b, b AS a FROM c ORDER BY a, b), and I'm not sure if all databases handle the rules per the SQL standard (it would surprise me if so). It's also possible to order on columns not being selected, in which case you can't get the values for the last retrieved row. Also, if the value for any of the column values for the last retrieved row are NULL, you can't use a filtering based approach. Basically, there are too many corner cases to turn this on by default, which is why it requires manually specifying the option. To handle nontrivial cases, when using the :strategy=>:filter option, you can also provide a :filter_value option proc. This proc takes the last retrieved row and an array of order by expressions, and should return an array of values in the last retrieved row related to those expressions. The default is to guess at the symbols used and just look in the hash for them, but that can fail in some of the cases mentioned above.
This makes three separate but related changes to remove extra parens. When creating complex expressions with an arbitrary number of arguments, strip off any NOOP expressions. This will allow things like AND(foo, NOOP(AND(bar, baz)) to combine into AND(foo, bar, baz). When using Sequel.& and related methods, in the single argument case, return the argument directly if it is already of the correct class. If the single argument is not already of the correct class, wrap it in a NOOP.
alias is a keyword in ruby, so when I originally wrote the code I didn't want to use it. However, having method names that overlap with keywords is fine as long as you call the methods with an explicit receiver. I think this makes the calling code easier to read.
…ression objects as first argument to Dataset#graph
…set_associations plugin This adapts the limit strategies used for filtering by associations to the dataset_associations plugin. This also fixes an issue with associations that require joins. Previously, the receiving dataset was joined directly, but that can break the query if the receiving dataset used an unqualified identifier that was made ambiguous by the join. Instead of joining directly, a nested subselect is now used.
Some of these guards are only necessary for a specific adapter, in which case they were made adapter specific. Some of these don't need guards at all on Microsoft SQL Server 2012, so I removed those guards.
Use a timestamp format that will work on ruby 1.8. Remove related guards from mssql adapter spec.
MSSQL will support 3 millisecond decimal places
This doesn't change the default input string to include fractional seconds in timestamps, since we can't be sure if the underlying database supports that.
…tgreSQL These make it easier to deal with searching based on user input, since users generally will not be using text search boolean operators or weight labels.
This applies the same type of strategy as eager loading limited associations to filtering by limited associations. If a limited association is used when filtering by associations, Sequel automatically uses the more complex filter that would be used if the association had conditions, but with an additional subquery in that filter to restrict the results to only those where the object meets the limit requirements for the association. This change can cause a significant slowdown, but it is required to get correct results, and my assumption is that anyone doing: Artist.where(:first_10_albums=>Album) really wants the artists where that album is one of their first 10 albums, not artists where that album is their 11th album.
Previously, the strategy Sequel used for associations with conditions was to start with the filter for the association without a condition, and then add another filter for the conditions part. However, the filter for the conditions in general just repeated most of the code for the filter for the association without the condition. This was especially bad for associations involving joins, as there would be multiple subqueries instead of a single subquery. This new code just drops the filter for the association without a condition, fixing a few corner cases for many_*_many associations where an IS NOT NULL condition was not added. While here, add an association reflection helper method for returning the dataset with the eager block condition already added. This reorders some clauses, but as the specs would be changing anyway due to removal of other clauses, it makes sense to do it at the same time.
No code changes, but this makes sure to test that eager loading works correctly when there are associated objects for multiple current objects.
…ders If an order is specified for a *_one association, then the association is probably not a true one-to-one association, and therefore could probably benefit from an eager limit strategy. Previously, this was only done automatically for *_one associations if they had offsets. With this change, the distinct on eager limit strategy can be used by default.
This is to stop the default eager_graphers from looking into the eager_graph metadata in the dataset. I've decided not to document this, as the use case for it is very rare.
This expands on the idea of limit strategies for eager and applies it to eager_graph. Both the distinct_on and window_function limit strategies are supported. Unlike in the eager case, the eager_graph case does not use a limit strategy by default, as it is possible (and I expected likely in the general case) to perform significantly worse. As eager_graph already takes an arbitrary number of arguments any of which can be hashes, it can't support an options hash, so add a eager_graph_with_options method that just takes the associations as a single argument and supports an option hash. Remove the protected new_eager_graph method added recently, and use eager_graph_with_options instead. Among other things, this allows you to do a regular eager_graph while overriding the join type. Start passing the :join_type method to the :eager_grapher proc, instead of having it look in the dataset's :eager_graph metadata. Since non-ruby eager_graph limit strategies are now supported, make the eager_graph code keep of which associations need to be manually sliced, and modify the EagerGraphLoader to use that metadata. Refactor some of the association reflection code to DRY it up.
…are no associated results
…sion proc in the pg_array extension The previous code was too strict. A missing conversion proc is an optimization for the case of a conversion proc that returns the argument unmodified, so it shouldn't cause an error to be raised.
…n creating new associated objects in the nested_attributes plugin If you have a presence validation on a foreign key in the associated object, it would previously fail when using nested attributes, because all validations are done before the parent object is saved, and the foreign key value is not available until after the parent object is saved (if the parent object is also a new object). Handle this case by assigning the foreign key value for the associated object before validation, and if the parent object doesn't have a primary key, setting it to a dummy zero value. If a dummy value is used, it is removed after validation to avoid issues if the save isn't successful.