Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve RowMetadata for presence/absence check of columns #58

Merged
merged 1 commit into from
Apr 29, 2019

Conversation

nbenjamin
Copy link
Contributor

@mp911de I have created initial PR for #56. Please let me know if any changes required

@mp911de
Copy link
Member

mp911de commented Mar 13, 2019

Thanks a lot for your pull request. Typically, we start a discussion on these proposals whether we like the idea or whether we can solve the use-case in a different approach.

I'd like to hear the opinion of the rest of the people involved in R2DBC what they think about my proposal. If you have time and if it's possible for you, please join our weekly call (Fridays, see http://r2dbc.io/resources/ for details) so we can discuss the proposal together.

Regarding the call: Due to DST adjustment, the time for the call in the US is one hour later, so 7:30AM PT (USA Pacific)/10:30AM ET (USA Eastern)/3:30PM CET (Central European).

@nbenjamin
Copy link
Contributor Author

@mp911de I am new to this open source contribution model. Thanks for sharing the call details, I will try to join this.

@mp911de
Copy link
Member

mp911de commented Mar 13, 2019

Welcome to our community, great to see you engaging with open source.

@mp911de mp911de changed the title feature(RowMetadata)[gh-56]: add getColumnMetadata on RowMetadata Improve RowMetadata for presence/absence check of columns Mar 15, 2019
@nbenjamin
Copy link
Contributor Author

nbenjamin commented Mar 29, 2019

@mp911de Are we still waiting for the feedback from others on this, or if you have any other suggestions, please let me know so I can implement those changes and apply to other modules.

@mp911de
Copy link
Member

mp911de commented Mar 29, 2019

Waiting on @nebhale's input on that one.

@mp911de mp911de requested a review from nebhale March 29, 2019 14:55
@nebhale nebhale self-assigned this Apr 12, 2019
@nebhale nebhale added this to the 1.0.0.M8 milestone Apr 12, 2019
@nebhale nebhale added the type: enhancement A general enhancement label Apr 12, 2019
@mp911de
Copy link
Member

mp911de commented Apr 16, 2019

@nbenjamin Care to squash your commits to a single one and force-push?

@nbenjamin
Copy link
Contributor Author

@mp911de squashed my commits, ready for merge.

@mp911de
Copy link
Member

mp911de commented Apr 17, 2019

After reiterating on that topic, I wonder whether we should rather use Collection instead of SortedSet. The reason is that a SortedSet (e.g. TreeSet) applies a comparator for both, element ordering and lookup of elements.

With columns, we have a distinct column order that does not depend on a Comparator. What we want to achieve is enabling case-sensitive/accent-sensitive/-insensitive access to columns.
E.g.

  • getColumnNames().contains("foo") -> true
  • getColumnNames().contains("Foo") -> true
  • getColumnNames().contains("[Foo]") -> false (SQL Server syntax here, other databases can use backticks, single-/double quotes)
  • getColumnNames().contains("[foo]") -> true

Reducing the contract to Collection would retain the desired functionality and Collection is less prescriptive about ordering. However, Collection.contains(…) states that it returns true if o1.equals(o2). From a SortedSet perspective, each SortedSet is broken because in sorted sets SortedSet.contains(…) returns true if o1.compareTo(o2) == 0.

Furthermore, we expect that getColumnNames() is immutable.

So there are two two to do/discuss:

  1. Extend Javadoc:
  • Mention that the returned column names are unmodifiable in Javadoc and every attempt to modify the collection will result in UnsupportedOperationException.
  • Also mention that iteration order depends depends on the actual query and is result-specific
  • Mention that column lookup follows database sorting rules and some databases can support escape characters to enforce a particular mode of comparison
  • The returned collection does not contain duplicate column names
  1. Consider whether to use Collection vs. SortedSet.

@nbenjamin
Copy link
Contributor Author

@mp911de yes it makes sense to use Collection instead of SortedSet. Please review the latest changes

@nbenjamin
Copy link
Contributor Author

@mp911de incorporated the changes, please let me know if any changes required

@mp911de mp911de self-requested a review April 25, 2019 13:34
Copy link
Member

@mp911de mp911de left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good for a merge. I would like to await tomorrow's call before merging this one to mention that we're going to merge a breaking change so driver devs can prepare for this change.

*
* @return the column names.
*/
Collection<String> getColumnNames();
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This new method and io.r2dbc.spi.ColumnMetadata.getName() should explicitly describe whether they return real column names or their aliases in Javadoc, because JDBC has different methods for these purposes.

See also more detailed description in
r2dbc/r2dbc-h2#73 (comment)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, @katzyn for bringing this up, I have updated the javadoc.

Replace Stream with forEach

replace foreach with for loop

remove unused import

change to Collection

Change to LinkedHashSet

update jave doc
@katzyn
Copy link

katzyn commented Apr 29, 2019

BTW, at least some databases may return duplicate column aliases (names). For example

SELECT 1 A, 2 A;

returns

 a | a 
---+---
 1 | 2

in PostgreSQL and

> A A
> - -
> 1 2

in H2. Perhaps such situation should be handled somehow too.

@mp911de
Copy link
Member

mp911de commented Apr 29, 2019

You're right, only indices are unique.

Names, as you already mentioned, can occur multiple times. We do not want to grow to multimaps or the like. That being said, accessing a column by name (Row.get("A")) is subject to ambiguity. One should use RowMetadata.getColumnMetadata(0) or RowMetadata.getColumnMetadatas() with iteration to disambiguate a column.

@mp911de mp911de assigned mp911de and unassigned nebhale Apr 29, 2019
@katzyn
Copy link

katzyn commented Apr 29, 2019

But proposed implementation returns a LinkedHashSet, so it looks like you want to preserve order of columns.

Some List<String> implementation (possibly with optimized contains(String) method) with duplicates seems to be more reasonable for such query.

@katzyn
Copy link

katzyn commented Apr 29, 2019

I think that size of the returned collection should be equal to number of columns.

@katzyn
Copy link

katzyn commented Apr 29, 2019

accessing a column by name (Row.get("A")) is subject to ambiguity

Of course, R2DBC is not a JDBC, but JDBC specification exactly describes this situation for ResultSet, see section 15.2.3 Retrieving Values in JDBC 4.3:

Column labels supplied to getter methods are case insensitive. If a select list contains
the same column more than once, the first instance of the column will be returned.

It's more reasonable to declare and implement the same behavior for R2DBC.

@mp911de
Copy link
Member

mp911de commented Apr 29, 2019

The proposal started with a Set/SortedSet to quickly check whether a column name was available. getColumnNames() is meant as optimization and now that we relaxed typing to Collection we can include duplicate names when iterating over the collection. Exposing a List with case-insensitive comparison semantics doesn't seem right.

Thanks also for the JDBC reference. I think this is a valuable addition to Row.get(…) and RowMetadata.getColumnMetadata(…). We do not want to change everything, it makes sense to preserve the good parts from JDBC's behavior. Do you want to file a ticket or submit a PR to clarify .get()/getColumnMetadata() behavior?

@mp911de
Copy link
Member

mp911de commented Apr 29, 2019

@nbenjamin I'll take this PR from here. No need to add further changes.

@nbenjamin
Copy link
Contributor Author

sure @mp911de

@mp911de mp911de merged commit 12cfb9b into r2dbc:master Apr 29, 2019
mp911de added a commit that referenced this pull request Apr 29, 2019
Slightly tweak Javadoc wording. Update MockRowMetadata.getColumnNames() implementation to reflect case-insensitive lookup.

[#56][#58]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type: enhancement A general enhancement
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants