Skip to content

[CALCITE-3560] Additional calcite.util.Source implementation for generic text source (eg. CharSource)#1632

Merged
asereda-gs merged 1 commit intoapache:masterfrom
asereda-gs:source-reader
Dec 5, 2019
Merged

[CALCITE-3560] Additional calcite.util.Source implementation for generic text source (eg. CharSource)#1632
asereda-gs merged 1 commit intoapache:masterfrom
asereda-gs:source-reader

Conversation

@asereda-gs
Copy link
Member

@asereda-gs asereda-gs commented Dec 4, 2019

Currently calcite calcite.util.Source interface can be built only from File or URL. This forces data to be stored on disk or accessed remotely using URL api.

This proposal adds another Source implementation on the top of CharSource so calcite adapters can operate on in-memory elements like String or generic text sources.

…ric text source (eg. CharSource)

Currently calcite calcite.util.Source interface can be built only from File or URL. This forces data to be stored on disk or accessed remotely using URL api.

This proposal adds another Source implementation on the top of CharSource so calcite adapters can operate on in-memory elements like String or generic text sources.
Comment on lines +149 to +150
@Override public Source relative(final Source source) {
throw unsupported();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is used in

builder.put(sourceSansJson.relative(baseSource).path(), table);

So it looks like the implementation should be not just a CharSource, but it should be like a virtual file system, so it is not clear how you want CharSource to blend into csv and other adapters.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CsvSchema won't create Sources on the top of CharSource. New (CharSource) Source implementation was created for users who would instantiate JsonTable directly (via constructor).

On your point with virtual file system, I think Source API is a little "coarse" in the sense that it exposes two functionalities:

  1. Operations on FS / URL path elements (trim / append / relative)
  2. supplier of InputStream / Reader (a la CharSource / ByteSource)

Obviously you can't do (1) when your data-source is String and not a File.

I was thinking of introducing something like Locator API where one can compose / operate on path elements and then expose CharSource from Locator. To some extent Source = Locator + CharSource. This requires more thought and I'm still not convinced that it worths it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vlsi are you OK if I merge this PR as is ?
I don't think there is a nice work-around for Source API without "origin" (file/url)

* @return {@code Source} delegate for {@code CharSource} (can't be null)
* @throws NullPointerException when {@code source} is null
*/
public static Source fromCharSource(CharSource source) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this significantly better than java.io.Reader and/or java.io.InputStream?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. The idea is that Reader / InputStream are "single-pass" in the sense that you can't re-read them. CharSource is a supplier of Reader similarly to File / URL.

You can't provide InputStream to Source because it can be read only once.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I see. That is fine then.

@asereda-gs asereda-gs merged commit 9e80eb6 into apache:master Dec 5, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants