Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Zero-config CSV queries #135

Closed
nfultz opened this issue Aug 6, 2019 · 8 comments
Closed

Zero-config CSV queries #135

nfultz opened this issue Aug 6, 2019 · 8 comments
Labels
enhancement New feature or request good first issue Good for newcomers

Comments

@nfultz
Copy link

nfultz commented Aug 6, 2019

I'm not sure how feasible this may be, but the old logparser program from Microsoft supported queries like

Select * from test.csv

And it would read from test.csv, without needing any config files.

@cube2222
Copy link
Owner

cube2222 commented Aug 6, 2019

Hey, I think some thought must be put into if this wouldn't cause any trouble with ambivalent meanings further down the road.

However, what we want to support for sure is allowing to input csv / json data via standard input for queries like:

Select * from stdin

Maybe a table valued function would be a good idea? That would be very easy to add after the recent parser customizations.

Select * from loadfile(type => 'csv', path => 'test.csv')

Not sure though if it's a good idea to provide so many ways to solve a single problem (open a csv file)

@cube2222 cube2222 added enhancement New feature or request good first issue Good for newcomers labels Aug 8, 2019
@nfultz
Copy link
Author

nfultz commented Aug 8, 2019

Select * from stdin

This would be fantastic! One problem would be, how to tell what format stdin is providing? I can imagine people piping in json at least as often CSV. logparser dealt with that by having a gazillion -i:format flags, but I don't find that very elegant.

Another option would be to pipe in a config from stdin, something like this may not be too bad:

octosql -c - "Select * from tbl"

Maybe a table valued function would be a good idea? That would be very easy to add after the recent parser customizations.

Select * from loadfile(type => 'csv', path => 'test.csv')

Not sure though if it's a good idea to provide so many ways to solve a single problem (open a csv file)

This would be a very general, but doesn't seem very standard. I'll take a look at how Hive, Presto try to deal with this, maybe there's some good ideas there.

@cube2222
Copy link
Owner

I think that discerning between csv and json wouldn't be too hard :) and I don't think there are much more formats people would like to pipe in through stdin (Excel?)

@tooolbox
Copy link

tooolbox commented May 1, 2021

I very much like this idea. Having the ability to arbitrarily query and join a few files without needing a config file would be very convenient.

Maybe a table valued function would be a good idea? That would be very easy to add after the recent parser customizations.

I think this is a decent idea, but I don't see why the shorthand select * from test.csv couldn't work. If need be, you could require it to obviously be a path such as select * from ./test.csv Also something like

select * from `./test.xlsx`.Sheet1

for Excel (using the MySQL quoting there).

If you do go the function route, I would imagine the type param could be inferred from the file extension, so you shouldn't have to write it most of the time. And for Excel, the sheet name and any other data could be specified as function params.

@cube2222
Copy link
Owner

cube2222 commented May 1, 2021

I'm actually working on a rewrite of OctoSQL on the redesign branch and this syntax is supported there. Examples:

octosql "SELECT * FROM test.json"
octosql "SELECT * FROM ./test/test.json"

I'll probably release an initial version in the following months.

@tooolbox
Copy link

tooolbox commented May 1, 2021

That's very exciting, thanks for sharing!

@tooolbox
Copy link

Just curious, it seems like you released this new zero-config, but I don't see Excel support (even though I see previous commits for it). Was this lost in the shuffle?

@cube2222
Copy link
Owner

@tooolbox Sorry, lost track of this issue.

Yeah, zero-config is now here and excel support is gone for now cause I haven't yet had the time to port it.

That said, I'm planning to bring it back soon. Should be very little work to port it over as a plugin.

Closing this as zero-config csv queries are here, but I'll open another issue for tracking Excel support.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

3 participants