Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Get distinct array values in a given column for an entire table #4956

Closed
bputt opened this issue Feb 21, 2017 · 2 comments
Closed

Get distinct array values in a given column for an entire table #4956

bputt opened this issue Feb 21, 2017 · 2 comments

Comments

@bputt
Copy link

bputt commented Feb 21, 2017

Use case: Getting the unique values of an array for an entire table

Feature description: Being able to run a SELECT DISTINCT statement against an array column.

Schema:

CREATE TABLE someTable (
    someArr array(STRING)
);

Example data:

INSERT INTO someTable(someArr) VALUES (["a", "b"]);
INSERT INTO someTable(someArr) VALUES (["b", "c"]);

SQL Statement:

SELECT DISTINCT someArr FROM someTable;

Currently throws: Cannot GROUP BY 'someArr': invalid data type 'string_array'

Would like for it to return something like:

a
b
c
@joemoe
Copy link
Contributor

joemoe commented Feb 23, 2017

thanks @bputt for the suggestions, we put that onto our backlog. i'll keep you posted.

@joemoe joemoe added enhancement Enhancement that doesn't fit into a more specific feature label. Try avoid using this shuffle labels Feb 23, 2017
@mfussenegger mfussenegger added feature: sql: unnest and removed shuffle enhancement Enhancement that doesn't fit into a more specific feature label. Try avoid using this labels Feb 7, 2018
@mfussenegger
Copy link
Member

We've merged support for table functions within the select list (#7716). This will be available in 3.2. Together with unnest this can be used for your example:

create table t (someArr array(string));
insert into t (someArr) values (['a', 'b']);
insert into t (someArr) values (['b', 'c']);
select distinct * from (select unnest(someArr) from t) tt;
+-----------------+
| unnest(somearr) |
+-----------------+
| a               |
| b               |
| c               |
+-----------------+

Note that it will be necessary to use a subquery for the unnest and do the aggregation/grouping on-top.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants