# Question 22 - Picking a survey group

You work for a large hardware company 
(one that manufactures watches, computers, and phones) 
and you're trying to understand user sentiment towards 
the company's brand and the products. You decide to send
out a survey to a random set of users across different products.

Can you create a query that samples across the different 
product offerings? The output of your query should be user_id 
and group (e.g. the sampling group the user belongs to).

You have a table with all users and their registered devices. 
The schema of the table is below:

Table: user_devices

|Column Name |Data Type |Description|
|---|---|---|
|user_id |integer |id of the user|
|devices |array of strings |lists the devices (watch, computer, phone)|
|device_ids |array of integers |id of the devices used by the user|
|user_create_time |integer |epoch time of the user's account|
|total_spend |integer |lifetime spend of a user|
|country |string |user country|

Solution: assuming 100 users per group, 
and each group is a unique combination of devices used 
(eg watch, watch+computer, watch+computer+phone, ...)

```sql
-- flatten array of devices
with user_device_pairs as (
    SELECT 
        user_id, 
        flattened_devices as device
    FROM user_devices ud
    CROSS JOIN UNNEST(ud.devices) AS flattened_devices
    group by 1,2
    order by 1,2 -- guarantees consistent ordering of devices in the next CTE
)

-- create "group" as the concatenation of devices used by each user
, users as (
    select
        user_id,
        string_agg(device, ',') as grp -- use listagg in redshift
    from user_device_pairs
    group by 1
)

-- sample N users per group
select 
    user_id, 
    grp
from ( 
select
    user_id,
    grp,
    rank() over(partition by grp order by random()) as r
from users 
) a
where r <= 100 -- number of users to sampel per group
```

Note that I'm unsure of this solution, as I don't have a Postgres or BigQuery instance handy.
For Redshift and MySQL, which don't support arrays and/or `UNNEST`, see this link:
https://www.holistics.io/blog/splitting-array-string-into-rows-in-amazon-redshift-or-mysql/

For `WITH ORDINALITY` examples, see https://stackoverflow.com/a/8767450

For BigQuery UNNEST, see this link https://cloud.google.com/bigquery/docs/reference/standard-sql/arrays#flattening-arrays