# Pivoting SQL results in Panda

This notebook shows how to pivot array data using SQL ARRAY JOIN and pandas DataFrame.pivot().  This problem appeared as a [question on Stack Overflow](https://stackoverflow.com/questions/54811905/return-clickhouse-array-as-column). 

First create some test data.  We'll use clickhouse-driver for this so we can see the SQL.  

In [1]:
from clickhouse_driver import Client
client = Client('localhost')
client.execute('CREATE TABLE IF NOT EXISTS f '
               '(f1 String,  f2 Array(Int32),  f3 Array(String)) '
               'ENGINE = Memory')
client.execute('TRUNCATE TABLE f')
client.execute(
    'INSERT INTO f (f1, f2, f3) VALUES', [
        ('a', [1,2,3], ['x', 'y', 'z']),
        ('b', [4,5,6], ['x', 'y', 'z']),
    ]
)

Now load SQLAlchemy. 

In [2]:
from sqlalchemy import create_engine
%load_ext sql

Connect to ClickHouse, which is assumed to be on localhost with default user. 

In [3]:
%sql clickhouse://default:@localhost/default

'Connected: default@default'

Use SQL query with ARRAY JOIN to flip matching array indexes in f2, f3 to row values with f1. 

In [13]:
result = %sql SELECT * FROM f ARRAY JOIN f2, f3
df = result.DataFrame()
df

 * clickhouse://default:***@localhost/default
Done.


Unnamed: 0,f1,f2,f3
0,a,1,x
1,a,2,y
2,a,3,z
3,b,4,x
4,b,5,y
5,b,6,z
6,c,7,y
7,c,8,z
8,c,9,aa
9,c,10,bb


Now we can pivot f2 and f3 into a new data frame that has the f3 array entries as columns. 

In [15]:
dfp = df.pivot_table(columns='f3', values='f2', index='f1')
print(dfp)

f3   aa    bb    x    y    z
f1                          
a   NaN   NaN  1.0  2.0  3.0
b   NaN   NaN  4.0  5.0  6.0
c   9.0  10.0  NaN  7.0  8.0


This approach works if we add additional data with new property names in the f3 array.  Try adding a row to the table and then rerun the cells that select and pivot data. 

In [12]:
client.execute(
    'INSERT INTO f (f1, f2, f3) VALUES', [
        ('c', [7,8,9,10], ['y', 'z', 'aa', 'bb']),
    ]
)

If you try this again the duplicate rows will be ignored. 