## Selecting or Projecting Data

Let us understand different aspects of projecting data. We primarily use `SELECT` to project the data.
* We can project all columns using `*` or some columns using column names.
* We can provide aliases to a column or expression using `AS` in `SELECT` clause.
* `DISTINCT` can be used to get the unique records from selected columns. We can also use `DISTINCT *` to get unique records using all the columns.
* As part of `SELECT` clause, we can have aggregate functions such as `count`, `sum` etc.

In [1]:
%load_ext sql

In [2]:
%env DATABASE_URL=postgresql://suryakantkumar:None@localhost:5432/suryakantkumar

env: DATABASE_URL=postgresql://suryakantkumar:None@localhost:5432/suryakantkumar


* Get all the Columns from Orders

In [3]:
%%sql 

SELECT 
    *
FROM 
    orders 
LIMIT 
    5

5 rows affected.


order_id,order_date,order_customer_id,order_status
1,2013-07-25 00:00:00,11599,CLOSED
2,2013-07-25 00:00:00,256,PENDING_PAYMENT
3,2013-07-25 00:00:00,12111,COMPLETE
4,2013-07-25 00:00:00,8827,CLOSED
5,2013-07-25 00:00:00,11318,COMPLETE


* Get all the Columns with Unique Values from Orders

In [4]:
%%sql 

SELECT 
    DISTINCT *
FROM 
    orders 
LIMIT 
    5

 * postgresql://suryakantkumar:***@localhost:5432/suryakantkumar
5 rows affected.


order_id,order_date,order_customer_id,order_status
1,2013-07-25 00:00:00,11599,CLOSED
2,2013-07-25 00:00:00,256,PENDING_PAYMENT
3,2013-07-25 00:00:00,12111,COMPLETE
4,2013-07-25 00:00:00,8827,CLOSED
5,2013-07-25 00:00:00,11318,COMPLETE


* Get order_customer_id, order_date, order_status and Derive order_month from Orders

In [5]:
%%sql 

SELECT 
    order_customer_id,
    order_date,
    to_char(order_date, 'yyyy-MM') AS order_month,
    order_status
FROM 
    orders 
LIMIT 
    5

 * postgresql://suryakantkumar:***@localhost:5432/suryakantkumar
5 rows affected.


order_customer_id,order_date,order_month,order_status
11599,2013-07-25 00:00:00,2013-07,CLOSED
256,2013-07-25 00:00:00,2013-07,PENDING_PAYMENT
12111,2013-07-25 00:00:00,2013-07,COMPLETE
8827,2013-07-25 00:00:00,2013-07,CLOSED
11318,2013-07-25 00:00:00,2013-07,COMPLETE


* Get Unique Months from the Orders

In [6]:
%%sql

SELECT 
    DISTINCT to_char(order_date, 'yyyy-MM') AS order_month
FROM 
    orders

 * postgresql://suryakantkumar:***@localhost:5432/suryakantkumar
13 rows affected.


order_month
2014-01
2014-05
2013-12
2013-11
2014-04
2014-07
2014-03
2013-08
2013-10
2013-07


* Compute Count of rows from Orders

In [7]:
%%sql

SELECT
    COUNT(*) AS rows_count
FROM
    orders

 * postgresql://suryakantkumar:***@localhost:5432/suryakantkumar
1 rows affected.


rows_count
68883


* Compute Count of Unique rows from Orders

In [8]:
%%sql

SELECT 
    COUNT(DISTINCT *) 
FROM 
    orders

 * postgresql://suryakantkumar:***@localhost:5432/suryakantkumar
(psycopg2.errors.SyntaxError) syntax error at or near "*"
LINE 2:     COUNT(DISTINCT *) 
                           ^

[SQL: SELECT 
    COUNT(DISTINCT *) 
FROM 
    orders]
(Background on this error at: https://sqlalche.me/e/14/f405)


In [9]:
%%sql

SELECT 
    COUNT(*) 
FROM 
    (SELECT 
         DISTINCT * 
     FROM 
         orders) AS temp

 * postgresql://suryakantkumar:***@localhost:5432/suryakantkumar
1 rows affected.


count
68883


* Compute Count of Unique `order date` from Orders

In [10]:
%%sql

SELECT
    COUNT(DISTINCT to_char(order_date, 'yyyy-MM')) AS distinct_months_count
FROM
    orders

 * postgresql://suryakantkumar:***@localhost:5432/suryakantkumar
1 rows affected.


distinct_months_count
13
