## Selecting or Projecting Data

Let us understand different aspects of projecting data. We primarily using `SELECT` to project the data.
* We can project all columns using `*` or some columns using column names.
* We can provide aliases to a column or expression using `AS` in `SELECT` clause.
* `DISTINCT` can be used to get the distinct records from selected columns. We can also use `DISTINCT *` to get unique records using all the columns.
* As part of `SELECT` clause we can have aggregate functions such as `count`, `sum` etc.

In [1]:
%load_ext sql

In [2]:
%env DATABASE_URL=postgresql://itv002461_retail_user:7ji8g7gg8p8olbqbna5vz1tjyikaixco@pg.itversity.com:5433/itv002461_retail_db

env: DATABASE_URL=postgresql://itv002461_retail_user:7ji8g7gg8p8olbqbna5vz1tjyikaixco@pg.itversity.com:5433/itv002461_retail_db


In [3]:
%sql SELECT * FROM orders LIMIT 10

10 rows affected.


order_id,order_date,order_customer_id,order_status
2646,2013-08-08 00:00:00,10712,CLOSED
74,2013-07-25 00:00:00,662,PENDING_PAYMENT
5139,2013-08-25 00:00:00,7989,PENDING
6975,2013-09-06 00:00:00,11555,COMPLETE
75,2013-07-25 00:00:00,2505,PENDING_PAYMENT
9596,2013-09-23 00:00:00,3299,COMPLETE
76,2013-07-25 00:00:00,6898,COMPLETE
17204,2013-11-09 00:00:00,3960,CLOSED
77,2013-07-25 00:00:00,7915,PENDING_PAYMENT
19694,2013-11-24 00:00:00,964,ON_HOLD


In [4]:
%%sql 

SELECT * FROM information_schema.columns 
WHERE table_catalog = 'itv002461_retail_db' 
    AND table_name = 'orders'

 * postgresql://itv002461_retail_user:***@pg.itversity.com:5433/itv002461_retail_db
4 rows affected.


table_catalog,table_schema,table_name,column_name,ordinal_position,column_default,is_nullable,data_type,character_maximum_length,character_octet_length,numeric_precision,numeric_precision_radix,numeric_scale,datetime_precision,interval_type,interval_precision,character_set_catalog,character_set_schema,character_set_name,collation_catalog,collation_schema,collation_name,domain_catalog,domain_schema,domain_name,udt_catalog,udt_schema,udt_name,scope_catalog,scope_schema,scope_name,maximum_cardinality,dtd_identifier,is_self_referencing,is_identity,identity_generation,identity_start,identity_increment,identity_maximum,identity_minimum,identity_cycle,is_generated,generation_expression,is_updatable
itv002461_retail_db,public,orders,order_id,1,,NO,integer,,,32.0,2.0,0.0,,,,,,,,,,,,,itv002461_retail_db,pg_catalog,int4,,,,,1,NO,NO,,,,,,NO,NEVER,,YES
itv002461_retail_db,public,orders,order_date,2,,NO,timestamp without time zone,,,,,,6.0,,,,,,,,,,,,itv002461_retail_db,pg_catalog,timestamp,,,,,2,NO,NO,,,,,,NO,NEVER,,YES
itv002461_retail_db,public,orders,order_customer_id,3,,NO,integer,,,32.0,2.0,0.0,,,,,,,,,,,,,itv002461_retail_db,pg_catalog,int4,,,,,3,NO,NO,,,,,,NO,NEVER,,YES
itv002461_retail_db,public,orders,order_status,4,,NO,character varying,45.0,180.0,,,,,,,,,,,,,,,,itv002461_retail_db,pg_catalog,varchar,,,,,4,NO,NO,,,,,,NO,NEVER,,YES


In [5]:
%%sql 

SELECT order_customer_id, order_date, order_status 
FROM orders 
LIMIT 10

 * postgresql://itv002461_retail_user:***@pg.itversity.com:5433/itv002461_retail_db
10 rows affected.


order_customer_id,order_date,order_status
10712,2013-08-08 00:00:00,CLOSED
662,2013-07-25 00:00:00,PENDING_PAYMENT
7989,2013-08-25 00:00:00,PENDING
11555,2013-09-06 00:00:00,COMPLETE
2505,2013-07-25 00:00:00,PENDING_PAYMENT
3299,2013-09-23 00:00:00,COMPLETE
6898,2013-07-25 00:00:00,COMPLETE
3960,2013-11-09 00:00:00,CLOSED
7915,2013-07-25 00:00:00,PENDING_PAYMENT
964,2013-11-24 00:00:00,ON_HOLD


In [6]:
%%sql 

SELECT order_customer_id, 
    to_char(order_date, 'yyyy-MM-dd') as order_date, 
    order_status 
FROM orders 
LIMIT 10

 * postgresql://itv002461_retail_user:***@pg.itversity.com:5433/itv002461_retail_db
10 rows affected.


order_customer_id,order_date,order_status
10712,2013-08-08,CLOSED
662,2013-07-25,PENDING_PAYMENT
7989,2013-08-25,PENDING
11555,2013-09-06,COMPLETE
2505,2013-07-25,PENDING_PAYMENT
3299,2013-09-23,COMPLETE
6898,2013-07-25,COMPLETE
3960,2013-11-09,CLOSED
7915,2013-07-25,PENDING_PAYMENT
964,2013-11-24,ON_HOLD


In [7]:
%%sql 

SELECT order_customer_id, 
    to_char(order_date, 'yyyy-MM') AS order_month, 
    order_status 
FROM orders 
LIMIT 10

 * postgresql://itv002461_retail_user:***@pg.itversity.com:5433/itv002461_retail_db
10 rows affected.


order_customer_id,order_month,order_status
10712,2013-08,CLOSED
662,2013-07,PENDING_PAYMENT
7989,2013-08,PENDING
11555,2013-09,COMPLETE
2505,2013-07,PENDING_PAYMENT
3299,2013-09,COMPLETE
6898,2013-07,COMPLETE
3960,2013-11,CLOSED
7915,2013-07,PENDING_PAYMENT
964,2013-11,ON_HOLD


In [8]:
%%sql 

SELECT DISTINCT to_char(order_date, 'yyyy-MM') AS order_month 
FROM orders

 * postgresql://itv002461_retail_user:***@pg.itversity.com:5433/itv002461_retail_db
13 rows affected.


order_month
2014-01
2014-05
2013-12
2013-11
2014-04
2014-07
2014-03
2013-08
2013-10
2013-07


In [9]:
%sql SELECT count(1) FROM orders

 * postgresql://itv002461_retail_user:***@pg.itversity.com:5433/itv002461_retail_db
1 rows affected.


count
68883


In [10]:
%%sql 

SELECT count(DISTINCT to_char(order_date, 'yyyy-MM')) AS distinct_month_count 
FROM orders

 * postgresql://itv002461_retail_user:***@pg.itversity.com:5433/itv002461_retail_db
1 rows affected.


distinct_month_count
13
