# Summarize Columns
[Getting Started with Python Integration to SAS® Viya® - Part 9 - Summarize Columns](https://blogs.sas.com/content/sgf/2022/09/14/getting-started-with-python-integration-to-sas-viya-part-9-summarize-columns/) blog post

## Import Packages
Visit the documentation for the SWAT [(SAS Scripting Wrapper for Analytics Transfer)](https://sassoftware.github.io/python-swat/index.html) package.

In [1]:
import swat
import pandas as pd

## custom personal module to connect to my CAS server environment
from casConnect import connect_to_cas 

## Make a Connection to CAS

##### To connect to the CAS server you will need:
1. the host name, 
2. the portnumber, 
3. your user name, and your password.

Visit the documentation [Getting Started with SAS® Viya® for Python](https://go.documentation.sas.com/doc/en/pgmsascdc/default/caspg3/titlepage.htm) for more information about connecting to CAS.

**Be aware that connecting to the CAS server can be implemented in various ways, so you might need to see your system administrator about how to make a connection. Please follow company policy regarding authentication.**

In [2]:
##
## Connect to CAS
##

## General connection syntax
# conn = swat.CAS(host, port, username, password)

## SAS Viya for Learners 3.5 connection
# hostValue = os.environ.get('CASHOST')
# portValue = os.environ.get('CASPORT')
# passwordToken=os.environ.get('SAS_VIYA_TOKEN')
# conn = swat.CAS(hostname=hostValue, port=portValue, password=passwordToken)

## Personal connection using my custom module
conn = connect_to_cas()

type(conn)

swat.cas.connection.CAS

## Load and explore data

In [11]:
conn.loadTable(path = 'WATER_CLUSTER.sashdat', caslib = 'samples',
                            casOut = dict(caslib = 'casuser'))
 
tbl = conn.CASTable('water_cluster', caslib='casuser')
 
tbl.head()

ERROR: The table WATER_CLUSTER already exists in caslib CASUSER(Peter.Styliadis@sas.com).
ERROR: The action stopped due to errors.


Unnamed: 0,Year,Month,Day,Date,Serial,Property,Address,City,Zip,Lat,Property_type,Meter_Location,Clli,DMA,Weekday,Weekend,Daily_W_C_M3,Week,US Holiday,CLUSTER
0,2014.0,1.0,31.0,2014-01-31,955.0,773.0,1800 POST OAK BLVD,HOUSTON,77056.0,-95.461478,0.0,internal,HSTNTXNA,1.0,6.0,0.0,4.376,4.0,,4.0
1,2015.0,12.0,26.0,2015-12-26,1076.0,879.0,1811 E CROSSTIMBERS ST,HOUSTON,77093.0,-95.352264,0.0,external,HSTNTXOX,2.0,5.0,0.0,1.515,51.0,,4.0
2,2014.0,1.0,19.0,2014-01-19,955.0,773.0,1800 POST OAK BLVD,HOUSTON,77056.0,-95.461478,0.0,internal,HSTNTXNA,1.0,1.0,1.0,1.694,3.0,,4.0
3,2014.0,5.0,9.0,2014-05-09,871.0,706.0,17575 ALDINE WESTFIELD RD,HOUSTON,77073.0,-95.364653,0.0,external,HSTNTXWE,1.0,6.0,0.0,0.728,18.0,,4.0
4,2014.0,1.0,30.0,2014-01-30,955.0,773.0,1800 POST OAK BLVD,HOUSTON,77056.0,-95.461478,0.0,internal,HSTNTXNA,1.0,5.0,0.0,3.973,4.0,,4.0


## Simple summarizations

In [13]:
(tbl
 .Daily_W_C_M3
 .sum()
)

401407.888

In [14]:
(tbl
 .Daily_W_C_M3
 .max()
)

11910.0

In [15]:
(tbl
 .Daily_W_C_M3
 .min()
)

0.0

In [16]:
(tbl
 .Daily_W_C_M3
 .mean()
)

8.59177842465753

## Find the property with the max water consumption

In [17]:
## Store the max water consumption value
maxWaterConsumption = (tbl
                      .Daily_W_C_M3
                      .max()
)
 
## Filter the CAS table for the property with the max water usage
(tbl
 .query(f"Daily_W_C_M3 = {maxWaterConsumption }")
 .head()
)

Unnamed: 0,Year,Month,Day,Date,Serial,Property,Address,City,Zip,Lat,Property_type,Meter_Location,Clli,DMA,Weekday,Weekend,Daily_W_C_M3,Week,US Holiday,CLUSTER
0,2015.0,9.0,29.0,2015-09-29,198.0,163.0,1660 S DAIRY ASHFORD ST,HOUSTON,77077.0,-95.606308,0.0,external,HSTNTXBU,1.0,1.0,1.0,11910.0,39.0,,1.0


## Find the top 10 daily water consumption values and properties

In [18]:
df_top10 = tbl.nlargest(10, 'Daily_W_C_M3')
display(df_top10)

Unnamed: 0,Year,Month,Day,Date,Serial,Property,Address,City,Zip,Lat,Property_type,Meter_Location,Clli,DMA,Weekday,Weekend,Daily_W_C_M3,Week,US Holiday,CLUSTER
0,2015.0,9.0,29.0,2015-09-29,198.0,163.0,1660 S DAIRY ASHFORD ST,HOUSTON,77077.0,-95.606308,0.0,external,HSTNTXBU,1.0,1.0,1.0,11910.0,39.0,,1.0
1,2015.0,9.0,8.0,2015-09-08,198.0,163.0,1660 S DAIRY ASHFORD ST,HOUSTON,77077.0,-95.606308,0.0,external,HSTNTXBU,1.0,1.0,1.0,11851.0,36.0,,1.0
2,2015.0,9.0,12.0,2015-09-12,198.0,163.0,1660 S DAIRY ASHFORD ST,HOUSTON,77077.0,-95.606308,0.0,external,HSTNTXBU,1.0,5.0,0.0,11755.0,36.0,,1.0
3,2015.0,9.0,26.0,2015-09-26,198.0,163.0,1660 S DAIRY ASHFORD ST,HOUSTON,77077.0,-95.606308,0.0,external,HSTNTXBU,1.0,5.0,0.0,11612.0,38.0,,1.0
4,2015.0,9.0,9.0,2015-09-09,198.0,163.0,1660 S DAIRY ASHFORD ST,HOUSTON,77077.0,-95.606308,0.0,external,HSTNTXBU,1.0,2.0,0.0,11231.0,36.0,,1.0
5,2015.0,9.0,20.0,2015-09-20,198.0,163.0,1660 S DAIRY ASHFORD ST,HOUSTON,77077.0,-95.606308,0.0,external,HSTNTXBU,1.0,6.0,0.0,11061.0,37.0,,1.0
6,2015.0,9.0,21.0,2015-09-21,198.0,163.0,1660 S DAIRY ASHFORD ST,HOUSTON,77077.0,-95.606308,0.0,external,HSTNTXBU,1.0,7.0,1.0,10828.0,37.0,,1.0
7,2015.0,10.0,3.0,2015-10-03,198.0,163.0,1660 S DAIRY ASHFORD ST,HOUSTON,77077.0,-95.606308,0.0,external,HSTNTXBU,1.0,5.0,0.0,10740.0,39.0,,1.0
8,2015.0,10.0,5.0,2015-10-05,198.0,163.0,1660 S DAIRY ASHFORD ST,HOUSTON,77077.0,-95.606308,0.0,external,HSTNTXBU,1.0,7.0,1.0,10677.0,39.0,,1.0
9,2015.0,9.0,25.0,2015-09-25,198.0,163.0,1660 S DAIRY ASHFORD ST,HOUSTON,77077.0,-95.606308,0.0,external,HSTNTXBU,1.0,4.0,0.0,10417.0,38.0,,1.0


## Terminate the CAS Connection

In [19]:
conn.terminate()