# ex06-Doing Math Across Table Columns

As we already know, the demo database (i.e., demo.db3) was extraced from a numerical hydrological modeling. The major data types are of integers, decimals, or floating points. It is quite natural that we want to carry out some calculations or statistical analysis. SQL can handle calculations ranging from basic math through advanced statistics.

***Basic Math Operators***
<li>+ Addition
<li>- Subtraction
<li>* Multiplication
<li>/ Division (returns the quotient only, no remainder)
<li>% Modulo (returns just the remainder)
<li>^ Exponentiation
<li>Others 
    
Let’s try to use the most frequently used SQL math operators on the demo data. Instead of using numbers in queries, we’ll use the names of the columns that contain the numbers. When we execute the query, the calculation will occur on each row of the table.    

In [None]:
%load_ext sql

### 1. Connet to the given database of demo.db3

In [None]:
%sql sqlite:///data/demo.db3

If you do not remember the tables in the demo data, you can always use the following command to query.

In [None]:
%sql SELECT name FROM sqlite_master WHERE type='table'

### 2. Test Math Operator in an easy way

Using the SELECT statement, we can easily test the math operators.

In [None]:
%sql SELECT 3+4

In [None]:
%sql SELECT 12 * 4

In [None]:
%sql SELECT 12 % 4

In [None]:
%sql SELECT round(123.456,2) as Rounded

### 3. Doing Math Across Table Columns

Take the table of watershed_monthly as an example

#### 3.1 Check the table colums firstly.

In [None]:
%sql SELECT * From watershed_monthly LIMIT 3

#### 3.2 Calculate the difference between two colummns

For example, we are interested in the difference between Potential evapotranspiration (PET_mm) and precipitation (PREC_mm). It is so-called Potential evapotranspiration deficit (PED). PED can be thought of as a drought index. It is the difference between how much water could potentially be lost from the soil through evapotranspiration and how much is actually available. When PED is high, plants do not have the full amount of water available they need for growth.

In [None]:
%%sql sqlite://
SELECT YR, MO,  
PREC_mm as Precipitation, 
PET_mm as PET, 
PET_mm-PREC_mm as PED 
From watershed_monthly LIMIT 10

###### We also can calculate the PED ratio to Precipitation.

In [None]:
%%sql sqlite://
SELECT YR, MO,  PREC_mm as Precipitation, 
PET_mm as PET, 
(PET_mm-PREC_mm)/PREC_mm*100.0 as PED_Ratio 
From watershed_monthly LIMIT 10

#### 3.3 Use math operators in a WHERE statement

For example, we could use the modulus operator (%) to filter the MOs.

In [None]:
%%sql sqlite://
SELECT RCH, YR, MO, FLOW_INcms, FLOW_OUTcms 
From rch 
WHERE YR>2009 
and RCH=10 
and MO % 3 = 0

### 4. Do some statistics with Aggregate Functions

So far, we’ve performed math operations across columns in each row of a table. We also can calculate a result from values within the same column using aggregate function, which calculate a single result from multiple inputs. Two of the most-used aggregate functions in data analysis are avg() and sum().

#### 4.1 average

avg - calculates the average of all values in that column (omits null values).

In [None]:
%%sql sqlite://
SELECT avg(FLOW_INcms), avg(FLOW_OUTcms) 
From rch 

#### 4.2 sum

sum - calculates the sum of the values in that column (omits null values).

In [None]:
%%sql sqlite://
SELECT sum(FLOW_INcms), sum(FLOW_OUTcms) 
From rch 

#### 4.3 extreme values

max - calculates the maximum value in that column (omits null values).

min - calculates the minimum value in that column (omits null values).

In [None]:
%%sql sqlite://
SELECT min(FLOW_INcms), max(FLOW_OUTcms) 
From rch 

### 5. Calculate by ourselves

We can calculated some values by the combination of those math operators. For example, we can calculate the average values of Flow_In and Flow_Out.

In [None]:
%%sql sqlite://
SELECT sum(FLOW_INcms)/COUNT(FLOW_INcms) as AVG_FlowIn, 
sum(FLOW_OUTcms)/COUNT(FLOW_OUTcms) as AVG_FlowOut
From rch 

### Summary

Aggregating data (also referred to as rolling up, summarizing, or grouping data) is creating some sort of total from a number of records. Sum, min, max, count, and average are common aggregate operations.

In fact, the above example did not present the real power of these aggregation functions. They will become more powerful only when they are used with ***GROUP BY*** and ***ORDER BY*** clauses.

# <font color="red">Click [here](ex07-Aggregating%20data%20with%20GROUP%20BY%20and%20ORDER%20BY.ipynb) to continue to the next notebook</font>