# Performing Mathematical Operations on Table Columns

In SQL, we have the capability to perform various calculations and statistical analyses on data stored in database tables. The primary data types involved in these calculations are integers, decimals, and floating-point numbers. SQL supports a wide range of math operators, from basic arithmetic to more advanced statistical functions.

***Basic Math Operators***

- Addition (+)
- Subtraction (-)
- Multiplication (*)
- Division (/) (returns only the quotient, no remainder)
- Modulo (%) (returns only the remainder)
- Exponentiation (^)
- and others

Let's explore the application of these frequently used SQL math operators on sample data. Instead of working with literal values in our queries, we will utilize the column names that contain the numeric data. By executing the query, the calculations will be carried out for each row of the table.

In [None]:
%load_ext sql

Connect to a Mysql database

In [None]:
%sql mysql://root:root@localhost:3306/training

### Easily Testing Math Operators with SELECT Statement:

By employing the SELECT statement, one can effortlessly test various math operators.

In [None]:
%sql SELECT 3+4

 * mysql://root:***@localhost:3306/sql-training
1 rows affected.


3+4
7


In [None]:
%sql SELECT 12 * 4

 * mysql://root:***@localhost:3306/sql-training
1 rows affected.


12 * 4
48


In [None]:
%sql SELECT 12 % 4

 * mysql://root:***@localhost:3306/sql-training
1 rows affected.


12 % 4
0


In [None]:
%sql SELECT round(123.456,2) as Rounded

 * mysql://root:***@localhost:3306/sql-training
1 rows affected.


Rounded
123.46


### Using Mathematics to Perform Operations Across Table Columns

Let's consider the table named "watershed_monthly" as our example.

#### First, examine the columns of the table.

In [None]:
%sql SELECT * From watershed_monthly LIMIT 3

 * mysql://root:***@localhost:3306/sql-training
3 rows affected.


YR,MO,PREC_mm,SURQ_mm,LATQ_mm,GWQ_mm,PERCOLA_mm,TILEQ_mm,SW_mm,ET_mm,PET_mm,WYLD_mm,SYLD_tons,NO3_SURQ,NO3_LATQ,NO3_PERC,NO3_CROP,N_ORG,P_SOL,P_ORG,TILENO3
1981,1,96.29016,0.5159814,0.412547,6.6881137,19.906706,0.0,1854.2242,6.871742,12.269017,8.681976,0.17933472,0.00021739966,0.0022701763,1.3104769,0.0,0.12622824,0.00011268913,0.015406809,0.0
1981,2,160.22804,3.3468022,0.6452789,9.148776,34.368122,0.0,1766.253,9.165532,14.773177,13.862185,1.2127172,0.0006267096,0.0019556785,1.1741264,0.0,0.8965996,0.0005604379,0.109999985,0.0
1981,3,136.65291,3.8249943,1.4813166,18.518496,34.367245,0.0,1990.7535,13.520476,23.3635,25.01851,1.2629602,0.00084956817,0.0075485506,0.5044969,0.0,0.6799346,0.0005445935,0.08338834,0.0


#### Calculating the Potential Evapotranspiration Deficit (PED)
In this context, we aim to determine the Potential Evapotranspiration Deficit (PED) by calculating the difference between two essential columns: Potential evapotranspiration (PET_mm) and precipitation (PREC_mm). PED represents a significant parameter known as the drought index, which reveals the variance between the potential loss of water from the soil through evapotranspiration and the actual water availability. When PED assumes higher values, plants face an inadequate water supply, hindering their optimal growth conditions.

In [None]:
%%sql
SELECT YR, MO,
PREC_mm as Precipitation,
PET_mm as PET,
PET_mm-PREC_mm as PED
From watershed_monthly LIMIT 10

 * mysql://root:***@localhost:3306/sql-training
10 rows affected.


YR,MO,Precipitation,PET,PED
1981,1,96.29016,12.269017,-84.021143
1981,2,160.22804,14.773177,-145.454863
1981,3,136.65291,23.3635,-113.28941
1981,4,118.85741,36.19556,-82.66185
1981,5,84.54698,89.77256,5.225579999999994
1981,6,44.837185,123.68332,78.846135
1981,7,32.325985,174.0089,141.682915
1981,8,20.651497,152.6375,131.98600299999998
1981,9,16.579657,115.852905,99.273248
1981,10,59.67293,43.6953,-15.977629999999998


Additionally, we have the option to compute the PED ratio concerning Precipitation by using the following SQL query:

In [None]:
%%sql
SELECT YR, MO,  PREC_mm as Precipitation,
PET_mm as PET,
(PET_mm-PREC_mm)/PREC_mm*100.0 as PED_Ratio
From watershed_monthly LIMIT 10

 * mysql://root:***@localhost:3306/sql-training
10 rows affected.


YR,MO,Precipitation,PET,PED_Ratio
1981,1,96.29016,12.269017,-87.25828578953447
1981,2,160.22804,14.773177,-90.77990531494986
1981,3,136.65291,23.3635,-82.90303514209832
1981,4,118.85741,36.19556,-69.54707325357333
1981,5,84.54698,89.77256,6.180682030274758
1981,6,44.837185,123.68332,175.8498777298352
1981,7,32.325985,174.0089,438.29419273689575
1981,8,20.651497,152.6375,639.1110678320317
1981,9,16.579657,115.852905,598.7653906229785
1981,10,59.67293,43.6953,-26.77534017518496



With this query, we can retrieve the columns `YR`, `MO`, `PREC_mm` as Precipitation, `PET_mm` as PET (Potential Evapotranspiration), and calculate the PED_Ratio by subtracting Precipitation from PET, dividing the result by Precipitation, and then multiplying by 100.0. The results are obtained from the `watershed_monthly` table and limited to the first ten rows.

#### Using Math Operators in a WHERE Clause:

In SQL, you can utilize mathematical operators within a WHERE clause to perform filtering based on specific numerical conditions. One such example is using the modulus operator (%) to filter data in specific intervals.

For instance, consider the following SQL query:

In [None]:
%%sql
SELECT RCH, YR, MO, FLOW_INcms, FLOW_OUTcms
From rch
WHERE YR < 2009
and RCH=10
and MO % 3 = 0

 * mysql://root:***@localhost:3306/sql-training
1 rows affected.


RCH,YR,MO,FLOW_INcms,FLOW_OUTcms
10,1981,3,8.964657,8.77624


In this SQL query, we are selecting the columns `RCH`, `YR`, `MO`, `FLOW_INcms`, and `FLOW_OUTcms` from the `rch` table. The `WHERE` clause is used to filter the data based on the following conditions:

-   Rows where the `YR` (year) is less than 2009,
-   Rows where the `RCH` (river channel) is equal to 10, and
-   Rows where the `MO` (month) satisfies the condition of being a multiple of 3 (i.e., the remainder of `MO` divided by 3 is 0).

This SQL query will retrieve data only for the months that are divisible by 3 (e.g., January, April, July, October) and occurred before the year 2009 in the specified river channel (RCH = 10).

### Statistics with Aggregate Functions
In addition to performing mathematical operations across columns in each row of a table, we can also conduct statistical calculations that derive a single result from multiple inputs within the same column. These calculations are achieved through aggregate functions, which are commonly used in data analysis. Among the most frequently employed aggregate functions are `avg()` and `sum()`. They enable us to calculate the average and sum of values within a column, respectively.

#### Average Calculation

The "avg" function is utilized to compute the arithmetic mean of all the values present in the specified column. It excludes any null values while performing the average calculation.

In [None]:
%%sql
SELECT avg(FLOW_INcms), avg(FLOW_OUTcms)
From rch

 * mysql://root:***@localhost:3306/sql-training
1 rows affected.


avg(FLOW_INcms),avg(FLOW_OUTcms)
317.28117681660007,316.4445325763


#### Sum Function:

The sum function is utilized to compute the total sum of the values within a specific column. It excludes any null values present in the column during the computation process. In essence, the sum function aggregates the numeric data in the column, providing the cumulative result as the output.

In [None]:
%%sql
SELECT sum(FLOW_INcms), sum(FLOW_OUTcms)
From rch

 * mysql://root:***@localhost:3306/sql-training
1 rows affected.


sum(FLOW_INcms),sum(FLOW_OUTcms)
31728.117681660005,31644.45325763


#### Extreme Values

In SQL, you can use the `max` and `min` functions to determine the highest and lowest values in a specific column, respectively. These functions exclude any null values present in the column during the calculation.

-   The `max` function calculates the maximum value within the column, disregarding any null values.
-   The `min` function, on the other hand, calculates the minimum value within the column, excluding null values from the calculation.

By utilizing these functions, you can easily identify the extreme values, such as the highest and lowest data points, in a particular column of a database table while ignoring any null entries in that column.

In [None]:
%%sql
SELECT min(FLOW_INcms), max(FLOW_OUTcms)
From rch

 * mysql://root:***@localhost:3306/sql-training
1 rows affected.


min(FLOW_INcms),max(FLOW_OUTcms)
0.31832552,8006.854


### Self-Calculations
By utilizing various mathematical operators, we have the ability to perform calculations on our own. For instance, we can determine the average values of both Flow_In and Flow_Out.

In [None]:
%%sql
SELECT sum(FLOW_INcms)/COUNT(FLOW_INcms) as AVG_FlowIn,
sum(FLOW_OUTcms)/COUNT(FLOW_OUTcms) as AVG_FlowOut
From rch

 * mysql://root:***@localhost:3306/sql-training
1 rows affected.


AVG_FlowIn,AVG_FlowOut
317.28117681660007,316.4445325763


### **Summary:**

Data aggregation, often known as rolling up, summarizing, or grouping data, involves calculating totals or summary values from multiple records. Common aggregate operations include calculating sums, minimums, maximums, counts, and averages.

However, the true power of these aggregation functions is fully realized when they are utilized in conjunction with the _**GROUP BY**_ and _**ORDER BY**_ clauses. These additional clauses enable more sophisticated and insightful data analysis, allowing data to be grouped based on specific criteria and presented in a meaningful order. By using _**GROUP BY**_ and _**ORDER BY**_, data can be organized into subsets and arranged in a way that facilitates deeper understanding and decision-making.