# Doing Math Across Table Columns
**Basic Math Operators**

- + Addition
- "-" Subtraction
- * Multiplication
- / Division (returns the quotient only, no remainder)
- % Modulo (returns just the remainder)
- ^ Exponentiation
- Others
Let’s try to use the most frequently used SQL math operators on the world data. Instead of using numbers in queries, we’ll use the names of the columns that contain the numbers. When we execute the query, the calculation will occur on each row of the table.

In [27]:
import mysql.connector as sql
import pandas as pd

In [28]:
connection = sql.connect(
    host="localhost",
    user="root",
    password="12345"
)

cursor = connection.cursor()

If you do not remember the tables in the demo data, you can always use the following command to query.

In [29]:
pd.read_sql_query("""
    SHOW TABLES
    FROM world""",
    connection)

Unnamed: 0,Tables_in_world
0,city
1,country
2,countrylanguage


Let's see what Country table contains

In [30]:
pd.read_sql_query("""
    SELECT *
    FROM country
    LIMIT 10""",
    connection)

DatabaseError: Execution failed on sql '
    SELECT *
    FROM country
    LIMIT 10': 1046 (3D000): No database selected

### 1. Test Math Operator in an easy way

Using the SELECT statement, we can easily test the math operators.

In [None]:
pd.read_sql_query("""
    SELECT 3+4
    """,
    connection)

Unnamed: 0,3+4
0,7


In [None]:
pd.read_sql_query("""
    SELECT 12 * 4
    """,
    connection)

Unnamed: 0,12 * 4
0,48


In [None]:
pd.read_sql_query("""
    SELECT 12 % 4
    """,
    connection)

Unnamed: 0,12 % 4
0,0


In [None]:
pd.read_sql_query("""
    SELECT ROUND(124.321,35) AS Rounded
    """,
    connection)

Unnamed: 0,Rounded
0,124.321


### 2. Doing Math Across Table Columns
Take the country table as an example.

### 2.1 Check the columns first:

In [None]:
pd.read_sql_query("""
    DESCRIBE world.country
    """,
    connection)

Unnamed: 0,Field,Type,Null,Key,Default,Extra
0,Code,b'char(3)',NO,PRI,b'',
1,Name,b'char(52)',NO,,b'',
2,Continent,"b""enum('Asia','Europe','North America','Africa...",NO,,b'Asia',
3,Region,b'char(26)',NO,,b'',
4,SurfaceArea,"b'float(10,2)'",NO,,b'0.00',
5,IndepYear,b'smallint',YES,,,
6,Population,b'int',NO,,b'0',
7,LifeExpectancy,"b'float(3,1)'",YES,,,
8,GNP,"b'float(10,2)'",YES,,,
9,GNPOld,"b'float(10,2)'",YES,,,


### 2.2 Calculate difference between two columns

In [None]:
pd.read_sql_query("""
    SELECT GNP - GNPOld AS GNP_Delta
    FROM world.country
    """,
    connection)

Unnamed: 0,GNP_Delta
0,35.0
1,
2,-1336.0
3,
4,705.0
...,...
234,312.0
235,
236,-12363.0
237,-545.0


**We can also calculate the density of the population:**

In [None]:
pd.read_sql_query("""
    SELECT Name, Population / SurfaceArea AS Density
    FROM world.country
    ORDER BY Density Desc
    """,
    connection)

Unnamed: 0,Name,Density
0,Macao,26277.777778
1,Monaco,22666.666667
2,Hong Kong,6308.837209
3,Singapore,5771.844660
4,Gibraltar,4166.666667
...,...,...
234,Bouvet Island,0.000000
235,Heard Island and McDonald Islands,0.000000
236,British Indian Ocean Territory,0.000000
237,South Georgia and the South Sandwich Islands,0.000000


### 2.3 Use math operators in a WHERE statement
We can check the density of the population in Europe

In [None]:
pd.read_sql_query("""
    SELECT Name, Population / SurfaceArea AS Density
    FROM world.country
    WHERE Continent='Europe'
    """,
    connection)

Unnamed: 0,Name,Density
0,Albania,118.310839
1,Andorra,166.666667
2,Austria,96.492923
3,Belgium,335.506914
4,Bulgaria,73.795881
5,Bosnia and Herzegovina,77.582671
6,Belarus,49.306358
7,Switzerland,173.442496
8,Czech Republic,130.323587
9,Germany,230.139039


### 3. Do some statistics with Aggregate Functions
So far, we’ve performed math operations across columns in each row of a table. We also can calculate a result from values within the same column using aggregate function, which calculate a single result from multiple inputs. Two of the most-used aggregate functions in data analysis are avg() and sum().

### 3.1 AVG
AVG - calculates the average of all values in that column (omits null values).

In [31]:
pd.read_sql_query("""
    SELECT AVG(Population)
    FROM world.country""",
    connection)

Unnamed: 0,AVG(Population)
0,25434100.0


### 3.2 SUM
SUM - calculates the sum of the values in that column (omits null values).

In [32]:
pd.read_sql_query("""
    SELECT SUM(Population)
    FROM world.country
    WHERE Continent='Europe'""",
    connection)

Unnamed: 0,SUM(Population)
0,730074600.0


### 3.3 Extreme values
MAX - calculates the maximum value in that column (omits null values).

MIN - calculates the minimum value in that column (omits null values).

In [33]:
pd.read_sql_query("""
    SELECT MIN(Population), MAX(Population)
    FROM world.country
    WHERE Continent='Europe'""",
    connection)

Unnamed: 0,MIN(Population),MAX(Population)
0,1000,146934000


### 4. Calculate by ourselves
We can calculated some values by the combination of those math operators.

In [35]:
pd.read_sql_query("""
    SELECT SUM(Population)/Count(Name)
    FROM world.country""",
    connection)

Unnamed: 0,SUM(Population)/Count(Name)
0,25434100.0


### Summary

Aggregating data (also referred to as rolling up, summarizing, or grouping data) is creating some sort of total from a number of records. Sum, min, max, count, and average are common aggregate operations.

In fact, the above example did not present the real power of these aggregation functions. They will become more powerful only when they are used with GROUP BY and ORDER BY clauses.

# References
- [Chonghua Yin notebook](https://github.com/royalosyin/Practice-SQL-with-SQLite-and-Jupyter-Notebook/blob/master/ex06-Doing%20Math%20Across%20Table%20Columns.ipynb)