# Mathematical Operations

In [1]:
from sqlalchemy import create_engine, Table, MetaData, select
from sqlalchemy import and_, or_, not_, between, desc, func

# instantiate the database connection
engine = create_engine('sqlite:///../data/sqlalchemy/census.sqlite')
connection = engine.connect()
print('Tables:', engine.table_names())

# instantiate the table obj
census = Table('census', MetaData(), autoload=True, autoload_with=engine)
print('Columns:', census.columns.keys())
print('Setup complete!')

Tables: ['census', 'state_fact']
Columns: ['state', 'sex', 'age', 'pop2000', 'pop2008']
Setup complete!


We can perform addition (+), subtraction (-), multiplication (*), division (/), and modulus (%) operations. Note: They behave differently when used with non-numeric column types.

**Calculate the difference between two numerical columns**

Surround the columns with `()` since we're calling `label`.

In [4]:
stmt = select([
    census.columns.age,
    (census.columns.pop2008 - census.columns.pop2000).label('pop_change')
])
query = stmt.group_by(census.columns.age) \
            .order_by(desc('pop_change')) \
            .limit(5)
            
connection.execute(query).fetchall()

[(61, 25201), (54, 23503), (55, 21716), (60, 19677), (58, 19526)]

**Using the Case statement**

* accepts a list of conditions to match and a column to return if the condition is met.
* the expression is followed by an `else_` if none of the conditions match. 

In [7]:
from sqlalchemy import case, cast, Float

# calculate the sum of the population liivining in 'New York' in 2008
# it the condition is met, add the pop2008, else use default
query = select([
    func.sum(case([
        (census.columns.state == 'New York', census.columns.pop2008)
    ], else_ = 0))
])
connection.execute(query).scalar() # use 'fetchall()' when multiple rows

19465159

**Using the Cast statement**

* used when we need to convert datatypes, e.g. ints to floats for division, strings to dates and times. Often when performing integer division, we want to get a float back. While some databases will do this automatically, you can use the `cast()` function to convert an expression to a particular type.
* accepts a column or expression and the target type.

In [8]:
# What percentage of the population lived in 'New York' in 2008
query = select([
    (func.sum(case([
        (census.columns.state == 'New York', census.columns.pop2008)
    ], else_ = 0)) /
    cast(func.sum(census.columns.pop2008), Float) * 100).label('ny_percent')
])
connection.execute(query).scalar()

6.426761976501632

**Calculate the percentage of women in the 2000 census**.

In [9]:
# Build an expression to calculate female population in 2000
female_pop2000 = func.sum(
    case([
        (census.columns.sex == 'F', census.columns.pop2000)
    ], else_= 0))

# Cast an expression to calculate total population in 2000 to Float
total_pop2000 = cast(func.sum(census.columns.pop2000), Float)

# Build a query to calculate the percentage of females in 2000: stmt
query = select([female_pop2000 / total_pop2000 * 100])

# Execute the query and store the scalar result: percent_female
connection.execute(query).scalar()

51.09467432293413