In [1]:
import pandas as pd

# Operators and Dunder Methods
- protocols that determine how Python langague reacts to operations
- for example when you use + operation, Python is dispatching to the `.__add__` method. 
- when you use a loop with a for statement, Python dispatches to the `.__iter__` method.
- dunder=double underscore

In [2]:
2+4

6

In [4]:
(2).__add__(4)

6

In [5]:
url = 'https://github.com/mattharrison/datasets/raw/master/data/vehicles.csv.zip'
df=pd.read_csv(url)
city_mpg = df.city08
highway_mpg = df.highway08

One way to calculate average of two series:

In [10]:
(city_mpg + highway_mpg)/2

0        22.0
1        11.5
2        28.0
3        11.0
4        20.0
         ... 
41139    22.5
41140    24.0
41141    21.0
41142    21.0
41143    18.5
Length: 41144, dtype: float64

## 6.3 Index Alignment
You can apply most math operations on a series with another series, and can also use a scalar. 

When you operate with two series, pandas will *align* the index before performing the operation.

Aligning will take each index entry in the left series and match it up with every entry with the same name in the index of the right series.

Because of infex alignment, you will want to make sure that the indexes:
- are unique (no dupes)
- common to both series

Example of two series with repeated index entries as well as non-common entries:

In [11]:
s1 = pd.Series([10,20,30], index=[1,2,2])

In [13]:
s2 = pd.Series([35,44,53], index=[2,2,4], name='s2')

In [14]:
s1

1    10
2    20
2    30
dtype: int64

In [15]:
s2

2    35
2    44
4    53
Name: s2, dtype: int64

In [16]:
s1+s2

1     NaN
2    55.0
2    64.0
2    65.0
2    74.0
4     NaN
dtype: float64

## 6.4 Broadcasting
- when you perform math operations with a scalar, pandas *broadcasts* the operation to all values
- for broadcasting, math oeprations are optimized and happen quickly in CPU. This is called *vectorization*
- a numeric pandas series is a block of memory, and modern CPU leverage a technology called Single Instruction/Multiple Data (SIMBD) to apply a math operation to the block memory
- operations that are also avaliable include: +, -, /, // (floor division), % (modulus), @ (matrix multiplication), ** (power), <,>, <=,>=, ==, !=, & (binary xor), | (binary or).

## 6.5 Iteration

Note that there is also a `.__iter__` method on a series, and you can loop over the items in a series. But, recommended to avoid using a `for` loop with a series.If you use a loop to search or filter for values, we will see that there are other ways that are faster and make code easier to understand


## 6.6 Operator Methods
pandas provides methods for standard operators to allow you to *parameterize* or change the behavior based on the parameters
- the dunder methods generally fill in NaN when one of the operands is missing following index alignment
- the operator methods have a `fill_value` parameter that changes this behavior.
- if one of the operands is missing, it will use the `fill_value` instead


If we call `.add` method with default parameters, we will have same result as + operator:


In [17]:
s1+s2

1     NaN
2    55.0
2    64.0
2    65.0
2    74.0
4     NaN
dtype: float64

In [18]:
s1.add(s2)

1     NaN
2    55.0
2    64.0
2    65.0
2    74.0
4     NaN
dtype: float64

However, we can use `fill_value` parameter to specify that we use zero instead:

In [19]:
s1.add(s2, fill_value=0)

1    10.0
2    55.0
2    64.0
2    65.0
2    74.0
4    53.0
dtype: float64

## 6.7 Chaining
- another stylistic reason to prefer the method to the operator is that it makes *chaining* manipulations easier.
    - pandas methods do not mutate data in place but instead return new object
    - allows us to keep tacking on method calls to the returned object

In [23]:
((city_mpg 
  + highway_mpg)
  /2
  )

0        22.0
1        11.5
2        28.0
3        11.0
4        20.0
         ... 
41139    22.5
41140    24.0
41141    21.0
41142    21.0
41143    18.5
Length: 41144, dtype: float64

In [21]:
(city_mpg
 .add(highway_mpg)
 .div(2)
 )

0        22.0
1        11.5
2        28.0
3        11.0
4        20.0
         ... 
41139    22.5
41140    24.0
41141    21.0
41142    21.0
41143    18.5
Length: 41144, dtype: float64

# Summary 
*Method*|Operator|*Description*
:---|---:|:---
`s.add(s2)` | s+s2| adds series
`s.radd(s2)` | s2+s| adds series
`s.sub(s2)` | s-s2| subtracts series
`s.rsub(S2)` | s2-s| subtracts series
`s.mul(s2), s.multiply(s2)` | s*s2 | multiplies series
`s.rmul(s2)` | s2*s |multiplies series
`s.div(s2), s.truediv(s2)` | s/s2| divides series
`s.rdiv(s2), s.rtruediv(s2)` | s2/s| divides series
`s.mod(s2)` | s % s2 | modulo of series division
`s.rmod(s2)` | s2 % s | modulo of series division
`s.floordiv(s2)` | s // s2 | floor  divides series
`s.rfloordiv(s2) `| s2 // s | floor  divides series
`s.pow(s2)` | s**s2 | exponential power of series
`s.rpow(s2)` | s2**s| exponential power of series
`s.eq(s2)` | s==s2 | elementwise equals of series
`s.ne(s2) `| s!=s2 | elementwise not equals of series
`s.gt(s2)` | s>s2 | elementwise greater than of series
`s.lt(s2)` | s<s2 | elementwise less than of series
`s.ge(s2)` | s>=s2 | elementwise greater than or equal of series
`s.le(s2)` | s<=s2 | elementwise less than or equal of series
`np.invert(s)` | ~s | Elementwise inversion of boolean series (no pandas method)
`np.logical_and(s, s2)` | s & s2 | Elementwise logical and of boolean series (no pandas)
`np.logical_or(s, s2)` | s \| s2 | Elementwise logical or of boolean series (no pandas)




