# SUMIF(), COUNTIF(), and AVERAGEIF()
---

## **In Excel:**

    =SUMIF(range, criteria, [sum_range])
    =COUNTIF(range, criteria)
    =AVERAGEIF(range, criteria, [average_range])

## **In Python:**

Using np.where()

    np.where(df['column'] == criteria, df['column'], 0).sum()     # Works best on sum, not great for count/mean 
<br>

**Pandas approach**

    df[df['column'] == criteria]['column_to_sum'].sum()
    df[df['column'] == criteria]['column_to_count'].count()
    df[df['column'] == criteria]['column_to_average'].mean()

<br>

*Tip: You can also save the condition as a variable and use that in the square brackets:*

    condition = df['column'] == criteria

    df[condition][average_range].mean()

<br><br>

### Load required packages and data
---

In [1]:
# Import required packages
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

In [2]:
# Save Github location paths to a variable
failed_bank_path = 'https://github.com/The-Calculated-Life/python_analysis_for_excel/blob/main/data/failed_banks.xlsx?raw=true'
bx_books_path = 'https://raw.githubusercontent.com/The-Calculated-Life/python_analysis_for_excel/main/data/bx_books.csv'

# Read excel and CSV files
bank_detail = pd.read_excel(failed_bank_path, sheet_name='detail')
bx_books = pd.read_csv(bx_books_path)

<br><br>
### SUMIF(), COUNTIF(), AVERAGEIF() Examples
---

In [3]:
# Calculate total ESTIMATED LOSS for banks with more than $50,000 in ESTIMATED LOSS (using np.where)
np.where(bank_detail['ESTIMATED LOSS'] > 50000, bank_detail['ESTIMATED LOSS'], 0).sum()

66580025.0

<br>

In [5]:
# Calculate total ESTIMATED LOSS for banks with more than $50,000 in ESTIMATED LOSS (Pandas)
bank_detail[bank_detail['ESTIMATED LOSS'] > 50000]['ESTIMATED LOSS'].sum()

66580025.0

<br>

In [6]:
# Save the condition as a variable
big_losses = bank_detail['ESTIMATED LOSS'] > 50000

<br>

In [7]:
# Calculate the total ASSETS for banks with an ESTIMATED LOSS over $50,000
bank_detail[big_losses]['ASSETS'].sum()

311588628

<br>

In [8]:
# How many banks had an estimated loss greater than $50,000?
bank_detail[big_losses]['CERT'].count()

233

<br>

In [9]:
# Calculate the average losses for banks with more than $50,000 in estimated losses (using brackets)
bank_detail[big_losses]['ESTIMATED LOSS'].mean()

285751.18025751074

<br><br>
**QUICK CHALLENGE #1:**

**Task: Write code which counts the number of books published in 2002 using the `bx_books` dataframe**


In [10]:
# Your code for quick challenge #1 here:
bx_books.head()

Unnamed: 0,isbn,book_title,book_author,year_of_publication,publisher
0,195153448,Classical Mythology,Mark P. O. Morford,2002,Oxford University Press
1,2005018,Clara Callan,Richard Bruce Wright,2001,HarperFlamingo Canada
2,60973129,Decision in Normandy,Carlo D'Este,1991,HarperPerennial
3,374157065,Flu: The Story of the Great Influenza Pandemic...,Gina Bari Kolata,1999,Farrar Straus Giroux
4,393045218,The Mummies of Urumchi,E. J. W. Barber,1999,W. W. Norton &amp; Company


In [12]:
bx_books[bx_books['year_of_publication'] == 2002]['isbn'].count()

17628