# Calculating on a pivot table
Pivot tables are filled with summary statistics, but they are only a first step to finding something insightful. Often you'll need to perform further calculations on them. A common thing to do is to find the rows or columns where the highest or lowest value occurs.

Recall from Chapter 1 that you can easily subset a Series or DataFrame to find rows of interest using a logical condition inside of square brackets. For example: **series[series > value]**.

**pandas** is loaded as *pd* and the DataFrame **temp_by_country_city_vs_year** is available.

In [2]:
import pandas as pd
path=r'/media/documentos/Cursos/Data Science/Python/Data_Science_Python/data_sets/'
file='temperatures.csv'
temperatures=pd.read_csv(path+file,index_col=0)
temperatures['date'] =  pd.to_datetime(temperatures['date'], format='%Y-%m-%d')
temperatures["year"]=temperatures["date"].dt.year
temp_by_country_city_vs_year = temperatures.pivot_table("avg_temp_c",index=["country","city"],columns="year")
print(temp_by_country_city_vs_year.head())

year                        2000       2001       2002       2003       2004  \
country     city                                                               
Afghanistan Kabul      15.822667  15.847917  15.714583  15.132583  16.128417   
Angola      Luanda     24.410333  24.427083  24.790917  24.867167  24.216167   
Australia   Melbourne  14.320083  14.180000  14.075833  13.985583  13.742083   
            Sydney     17.567417  17.854500  17.733833  17.592333  17.869667   
Bangladesh  Dhaka      25.905250  25.931250  26.095000  25.927417  26.136083   

year                        2005       2006       2007       2008       2009  \
country     city                                                               
Afghanistan Kabul      14.847500  15.798500  15.518000  15.479250  15.093333   
Angola      Luanda     24.414583  24.138417  24.241583  24.266333  24.325083   
Australia   Melbourne  14.378500  13.991083  14.991833  14.110583  14.647417   
            Sydney     18.028083  17.74

- Calculate the mean temperature for each year, assigning to **mean_temp_by_year**.
- Filter **mean_temp_by_year** for the year that had the highest mean temperature.
- Calculate the mean temperature for each city (across columns), assigning to **mean_temp_by_city**.
- Filter **mean_temp_by_city** for the city that had the lowest mean temperature.

In [6]:
# Get the worldwide mean temp by year
mean_temp_by_year = temp_by_country_city_vs_year.mean()

# Filter for the year that had the highest mean temp
print(mean_temp_by_year[mean_temp_by_year==mean_temp_by_year.max()])

# Get the mean temp by city
mean_temp_by_city = temp_by_country_city_vs_year.mean(axis="columns")

# Filter for the city that had the lowest mean temp
print(mean_temp_by_city[mean_temp_by_city==mean_temp_by_city.min()])

year
2013    20.312285
dtype: float64
country  city  
China    Harbin    4.876551
dtype: float64
