In [1]:
import numpy as np
import pandas as pd

## Filling Missing Values with Group-Specific Values
When cleaning up missing data, in some cases you will remove data observations
using dropna , but in others you may want to fill in the null (NA) values using a
fixed value or some value derived from the data. fillna is the right tool to use;

In [2]:
s = pd.Series(np.random.standard_normal(6))
s[::2] = np.nan
s

0         NaN
1    1.132940
2         NaN
3    0.815862
4         NaN
5    1.574887
dtype: float64

In [3]:
s.fillna(s.mean())

0    1.174563
1    1.132940
2    1.174563
3    0.815862
4    1.174563
5    1.574887
dtype: float64

Suppose you need the fill value to vary by group

In [4]:
states = ["Ohio", "New York", "Vermont", "Florida", "Oregon", "Nevada", "California", "Idaho"]

In [5]:
group_key = ["East", "East", "East", "East", "West", "West", "West", "West"]

In [6]:
data = pd.Series(np.random.standard_normal(8), index=states)
data[["Vermont", "Nevada", "Idaho"]] = np.nan
data

Ohio         -1.121787
New York      0.207882
Vermont            NaN
Florida       1.067554
Oregon        2.082142
Nevada             NaN
California    1.796159
Idaho              NaN
dtype: float64

## Group Weighted Average

In [7]:
df = pd.DataFrame({"category": ["a", "a", "a", "a", "b", "b", "b", "b"],
"data": np.random.standard_normal(8),
"weights": np.random.uniform(size=8)})
df

Unnamed: 0,category,data,weights
0,a,2.074501,0.681613
1,a,2.790271,0.202006
2,a,-0.687044,0.590698
3,a,-1.992536,0.128009
4,b,1.386038,0.447775
5,b,0.438358,0.099912
6,b,-0.235804,0.08107
7,b,0.369543,0.918887


Find the weighted average by category
(hint: groupby data and using np.average)