# Chapter 8-3: Updating Values with Boolean Masks - Exercise

## Question
One thing you'll find if you inspect the data in the "bathrooms" column of the house sales dataset, is that many of the houses in the dataset have - somehow - a quarter of a bathroom. In other words, the value in the column is 1.25, 2.25, 3.25, etc.

Now in my opinion, the concept of a "quarter bathroom" is pretty darn ridiculous - I mean, what the heck are you even going to do in there?!

So your job for this exercise, is to replace all of those quarter bathroom values with sensible equivalents (i.e., rounded down to the nearest whole number).  I'll even give you a list of values to replace so you don't have to type:

[1.25,2.25,3.25,4.25 ,5.25,6.25

Now if this method of replacing a large number of values - that all meet the same specific criteria - seems inefficient, you're right! And we'll see how to fix that in the next video.

In [1]:
import pandas as pd

In [2]:
house_path = 'data/kc_house_data.csv'

In [3]:
house_sales = pd.read_csv(house_path)
house_sales.head(10)

Unnamed: 0,id,date,price,bedrooms,bathrooms,sqft_living,sqft_lot,floors,waterfront,view,...,grade,sqft_above,sqft_basement,yr_built,yr_renovated,zipcode,lat,long,sqft_living15,sqft_lot15
0,7129300520,20141013T000000,221900.0,3,1.0,1180,5650,1.0,0,0,...,7,1180,0,1955,0,98178,47.5112,-122.257,1340,5650
1,6414100192,20141209T000000,538000.0,3,2.25,2570,7242,2.0,0,0,...,7,2170,400,1951,1991,98125,47.721,-122.319,1690,7639
2,5631500400,20150225T000000,180000.0,2,1.0,770,10000,1.0,0,0,...,6,770,0,1933,0,98028,47.7379,-122.233,2720,8062
3,2487200875,20141209T000000,604000.0,4,3.0,1960,5000,1.0,0,0,...,7,1050,910,1965,0,98136,47.5208,-122.393,1360,5000
4,1954400510,20150218T000000,510000.0,3,2.0,1680,8080,1.0,0,0,...,8,1680,0,1987,0,98074,47.6168,-122.045,1800,7503
5,7237550310,20140512T000000,1225000.0,4,4.5,5420,101930,1.0,0,0,...,11,3890,1530,2001,0,98053,47.6561,-122.005,4760,101930
6,1321400060,20140627T000000,257500.0,3,2.25,1715,6819,2.0,0,0,...,7,1715,0,1995,0,98003,47.3097,-122.327,2238,6819
7,2008000270,20150115T000000,291850.0,3,1.5,1060,9711,1.0,0,0,...,7,1060,0,1963,0,98198,47.4095,-122.315,1650,9711
8,2414600126,20150415T000000,229500.0,3,1.0,1780,7470,1.0,0,0,...,7,1050,730,1960,0,98146,47.5123,-122.337,1780,8113
9,3793500160,20150312T000000,323000.0,3,2.5,1890,6560,2.0,0,0,...,7,1890,0,2003,0,98038,47.3684,-122.031,2390,7570


In [4]:
house_sales['bathrooms'] - house_sales['bathrooms'].astype(int)

0        0.00
1        0.25
2        0.00
3        0.00
4        0.00
         ... 
21608    0.50
21609    0.50
21610    0.75
21611    0.50
21612    0.75
Name: bathrooms, Length: 21613, dtype: float64

In [7]:
int_bathrooms = house_sales['bathrooms'].astype(int)
bathrooms_mask = (house_sales['bathrooms'] - int_bathrooms) == 0.25

In [9]:
house_sales[bathrooms_mask]

Unnamed: 0,id,date,price,bedrooms,bathrooms,sqft_living,sqft_lot,floors,waterfront,view,...,grade,sqft_above,sqft_basement,yr_built,yr_renovated,zipcode,lat,long,sqft_living15,sqft_lot15
1,6414100192,20141209T000000,538000.0,3,2.25,2570,7242,2.0,0,0,...,7,2170,400,1951,1991,98125,47.7210,-122.319,1690,7639
6,1321400060,20140627T000000,257500.0,3,2.25,1715,6819,2.0,0,0,...,7,1715,0,1995,0,98003,47.3097,-122.327,2238,6819
24,3814700200,20141120T000000,329000.0,3,2.25,2450,6500,2.0,0,0,...,8,2450,0,1985,0,98030,47.3739,-122.172,2200,6865
41,7766200013,20140811T000000,775000.0,4,2.25,4220,24186,1.0,0,0,...,8,2600,1620,1984,0,98166,47.4450,-122.347,2410,30617
54,4217401195,20150303T000000,920000.0,5,2.25,2730,6000,1.5,0,0,...,8,2130,600,1927,0,98105,47.6571,-122.281,2730,6000
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
21579,7011201004,20140529T000000,645000.0,3,3.25,1730,1229,2.0,0,2,...,9,1320,410,2008,0,98119,47.6374,-122.369,1710,1686
21582,3052700432,20141112T000000,490000.0,3,2.25,1500,1290,2.0,0,0,...,8,1220,280,2006,0,98117,47.6785,-122.375,1460,1375
21592,1931300412,20150416T000000,475000.0,3,2.25,1190,1200,3.0,0,0,...,8,1190,0,2008,0,98103,47.6542,-122.346,1180,1224
21595,1972201967,20141031T000000,520000.0,2,2.25,1530,981,3.0,0,0,...,8,1480,50,2006,0,98103,47.6533,-122.346,1530,1282


In [11]:
updated_bathrooms = int_bathrooms[bathrooms_mask]

In [12]:
house_sales.loc[bathrooms_mask, 'bathrooms'] = updated_bathrooms

In [13]:
house_sales['bathrooms'].value_counts()

bathrooms
2.50    5380
2.00    3977
1.00    3861
1.75    3048
1.50    1446
3.00    1342
2.75    1185
3.50     731
4.00     215
3.75     155
4.50     100
0.75      72
5.00      34
4.75      23
0.00      10
5.50      10
6.00       8
0.50       4
5.75       4
6.75       2
8.00       2
6.50       2
7.50       1
7.75       1
Name: count, dtype: int64