## This notebook contains:
### 01. Import libraries and data
### 02. Calculate descriptive statistics for each cluster
### 03. View cluster counts by location
### 04. Assess other cluster qualities- kitchen, flat type, state population category
### 05. See how many good deals there are in each cluster

# 01. Import libraries and data

In [1]:
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib
import os

In [2]:
# This command propts matplotlib visuals to appear in the notebook 

%matplotlib inline

In [3]:
path = r'C:\Users\jacym\Desktop\Career Foundry projects\german rent'

In [4]:
# Import rent data

df = pd.read_csv(os.path.join(path, '02 data', 'cleaned data', 'clustered_data.csv'), index_col = False)

In [5]:
df.head()

Unnamed: 0,totalRent,price/unit,livingSpace,picturecount,yearConstructed,noRooms,population,populationTrend,populationDensity,districtPopTrend,...,garden,baseRentRange,noRoomsRange,livingSpaceRange,yearConstructedRange,regio2,regio3,description,date,popTrendCat
0,840.0,9.767442,86.0,6,1965.0,4.0,17935147,-0.02,526,0.33,...,True,4,4,4,2.0,Dortmund,Schüren,Die ebenerdig zu erreichende Erdgeschosswohnun...,2019-05-10,stable
1,1320.65,15.179885,87.0,12,2018.0,3.0,17935147,-0.02,526,0.33,...,False,6,3,4,9.0,Dortmund,Kirchhörde,Der attraktive Neubau mit 10 Wohnungen liegt i...,2019-05-10,stable
2,493.8,7.964516,62.0,0,1958.0,2.0,17935147,-0.02,526,0.33,...,False,2,2,3,2.0,Dortmund,Innenstadt,"Wohnraum, Schlafraum, Küche, Diele und Bad",2020-02-01,stable
3,460.0,8.363636,55.0,14,1930.0,2.0,17935147,-0.02,526,0.33,...,True,2,2,2,1.0,Dortmund,Derne,"Altbau, mehrfach saniert, einfache Ausstattung...",2019-05-10,stable
4,2205.0,14.898649,148.0,32,2015.0,3.0,17935147,-0.02,526,0.33,...,False,8,3,6,8.0,Dortmund,Lücklemberg,Qi – das innovative Wohnresort an der Olpketal...,2019-05-10,stable


In [6]:
df.shape

(171371, 30)

In [32]:
# count units in each cluster
df.groupby(['cluster']).agg({'scoutId': ['nunique']})

Unnamed: 0_level_0,scoutId
Unnamed: 0_level_1,nunique
cluster,Unnamed: 1_level_2
bigcity,11026
budget,48285
midhigh,42049
midlow,41776
upscale,28235


# 02. Calculate descriptive statistics for each cluster

In [7]:
df.groupby('cluster').agg({'totalRent':['mean', 'std'], 
                         'price/unit':['mean', 'std'],
                           'livingSpace':['mean', 'std']
                          })

Unnamed: 0_level_0,totalRent,totalRent,price/unit,price/unit,livingSpace,livingSpace
Unnamed: 0_level_1,mean,std,mean,std,mean,std
cluster,Unnamed: 1_level_2,Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2,Unnamed: 6_level_2
bigcity,1356.059561,817.827508,16.814508,5.45839,80.683281,37.780991
budget,483.675904,162.681281,7.975155,1.456428,61.061406,17.657399
midhigh,756.221164,295.468087,13.548828,6.105228,59.363078,19.169273
midlow,683.349209,224.821659,9.93994,2.363322,69.561887,18.331846
upscale,1611.137709,780.246614,13.889768,5.507451,118.560051,34.404286


In [8]:
stats = df.groupby('cluster').agg({'totalRent':['mean', 'std'], 
                         'price/unit':['mean', 'std'],
                           'livingSpace':['mean', 'std']
                          })
stats.to_clipboard()

# 03. View cluster counts by location

In [30]:
df.groupby(['cluster', 'regio1']).agg({'scoutId': ['nunique']})

Unnamed: 0_level_0,Unnamed: 1_level_0,scoutId
Unnamed: 0_level_1,Unnamed: 1_level_1,nunique
cluster,regio1,Unnamed: 2_level_2
bigcity,Berlin,8363
bigcity,Hamburg,2663
budget,Baden_Württemberg,9
budget,Bayern,6
budget,Brandenburg,2944
budget,Bremen,16
budget,Hessen,54
budget,Mecklenburg_Vorpommern,1736
budget,Niedersachsen,490
budget,Nordrhein_Westfalen,5


In [31]:
# copy to clipboard to go over locations in Excel
cluster_locs = df.groupby(['cluster', 'regio1']).agg({'scoutId': ['nunique']})
cluster_locs.to_clipboard()

# 04. Assess other cluster qualities: kitchen, flat type, state population category

In [15]:
df.groupby(['cluster', 'hasKitchen']).agg({'hasKitchen': ['count']})

Unnamed: 0_level_0,Unnamed: 1_level_0,hasKitchen
Unnamed: 0_level_1,Unnamed: 1_level_1,count
cluster,hasKitchen,Unnamed: 2_level_2
bigcity,False,4178
bigcity,True,6848
budget,False,40214
budget,True,8071
midhigh,False,21634
midhigh,True,20415
midlow,False,33057
midlow,True,8719
upscale,False,13713
upscale,True,14522


Most likely to have a kitchen: Upscale units. Least likely: Budget.

In [16]:
df.groupby(['cluster', 'typeOfFlat'], dropna = True).agg({'typeOfFlat': ['count']})

Unnamed: 0_level_0,Unnamed: 1_level_0,typeOfFlat
Unnamed: 0_level_1,Unnamed: 1_level_1,count
cluster,typeOfFlat,Unnamed: 2_level_2
bigcity,apartment,7021
bigcity,ground_floor,1088
bigcity,half_basement,28
bigcity,loft,49
bigcity,maisonette,331
bigcity,other,213
bigcity,penthouse,199
bigcity,raised_ground_floor,223
bigcity,roof_storey,999
bigcity,terraced_flat,179


In [17]:
df.groupby(['cluster', 'popTrendCat']).agg({'popTrendCat': ['count']})

Unnamed: 0_level_0,Unnamed: 1_level_0,popTrendCat
Unnamed: 0_level_1,Unnamed: 1_level_1,count
cluster,popTrendCat,Unnamed: 2_level_2
bigcity,increase,11026
budget,decrease,47281
budget,increase,157
budget,stable,847
midhigh,decrease,12161
midhigh,increase,21543
midhigh,stable,8345
midlow,increase,5375
midlow,stable,36401
upscale,decrease,4116


In [None]:
Big city units are all in states with rising populations. 
Midlow units are mostly in states with stable populations.
Upscale units are mostly in states with increasing or stable populations.
Budget units are mostly in states with decreasing populations.
Midhigh units are the most spread out.

# 05. See how many good deals there are in each cluster

In [18]:
upscale = df[df['cluster'] == 'upscale']
bigcity = df[df['cluster'] == 'bigcity']
midhigh = df[df['cluster'] == 'midhigh']
midlow = df[df['cluster'] == 'midlow']
budget = df[df['cluster'] == 'budget']


In [19]:
df['cluster'].value_counts()

budget     48285
midhigh    42049
midlow     41776
upscale    28235
bigcity    11026
Name: cluster, dtype: int64

I'm defining a good deal as ana apartment that is at least half a standard deviation below the average total rent for that cluster. To see how many good deals there are in each cluster, search for rows in each subset with totalRent less than or equal to the good deal threshold. 

In [29]:
df.groupby('cluster').agg({'totalRent':['mean', 'std']})

Unnamed: 0_level_0,totalRent,totalRent
Unnamed: 0_level_1,mean,std
cluster,Unnamed: 1_level_2,Unnamed: 2_level_2
bigcity,1356.059561,817.827508
budget,483.675904,162.681281
midhigh,756.221164,295.468087
midlow,683.349209,224.821659
upscale,1611.137709,780.246614


In [25]:
upscale[upscale['totalRent'] <= 1221.01]

Unnamed: 0,totalRent,price/unit,livingSpace,picturecount,yearConstructed,noRooms,population,populationTrend,populationDensity,districtPopTrend,...,garden,baseRentRange,noRoomsRange,livingSpaceRange,yearConstructedRange,regio2,regio3,description,date,popTrendCat
138,1034.9,8.770339,118.00,12,1993.0,4.0,17935147,-0.02,526,0.33,...,False,5,4,5,5.0,Dortmund,Sölderholz,Die Wohnung liegt im Dachgeschoss eines freund...,2019-10-08,stable
162,1086.5,10.396134,104.51,14,2001.0,4.0,17935147,-0.02,526,0.33,...,True,6,4,5,6.0,Dortmund,Innenstadt,Die Wohnung befindet sich im 3. OG eines im Ja...,2019-05-10,stable
221,779.0,11.803030,66.00,37,1959.0,3.0,17935147,-0.02,526,0.33,...,True,5,3,3,2.0,Dortmund,Aplerbecker_Mark,Neu sanierte Erdgeschoßwohnung mit eigenem Mie...,2019-10-08,stable
418,1191.0,12.153061,98.00,16,1960.0,4.5,17935147,-0.02,526,0.33,...,False,6,4,4,2.0,Dortmund,Innenstadt,"Helle, gut aufgeteilte 4,5 Raum-Wohnung mit Ba...",2019-10-08,stable
522,1075.0,10.539216,102.00,23,1985.0,3.0,17935147,-0.02,526,0.33,...,False,6,3,5,4.0,Dortmund,Lücklemberg,"Das Objekt ist ein altes Fachwerkhaus, das Lan...",2018-09-22,stable
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
171333,1020.0,9.807692,104.00,14,2013.0,4.0,987129,-0.40,384,0.14,...,True,6,4,5,8.0,Merzig_Wadern_Kreis,Beckingen,"Sehr moderne, hochwertig ausgestattete helle\n...",2018-09-22,decrease
171343,800.0,6.153846,130.00,19,1961.0,4.0,987129,-0.40,384,0.14,...,False,4,4,6,2.0,Merzig_Wadern_Kreis,Merzig,Es handelt sich um eine großzüge 4-Zimmer-Wohn...,2019-05-10,decrease
171349,1190.0,7.933333,150.00,13,2000.0,4.0,987129,-0.40,384,-0.20,...,True,6,4,6,5.0,Sankt_Wendel_Kreis,Marpingen,Am Ende einer verkehrsberuhigten Sackgasse mit...,2020-02-01,decrease
171357,980.0,6.533333,150.00,12,1995.0,4.5,987129,-0.40,384,-0.20,...,True,5,4,6,5.0,Sankt_Wendel_Kreis,Sankt_Wendel,1995 moderner Totalumbau eines Hauses von 1910...,2018-09-22,decrease


In [26]:
bigcity[bigcity['totalRent'] <= 947.15]

Unnamed: 0,totalRent,price/unit,livingSpace,picturecount,yearConstructed,noRooms,population,populationTrend,populationDensity,districtPopTrend,...,garden,baseRentRange,noRoomsRange,livingSpaceRange,yearConstructedRange,regio2,regio3,description,date,popTrendCat
137825,760.00,13.598139,55.89,9,1953.0,3.0,1846970,0.36,2446,0.90,...,False,5,3,2,2.0,Hamburg,Horn,Freuen Sie sich auf diese grundsanierte Wohnun...,2019-05-10,increase
137827,780.00,19.500000,40.00,13,1995.0,2.0,1846970,0.36,2446,0.90,...,True,4,2,1,5.0,Hamburg,Stellingen,Schöne 2 Zimmer Wohnung in Hamburg-Stellingen....,2018-09-22,increase
137829,832.00,15.757576,52.80,12,1972.0,2.0,1846970,0.36,2446,0.90,...,False,4,2,2,3.0,Hamburg,Heimfeld,In gefragter und attraktiver Lage in Hamburg-E...,2018-09-22,increase
137834,726.00,15.861918,45.77,10,1968.0,1.0,1846970,0.36,2446,0.90,...,False,3,1,2,2.0,Hamburg,Poppenbüttel,Bitte übermitteln Sie uns vorab Ihrerseits übe...,2020-02-01,increase
137844,860.00,16.226415,53.00,19,2019.0,2.0,1846970,0.36,2446,0.90,...,False,5,2,2,9.0,Hamburg,Bergedorf,Hier im Stadtteil Bergedorf wächst eines der a...,2019-10-08,increase
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
165190,945.00,10.110196,93.47,14,1994.0,4.0,3659468,0.39,4103,1.04,...,False,5,4,4,5.0,Berlin,Köpenick_Köpenick,Die ruhige und gut geschnittene Wohnung befin...,2018-09-22,increase
165193,630.00,11.052632,57.00,9,1996.0,1.0,3659468,0.39,4103,1.04,...,False,3,1,2,5.0,Berlin,Friedrichshagen_Köpenick,"1-Zimmer-Wohnung, DG links\n\n\nGröße: 57 m² ...",2019-05-10,increase
165194,850.00,14.209295,59.82,10,1910.0,2.0,3659468,0.39,4103,1.04,...,False,5,2,2,1.0,Berlin,Spandau_Spandau,Extras:\n- Treppenhausreinigung\n- Hausmeister...,2018-09-22,increase
165196,919.00,32.302285,28.45,13,2019.0,1.0,3659468,0.39,4103,1.04,...,True,6,1,1,9.0,Berlin,Charlottenburg_Charlottenburg,Bei dem Objekt handelt es sich um einen energi...,2020-02-01,increase


In [28]:
midhigh[midhigh['totalRent'] <= 608.49]

Unnamed: 0,totalRent,price/unit,livingSpace,picturecount,yearConstructed,noRooms,population,populationTrend,populationDensity,districtPopTrend,...,garden,baseRentRange,noRoomsRange,livingSpaceRange,yearConstructedRange,regio2,regio3,description,date,popTrendCat
6720,550.0,24.875622,22.11,6,1991.0,1.0,17935147,-0.02,526,0.30,...,False,2,1,1,5.0,Mettmann_Kreis,Langenfeld_Rheinland,Hochwertig möbliertes Apartment im Zentrum von...,2018-09-22,stable
8256,400.0,22.222222,18.00,6,1991.0,1.0,17935147,-0.02,526,0.29,...,False,2,1,1,5.0,Essen,Kupferdreh,Bei unserem Objekt handelt es sich um ein anse...,2018-09-22,stable
8746,485.0,20.208333,24.00,24,1962.0,1.0,17935147,-0.02,526,0.29,...,False,3,1,1,2.0,Essen,Nordviertel,Moderne Studentenzimmer in unmittelbarer Nähe ...,2019-10-08,stable
10847,350.0,21.875000,16.00,10,1958.0,1.0,17935147,-0.02,526,0.29,...,False,1,1,1,2.0,Essen,Nordviertel,Dein neues Zuhause liegt im 2. OG eines gepfle...,2020-02-01,stable
12761,299.0,20.000000,14.95,14,1920.0,1.0,17935147,-0.02,526,0.49,...,False,1,1,1,1.0,Wuppertal,Oberbarmen,Wohngemeinschaft in gepflegter WG in einer ehe...,2020-02-01,stable
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
170648,405.0,18.409091,22.00,11,1996.0,1.0,987129,-0.40,384,0.17,...,False,2,1,1,5.0,Stadtverband_Saarbrücken_Kreis,Saarbrücken,Schickes Einzimmer-Appartment!\nGehobener Inne...,2018-09-22,decrease
170670,450.0,17.744479,25.36,13,1973.0,1.0,987129,-0.40,384,0.17,...,False,1,1,1,3.0,Stadtverband_Saarbrücken_Kreis,Saarbrücken,Schönes 1 Zimmer Apartment in der Saarbrücker ...,2020-02-01,decrease
170724,400.0,17.391304,23.00,8,1991.0,1.0,987129,-0.40,384,0.17,...,False,1,1,1,5.0,Stadtverband_Saarbrücken_Kreis,Saarbrücken,Das Appartment liegt im Erdgeschoss einer Stud...,2019-05-10,decrease
170828,440.0,17.600000,25.00,4,1971.0,1.0,987129,-0.40,384,0.17,...,False,2,1,1,3.0,Stadtverband_Saarbrücken_Kreis,Saarbrücken,Mehrfamilienhaus,2020-02-01,decrease


In [24]:
midlow[midlow['totalRent'] <= 570.94]

Unnamed: 0,totalRent,price/unit,livingSpace,picturecount,yearConstructed,noRooms,population,populationTrend,populationDensity,districtPopTrend,...,garden,baseRentRange,noRoomsRange,livingSpaceRange,yearConstructedRange,regio2,regio3,description,date,popTrendCat
2,493.80,7.964516,62.00,0,1958.0,2.0,17935147,-0.02,526,0.33,...,False,2,2,3,2.0,Dortmund,Innenstadt,"Wohnraum, Schlafraum, Küche, Diele und Bad",2020-02-01,stable
3,460.00,8.363636,55.00,14,1930.0,2.0,17935147,-0.02,526,0.33,...,True,2,2,2,1.0,Dortmund,Derne,"Altbau, mehrfach saniert, einfache Ausstattung...",2019-05-10,stable
5,525.00,7.835821,67.00,6,1965.0,2.0,17935147,-0.02,526,0.33,...,False,3,2,3,2.0,Dortmund,Innenstadt,Die Wohnung befindet sich in einem gepflegtem ...,2019-05-10,stable
8,522.09,12.198364,42.80,7,1954.0,2.5,17935147,-0.02,526,0.33,...,False,2,2,2,2.0,Dortmund,Eving,Die Wohnung befindet sich in einem gepflegten ...,2018-09-22,stable
12,480.00,10.400867,46.15,6,1937.0,2.0,17935147,-0.02,526,0.33,...,True,2,2,2,1.0,Dortmund,Innenstadt,Das Mehrfamilienhaus besteht aus sechs Wohnein...,2019-10-08,stable
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
137812,350.00,6.034483,58.00,5,1984.0,2.0,13113880,0.35,186,-0.34,...,False,1,2,2,4.0,Kronach_Kreis,Steinwiesen,"Diese attraktive, gepflegte Wohnung kann zum 0...",2019-10-08,increase
137816,280.00,7.567568,37.00,3,2007.0,1.0,13113880,0.35,186,0.06,...,False,1,1,1,7.0,Neustadt_a.d._Waldnaab_Kreis,Vohenstrauß,Gewerberäume sind ab sofort anmietbar. Lagerrä...,2019-10-08,increase
137817,570.00,6.705882,85.00,4,2000.0,3.0,13113880,0.35,186,0.06,...,False,3,3,4,5.0,Neustadt_a.d._Waldnaab_Kreis,Tännesberg,"Diese ansprechende, modernisierte Wohnung im z...",2018-09-22,increase
137818,395.00,5.302725,74.49,7,1960.0,3.0,13113880,0.35,186,0.06,...,False,2,3,3,2.0,Neustadt_a.d._Waldnaab_Kreis,Eschenbach_in_der_Oberpfalz,Das Objekt befindet sich an einer normal befah...,2019-10-08,increase


In [27]:
budget[budget['totalRent'] <= 402.34]

Unnamed: 0,totalRent,price/unit,livingSpace,picturecount,yearConstructed,noRooms,population,populationTrend,populationDensity,districtPopTrend,...,garden,baseRentRange,noRoomsRange,livingSpaceRange,yearConstructedRange,regio2,regio3,description,date,popTrendCat
40207,360.0,5.294896,67.99,2,1960.0,3.0,4092379,0.08,206,0.06,...,True,1,3,3,2.0,Kusel_Kreis,Kusel,Schöne Dreizimmerwohnung in Sechsfamilienhaus...,2019-05-10,stable
40222,375.0,5.820270,64.43,6,1964.0,3.0,4092379,0.08,206,0.06,...,False,1,3,3,2.0,Kusel_Kreis,Kusel,Schöne Dreizimmerwohnung zu vermieten. Die Woh...,2018-09-22,stable
40277,395.0,7.452830,53.00,11,1933.0,2.0,4092379,0.08,206,0.29,...,False,2,2,2,1.0,Bad_Dürkheim_Kreis,Bad_Dürkheim,Ab sofort zu besichtigen! . Vom Eigentümer ist...,2020-02-01,stable
41163,330.0,4.925373,67.00,4,1952.0,2.0,4092379,0.08,206,0.16,...,False,1,2,3,2.0,Pirmasens,Stadtmitte,"Diese attraktive, gepflegte DG-Wohnung kann ab...",2019-05-10,stable
41178,360.0,5.806452,62.00,6,1920.0,3.0,4092379,0.08,206,0.16,...,False,1,3,3,1.0,Pirmasens,Stadtmitte,Diese Wohnung befindet sich im 1. Obergeschoss...,2019-10-08,stable
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
171299,345.0,8.625000,40.00,6,1952.0,1.0,987129,-0.40,384,-0.08,...,False,1,1,1,2.0,Neunkirchen_Kreis,Neunkirchen,"In diese frisch gestrichene Wohnung, die im zw...",2019-10-08,decrease
171320,372.0,7.018868,53.00,5,1920.0,2.0,987129,-0.40,384,-0.08,...,False,1,2,2,1.0,Neunkirchen_Kreis,Neunkirchen,Objektbeschreibung\nNeunkirchen-Am Hirschberg:...,2020-02-01,decrease
171324,372.0,8.086957,46.00,3,1955.0,2.0,987129,-0.40,384,-0.08,...,True,2,2,2,2.0,Neunkirchen_Kreis,Neunkirchen,Moderne Mietwohnung mit schönem Gartengrundstü...,2020-02-01,decrease
171354,375.0,7.653061,49.00,6,1996.0,2.0,987129,-0.40,384,-0.20,...,False,1,2,2,5.0,Sankt_Wendel_Kreis,Nonnweiler,In einem 1996 erbauten Mehrparteienhaus in Non...,2019-10-08,decrease
