# Project Urban Rooftop Classification: First Notebook:Addresses and Lat/Long

From the screenshot to the classification 

### Geospatial dataset: Adressess:

In order to build up a low coast (unfortunately not completely free, API Google Static is 2 USD for 1000 images) Geospatial Data Base for a rooftop classification from scratch 

At first we need addresses of roofs. Fortunately address (Street Name and House Numbers) are available for free on different websites such as 

[Berliner Straßen](https://berlin.kauperts.de/Strassen/Alt-Lankwitz-12247-Berlin)

Furthermore, we want the the Lat/Long coordinates of the buildings, which can be obtained for example here:

[Adress2LatLong](https://www.latlong.net/convert-address-to-lat-long.html) via Web scraping

or the addresses can be obtained via geopy. A good tutorial on how to use geopy can be found here:
[GeoPy](https://www.askpython.com/python/python-geopy-to-find-geocode-of-an-address)


In this notebpook i will explore the GeoPy libary. Lets start with one address as an example: Ritterstr. 12, 10969 Berlin (Spiced Academy) 



In [1]:
# In case geopy is not installed yet run:
#!pip install geopy

In [2]:
# importing needed libaries:
from geopy.geocoders import Nominatim
import pandas as pd
import os


In [3]:
# The geocolactor uses one out of three geocoder ArcGIS, Google Maps, and Nominatim. Google Maps needs your own API key, 
# whereas Nominatim uses and in build api requests
geolocator = Nominatim(user_agent="http")
# location is an object in which the local spatial information are saved in
location = geolocator.geocode("12 Ritterstraße Berlin 10969")
print(location.address)#getting the address as found in the geolocator
print((location.latitude, location.longitude)) #getting the corresponding latitude and longitude


Yafo, 12, Ritterstraße, Luisenstadt, Kreuzberg, Friedrichshain-Kreuzberg, Berlin, 10969, Deutschland
(52.5020453, 13.4108962)


## Building addresses

Alright that seems like it worked out quite alright for one example. Lets see if we can generate a bunch of addresses with the help of [Berliner Straßen](https://berlin.kauperts.de/Strassen/Alt-Lankwitz-12247-Berlin)


For my project I am thinking about investigating 3 to 5 different streets in Berlin with different urban/ suburban attributes: Suburban: Alt-Lankwitz-12247-Berlin, Nr. 2-108, Urban: Ritterstraße-10969-Berlin, Nr. 1-127 and mixed urban/suburban streets : Attilastr-12105-Berlin, Nr. 1-68, 108-180 (Attilastr. is crossing to districts in Berlin: Steglitz and Tempelhof, I chose Tempelhof because it has a good mix of one family and multifamily houses) and Steglitzer-Damm-12169-Berlin, Nr. 1-128

In [6]:
# Generating Suburban Addresses for one street 
StrName = "Alt-Lankwitz" # defining street name / can be changed to dynamic coding to user input
PLZ = "12247" # defining ZIP-code / can be changed to dynamic coding to user input
HNum = [k for k in range(2,109)] # Creating the house numbers 
Address = " " # defining format of Address -> str
AddressesSUrb = list() # defining format of Address -> list of str. 
for item in HNum: #looping through the house numbers, changing their format to str and putting the address together
    Address = str(item) +" "+ StrName +" "+ PLZ + " " + "Berlin" + " " #+ "Germany"
    AddressesSUrb.append(Address) # to save/append them to the addresses list
# to check if the address format is alright 
print(f'Very good we created suburban addresses in the format: House# StrName ZIP-Code City Country: {AddressesSUrb[0]}')
print(f'You created {len(AddressesSUrb)} addresses')

Very good we created Suburban addresses in the format: House# StrName ZIP-Code City Country: 2 Alt-Lankwitz 12247 Berlin 
You created 107 addresses


In [7]:
# Now we can do the same with the other streets: suburban streets : Attilastraße-12105-Berlin, Nr. 1-68, 108-180
# Generating Suburban Addresses for one street 
StrName = "Attilastraße" # defining street name / can be changed to dynamic coding to user input
PLZ = "12105" # defining ZIP-code / can be changed to dynamic coding to user input
HNum = [k for k in range(1,69)] # Creating the house numbers 
HNum_cont = [k for k in range(108,181)] # Creating the house numbers 
HNum.extend(HNum_cont) # Note here the numbers are not coherent (its a Berlin issue)
Address = " " # defining format of Address -> str
AddressesSUrb2 = list() # defining format of Address -> list of str. 
for item in HNum: #looping through the house numbers, changing their format to str and putting the address together
    Address = str(item) +" "+ StrName +" "+ PLZ + " " + "Berlin"
    AddressesSUrb2.append(Address) # to save/append them to the addresses list
# to check if the address format is alright 
print(f'Very good we created suburban addresses in the format: House# StrName ZIP-Code City Germany: {AddressesSUrb2[0]}')
print(f'You created {len(AddressesSUrb2)} addresses')

Very good we created Suburban addresses in the format: House# StrName ZIP-Code City Germany: 1 Attilastraße 12105 Berlin
You created 141 addresses


In [8]:
# Now we can do the same with the other streets: suburbanmix streets : Steglitzer-Damm-12169-Berlin, Nr. 1-128
# Generating Suburban Addresses for one street 
StrName = "Steglitzer Damm" # defining street name / can be changed to dynamic coding to user input
PLZ = "12169" # defining ZIP-code / can be changed to dynamic coding to user input
HNum = [k for k in range(1,129)] # Creating the house numbers 
Address = " " # defining format of Address -> str
AddressesSUrbMix = list() # defining format of Address -> list of str. 
for item in HNum: #looping through the house numbers, changing their format to str and putting the address together
    Address = str(item) +" "+ StrName +" "+ PLZ + " " + "Berlin"
    AddressesSUrbMix.append(Address) # to save/append them to the addresses list

# to check if the address format is alright 
print(f'Very good we created SuburbanMix addresses in the format: House# StrName ZIP-Code City Germany: {AddressesSUrbMix[0]}')
print(f'You created {len(AddressesSUrbMix)} addresses')

Very good we created Suburban addresses in the format: House# StrName ZIP-Code City Germany: 1 Steglitzer Damm 12169 Berlin
You created 128 addresses


In [9]:
# Now we can do the same with the other streets: urban streets : Ritterstraße-10969-Berlin, Nr. 1-127
# Generating Suburban Addresses for one street 
StrName = "Ritterstraße" # defining street name / can be changed to dynamic coding to user input
PLZ = "10969" # defining ZIP-code / can be changed to dynamic coding to user input
HNum = [k for k in range(1,125)] # Creating the house numbers 
Address = " " # defining format of Address -> str
AddressesUrban = list() # defining format of Address -> list of str. 
for item in HNum: #looping through the house numbers, changing their format to str and putting the address together
    Address = str(item) +" "+ StrName +" "+ PLZ + " " + "Berlin"
    AddressesUrban.append(Address) # to save/append them to the addresses list

# to check if the address format is alright 
print(f'Very good we created Urban addresses in the format: House# StrName ZIP-Code City Germany: {AddressesUrban[0]}')
print(f'You created {len(AddressesUrban)} addresses')

Very good we created Suburban addresses in the format: House# StrName ZIP-Code City Germany: 1 Ritterstraße 10969 Berlin
You created 124 addresses


In [10]:
# Lets check out how many addresses we have in total:
print(f''' You created in total of {len(AddressesUrban) +len(AddressesSUrb) + len(AddressesSUrb2) + len(AddressesSUrbMix)}
       addresses''')

 You created in total of 500
       addresses


In [11]:
#lets put them all together into one single list 
Adresses = []
Adresses.extend(AddressesUrban)
Adresses.extend(AddressesSUrb)
Adresses.extend(AddressesSUrb2)
Adresses.extend(AddressesSUrbMix)

### Next we will get the Lat/Long coordinates with location.latitude and location.long

In [12]:
# initiating the the geolocator Nominatim
geolocator = Nominatim(user_agent="http")

# Initialize list and help variables 
counter = 0
LatCor = []
LongCor = []
# unfortunately we have to loop through manually due to the request limitations for the website (4 min, 15.5 sec, 
# dont let the computer go into sleeping mode)
for i in range(0, len(Adresses)):
    counter  += 1
    
    location = geolocator.geocode(Adresses[i])
    LatCor.append(location.latitude)
    LongCor.append(location.longitude)
    #print(location) #print() #print(counter)


In [27]:
# That took a longer than hoped for but that should be ok, now that we have the Adresses and the Long/Lat
# Its time to save the data in DF to a csv file
print(type(LongCor))
print(type(Adresses))
df_add = pd.DataFrame({'Address':Adresses, 'Long':LongCor, 'Lat':LatCor})
df_add.head()

<class 'list'>
<class 'list'>


Unnamed: 0,Address,Long,Lat
0,1 Ritterstraße 10969 Berlin,13.413797,52.500577
1,2 Ritterstraße 10969 Berlin,13.413585,52.50068
2,3 Ritterstraße 10969 Berlin,13.413377,52.501215
3,4 Ritterstraße 10969 Berlin,13.320881,52.433071
4,5 Ritterstraße 10969 Berlin,13.320525,52.433183


In [29]:
foldername = 'data'
if os.path.isdir(foldername) == 0:
    os.mkdir('data')

df_add.to_csv('data/addressesLatLong.csv',index_label='Index')