# How similar are Indonesian Embassies based on their location ?

### Problem Description

<b>Embassy</b> is a representative of a country in other countries, their existence help to indicate a relation among countries and serve as a way to communicate or strengthen the ties. Their location follows a strict and complex requirements both from the country it comes from and the country it resides, usually in a special diplomatic compound or district. Despite all of the careful planning and requirements, the decision to establish an embassy could have come from other necessities such as certain neighborhood/district/area, near to and close from certain amneties, places that could support the embassy mission etc.  

The knowledge of how a certain embassies is similar or different could <b>help give a bigger view to Indonesian Foreign Affairs Officials</b> to understand the general environment their embassies are located. Should a certain embassies need to be treated differently, do embassies with certain criteria experience the same or different stress level of working for their staffs, do certain embassies experience certain disturbance etc.

### What Data Do We Need ?

To accomplish the analysis we will need data on Indonesian Embassies abroad and their latitude and longitude and combine it with data from FourSquare API to get much more data about the embassy's neighborhood, venues and places surrounding the embassy. In which we will cluster and compare each embassy and find similarities between them and group them together.

### Data Gathering

#### 1. List of Indonesian Embassies abroad

Apparently to get data about the list of Indonesian Embassies abroad is going to be a bit difficult, because the Ministry of Foreign Affairs website doesn't display the data easily to be analize. But with the help of google and wikipedia we find a page that we could extract the data that we need.

In [1]:
list_of_ID_embassies = 'https://id.wikipedia.org/wiki/Kedutaan_besar_Republik_Indonesia'

#### Import the libraries that we need

In [1]:
#!conda install -c anaconda beautifulsoup4
from bs4 import BeautifulSoup # this module helps in web scrapping.
import requests  # this module helps us to download a web page

#### Get the website page

In [5]:
page_list_of_ID_embassies  = requests.get(list_of_ID_embassies).text
page_list_of_ID_embassies

'<!DOCTYPE html>\n<html class="client-nojs" lang="id" dir="ltr">\n<head>\n<meta charset="UTF-8"/>\n<title>Kedutaan besar Republik Indonesia - Wikipedia bahasa Indonesia, ensiklopedia bebas</title>\n<script>document.documentElement.className="client-js";RLCONF={"wgBreakFrames":!1,"wgSeparatorTransformTable":[",\\t.",".\\t,"],"wgDigitTransformTable":["",""],"wgDefaultDateFormat":"dmy","wgMonthNames":["","Januari","Februari","Maret","April","Mei","Juni","Juli","Agustus","September","Oktober","November","Desember"],"wgRequestId":"3a7aeeb9-cccd-4bcc-ac24-98aa49e1ba29","wgCSPNonce":!1,"wgCanonicalNamespace":"","wgCanonicalSpecialPageName":!1,"wgNamespaceNumber":0,"wgPageName":"Kedutaan_besar_Republik_Indonesia","wgTitle":"Kedutaan besar Republik Indonesia","wgCurRevisionId":18023174,"wgRevisionId":18023174,"wgArticleId":28406,"wgIsArticle":!0,"wgIsRedirect":!1,"wgAction":"view","wgUserName":null,"wgUserGroups":["*"],"wgCategories":["Artikel dengan pranala luar nonaktif","Artikel dengan prana

#### Extract the website page for table of Indonesian Embassies

In [7]:
soup = BeautifulSoup(page_list_of_ID_embassies,"html5lib")

In [9]:
table = soup.find('table')
table

<table class="wikitable sortable" style="text-align: center;">

<tbody><tr>
<th>Perwakilan
</th>
<th>Duta Besar
</th>
<th>Pelantikan
</th>
<th>Website
</th>
<th>Merangkap
</th>
<th>Daftar
</th>
<th>Ref.
</th></tr>
<tr>
<td align="left"><span class="flagicon"><img alt="" class="thumbborder" data-file-height="600" data-file-width="900" decoding="async" height="15" src="//upload.wikimedia.org/wikipedia/commons/thumb/9/9a/Flag_of_Afghanistan.svg/23px-Flag_of_Afghanistan.svg.png" srcset="//upload.wikimedia.org/wikipedia/commons/thumb/9/9a/Flag_of_Afghanistan.svg/35px-Flag_of_Afghanistan.svg.png 1.5x, //upload.wikimedia.org/wikipedia/commons/thumb/9/9a/Flag_of_Afghanistan.svg/45px-Flag_of_Afghanistan.svg.png 2x" width="23"/> </span><a href="/wiki/Afganistan" title="Afganistan">Afganistan</a>
</td>
<td><a href="/wiki/Arief_Rachman" title="Arief Rachman">Arief Rachman</a>
</td>
<td>13 Maret 2017
</td>
<td><a href="/wiki/Kedutaan_Besar_Republik_Indonesia_di_Kabul" title="Kedutaan Besar Republik

#### Table consists of Country , Ambassador, Inauguration, Website (Country Capital), Include, List, Ref

In [270]:
table_embassies=[] #List of ID Embassies
rows = table.findAll('tr')
for num, row in enumerate(rows, start=0):
    if num != 0 : #Skip the table header information
        cell = {}
        row_data = row.findAll('td')
        cell['Country'] = row_data[0].text
        cell['Capital'] = row_data[3].text
        table_embassies.append(cell)

#### See the top 10

In [271]:
table_embassies[:10]

[{'Country': '\xa0Afganistan\n', 'Capital': 'Kabul [1]\n'},
 {'Country': '\xa0Afrika Selatan\n', 'Capital': 'Pretoria [2]\n'},
 {'Country': '\xa0Aljazair\n', 'Capital': 'Algiers\n'},
 {'Country': '\xa0Amerika Serikat\n', 'Capital': 'Washington, D.C. [3]\n'},
 {'Country': '\xa0Arab Saudi\n', 'Capital': 'Riyadh\n'},
 {'Country': '\xa0Argentina\n', 'Capital': 'Buenos Aires [4]\n'},
 {'Country': '\xa0Australia\n', 'Capital': 'Canberra [5]\n'},
 {'Country': '\xa0Austria\n', 'Capital': 'Wina\n'},
 {'Country': '\xa0Azerbaijan\n', 'Capital': 'Baku\n'},
 {'Country': '\xa0Bahrain\n', 'Capital': 'Manama [6]\n'}]

#### Import pandas

In [2]:
import pandas as pd
pd.set_option('display.max_rows', None) #Need to see all the rows

In [273]:
df_ID_embassies=pd.DataFrame(table_embassies)

In [274]:
df_ID_embassies.head(200)

Unnamed: 0,Country,Capital
0,Afganistan\n,Kabul [1]\n
1,Afrika Selatan\n,Pretoria [2]\n
2,Aljazair\n,Algiers\n
3,Amerika Serikat\n,"Washington, D.C. [3]\n"
4,Arab Saudi\n,Riyadh\n
5,Argentina\n,Buenos Aires [4]\n
6,Australia\n,Canberra [5]\n
7,Austria\n,Wina\n
8,Azerbaijan\n,Baku\n
9,Bahrain\n,Manama [6]\n


There are 98 Indonesian Embassies according to our data

#### Clean the data to avoid problems later

1. Get rid of the '\n'
2. Get rid of the []

In [275]:
df_ID_embassies['Country'] = df_ID_embassies['Country'].replace(f'(\n)', '',regex=True)
df_ID_embassies['Capital'] = df_ID_embassies['Capital'].replace(f'(\n)', '',regex=True)
df_ID_embassies['Capital'] = df_ID_embassies['Capital'].replace(f'(\[\d+\])', '',regex=True) #inside [] with one more digits

In [276]:
df_ID_embassies.head(100)

Unnamed: 0,Country,Capital
0,Afganistan,Kabul
1,Afrika Selatan,Pretoria
2,Aljazair,Algiers
3,Amerika Serikat,"Washington, D.C."
4,Arab Saudi,Riyadh
5,Argentina,Buenos Aires
6,Australia,Canberra
7,Austria,Wina
8,Azerbaijan,Baku
9,Bahrain,Manama


3. Drop Perbara since it's location is in Jakarta Indonesia Capital not abroad 

In [277]:
df_ID_embassies = df_ID_embassies.drop(62)

4. Drop Taiwan since Indonesia doesn't have an embassy there

In [278]:
df_ID_embassies = df_ID_embassies.drop(82)

5. Change Washington, D.C to Washington

In [279]:
df_ID_embassies.loc[3,'Capital'] ='Washington'

6. Drop Kamerun since it's handle by the embassy in Nigeria

In [280]:
df_ID_embassies = df_ID_embassies.drop(34)

7. Change Kairo to Cairo

In [281]:
df_ID_embassies.loc[50,'Capital'] ='Cairo'

8. Drop Indonesia Representative for the UN

In [282]:
df_ID_embassies = df_ID_embassies.drop(60)
df_ID_embassies = df_ID_embassies.drop(61)

9. Change Bukares to Bucharest

In [283]:
df_ID_embassies.loc[68,'Capital'] ='Bucharest'

10. Change Moskwa to Moscow

In [284]:
df_ID_embassies.loc[69,'Capital'] ='Moscow'

11. Change Kolombo to Colombo

In [285]:
df_ID_embassies.loc[76,'Capital'] ='Colombo'

12. Change Damaskus to Damascus

In [286]:
df_ID_embassies.loc[78,'Capital'] ='Damascus'

13. Change Vatikan to Vatican

In [287]:
df_ID_embassies.loc[83,'Capital'] ='Vatican'

In [288]:
df_ID_embassies = df_ID_embassies.reset_index(drop=True)
df_ID_embassies

Unnamed: 0,Country,Capital
0,Afganistan,Kabul
1,Afrika Selatan,Pretoria
2,Aljazair,Algiers
3,Amerika Serikat,Washington
4,Arab Saudi,Riyadh
5,Argentina,Buenos Aires
6,Australia,Canberra
7,Austria,Wina
8,Azerbaijan,Baku
9,Bahrain,Manama


#### Save the data

In [289]:
df_ID_embassies.to_csv('ID_Embassies.csv')

In [3]:
df_ID_embassies = pd.read_csv('ID_Embassies.csv',index_col=0)
df_ID_embassies

Unnamed: 0,Country,Capital
0,Afganistan,Kabul
1,Afrika Selatan,Pretoria
2,Aljazair,Algiers
3,Amerika Serikat,Washington
4,Arab Saudi,Riyadh
5,Argentina,Buenos Aires
6,Australia,Canberra
7,Austria,Wina
8,Azerbaijan,Baku
9,Bahrain,Manama


so the final tally is 93 Indonesian Embassies abroad

#### 2. Get Indonesian Embassies address

Since there are no data on latitude and longitude of the embassies, we need to get it's address can convert it later. This is also a problem since the data is not available easily to be extracted and analize. But there are similarities in the embassy website that we could explore

https://kemlu.go.id/CAPITAL

In [24]:
capitals = df_ID_embassies['Capital'].values

In [25]:
address = []
for capital in capitals:
    formated_capital = capital.lower().strip().replace(' ','')
    url = 'https://kemlu.go.id/{}/en'.format(formated_capital)
    embassy_website  = requests.get(url).text
    soup = BeautifulSoup(embassy_website,"html5lib")
    div_address = soup.find('div',{"class": "col-12 col-md-6 text-center text-md-left"}) #The Bottom part with the embassy address
    rows = div_address.findAll('a')
    for num, row in enumerate(rows, start=0):
        if num == 0:
            #print(row.text)
            #embassies_address.loc[num,'Address']= row.text
            address.append(row.text)

#### Create a new DataFrame for the embassies address 

In [28]:
embassies_address = pd.DataFrame(data=address, columns=['Address']) 
embassies_address

Unnamed: 0,Address
0,\n \n \n...
1,\n \n \n...
2,\n \n \n...
3,\n \n \n...
4,\n \n \n...
5,\n \n \n...
6,\n \n \n...
7,\n \n \n...
8,\n \n \n...
9,\n \n \n...


apparently the address from scrapping each embassy website for the address needs a make over.

In [30]:
embassies_address['Address'] = embassies_address['Address'].str.strip()
embassies_address['Address'] = embassies_address['Address'].str.replace('^ +', '_',regex=True)
embassies_address['Address'] = embassies_address['Address'].str.replace(' +$', '_',regex=True)
embassies_address['Address'] = embassies_address['Address'].replace(r'\\n',' ', regex=True)
embassies_address['Address'] = embassies_address['Address'].replace(to_replace=[r"\\t|\\n|\\r", "\t|\n|\r"], value=["",""], regex=True)
embassies_address['Address'] = embassies_address['Address'].replace(r"(?i)[^0-9a-z!?.;,@' -]",'',regex=True)
embassies_address

Unnamed: 0,Address
0,"Malalai Watt, Shah-re-Naw, Ministry of Interio..."
1,Embassy of the Republic of Indonesia949 Franci...
2,"Embassy of the Republic of Indonesia-61, Avenu..."
3,
4,"Diplomatic Quarter, P.O. Box 94343 - Riyadh 11693"
5,"Mariscal Ramon Castilla 2901, 1425 Capital Fed..."
6,Embassy of the Republic of IndonesiaAddress 8 ...
7,Embassy of the Republic of Indonesia in Vienna...
8,EMBASSY OF THE REPUBLIC OF INDONESIAAzer Aliye...
9,Embassy of the Republic of Indonesia to Bahrai...


I think we need to do it manually since the address format is not unison

#### Save the address and edit in manually

In [32]:
embassies_address.to_csv('embassies_address.csv')

#### Read the result

In [86]:
embassies_address = pd.read_csv('embassies_address_refine.csv')
embassies_address

Unnamed: 0,Address
0,Shah-re-Naw Ministry of Interior Street Kabul
1,949 Francis Baard Street Hatfield. Pretoria
2,Avenue Souidani Boudjemaa 61 Algiers
3,2020 Massachusetts Avenue NW. Washington DC
4,Diplomatic Quarter. Riyadh
5,Mariscal Ramon Castilla 2901. Buenos Aires
6,8 Darwin Avenue Yarralumla. Canberra
7,Gustav Tschermakgasse 5-7 Vienna
8,Azer Aliyev 3 Nasimi Baku
9,Villa 2113 Road 2432 Manama


#### Insert into the Indonesian Embassies DataFrame

In [88]:
df_ID_embassies['Address'] = embassies_address

In [95]:
df_ID_embassies

Unnamed: 0,Country,Capital,Address
0,Afganistan,Kabul,Shah-re-Naw Ministry of Interior Street Kabul
1,Afrika Selatan,Pretoria,949 Francis Baard Street Hatfield. Pretoria
2,Aljazair,Algiers,Avenue Souidani Boudjemaa 61 Algiers
3,Amerika Serikat,Washington,2020 Massachusetts Avenue NW. Washington DC
4,Arab Saudi,Riyadh,Diplomatic Quarter. Riyadh
5,Argentina,Buenos Aires,Mariscal Ramon Castilla 2901. Buenos Aires
6,Australia,Canberra,8 Darwin Avenue Yarralumla. Canberra
7,Austria,Wina,Gustav Tschermakgasse 5-7 Vienna
8,Azerbaijan,Baku,Azer Aliyev 3 Nasimi Baku
9,Bahrain,Manama,Villa 2113 Road 2432 Manama


In [96]:
df_ID_embassies.to_csv('ID_Embassies_with_address.csv')
df_ID_embassies = pd.read_csv('ID_Embassies_with_address.csv',index_col=0)
df_ID_embassies

Unnamed: 0,Country,Capital,Address
0,Afganistan,Kabul,Shah-re-Naw Ministry of Interior Street Kabul
1,Afrika Selatan,Pretoria,949 Francis Baard Street Hatfield. Pretoria
2,Aljazair,Algiers,Avenue Souidani Boudjemaa 61 Algiers
3,Amerika Serikat,Washington,2020 Massachusetts Avenue NW. Washington DC
4,Arab Saudi,Riyadh,Diplomatic Quarter. Riyadh
5,Argentina,Buenos Aires,Mariscal Ramon Castilla 2901. Buenos Aires
6,Australia,Canberra,8 Darwin Avenue Yarralumla. Canberra
7,Austria,Wina,Gustav Tschermakgasse 5-7 Vienna
8,Azerbaijan,Baku,Azer Aliyev 3 Nasimi Baku
9,Bahrain,Manama,Villa 2113 Road 2432 Manama


Let us try to get the longitude and latitude with geopy libraries

In [97]:
#conda install -c conda-forge geopy

In [98]:
from geopy.geocoders import Nominatim
from geopy.extra.rate_limiter import RateLimiter

In [99]:
geocoder = Nominatim(user_agent='embassies')
geocode = RateLimiter(geocoder.geocode, min_delay_seconds=1, return_value_on_exception=None)

#### Get the embassies addresses

In [100]:
address = df_ID_embassies['Address'].values

#### Loop all the addresses

In [101]:
long_and_lat = []
for addr in address:
    print(addr)
    location = geocode(addr)
    long_and_lat.append(location)

Shah-re-Naw Ministry of Interior Street Kabul
949 Francis Baard Street Hatfield. Pretoria
Avenue Souidani Boudjemaa 61 Algiers
2020 Massachusetts Avenue NW. Washington DC
Diplomatic Quarter. Riyadh
Mariscal Ramon Castilla 2901. Buenos Aires
8 Darwin Avenue Yarralumla. Canberra
Gustav Tschermakgasse 5-7 Vienna
Azer Aliyev 3 Nasimi Baku
Villa 2113 Road 2432 Manama
Road No 53 Plot No 14 Gulshan Dhaka
Tobias Asserlaan 8 Den Haag
Boulevardde la Woluwe 38 Brussels
Splitska 9. Sarajevo
SES Avenida Das Nacoes Quadra 805 Brasilia-DF
30 Great Peter Street. London
Jalan Kebangsaan Kampung Kawasan Diplomatik Mukim Kianggeh Bandar Seri Begawan
Simeonovsko Shosse Sofia
Nad Budankami II  7. Praha
Avenida Las Urbinas 160 Providencia. Santiago
Alle 1 Hellerup Copenhagen
CALLE QUITEO LIBRE E15 QUITO
Egypt Street Mekanissa Road Woreda 05  Addis Ababa
Marama Building  91 Gordon Street Fiji
Salcedo Street 185 Manila
Kuusisaarentie 3. Helsinki
Varosligeti fasor 26. Budapest
50-A Kautilya Marg Chanakyapuri. 

RateLimiter caught an error, retrying (0/2 tries). Called with (*('Calle 70 Bogota',), **{}).
Traceback (most recent call last):
  File "/Users/boysetiawan/opt/anaconda3/lib/python3.7/site-packages/urllib3/connectionpool.py", line 421, in _make_request
    six.raise_from(e, None)
  File "<string>", line 3, in raise_from
  File "/Users/boysetiawan/opt/anaconda3/lib/python3.7/site-packages/urllib3/connectionpool.py", line 416, in _make_request
    httplib_response = conn.getresponse()
  File "/Users/boysetiawan/opt/anaconda3/lib/python3.7/http/client.py", line 1344, in getresponse
    response.begin()
  File "/Users/boysetiawan/opt/anaconda3/lib/python3.7/http/client.py", line 306, in begin
    version, status, reason = self._read_status()
  File "/Users/boysetiawan/opt/anaconda3/lib/python3.7/http/client.py", line 267, in _read_status
    line = str(self.fp.readline(_MAXLINE + 1), "iso-8859-1")
  File "/Users/boysetiawan/opt/anaconda3/lib/python3.7/socket.py", line 589, in readinto
    

380 Yeouidaebang-ro Yeongdeungpo-gu. Seoul
Munsudong Taedonggang Distric Pyongyang
Ulica Medveak 56 Zagreb
5ta Avenida  1607 Miramar. La Habana
Daiya Block 1 Rashed Ahmed Al-Roumi Street
Kaysone Phomvihane Avenue. Vientiane
Presidential Palace Avenue Rue 68 Sector 3 Beirut
Hay Al Karamah Qobri Taariq Al Sari Tripoli
Jalan Tun Razak 233 Kualalumpur 
Rue Beni Boufrah 63 Rabat 
Julio Verne No 27 Mexico City
Aisha El Taymouria Street 13 Garden City Cairo
Streets No 141 Sommerschield  Maputo
Pyiudaungsu Yeiktha Road 100 Yangon
103 Nelson Mandela Avenue. Windhoek
Katsina Ala Crescent 10 Abuja
Fritzners gate 12. Oslo
Al-Shatty Qurum Building Way 3015 Muscat
Diplomatic Enclave I Street 5 Islamabad
Casa no 15 y Ricardo Arango Urbanizacion Obarrio Calle 55 Este. Panama City
Sir John Giuse Drive Lot 12 Section 410 Port Moresby
Avenida Las Flores 334-336 San Isidro. Lima
ulica Estoska 3 Warsawa
Avenida Dom Vasco da Gama no 40 Lisbon
47-49 rue Cortambert. Paris
Al Salmiya Street  Zone 66  Street 94

In [102]:
long_and_lat

[None,
 Location(Francis Baard Street, Hatfield, Tshwane Ward 56, Pretoria, City of Tshwane Metropolitan Municipality, Gauteng, 1166, South Africa, (-25.7458008, 28.2406273, 0.0)),
 None,
 Location(Embassy of Indonesia, 2020, Massachusetts Avenue Northwest, Dupont Circle and surrunding block, Dupont Circle, Washington, District of Columbia, 20036-5305, United States, (38.9102789, -77.0461492, 0.0)),
 Location(Diplomatic Quarter, حي السفارات, عرقة, Municipalty of Irqah, الرياض, منطقة الرياض, السعودية, (24.677103449999997, 46.625145184164424, 0.0)),
 Location(Embajada de Indonesia, 2901, Mariscal Ramón Castilla, Barrio Parque, Palermo, Buenos Aires, Comuna 14, Ciudad Autónoma de Buenos Aires, 1425, Argentina, (-34.5791904, -58.399681164365234, 0.0)),
 Location(Darwin Avenue, Yarralumla, Canberra, District of Canberra Central, Australian Capital Territory, 2600, Australia, (-35.3035676, 149.1154008, 0.0)),
 None,
 None,
 Location(Villa, طريق 2502, القضيبية, المنامة, محافظة العاصمة, 308, ا

#### Format the result in a DataFrame

In [103]:
geolocation = []
for num, geo in enumerate(long_and_lat, start=0):
    try:
      #print(num, geo.latitude, geo.longitude)
        geolocation.append({'latitude':geo.latitude, 'longitude':geo.longitude})
    except:
      geolocation.append({'latitude':0, 'longitude':0})
geolocation        

[{'latitude': 0, 'longitude': 0},
 {'latitude': -25.7458008, 'longitude': 28.2406273},
 {'latitude': 0, 'longitude': 0},
 {'latitude': 38.9102789, 'longitude': -77.0461492},
 {'latitude': 24.677103449999997, 'longitude': 46.625145184164424},
 {'latitude': -34.5791904, 'longitude': -58.399681164365234},
 {'latitude': -35.3035676, 'longitude': 149.1154008},
 {'latitude': 0, 'longitude': 0},
 {'latitude': 0, 'longitude': 0},
 {'latitude': 26.222771450000003, 'longitude': 50.58894775365927},
 {'latitude': 0, 'longitude': 0},
 {'latitude': 52.0861442, 'longitude': 4.2886995},
 {'latitude': 0, 'longitude': 0},
 {'latitude': 43.8507066, 'longitude': 18.4035497},
 {'latitude': 0, 'longitude': 0},
 {'latitude': 51.4968935, 'longitude': -0.1295604},
 {'latitude': 0, 'longitude': 0},
 {'latitude': 0, 'longitude': 0},
 {'latitude': 50.0711315, 'longitude': 14.3729314},
 {'latitude': -33.4221722, 'longitude': -70.6120543},
 {'latitude': 55.7222381, 'longitude': 12.5595906},
 {'latitude': 0, 'longit

#### Make geolocation DataFrame

In [104]:
embassies_geolocation = pd.DataFrame(data=geolocation, columns=['latitude','longitude']) 
embassies_geolocation

Unnamed: 0,latitude,longitude
0,0.0,0.0
1,-25.745801,28.240627
2,0.0,0.0
3,38.910279,-77.046149
4,24.677103,46.625145
5,-34.57919,-58.399681
6,-35.303568,149.115401
7,0.0,0.0
8,0.0,0.0
9,26.222771,50.588948


#### Combine with the Indonesian Embassies DataFrame

In [105]:
df_ID_embassies = df_ID_embassies.join(embassies_geolocation)
df_ID_embassies

Unnamed: 0,Country,Capital,Address,latitude,longitude
0,Afganistan,Kabul,Shah-re-Naw Ministry of Interior Street Kabul,0.0,0.0
1,Afrika Selatan,Pretoria,949 Francis Baard Street Hatfield. Pretoria,-25.745801,28.240627
2,Aljazair,Algiers,Avenue Souidani Boudjemaa 61 Algiers,0.0,0.0
3,Amerika Serikat,Washington,2020 Massachusetts Avenue NW. Washington DC,38.910279,-77.046149
4,Arab Saudi,Riyadh,Diplomatic Quarter. Riyadh,24.677103,46.625145
5,Argentina,Buenos Aires,Mariscal Ramon Castilla 2901. Buenos Aires,-34.57919,-58.399681
6,Australia,Canberra,8 Darwin Avenue Yarralumla. Canberra,-35.303568,149.115401
7,Austria,Wina,Gustav Tschermakgasse 5-7 Vienna,0.0,0.0
8,Azerbaijan,Baku,Azer Aliyev 3 Nasimi Baku,0.0,0.0
9,Bahrain,Manama,Villa 2113 Road 2432 Manama,26.222771,50.588948


#### See which embassies has geolocation

In [106]:
embassies_with_geolocation = df_ID_embassies[df_ID_embassies['latitude'] != 0]
embassies_with_geolocation

Unnamed: 0,Country,Capital,Address,latitude,longitude
1,Afrika Selatan,Pretoria,949 Francis Baard Street Hatfield. Pretoria,-25.745801,28.240627
3,Amerika Serikat,Washington,2020 Massachusetts Avenue NW. Washington DC,38.910279,-77.046149
4,Arab Saudi,Riyadh,Diplomatic Quarter. Riyadh,24.677103,46.625145
5,Argentina,Buenos Aires,Mariscal Ramon Castilla 2901. Buenos Aires,-34.57919,-58.399681
6,Australia,Canberra,8 Darwin Avenue Yarralumla. Canberra,-35.303568,149.115401
9,Bahrain,Manama,Villa 2113 Road 2432 Manama,26.222771,50.588948
11,Belanda,Den Haag,Tobias Asserlaan 8 Den Haag,52.086144,4.288699
13,Bosnia dan Herzegovina,Sarajevo,Splitska 9. Sarajevo,43.850707,18.40355
15,Britania Raya,London,30 Great Peter Street. London,51.496893,-0.12956
18,Ceko,Praha,Nad Budankami II 7. Praha,50.071131,14.372931


In [108]:
embassies_without_geolocation = df_ID_embassies[df_ID_embassies['latitude'] == 0]
embassies_without_geolocation

Unnamed: 0,Country,Capital,Address,latitude,longitude
0,Afganistan,Kabul,Shah-re-Naw Ministry of Interior Street Kabul,0.0,0.0
2,Aljazair,Algiers,Avenue Souidani Boudjemaa 61 Algiers,0.0,0.0
7,Austria,Wina,Gustav Tschermakgasse 5-7 Vienna,0.0,0.0
8,Azerbaijan,Baku,Azer Aliyev 3 Nasimi Baku,0.0,0.0
10,Bangladesh,Dhaka,Road No 53 Plot No 14 Gulshan Dhaka,0.0,0.0
12,Belgia,Brussels,Boulevardde la Woluwe 38 Brussels,0.0,0.0
14,Brasil,Brasilia,SES Avenida Das Nacoes Quadra 805 Brasilia-DF,0.0,0.0
16,Brunei,Bandar Seri Begawan,Jalan Kebangsaan Kampung Kawasan Diplomatik Mu...,0.0,0.0
17,Bulgaria,Sofia,Simeonovsko Shosse Sofia,0.0,0.0
21,Ekuador,Quito,CALLE QUITEO LIBRE E15 QUITO,0.0,0.0


so we have some embassies without geolocation from 93

In [109]:
embassies_without_geolocation.shape

(41, 5)

In [123]:
address_list = dict(zip(embassies_without_geolocation.index,embassies_without_geolocation[['Address']].values.flatten()))
address_list

{0: 'Shah-re-Naw Ministry of Interior Street Kabul',
 2: 'Avenue Souidani Boudjemaa 61 Algiers',
 7: 'Gustav Tschermakgasse 5-7 Vienna',
 8: 'Azer Aliyev 3 Nasimi Baku',
 10: 'Road No 53 Plot No 14 Gulshan Dhaka',
 12: 'Boulevardde la Woluwe 38 Brussels',
 14: 'SES Avenida Das Nacoes Quadra 805 Brasilia-DF',
 16: 'Jalan Kebangsaan Kampung Kawasan Diplomatik Mukim Kianggeh Bandar Seri Begawan',
 17: 'Simeonovsko Shosse Sofia',
 21: 'CALLE QUITEO LIBRE E15 QUITO',
 22: 'Egypt Street Mekanissa Road Woreda 05  Addis Ababa',
 23: 'Marama Building  91 Gordon Street Fiji',
 28: "Salhiya Hay Al-l'lam 220  Zukak 5 Baghdad",
 29: 'Ghaemmagham 180 Tehran',
 33: 'Street 268 Preah Suramarit Boulevard Phnom Penh',
 35: 'Saraishyk Street Diplomatic town. Nur-Sultan',
 37: 'Calle 70 Bogota',
 39: 'Munsudong Taedonggang Distric Pyongyang',
 40: 'Ulica Medveak 56 Zagreb',
 42: 'Daiya Block 1 Rashed Ahmed Al-Roumi Street',
 44: 'Presidential Palace Avenue Rue 68 Sector 3 Beirut',
 45: 'Hay Al Karamah Qob

In [124]:
len(address_list)

41

In [None]:
#conda install -c conda-forge geocoder

In [130]:
import geocoder

In [None]:
# initialize your variable to None
lat_lng_coords = None

# loop until you get the coordinates
while(lat_lng_coords is None):
    #print('Running')
    g = geocoder.google('{}'.format('YahyoGulomov Street 73 Tashkent'))
    lat_lng_coords = g.latlng

latitude = lat_lng_coords[0]
longitude = lat_lng_coords[1]
print('latitude = {0} longitude = {1}'.format(latitude,longitude))