In [2]:
import pandas as pd
import numpy as np
import re
import matplotlib.pyplot as plt
import warnings
warnings.filterwarnings('ignore')

# 1. get a list of all cities in israel

This is a pipeline to visualize the local monicipal authorities of Israel on a map using WGS coordinates.
In this notebook we will extract the name and code of all cities of Israel from a Lamas (Central Bureau for Statistics) 2018 population report.

For simplicity, from here forth the term 'city' will refer to any kind of israely local monicipal authority such as: city, village, kibutz, moshav exct.

Note the this table contain ITM (Israeli Transverse Mercator) coordinates. The problem is that converting ITM to WGS is quite a headache, and I found it easier to use the Google API.

In [3]:
#the bycode2018.xlsx is the israely Central Bureau for Statistics (Lamas) report on population in israel for 2018
#The meaning of each featurh can be viewd in the index_2018.csv file
#both can bedownloaded here:
#https://www.cbs.gov.il/he/publications/Pages/2019/%D7%99%D7%99%D7%A9%D7%95%D7%91%D7%99%D7%9D-%D7%91%D7%99%D7%A9%D7%A8%D7%90%D7%9C.aspx



In [5]:
file_name = r'..\csv\lamas_report_2018.xlsx'
israel = pd.read_excel(file_name, encoding='utf-8')

In [4]:
#extract the hebrew index from the lamas file
heb_index=pd.Series(israel.columns)
heb_index

0                  שם יישוב
1                 סמל יישוב
2                     תעתיק
3                      מחוז
4                       נפה
5                 אזור טבעי
6            מעמד מונציפאלי
7           שיוך מטרופוליני
8                  דת יישוב
9     סך הכל אוכלוסייה 2018
10            יהודים ואחרים
11              מזה: יהודים
12                    ערבים
13                שנת ייסוד
14         צורת יישוב שוטפת
15         השתייכות ארגונית
16              קואורדינטות
17                     גובה
18               ועדת תכנון
19               מרחב משטרה
20                      שנה
21         שם יישוב באנגלית
22     אשכול רשויות מקומיות
dtype: object

In [5]:
#From the Lamas file I took 2 features: The english name and the ID code of each city. The code will be used as index and
#will help us retrive data from the Lamas table later on
engDF = israel[heb_index[[1, 21]]]

# name it
engDF.columns = ['code', 'name']

#drop missing values 
engDF.dropna(inplace = True)

#reindex the array
engDF.index = engDF['code']

#drop the code column
engDF.drop(labels = 'code', axis = 1, inplace =True)


engDF.head()

Unnamed: 0_level_0,name
code,Unnamed: 1_level_1
967,Abu Juway'ad
472,Abu Ghosh
473,Abu Sinan
935,Abu Surayhan
958,Abu 'Abdun


# 2. get WGS coordinates using GMAPS API

In [6]:
#to run google API of any kind, go to "google cloud", enter a billing method and generate API for the task you need.
#note that this service costs money, but google give you free trials.

#This is a simple function that gets a df of adressess and return same df with latitude and longitude WGS coordinates


from googlemaps import Client as GoogleMaps
import pandas as pd
import warnings
warnings.filterwarnings('ignore')



#define a func to work with
def getWGS(engDF):
    '''
    INPUT - a df with a 'name' column - to be used as adresss 
    OUTPUT - a df with 3 cols: name, lat and lon
    '''
    #start an GMAPS API session
    #assign google key - You should generate a key by going manually to your google cloud -> API section

    gmaps = GoogleMaps('Enter Your key here')
    
    #copy the adressess to a new df and add lat and long columns
    df = pd.DataFrame()
    df= engDF.copy()
    df['lat'] = np.nan
    df['long'] = np.nan
    
    #manipulate address - here you can add any string to your addressess to help google maps find them
    df['nameIL'] = df['name'] + ', Israel'
    google_errors = []
    
    #itterate over the adressess and getthe coordinates. 
    for x in df.index.values:
        try:
            print(f'Starting round {x} of {len(df)-1}')
            geocode_result = gmaps.geocode(df['nameIL'][x])
            df['lat'][x] = geocode_result[0]['geometry']['location'] ['lat']
            df['long'][x] = geocode_result[0]['geometry']['location']['lng']
            print(df['lat'][x], df['long'][x])
        except:
            google_errors.append(df['name'][x])
    
    return df, google_errors



In [7]:
#as any query to the API costs money (potentialy), it is better to try it first with a small test batch

#get the coordinates
google_results = getWGS(engDF.iloc[:4, :])

#if it works well - do it for the whole set:
#google_results = getWGS(engDF)

Starting round 967 of 3
Starting round 472 of 3
Starting round 473 of 3
Starting round 935 of 3


In [8]:
google_results[0]

Unnamed: 0_level_0,name,lat,long,nameIL
code,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
967,Abu Juway'ad,,,"Abu Juway'ad, Israel"
472,Abu Ghosh,,,"Abu Ghosh, Israel"
473,Abu Sinan,,,"Abu Sinan, Israel"
935,Abu Surayhan,,,"Abu Surayhan, Israel"


In [9]:
#save your results
#google_results[0].to_csv(r'..\csv\israel_with_lamas_codes_wgs.csv')



I review the results manualy. 

For some cities I saw lat/long way out of range - so i added them manually.

The missing values were mostly beduwin small vilages, most of the I also couldn't find mannualy on GMPAS.

But generaly - It looks good!