Part 1
Installing required packages

In [3]:
!conda install -c conda-forge beautifulsoup4 --yes
!conda install -c conda-forge lxml --yes
!conda install -c conda-forge requests --yes

Fetching package metadata .............
Solving package specifications: .

Package plan for installation in environment /opt/conda/envs/DSX-Python35:

The following packages will be UPDATED:

    beautifulsoup4: 4.6.0-py35h442a8c9_1 --> 4.6.3-py35_0 conda-forge

beautifulsoup4 100% |################################| Time: 0:00:00   5.41 MB/s
Fetching package metadata .............
Solving package specifications: .

Package plan for installation in environment /opt/conda/envs/DSX-Python35:

The following packages will be UPDATED:

    libxml2: 2.9.4-h6b072ca_5     --> 2.9.8-h422b904_2     conda-forge
    libxslt: 1.1.29-hcf9102b_5    --> 1.1.32-h88dbc4e_2    conda-forge
    lxml:    4.1.0-py35ha401a81_0 --> 4.2.5-py35hc9114bc_0 conda-forge

libxml2-2.9.8- 100% |################################| Time: 0:00:00  25.64 MB/s
libxslt-1.1.32 100% |################################| Time: 0:00:00  67.04 MB/s
lxml-4.2.5-py3 100% |################################| Time: 0:00:00  66.36 MB/s
Fetchin

Importing required libraries

In [4]:
import pandas as pd
import numpy as np
import requests
from bs4 import BeautifulSoup

Obtaining the data from the website

In [5]:
source = requests.get('https://en.wikipedia.org/wiki/List_of_postal_codes_of_Canada:_M').text

Creating a BeautifulSoup object with 'lxml' parcel

In [6]:
soup = BeautifulSoup(source, 'lxml')

Finding required data using HTML tags

In [7]:
table = soup.find('table')
cells = table.find_all('td')

Defining empty lists for scraping data from the table

In [8]:
postcodes = []
boroughs = []
neighbourhoods = []

Scraping the data from the table into the lists while removing parantheses

In [9]:
for index in range(0, len(cells), 3):
    postcodes.append(cells[index].text.strip())
    boroughs.append(cells[index+1].text.strip())
    neighbourhoods.append(cells[index+2].text.strip())

Creating a data frame and populating it with the data from the lists

In [10]:
df = pd.DataFrame(data=[postcodes, boroughs, neighbourhoods]).transpose()

Adding columns' headers

In [11]:
df.columns = ['Postcode', 'Borough', 'Neighbourhood']

Casting rows with not assigned boroughs into NaN and dropping them

In [12]:
df['Borough'].replace('Not assigned', np.nan, inplace=True)
df.dropna(subset=['Borough'], inplace=True)

Combining rows with the same postcode and separating neighbourhoods in the same cell with a comma

In [13]:
df = df.groupby(['Postcode', 'Borough'])['Neighbourhood'].apply(', '.join).reset_index()

Assigning boroughs' names to not assigned neighbourhoods

In [14]:
df['Neighbourhood'].replace('Not assigned', df['Borough'], inplace=True)

Printing the number of rows in the dataframe

In [15]:
df.shape[0]

103

Showing the resulting dataframe

In [16]:
df.head(12)

Unnamed: 0,Postcode,Borough,Neighbourhood
0,M1B,Scarborough,"Rouge, Malvern"
1,M1C,Scarborough,"Highland Creek, Rouge Hill, Port Union"
2,M1E,Scarborough,"Guildwood, Morningside, West Hill"
3,M1G,Scarborough,Woburn
4,M1H,Scarborough,Cedarbrae
5,M1J,Scarborough,Scarborough Village
6,M1K,Scarborough,"East Birchmount Park, Ionview, Kennedy Park"
7,M1L,Scarborough,"Clairlea, Golden Mile, Oakridge"
8,M1M,Scarborough,"Cliffcrest, Cliffside, Scarborough Village West"
9,M1N,Scarborough,"Birch Cliff, Cliffside West"


Part 2
Creating a dataframe with coordinates from the link provided since Geocoder package proved to be unreliable

In [17]:
coordinates = pd.read_csv('http://cocl.us/Geospatial_data')

Adding columns' headers to the coordinates dataframe so that it can be merged with the initial dataframe

In [18]:
coordinates.columns = ['Postcode', 'Latitude', 'Longitude']

Merging two dataframes based on 'Postcode' column

In [19]:
df = pd.merge(df, coordinates, on=['Postcode'], how='inner')

Showing the resulting dataframe

In [20]:
df.head(12)

Unnamed: 0,Postcode,Borough,Neighbourhood,Latitude,Longitude
0,M1B,Scarborough,"Rouge, Malvern",43.806686,-79.194353
1,M1C,Scarborough,"Highland Creek, Rouge Hill, Port Union",43.784535,-79.160497
2,M1E,Scarborough,"Guildwood, Morningside, West Hill",43.763573,-79.188711
3,M1G,Scarborough,Woburn,43.770992,-79.216917
4,M1H,Scarborough,Cedarbrae,43.773136,-79.239476
5,M1J,Scarborough,Scarborough Village,43.744734,-79.239476
6,M1K,Scarborough,"East Birchmount Park, Ionview, Kennedy Park",43.727929,-79.262029
7,M1L,Scarborough,"Clairlea, Golden Mile, Oakridge",43.711112,-79.284577
8,M1M,Scarborough,"Cliffcrest, Cliffside, Scarborough Village West",43.716316,-79.239476
9,M1N,Scarborough,"Birch Cliff, Cliffside West",43.692657,-79.264848
