# Creating a database with a list of the issue of new land title deeds in Kenya.

This is a project scraping HTML pages of weekly Kenya Gazette notices starting from 2010 in order to retrieve and categorise land-related notices. The result is a database with more than 200,000 rows of land-related notice entries.


### Getting the Data
* [Weekly Kenya Gazette notices](http://kenyalaw.org/kenya_gazette/gazette)
* [Districts, areas and population](http://www.statoids.com/yke.html)
* [Shapefile of Kenyan provinces](https://www.igismap.com/kenya-shapefile-download-boundary-line-administrative-state-and-polygon/)

## The Kenya Gazette

## This is what the home page of the Kenya Gazette looks like

![Home Page](screenshot_of_home_page.png)

## This is what the landing page for one year of gazette notices looks like

![Landing Page for 2022](screenshot_of_2022_page.png)

# Setting Everything Up

In [1]:
#import the file from the web
import requests
from bs4 import BeautifulSoup
import pandas as pd
from unicodedata import normalize
import re

# I can give a number or use None to remove maximum ceiling & display all columns
pd.options.display.max_columns = None

# I want to be able to see the entire narrative, so remove the maximum width for each column
pd.options.display.max_colwidth = None



from requests.adapters import HTTPAdapter
from urllib3.util.retry import Retry


#To resolve retry http error
session = requests.Session()
retry = Retry(connect=3, backoff_factor=0.5)
adapter = HTTPAdapter(max_retries=retry)
session.mount('http://', adapter)
session.mount('https://', adapter)

### Outlining the years of Interest

In [2]:
#From 2010 to 2020
#Weekly issues

#Every url for a year's worth of gazette notices starts with this url_year_base below

url_year_base="http://kenyalaw.org/kenya_gazette/gazette/year/"

#I am trying to scrape for the years with htmls, that is 2010 to 2020
#Getting the range from 2010 to 2021 so 2020 can be included, then setting it into a list

years=list(range(2010,2021))
years

[2010, 2011, 2012, 2013, 2014, 2015, 2016, 2017, 2018, 2019, 2020]

# Getting the urls for each year 

In [3]:
#to get each url for each year containing a list of the year's gazette notices

year_link_dict_list=[]

for year in years:
    year_link_dict={}
    url_year=url_year_base+str(year)
#     print(url_year)
    year_link_dict['year']=year
    year_link_dict['url_year']=url_year
    year_link_dict_list.append(year_link_dict)
    
url_year_links = [x['url_year'] for x in year_link_dict_list]

In [4]:
#To see a list of dictionaries each with a year and its url
year_link_dict_list

[{'year': 2010,
  'url_year': 'http://kenyalaw.org/kenya_gazette/gazette/year/2010'},
 {'year': 2011,
  'url_year': 'http://kenyalaw.org/kenya_gazette/gazette/year/2011'},
 {'year': 2012,
  'url_year': 'http://kenyalaw.org/kenya_gazette/gazette/year/2012'},
 {'year': 2013,
  'url_year': 'http://kenyalaw.org/kenya_gazette/gazette/year/2013'},
 {'year': 2014,
  'url_year': 'http://kenyalaw.org/kenya_gazette/gazette/year/2014'},
 {'year': 2015,
  'url_year': 'http://kenyalaw.org/kenya_gazette/gazette/year/2015'},
 {'year': 2016,
  'url_year': 'http://kenyalaw.org/kenya_gazette/gazette/year/2016'},
 {'year': 2017,
  'url_year': 'http://kenyalaw.org/kenya_gazette/gazette/year/2017'},
 {'year': 2018,
  'url_year': 'http://kenyalaw.org/kenya_gazette/gazette/year/2018'},
 {'year': 2019,
  'url_year': 'http://kenyalaw.org/kenya_gazette/gazette/year/2019'},
 {'year': 2020,
  'url_year': 'http://kenyalaw.org/kenya_gazette/gazette/year/2020'}]

# Getting the HTML content on each year's page for all gazette notices within it. 

### Here, as a list of dictionaries, I get the following information:
* year
* url for the year
* url for each gazette notice within the year.

In [5]:
    
for year_link_dict in year_link_dict_list:
    print("$$$$$$$$$$$$")
    url_year_link=year_link_dict['url_year']
    print(url_year_link)
    year_raw_html = requests.get(url_year_link).content
    print(type(year_raw_html))
    
    #assign year_urls_soup_doc as the doc holding parsed html 
    #learn type of year_urls_soup_doc
    year_urls_soup_doc = BeautifulSoup(year_raw_html, "html.parser")
    print(type(year_urls_soup_doc))
    print("________")
    
    
    #These are the links on the page for all gazette notices
    links_within_year=year_urls_soup_doc.select('#content')

    #Both weekly and special gazette notices sections
    sections_within_year = str(links_within_year).split('<p>') 
    
    #To get weekly issues only use sections_within_year[1]
    #Then split it by tr to get each entry of a gazette notice
    sections_within_year[1].split('<tr>')
    weekly_section=sections_within_year[1]
    weekly_section_entries=weekly_section.split('<tr>')
    
    
    link_href_list=[]

    for link_within_section in weekly_section_entries:
        try:
            link_href=link_within_section.split('<td>')[2].split('"')[1]
#             print(link_href)
#             print("______")
            link_href_list.append(link_href)
            year_link_dict['gazette_links']=link_href_list

        except:
            pass

#     print(year_link_dict)
    
print(year_link_dict_list[0])

$$$$$$$$$$$$
http://kenyalaw.org/kenya_gazette/gazette/year/2010
<class 'bytes'>
<class 'bs4.BeautifulSoup'>
________
$$$$$$$$$$$$
http://kenyalaw.org/kenya_gazette/gazette/year/2011
<class 'bytes'>
<class 'bs4.BeautifulSoup'>
________
$$$$$$$$$$$$
http://kenyalaw.org/kenya_gazette/gazette/year/2012
<class 'bytes'>
<class 'bs4.BeautifulSoup'>
________
$$$$$$$$$$$$
http://kenyalaw.org/kenya_gazette/gazette/year/2013
<class 'bytes'>
<class 'bs4.BeautifulSoup'>
________
$$$$$$$$$$$$
http://kenyalaw.org/kenya_gazette/gazette/year/2014
<class 'bytes'>
<class 'bs4.BeautifulSoup'>
________
$$$$$$$$$$$$
http://kenyalaw.org/kenya_gazette/gazette/year/2015
<class 'bytes'>
<class 'bs4.BeautifulSoup'>
________
$$$$$$$$$$$$
http://kenyalaw.org/kenya_gazette/gazette/year/2016
<class 'bytes'>
<class 'bs4.BeautifulSoup'>
________
$$$$$$$$$$$$
http://kenyalaw.org/kenya_gazette/gazette/year/2017
<class 'bytes'>
<class 'bs4.BeautifulSoup'>
________
$$$$$$$$$$$$
http://kenyalaw.org/kenya_gazette/gazette/y

# Getting the list of dictionaries for each year into a pandas dataframe

In [6]:
df_links_list=pd.DataFrame(year_link_dict_list)
df_links_list.head(3)

Unnamed: 0,year,url_year,gazette_links
0,2010,http://kenyalaw.org/kenya_gazette/gazette/year/2010,"[http://kenyalaw.org/kenya_gazette/gazette/volume/Mjg0/Vol. CXII - No. 135/, http://kenyalaw.org/kenya_gazette/gazette/volume/Mjc0/Vol. CXII - No. 133/, http://kenyalaw.org/kenya_gazette/gazette/volume/Mjcy/Vol. CXII - No. 131/, http://kenyalaw.org/kenya_gazette/gazette/volume/Mjcw/Vol. CXII - No. 128/, http://kenyalaw.org/kenya_gazette/gazette/volume/MjY5/Vol. CXII - No. 125/, http://kenyalaw.org/kenya_gazette/gazette/volume/MjY4/Vol. CXII - No. 123/, http://kenyalaw.org/kenya_gazette/gazette/volume/MjY3/Vol. CXII - No. 120/, http://kenyalaw.org/kenya_gazette/gazette/volume/MjYx/Vol. CXII - No. 114/, http://kenyalaw.org/kenya_gazette/gazette/volume/MjU1/Vol. CXII - No. 108/, http://kenyalaw.org/kenya_gazette/gazette/volume/MjU2/Vol. CXII - No. 109/, http://kenyalaw.org/kenya_gazette/gazette/volume/MjUy/Vol. CXII - No. 105/, http://kenyalaw.org/kenya_gazette/gazette/volume/MjQ4/Vol. CXII - No. 101/, http://kenyalaw.org/kenya_gazette/gazette/volume/MjQ1/Vol. CXII - No. 99/, http://kenyalaw.org/kenya_gazette/gazette/volume/MjQx/Vol. CXII - No. 95/, http://kenyalaw.org/kenya_gazette/gazette/volume/MjM4/Vol. CXII - No. 93/, http://kenyalaw.org/kenya_gazette/gazette/volume/MjM2/Vol. CXII - No. 91/, http://kenyalaw.org/kenya_gazette/gazette/volume/MjM1/Vol. CXII - No. 90/, http://kenyalaw.org/kenya_gazette/gazette/volume/MjMz/Vol. CXII - No. 88/, http://kenyalaw.org/kenya_gazette/gazette/volume/MjMx/Vol. CXII - No. 85/, http://kenyalaw.org/kenya_gazette/gazette/volume/MjI5/Vol. CXII - No. 83/, http://kenyalaw.org/kenya_gazette/gazette/volume/MjI1/Vol. CXII - No. 79/, http://kenyalaw.org/kenya_gazette/gazette/volume/MjIz/Vol. CXII - No. 77/, http://kenyalaw.org/kenya_gazette/gazette/volume/MjE5/Vol. CXII - No. 73/, http://kenyalaw.org/kenya_gazette/gazette/volume/MTk3/Vol. CXII - No. 70/, http://kenyalaw.org/kenya_gazette/gazette/volume/MTk4/Vol. CXII - No. 69/, http://kenyalaw.org/kenya_gazette/gazette/volume/MTk5/Vol. CXII - No. 67/, http://kenyalaw.org/kenya_gazette/gazette/volume/MjAx/Vol. CXII - No. 65/, http://kenyalaw.org/kenya_gazette/gazette/volume/MjE2/Vol. CXII - No. 62/, http://kenyalaw.org/kenya_gazette/gazette/volume/MjA1/Vol. CXII - No. 61/, http://kenyalaw.org/kenya_gazette/gazette/volume/MjA3/Vol. CXII - No. 58/, http://kenyalaw.org/kenya_gazette/gazette/volume/MjA4/Vol. CXII - No. 56/, http://kenyalaw.org/kenya_gazette/gazette/volume/MjA5/Vol. CXII - No. 55/, http://kenyalaw.org/kenya_gazette/gazette/volume/MjEx/Vol. CXII - No. 53/, http://kenyalaw.org/kenya_gazette/gazette/volume/MTky/Vol. CXII - No. 49/, http://kenyalaw.org/kenya_gazette/gazette/volume/MTg5/Vol. CXII - No. 46/, http://kenyalaw.org/kenya_gazette/gazette/volume/MTg3/Vol. CXII - No. 44/, http://kenyalaw.org/kenya_gazette/gazette/volume/MjE1/Vol. CXII - No. 40/, http://kenyalaw.org/kenya_gazette/gazette/volume/MjE0/Vol. CXII - No. 38/, http://kenyalaw.org/kenya_gazette/gazette/volume/Mjc3/Vol. CXII - No. 37/, http://kenyalaw.org/kenya_gazette/gazette/volume/MTc4/Vol. CXII - No. 34/, http://kenyalaw.org/kenya_gazette/gazette/volume/MTcz/Vol. CXII - No. 30/, http://kenyalaw.org/kenya_gazette/gazette/volume/MTcx/Vol. CXII - No. 28/, http://kenyalaw.org/kenya_gazette/gazette/volume/MTY4/Vol. CXII - No. 24/, http://kenyalaw.org/kenya_gazette/gazette/volume/MTY2/Vol. CXII - No. 22/, http://kenyalaw.org/kenya_gazette/gazette/volume/MTY1/Vol. CXII - No. 21/, http://kenyalaw.org/kenya_gazette/gazette/volume/MTU5/Vol. CXII - No. 18/, http://kenyalaw.org/kenya_gazette/gazette/volume/MTU1/Vol. CXII - No. 15/, http://kenyalaw.org/kenya_gazette/gazette/volume/MTU3/Vol. CXII - No. 16/, http://kenyalaw.org/kenya_gazette/gazette/volume/MTUz/Vol. CXII - No. 13/, http://kenyalaw.org/kenya_gazette/gazette/volume/MTUy/Vol. CXII - No. 12/, http://kenyalaw.org/kenya_gazette/gazette/volume/MTUx/Vol. CXII - No. 11/, http://kenyalaw.org/kenya_gazette/gazette/volume/MTUw/Vol. CXII - No. 10/, http://kenyalaw.org/kenya_gazette/gazette/volume/MTQ4/Vol. CXII - No. 8/, http://kenyalaw.org/kenya_gazette/gazette/volume/MTQ1/Vol. CXII - No. 5/, http://kenyalaw.org/kenya_gazette/gazette/volume/MTQz/Vol. CXII - No. 3/]"
1,2011,http://kenyalaw.org/kenya_gazette/gazette/year/2011,"[http://kenyalaw.org/kenya_gazette/gazette/volume/OTQ4/Vol.CXIII-No.128/, http://kenyalaw.org/kenya_gazette/gazette/volume/OTk0/Vol.CXIII-No.125/, http://kenyalaw.org/kenya_gazette/gazette/volume/OTky/Vol. CXV - No. 123/, http://kenyalaw.org/kenya_gazette/gazette/volume/OTk3/Vol.CXIII-No.122/, http://kenyalaw.org/kenya_gazette/gazette/volume/OTg5/Vol.CXIII-No.119/, http://kenyalaw.org/kenya_gazette/gazette/volume/OTkw/Vol. CXV - No. 120/, http://kenyalaw.org/kenya_gazette/gazette/volume/OTg3/Vol.CXIII-No.117/, http://kenyalaw.org/kenya_gazette/gazette/volume/OTg1/Vol.CXIII-No.114/, http://kenyalaw.org/kenya_gazette/gazette/volume/OTg0/Vol. CXV - No. 113/, http://kenyalaw.org/kenya_gazette/gazette/volume/OTgz/Vol.CXIII-No.112/, http://kenyalaw.org/kenya_gazette/gazette/volume/OTgy/Vol. CXV - No. 111/, http://kenyalaw.org/kenya_gazette/gazette/volume/OTgx/Vol.CXIII-No.110/, http://kenyalaw.org/kenya_gazette/gazette/volume/OTgw/Vol. CXV - No. 109/, http://kenyalaw.org/kenya_gazette/gazette/volume/OTc5/Vol.CXIII-No.108/, http://kenyalaw.org/kenya_gazette/gazette/volume/OTc3/Vol.CXIII-No.106/, http://kenyalaw.org/kenya_gazette/gazette/volume/OTc1/Vol.CXIII-No.104/, http://kenyalaw.org/kenya_gazette/gazette/volume/OTIy/Vol.CXIII-No.102/, http://kenyalaw.org/kenya_gazette/gazette/volume/MTA0Nw--/Vol. CXIII - No. 103/, http://kenyalaw.org/kenya_gazette/gazette/volume/MjA3OA--/Vol.CXIII-No.99/, http://kenyalaw.org/kenya_gazette/gazette/volume/OTc0/Vol. CXV - No. 101/, http://kenyalaw.org/kenya_gazette/gazette/volume/OTcy/Vol.CXIII-No.94/, http://kenyalaw.org/kenya_gazette/gazette/volume/OTcx/Vol.CXIII-No.92/, http://kenyalaw.org/kenya_gazette/gazette/volume/MTA0NA--/Vol.CXIII-No.90/, http://kenyalaw.org/kenya_gazette/gazette/volume/MTA0Mg--/Vol.CXIII-No.87/, http://kenyalaw.org/kenya_gazette/gazette/volume/OTU0/Vol.CXIII-No.85/, http://kenyalaw.org/kenya_gazette/gazette/volume/OTcw/Vol.CXIII-No.81/, http://kenyalaw.org/kenya_gazette/gazette/volume/OTY5/Vol.CXIII-No.79/, http://kenyalaw.org/kenya_gazette/gazette/volume/OTY4/Vol.CXIII-No.76/, http://kenyalaw.org/kenya_gazette/gazette/volume/OTM2/Vol.CXIII-No.71/, http://kenyalaw.org/kenya_gazette/gazette/volume/MTA0MQ--/Vol.CXIII-No.67/, http://kenyalaw.org/kenya_gazette/gazette/volume/OTYz/Vol.CXIII-No.63/, http://kenyalaw.org/kenya_gazette/gazette/volume/OTYx/Vol.CXIII-No.61/, http://kenyalaw.org/kenya_gazette/gazette/volume/OTMz/Vol.CXIII-No.59/, http://kenyalaw.org/kenya_gazette/gazette/volume/OTMy/Vol.CXIII-No.57/, http://kenyalaw.org/kenya_gazette/gazette/volume/OTMx/Vol.CXIII-No.56/, http://kenyalaw.org/kenya_gazette/gazette/volume/OTU4/Vol.CXIII-No.52/, http://kenyalaw.org/kenya_gazette/gazette/volume/OTQ5/Vol.CXIII-No.48/, http://kenyalaw.org/kenya_gazette/gazette/volume/OTA2/Vol.CXIII-No.47/, http://kenyalaw.org/kenya_gazette/gazette/volume/MTA0MA--/Vol.CXIII-No.46/, http://kenyalaw.org/kenya_gazette/gazette/volume/OTUw/Vol.CXIII-No.44/, http://kenyalaw.org/kenya_gazette/gazette/volume/OTAz/Vol.CXIII-No.41/, http://kenyalaw.org/kenya_gazette/gazette/volume/OTAx/Vol.CXIII-No.39/, http://kenyalaw.org/kenya_gazette/gazette/volume/MTA0OQ--/Vol.CXIII-No.36/, http://kenyalaw.org/kenya_gazette/gazette/volume/MTAzOA--/Vol.CXIII-No.35/, http://kenyalaw.org/kenya_gazette/gazette/volume/MTAzNQ--/Vol.CXIII-No.33/, http://kenyalaw.org/kenya_gazette/gazette/volume/MTAwOA--/Vol.CXIII-No.30/, http://kenyalaw.org/kenya_gazette/gazette/volume/MTAzMA--/Vol.CXIII-No.28/, http://kenyalaw.org/kenya_gazette/gazette/volume/OTU2/Vol.CXIII-No.26/, http://kenyalaw.org/kenya_gazette/gazette/volume/OTU1/Vol.CXIII-No.24/, http://kenyalaw.org/kenya_gazette/gazette/volume/OTI0/Vol.CXIII-No.21/, http://kenyalaw.org/kenya_gazette/gazette/volume/MTI1/Vol.CXIII-No.19/, http://kenyalaw.org/kenya_gazette/gazette/volume/MTIy/Vol.CXIII-No.16/, http://kenyalaw.org/kenya_gazette/gazette/volume/MTIw/Vol.CXIII-No.14/, http://kenyalaw.org/kenya_gazette/gazette/volume/MTE4/Vol.CXIII-No.12/, http://kenyalaw.org/kenya_gazette/gazette/volume/MTE2/Vol.CXIII-No.10/, http://kenyalaw.org/kenya_gazette/gazette/volume/MTE3/Vol. CXIII - No. 11/, http://kenyalaw.org/kenya_gazette/gazette/volume/MTEz/Vol.CXIII-No.7/, http://kenyalaw.org/kenya_gazette/gazette/volume/OTY-/Vol.CXIII-No.4/, http://kenyalaw.org/kenya_gazette/gazette/volume/OTE-/Vol.CXIII-No.1/]"
2,2012,http://kenyalaw.org/kenya_gazette/gazette/year/2012,"[http://kenyalaw.org/kenya_gazette/gazette/volume/NTI-/Vol.CXIV-No.107/, http://kenyalaw.org/kenya_gazette/gazette/volume/NTE-/Vol.CXIV-No.104/, http://kenyalaw.org/kenya_gazette/gazette/volume/Njg-/Vol.CXIV-No.102/, http://kenyalaw.org/kenya_gazette/gazette/volume/NTA-/Vol.CXIV-No.99/, http://kenyalaw.org/kenya_gazette/gazette/volume/NDg-/Vol.CXIV-No.93/, http://kenyalaw.org/kenya_gazette/gazette/volume/MTEx/Vol.CXIV-No.92/, http://kenyalaw.org/kenya_gazette/gazette/volume/NDQ-/Vol.CXIV-No.84/, http://kenyalaw.org/kenya_gazette/gazette/volume/NDM-/Vol.CXIV-No.83/, http://kenyalaw.org/kenya_gazette/gazette/volume/NDE-/Vol.CXIV-No.81/, http://kenyalaw.org/kenya_gazette/gazette/volume/NDA-/Vol.CXIV-No.78/, http://kenyalaw.org/kenya_gazette/gazette/volume/Mzk-/Vol.CXIV-No.76/, http://kenyalaw.org/kenya_gazette/gazette/volume/Mzc-/Vol.CXIV-No.75/, http://kenyalaw.org/kenya_gazette/gazette/volume/MzY-/Vol.CXIV-No.73/, http://kenyalaw.org/kenya_gazette/gazette/volume/MzQ-/Vol.CXIV-No.49/, http://kenyalaw.org/kenya_gazette/gazette/volume/MzM-/Vol.CXIV-No.47/, http://kenyalaw.org/kenya_gazette/gazette/volume/MzE-/Vol.CXIV-No.42/, http://kenyalaw.org/kenya_gazette/gazette/volume/MzA-/Vol.CXIV-No.39/, http://kenyalaw.org/kenya_gazette/gazette/volume/MTI3/Vol.CXIV-No.35/, http://kenyalaw.org/kenya_gazette/gazette/volume/Mjg-/Vol.CXIV-No.37/, http://kenyalaw.org/kenya_gazette/gazette/volume/MjQ-/Vol.CXIV-No.33/, http://kenyalaw.org/kenya_gazette/gazette/volume/MjM-/Vol.CXIV-No.31/, http://kenyalaw.org/kenya_gazette/gazette/volume/NTM-/Vol.CXIV-No.28/, http://kenyalaw.org/kenya_gazette/gazette/volume/MjI-/Vol.CXIV-No.27/, http://kenyalaw.org/kenya_gazette/gazette/volume/MjE-/Vol.CXIV-No.25/, http://kenyalaw.org/kenya_gazette/gazette/volume/MjA-/Vol.CXIV-No.22/, http://kenyalaw.org/kenya_gazette/gazette/volume/ODk-/Vol.CXIV-No.24/, http://kenyalaw.org/kenya_gazette/gazette/volume/MTg-/Vol.CXIV-No.20/, http://kenyalaw.org/kenya_gazette/gazette/volume/MTU-/Vol.CXIV-No.18/, http://kenyalaw.org/kenya_gazette/gazette/volume/ODg-/Vol.CXIV-No.17/, http://kenyalaw.org/kenya_gazette/gazette/volume/MTM-/Vol.CXIV-No.15/, http://kenyalaw.org/kenya_gazette/gazette/volume/MTE-/Vol.CXIV-No.13/, http://kenyalaw.org/kenya_gazette/gazette/volume/OQ--/Vol.CXIV-No.11/, http://kenyalaw.org/kenya_gazette/gazette/volume/OA--/Vol.CXIV-No.9/, http://kenyalaw.org/kenya_gazette/gazette/volume/Nw--/Vol.CXIV-No.6/, http://kenyalaw.org/kenya_gazette/gazette/volume/Ng--/Vol.CXIV-No.3/, http://kenyalaw.org/kenya_gazette/gazette/volume/Mw--/Vol.CXIV-No.2/, http://kenyalaw.org/kenya_gazette/gazette/volume/Mg--/Vol.CXIV-No.1/]"
3,2013,http://kenyalaw.org/kenya_gazette/gazette/year/2013,"[http://kenyalaw.org/kenya_gazette/gazette/volume/NzY5/Vol. CXV-No. 180/, http://kenyalaw.org/kenya_gazette/gazette/volume/NzY1/Vol. CXV-No. 178/, http://kenyalaw.org/kenya_gazette/gazette/volume/NzE0/Vol. CXV-No. 173/, http://kenyalaw.org/kenya_gazette/gazette/volume/NzA2/Vol. CXV-No. 170/, http://kenyalaw.org/kenya_gazette/gazette/volume/Njkw/Vol. CXV-No. 167/, http://kenyalaw.org/kenya_gazette/gazette/volume/Njgz/Vol. CXV-No. 163/, http://kenyalaw.org/kenya_gazette/gazette/volume/Njgy/Vol. CXV-No. 161/, http://kenyalaw.org/kenya_gazette/gazette/volume/NjYy/Vol. CXV-No. 158/, http://kenyalaw.org/kenya_gazette/gazette/volume/NjQ5/Vol. CXV-No. 157/, http://kenyalaw.org/kenya_gazette/gazette/volume/NjQ0/Vol. CXV-No. 152/, http://kenyalaw.org/kenya_gazette/gazette/volume/MzE4/Vol. CXV-No. 151/, http://kenyalaw.org/kenya_gazette/gazette/volume/MzE1/Vol. CXV-No. 147/, http://kenyalaw.org/kenya_gazette/gazette/volume/MzEz/Vol. CXV-No. 144/, http://kenyalaw.org/kenya_gazette/gazette/volume/MzEw/Vol. CXV-No. 139/, http://kenyalaw.org/kenya_gazette/gazette/volume/MzAx/Vol. CXV-No. 138/, http://kenyalaw.org/kenya_gazette/gazette/volume/Mjk2/Vol. CXV-No. 133/, http://kenyalaw.org/kenya_gazette/gazette/volume/MzE2/Vol. CXV-No. 131/, http://kenyalaw.org/kenya_gazette/gazette/volume/MzEy/Vol. CXV-No. 126/, http://kenyalaw.org/kenya_gazette/gazette/volume/MzE0/Vol. CXV-No. 121/, http://kenyalaw.org/kenya_gazette/gazette/volume/MzE3/Vol. CXV-No. 117/, http://kenyalaw.org/kenya_gazette/gazette/volume/NjUw/Vol. CXV-No. 114/, http://kenyalaw.org/kenya_gazette/gazette/volume/NjQ2/Vol. CXV-No. 112/, http://kenyalaw.org/kenya_gazette/gazette/volume/NjQ3/Vol. CXV-No. 109/, http://kenyalaw.org/kenya_gazette/gazette/volume/NjUx/Vol. CXV-No. 106/, http://kenyalaw.org/kenya_gazette/gazette/volume/NjUy/Vol. CXV-No. 103/, http://kenyalaw.org/kenya_gazette/gazette/volume/NjUz/Vol. CXV-No. 99/, http://kenyalaw.org/kenya_gazette/gazette/volume/NjU0/Vol. CXV-No. 96/, http://kenyalaw.org/kenya_gazette/gazette/volume/NjU1/Vol. CXV-No. 92/, http://kenyalaw.org/kenya_gazette/gazette/volume/NjU2/Vol. CXV-No. 89/, http://kenyalaw.org/kenya_gazette/gazette/volume/NjU3/Vol. CXV-No. 85/, http://kenyalaw.org/kenya_gazette/gazette/volume/NjU4/Vol. CXV-No. 81/, http://kenyalaw.org/kenya_gazette/gazette/volume/NjU5/Vol. CXV-No. 77/, http://kenyalaw.org/kenya_gazette/gazette/volume/NjYx/Vol. CXV-No. 75/, http://kenyalaw.org/kenya_gazette/gazette/volume/Njcx/Vol. CXV-No. 70/, http://kenyalaw.org/kenya_gazette/gazette/volume/Njcy/Vol. CXV-No. 67/, http://kenyalaw.org/kenya_gazette/gazette/volume/Njc2/Vol. CXV-No. 65/, http://kenyalaw.org/kenya_gazette/gazette/volume/Njc3/Vol. CXV-No. 62/, http://kenyalaw.org/kenya_gazette/gazette/volume/Njc5/Vol. CXV-No. 59/, http://kenyalaw.org/kenya_gazette/gazette/volume/Njc4/Vol. CXV-No. 57/, http://kenyalaw.org/kenya_gazette/gazette/volume/Njgw/Vol. CXV-No. 55/, http://kenyalaw.org/kenya_gazette/gazette/volume/Njcw/Vol. CXV-No. 51/, http://kenyalaw.org/kenya_gazette/gazette/volume/NjY5/Vol. CXV-No. 46/, http://kenyalaw.org/kenya_gazette/gazette/volume/NjY3/Vol. CXV-No. 39/, http://kenyalaw.org/kenya_gazette/gazette/volume/Njgx/Vol. CXV-No. 32/, http://kenyalaw.org/kenya_gazette/gazette/volume/Njg0/Vol. CXV-No. 28/, http://kenyalaw.org/kenya_gazette/gazette/volume/Njcz/Vol. CXV-No. 22/, http://kenyalaw.org/kenya_gazette/gazette/volume/NjY4/Vol. CXV-No. 18/, http://kenyalaw.org/kenya_gazette/gazette/volume/NjY1/Vol. CXV-No. 13/, http://kenyalaw.org/kenya_gazette/gazette/volume/Njg5/Vol. CXV-No. 10/, http://kenyalaw.org/kenya_gazette/gazette/volume/Njg4/Vol. CXV-No. 8/, http://kenyalaw.org/kenya_gazette/gazette/volume/Njg3/Vol. CXV-No. 3/, http://kenyalaw.org/kenya_gazette/gazette/volume/MzAz/Vol. CXV-No. 1/]"
4,2014,http://kenyalaw.org/kenya_gazette/gazette/year/2014,"[http://kenyalaw.org/kenya_gazette/gazette/volume/MTEwNg--/Vol.CXVI-No.151/, http://kenyalaw.org/kenya_gazette/gazette/volume/MTEwMg--/Vol.CXVI-No.148/, http://kenyalaw.org/kenya_gazette/gazette/volume/MTEwMQ--/Vol.CXVI-No.145/, http://kenyalaw.org/kenya_gazette/gazette/volume/MTEwMA--/Vol.CXVI-No.143/, http://kenyalaw.org/kenya_gazette/gazette/volume/MTA5OQ--/Vol.CXVI-No.141/, http://kenyalaw.org/kenya_gazette/gazette/volume/MTA5Nw--/Vol.CXVI-No.139/, http://kenyalaw.org/kenya_gazette/gazette/volume/MTA5NQ--/Vol.CXVI-No.135/, http://kenyalaw.org/kenya_gazette/gazette/volume/MTA5NA--/Vol.CXVI-No.133/, http://kenyalaw.org/kenya_gazette/gazette/volume/MTA5Mg--/Vol.CXVI-No.129/, http://kenyalaw.org/kenya_gazette/gazette/volume/MTA5MA--/Vol.CXVI-No.126/, http://kenyalaw.org/kenya_gazette/gazette/volume/MTA4OA--/Vol.CXVI.No.123/, http://kenyalaw.org/kenya_gazette/gazette/volume/MTA4Ng--/Vol.CXVI-No.122/, http://kenyalaw.org/kenya_gazette/gazette/volume/MTA4NQ--/Vol.CXVI-No.118/, http://kenyalaw.org/kenya_gazette/gazette/volume/MTA3Nw--/Vol.CXVI-No.115/, http://kenyalaw.org/kenya_gazette/gazette/volume/MTA3Ng--/Vol.CXVI-No.112/, http://kenyalaw.org/kenya_gazette/gazette/volume/MTA3NA--/Vol.CXVI-No.110/, http://kenyalaw.org/kenya_gazette/gazette/volume/MTA3NQ--/Vol.CXVI-No.106/, http://kenyalaw.org/kenya_gazette/gazette/volume/MTA3Mg--/Vol.CXVI-No.103/, http://kenyalaw.org/kenya_gazette/gazette/volume/MTA3Mw--/Vol.CXVI-No.101/, http://kenyalaw.org/kenya_gazette/gazette/volume/MTA2NQ--/Vol.CXVI-No.98/, http://kenyalaw.org/kenya_gazette/gazette/volume/MTA2Mg--/Vol.CXVI-No.94/, http://kenyalaw.org/kenya_gazette/gazette/volume/MTA1Nw--/Vol.CXVI-No.91/, http://kenyalaw.org/kenya_gazette/gazette/volume/MTA1Ng--/Vol.CXVI-No.87/, http://kenyalaw.org/kenya_gazette/gazette/volume/MTA1Mw--/Vol.CXVI-No.85/, http://kenyalaw.org/kenya_gazette/gazette/volume/MTA0Ng--/Vol.CXVI-No.80/, http://kenyalaw.org/kenya_gazette/gazette/volume/MTAzMQ--/Vol.CXVI-No.79/, http://kenyalaw.org/kenya_gazette/gazette/volume/MTAyNQ--/Vol.CXVI-No.76/, http://kenyalaw.org/kenya_gazette/gazette/volume/MTAyMg--/Vol.CXVI-No.74/, http://kenyalaw.org/kenya_gazette/gazette/volume/MTAxNA--/Vol.CXVI-No.71/, http://kenyalaw.org/kenya_gazette/gazette/volume/MTAxMw--/Vol.CXVI-No.70/, http://kenyalaw.org/kenya_gazette/gazette/volume/MTAxMA--/Vol. CXVI-No. 67/, http://kenyalaw.org/kenya_gazette/gazette/volume/MTAwNQ--/Vol. CXVI-No. 65/, http://kenyalaw.org/kenya_gazette/gazette/volume/ODg5/Vol. CXVI-No. 63/, http://kenyalaw.org/kenya_gazette/gazette/volume/ODg2/Vol. CXVI-No. 59/, http://kenyalaw.org/kenya_gazette/gazette/volume/ODgz/Vol. CXVI-No. 56/, http://kenyalaw.org/kenya_gazette/gazette/volume/ODgx/Vol. CXVI-No. 52/, http://kenyalaw.org/kenya_gazette/gazette/volume/ODc3/Vol. CXVI-No. 49/, http://kenyalaw.org/kenya_gazette/gazette/volume/ODc2/Vol. CXVI-No. 46/, http://kenyalaw.org/kenya_gazette/gazette/volume/ODcw/Vol. CXVI-No. 43/, http://kenyalaw.org/kenya_gazette/gazette/volume/ODY5/Vol. CXVI-No. 39/, http://kenyalaw.org/kenya_gazette/gazette/volume/ODY0/Vol. CXVI-No. 37/, http://kenyalaw.org/kenya_gazette/gazette/volume/ODYz/Vol. CXVI-No. 33/, http://kenyalaw.org/kenya_gazette/gazette/volume/ODYw/Vol. CXVI-No. 32/, http://kenyalaw.org/kenya_gazette/gazette/volume/ODU3/Vol. CXVI-No. 29/, http://kenyalaw.org/kenya_gazette/gazette/volume/ODQ3/Vol. CXVI-No. 25/, http://kenyalaw.org/kenya_gazette/gazette/volume/ODQx/Vol. CXVI-No. 22/, http://kenyalaw.org/kenya_gazette/gazette/volume/ODQw/Vol. CXVI-No. 18/, http://kenyalaw.org/kenya_gazette/gazette/volume/ODM5/Vol. CXVI-No. 12/, http://kenyalaw.org/kenya_gazette/gazette/volume/ODM3/Vol. CXVI-No. 10/, http://kenyalaw.org/kenya_gazette/gazette/volume/ODMy/Vol. CXVI-No. 7/, http://kenyalaw.org/kenya_gazette/gazette/volume/Nzc5/Vol.CXVI-No.4/, http://kenyalaw.org/kenya_gazette/gazette/volume/Nzc1/Vol.CXVI-No.1/]"


### Explode this so that each html for a weekly gazette notice gets its own row

In [7]:
df_links = df_links_list.explode("gazette_links").reset_index()
df_links.head()

Unnamed: 0,index,year,url_year,gazette_links
0,0,2010,http://kenyalaw.org/kenya_gazette/gazette/year/2010,http://kenyalaw.org/kenya_gazette/gazette/volume/Mjg0/Vol. CXII - No. 135/
1,0,2010,http://kenyalaw.org/kenya_gazette/gazette/year/2010,http://kenyalaw.org/kenya_gazette/gazette/volume/Mjc0/Vol. CXII - No. 133/
2,0,2010,http://kenyalaw.org/kenya_gazette/gazette/year/2010,http://kenyalaw.org/kenya_gazette/gazette/volume/Mjcy/Vol. CXII - No. 131/
3,0,2010,http://kenyalaw.org/kenya_gazette/gazette/year/2010,http://kenyalaw.org/kenya_gazette/gazette/volume/Mjcw/Vol. CXII - No. 128/
4,0,2010,http://kenyalaw.org/kenya_gazette/gazette/year/2010,http://kenyalaw.org/kenya_gazette/gazette/volume/MjY5/Vol. CXII - No. 125/


In [8]:
#save the result to a csv file
df_links.to_csv("titles_links.csv")

## This is what the one section of land-related notices in the Weekly Kenya Gazette looks like

![One section of land-related notices](one_section_of_land_notices_in_weekly_gazette.png)

# Create a function that scrapes a url for a gazette notice to extract the following information:

* volume number: volume_num
* volume date: volume_date
* volume url: volume_url
* title of notice's number: notice_num_title
* title of notice's act: notice_act_title
* notice's capital and section: notice_num_year
* notice's subtitle: notice_sub_title
* notice's full body of info: notice_body
* specific notice's date: notice_date
* name of notice's registrar: notice_registrar_name
* number and location of notice: notice_num_loc
* notice's location: notice_loc

In [9]:
def scrape_url(my_url):
    try: 
        raw_html = requests.get(my_url).content

        #assign soup_doc as the doc holding parsed html 
        #learn type of soup_doc
        soup_doc = BeautifulSoup(raw_html, "html.parser")

        #Finding volume number to keep track of what data is from what source in larger csv
        
            
        

        # creating a list of all divs with id starting with GAZETTE NOTICE
        list_all_divs= [elem.parent.parent for elem in soup_doc.find_all("em", text=re.compile("(Land Registrar|Registrar of Land|Registrar of Titles)", re.IGNORECASE))]
        #The parent of p is a div
        #print("Found", len(list_all_divs), "potentials")
        #create function for making a dictionary out of each entry within the list of all divs
        def class_to_dictionary(list_of_items):
            #initialise keys of dictionary
            dict_titles=['volume_num',
                     'volume_date',
                     'volume_url',
                     'notice_num_title',
                     'notice_act_title',
                     'notice_num_year',
                     'notice_sub_title',
                     'notice_body',
                     'notice_date',
                     'notice_registrar_name',
                     'notice_num_loc',
                     'notice_loc']
            
            #initialise empty list which will contain all new notice entries
            all_new_notice_entries=[]
            
            #For one entry within the list of items which is a variable you will pass in the function
            #list_of_items will usually be the large capsules containing gazette entries
            #mostly for the ones I have seen there are two such capsules among these that have land registration act entries
            
            for one_item in list_of_items:
                
                
                #if the words LAND REGISTRATION ACT or Land Registrar appear in the text
                if "LAND REGISTRATION ACT" in one_item.text or "Land Registrar" in one_item.text:

                    #Then split one_item at the <hr/> point
                    entry_deets = str(one_item).split('<hr/>') 
                    
                    #now entry_deets is a list of notices within the capsule
                    for entry_deet in entry_deets:
                        
                        #one entry_deet is a notice within the capsule
                        
                        #initialise a list which I plan to zip to dict_titles to make a dictionary
                        entry_deet_full_list=[]
                        
                        #since I changed to string to split on hr
                        #I need to return the content to beautiful soup classes

                        entry_deet_soup = BeautifulSoup(entry_deet, "html.parser")
                        
                        #now make a list of all p's within the gazette notice
                        entry_deet_paragraphs = entry_deet_soup.find_all('p')
                        
                        #When converting to text I kept getting Xa0
                        # Normalize unicode data like this
        #               # normalize("NFKD", x.text)
                        # https://stackoverflow.com/questions/10993612/how-to-remove-xa0-from-string-in-python
                        entry_deet_paragraphs = [normalize("NFKD", x.text) for x in entry_deet_paragraphs if x.text.strip()!='']
                        if len(entry_deet_paragraphs) == 0:
                            continue


                        #Introducing number, date and url
                        #Then the rest comes from 
                        volume_num=soup_doc.select(".gazette-content #content div")[2].text
                        volume_date=soup_doc.select(".gazette-content #content div")[3].text
                        volume_url=my_url
                        notice_num_title=entry_deet_paragraphs[0]
                        notice_act_title=entry_deet_paragraphs[1]
                        notice_num_year=entry_deet_paragraphs[2]
                        notice_sub_title=entry_deet_paragraphs[3]
                        notice_body=entry_deet_paragraphs[4]
                        notice_date=entry_deet_paragraphs[5]
                        notice_registrar_name=entry_deet_paragraphs[6]
                        notice_num_loc=entry_deet_paragraphs[7]
                        try:
                            notice_loc=(entry_deet_soup.find_all('p')[-1]).text.split(',')[1]
                        except:
                            notice_loc="Na"
                        
                        entry_deet_full_list.append(volume_num)
                        entry_deet_full_list.append(volume_date)
                        entry_deet_full_list.append(volume_url)
                        entry_deet_full_list.append(notice_num_title)
                        entry_deet_full_list.append(notice_act_title)
                        entry_deet_full_list.append(notice_num_year)
                        entry_deet_full_list.append(notice_sub_title)
                        entry_deet_full_list.append(notice_body)
                        entry_deet_full_list.append(notice_date)
                        entry_deet_full_list.append(notice_registrar_name)
                        entry_deet_full_list.append(notice_num_loc)
                        entry_deet_full_list.append(notice_loc)



        #             #     print(new_deet_list)

                        first_rep = dict(zip(dict_titles,entry_deet_full_list))
        #         #         print(first_rep)

        #         #         print("######")

                        all_new_notice_entries.append(first_rep)
            return all_new_notice_entries


        data = class_to_dictionary(list_all_divs)
        return data
    except Exception as e:
#         raise e
        pass

#### Example scrape for one url using the function created

In [10]:
#read csv
#loop through links
# my_url = "http://kenyalaw.org/kenya_gazette/gazette/volume/MjIzMA--/Vol.CXXII-No.199/"

#Get the list of dictionaries
data = scrape_url("http://kenyalaw.org/kenya_gazette/gazette/volume/MjIzMA--/Vol.CXXII-No.199/")
data
#convert to a dataframe
# pd.DataFrame(data)

#save the resulting large dataframe to csv

In [11]:
scrape_url('http://kenyalaw.org/kenya_gazette/gazette/volume/Mjg0/Vol. CXII - No. 135/')

[{'volume_num': 'Vol. CXII - No. 135',
  'volume_date': 'NAIROBI, \n\t\t    \t31 December,2010 ',
  'volume_url': 'http://kenyalaw.org/kenya_gazette/gazette/volume/Mjg0/Vol. CXII - No. 135/',
  'notice_num_title': 'Gazette Notice No. 16759',
  'notice_act_title': 'THE REGISTERED LAND ACT',
  'notice_num_year': '(Cap. 300, section 35)',
  'notice_sub_title': 'Issue of a New Certificate of Lease',
  'notice_body': 'WHEREAS Francis Meso, of P.O. Box 52540, Nairobi in the Republic of Kenya, is registered as proprietor in leasehold interests of that piece of land containing 0.0141 hectare or thereabouts, situate in the city of Nairobi, registered under title No. Nairobi Block 93/686, and whereas sufficient evidence has been adduced to show that the land title deed issued thereof has been lost, notice is given that after the expiration of sixty (60) days from the date hereof, I shall issue a new certificate of lease provided that no objection has been received within that period.',
  'notice

# Use the function created to now scrape through all gazette links which were earlier scraped and saved into the df_links gazette link
Save it into a large list and then create a dataframe from it

In [12]:
data_list = []

for gazette_link in df_links.gazette_links:
    try:
        print(gazette_link)
        my_url = gazette_link

        #Get the list of dictionaries
        data = scrape_url(my_url)
        print(len(data))
        if data:
            data_list.append(data)
    except:
        pass
        
# save the resulting large dataframe to csv

data_list_flat = [item for sublist in data_list for item in sublist]
df=pd.DataFrame(data_list_flat)
df.shape

http://kenyalaw.org/kenya_gazette/gazette/volume/Mjg0/Vol. CXII - No. 135/
32
http://kenyalaw.org/kenya_gazette/gazette/volume/Mjc0/Vol. CXII - No. 133/
32
http://kenyalaw.org/kenya_gazette/gazette/volume/Mjcy/Vol. CXII - No. 131/
58
http://kenyalaw.org/kenya_gazette/gazette/volume/Mjcw/Vol. CXII - No. 128/
46
http://kenyalaw.org/kenya_gazette/gazette/volume/MjY5/Vol. CXII - No. 125/
24
http://kenyalaw.org/kenya_gazette/gazette/volume/MjY4/Vol. CXII - No. 123/
30
http://kenyalaw.org/kenya_gazette/gazette/volume/MjY3/Vol. CXII - No. 120/
http://kenyalaw.org/kenya_gazette/gazette/volume/MjYx/Vol. CXII - No. 114/
61
http://kenyalaw.org/kenya_gazette/gazette/volume/MjU1/Vol. CXII - No. 108/
35
http://kenyalaw.org/kenya_gazette/gazette/volume/MjU2/Vol. CXII - No. 109/
0
http://kenyalaw.org/kenya_gazette/gazette/volume/MjUy/Vol. CXII - No. 105/
36
http://kenyalaw.org/kenya_gazette/gazette/volume/MjQ4/Vol. CXII - No. 101/
45
http://kenyalaw.org/kenya_gazette/gazette/volume/MjQ1/Vol. CXII - No

7
http://kenyalaw.org/kenya_gazette/gazette/volume/MTE4/Vol.CXIII-No.12/
5
http://kenyalaw.org/kenya_gazette/gazette/volume/MTE2/Vol.CXIII-No.10/
38
http://kenyalaw.org/kenya_gazette/gazette/volume/MTE3/Vol. CXIII - No. 11/
0
http://kenyalaw.org/kenya_gazette/gazette/volume/MTEz/Vol.CXIII-No.7/
31
http://kenyalaw.org/kenya_gazette/gazette/volume/OTY-/Vol.CXIII-No.4/
8
http://kenyalaw.org/kenya_gazette/gazette/volume/OTE-/Vol.CXIII-No.1/
http://kenyalaw.org/kenya_gazette/gazette/volume/NTI-/Vol.CXIV-No.107/
http://kenyalaw.org/kenya_gazette/gazette/volume/NTE-/Vol.CXIV-No.104/
http://kenyalaw.org/kenya_gazette/gazette/volume/Njg-/Vol.CXIV-No.102/
0
http://kenyalaw.org/kenya_gazette/gazette/volume/NTA-/Vol.CXIV-No.99/
http://kenyalaw.org/kenya_gazette/gazette/volume/NDg-/Vol.CXIV-No.93/
45
http://kenyalaw.org/kenya_gazette/gazette/volume/MTEx/Vol.CXIV-No.92/
0
http://kenyalaw.org/kenya_gazette/gazette/volume/NDQ-/Vol.CXIV-No.84/
59
http://kenyalaw.org/kenya_gazette/gazette/volume/NDM-/Vo

2894
http://kenyalaw.org/kenya_gazette/gazette/volume/MTA3NQ--/Vol.CXVI-No.106/
0
http://kenyalaw.org/kenya_gazette/gazette/volume/MTA3Mg--/Vol.CXVI-No.103/
87
http://kenyalaw.org/kenya_gazette/gazette/volume/MTA3Mw--/Vol.CXVI-No.101/
0
http://kenyalaw.org/kenya_gazette/gazette/volume/MTA2NQ--/Vol.CXVI-No.98/
60
http://kenyalaw.org/kenya_gazette/gazette/volume/MTA2Mg--/Vol.CXVI-No.94/
88
http://kenyalaw.org/kenya_gazette/gazette/volume/MTA1Nw--/Vol.CXVI-No.91/
94
http://kenyalaw.org/kenya_gazette/gazette/volume/MTA1Ng--/Vol.CXVI-No.87/
42
http://kenyalaw.org/kenya_gazette/gazette/volume/MTA1Mw--/Vol.CXVI-No.85/
82
http://kenyalaw.org/kenya_gazette/gazette/volume/MTA0Ng--/Vol.CXVI-No.80/
81
http://kenyalaw.org/kenya_gazette/gazette/volume/MTAzMQ--/Vol.CXVI-No.79/
68
http://kenyalaw.org/kenya_gazette/gazette/volume/MTAyNQ--/Vol.CXVI-No.76/
77
http://kenyalaw.org/kenya_gazette/gazette/volume/MTAyMg--/Vol.CXVI-No.74/
http://kenyalaw.org/kenya_gazette/gazette/volume/MTAxNA--/Vol.CXVI-No.71/

0
http://kenyalaw.org/kenya_gazette/gazette/volume/MTM2OA--/Vol.CXVIII-No.122/
0
http://kenyalaw.org/kenya_gazette/gazette/volume/MTM2Nw--/Vol.CXVIII-No.116/
91
http://kenyalaw.org/kenya_gazette/gazette/volume/MTM2Mg--/Vol.CXVIII-No.114/
99
http://kenyalaw.org/kenya_gazette/gazette/volume/MTM2MQ--/Vol.CXVIII-No.110/
61
http://kenyalaw.org/kenya_gazette/gazette/volume/MTM1OQ--/Vol.CXVIII-No.107/
29
http://kenyalaw.org/kenya_gazette/gazette/volume/MTM1Ng--/Vol.CXVIII-No.103/
61
http://kenyalaw.org/kenya_gazette/gazette/volume/MTM1NA--/Vol.CXVIII-No.98/
61
http://kenyalaw.org/kenya_gazette/gazette/volume/MTM1Mw--/Vol.CXVIII-No.95/
82
http://kenyalaw.org/kenya_gazette/gazette/volume/MTM1MA--/Vol.CXVIII-No. 92/
113
http://kenyalaw.org/kenya_gazette/gazette/volume/MTM0OQ--/Vol.CXVIII-No.87/
107
http://kenyalaw.org/kenya_gazette/gazette/volume/MTM0NQ--/Vol.CXVIII-No.84/
67
http://kenyalaw.org/kenya_gazette/gazette/volume/MTM0NA--/Vol.CXVIII-No.81/
118
http://kenyalaw.org/kenya_gazette/gazette

0
http://kenyalaw.org/kenya_gazette/gazette/volume/MTg2Mw--/Vol.CXX-No.139/
0
http://kenyalaw.org/kenya_gazette/gazette/volume/MTg1OQ--/Vol.CXX-No.137/
0
http://kenyalaw.org/kenya_gazette/gazette/volume/MTg1OA--/Vol.CXX-No.134/
0
http://kenyalaw.org/kenya_gazette/gazette/volume/MTg1NQ--/Vol.CXX-No.130/
0
http://kenyalaw.org/kenya_gazette/gazette/volume/MTg1Mg--/Vol.CXX-No.125/
http://kenyalaw.org/kenya_gazette/gazette/volume/MTg0OQ--/Vol.CXX-No.123/
0
http://kenyalaw.org/kenya_gazette/gazette/volume/MTg0OA--/Vol.CXX-No.121/
0
http://kenyalaw.org/kenya_gazette/gazette/volume/MTg0Ng--/Vol.CXX-No.120/
0
http://kenyalaw.org/kenya_gazette/gazette/volume/MTg0Mw--/Vol.CXX-No.116/
http://kenyalaw.org/kenya_gazette/gazette/volume/MTgzMg--/Vol.CXX-No.113/
http://kenyalaw.org/kenya_gazette/gazette/volume/MTgxOA--/Vol.CXX-No.108/
87
http://kenyalaw.org/kenya_gazette/gazette/volume/MTgyOQ--/Vol.CXX-No.105/
http://kenyalaw.org/kenya_gazette/gazette/volume/MTgyOA--/Vol.CXX-No.102/
http://kenyalaw.org

http://kenyalaw.org/kenya_gazette/gazette/volume/MjIyNQ--/Vol.CXXII-No.192/
http://kenyalaw.org/kenya_gazette/gazette/volume/MjIyNA--/Vol.CXXII-No.187/
0
http://kenyalaw.org/kenya_gazette/gazette/volume/MjIyMg--/Vol.CXXII-No.182/
0
http://kenyalaw.org/kenya_gazette/gazette/volume/MjIyMQ--/Vol.CXXII-No.177/
0
http://kenyalaw.org/kenya_gazette/gazette/volume/MjIxOQ--/Vol.CXXII-No.174/
0
http://kenyalaw.org/kenya_gazette/gazette/volume/MjIxNw--/Vol.CXXII-No.169/
0
http://kenyalaw.org/kenya_gazette/gazette/volume/MjIxNQ--/Vol.CXXII-No.167/
0
http://kenyalaw.org/kenya_gazette/gazette/volume/MjIxMw--/Vol.CXXII-No.162/
0
http://kenyalaw.org/kenya_gazette/gazette/volume/MjIwMA--/Vol.CXXII-No.159/
http://kenyalaw.org/kenya_gazette/gazette/volume/MjE5OQ--/Vol.CXXII-No.155/
0
http://kenyalaw.org/kenya_gazette/gazette/volume/MjE5OA--/Vol.CXXII-No.154/
0
http://kenyalaw.org/kenya_gazette/gazette/volume/MjE5Nw--/Vol.CXXII-No.150/
3969
http://kenyalaw.org/kenya_gazette/gazette/volume/MjE5NA--/Vol.CXX

(230211, 12)

In [13]:
df.to_csv("titles_bignotebook.csv")

# Now merge the data in df_links to the new dataframe df

In [15]:
df_large = pd.merge(df_links, df, left_on=df_links.gazette_links, right_on=df.volume_url, how='left')
df_large.head(3)

Unnamed: 0,key_0,index,year,url_year,gazette_links,volume_num,volume_date,volume_url,notice_num_title,notice_act_title,notice_num_year,notice_sub_title,notice_body,notice_date,notice_registrar_name,notice_num_loc,notice_loc
0,http://kenyalaw.org/kenya_gazette/gazette/volume/Mjg0/Vol. CXII - No. 135/,0,2010,http://kenyalaw.org/kenya_gazette/gazette/year/2010,http://kenyalaw.org/kenya_gazette/gazette/volume/Mjg0/Vol. CXII - No. 135/,Vol. CXII - No. 135,"NAIROBI, \n\t\t \t31 December,2010",http://kenyalaw.org/kenya_gazette/gazette/volume/Mjg0/Vol. CXII - No. 135/,Gazette Notice No. 16759,THE REGISTERED LAND ACT,"(Cap. 300, section 35)",Issue of a New Certificate of Lease,"WHEREAS Francis Meso, of P.O. Box 52540, Nairobi in the Republic of Kenya, is registered as proprietor in leasehold interests of that piece of land containing 0.0141 hectare or thereabouts, situate in the city of Nairobi, registered under title No. Nairobi Block 93/686, and whereas sufficient evidence has been adduced to show that the land title deed issued thereof has been lost, notice is given that after the expiration of sixty (60) days from the date hereof, I shall issue a new certificate of lease provided that no objection has been received within that period.","Dated the 31st December, 2010.","B. K. LEITICH,","Land Registrar, Nairobi.",Nairobi.
1,http://kenyalaw.org/kenya_gazette/gazette/volume/Mjg0/Vol. CXII - No. 135/,0,2010,http://kenyalaw.org/kenya_gazette/gazette/year/2010,http://kenyalaw.org/kenya_gazette/gazette/volume/Mjg0/Vol. CXII - No. 135/,Vol. CXII - No. 135,"NAIROBI, \n\t\t \t31 December,2010",http://kenyalaw.org/kenya_gazette/gazette/volume/Mjg0/Vol. CXII - No. 135/,Gazette Notice No. 16760,THE REGISTERED LAND ACT,"(Cap. 300, section 35)",Issue of a New Land Title Deed,"WHEREAS Milcah Wangui Wambura, of P.O. Box 42660, Nairobi in the Republic of Kenya, is registered as proprietor in absolute ownership interest of that piece of land containing 7.6 acres or thereabout, situate in the city of Nairobi, registered under title No. Dagoretti/Mutuini/309, and whereas sufficient evidence has been adduced to show that the land title deed issued thereof has been lost, notice is given that after the expiration of sixty (60) days from the date hereof, I shall issue a new land title deed provided that no objection has been received within that period.","Dated the 31st December, 2010.","B. K. LEITICH,","Land Registrar, Nairobi.",Nairobi.
2,http://kenyalaw.org/kenya_gazette/gazette/volume/Mjg0/Vol. CXII - No. 135/,0,2010,http://kenyalaw.org/kenya_gazette/gazette/year/2010,http://kenyalaw.org/kenya_gazette/gazette/volume/Mjg0/Vol. CXII - No. 135/,Vol. CXII - No. 135,"NAIROBI, \n\t\t \t31 December,2010",http://kenyalaw.org/kenya_gazette/gazette/volume/Mjg0/Vol. CXII - No. 135/,Gazette Notice No. 16761,THE REGISTERED LAND ACT,"(Cap. 300, section 35)",Issue of a New Land Title Deed,"WHEREAS Joseph Muchiri Kenja, is registered as proprietor in absolute ownership interest of that piece of land containing 0.045 hectare or thereabouts, situate in the district of Nakuru, registered under title No. Nakuru/Municipality Block 22/607, and whereas sufficient evidence has been adduced to show that the land title deed issued thereof has been lost, notice is given that after the expiration of sixty (60) days from the date hereof, I shall issue a new land title deed provided that no objection has been received within that period.","Dated the 31st December, 2010.","J. M. MWAURA,","Land Registrar,",Na
3,http://kenyalaw.org/kenya_gazette/gazette/volume/Mjg0/Vol. CXII - No. 135/,0,2010,http://kenyalaw.org/kenya_gazette/gazette/year/2010,http://kenyalaw.org/kenya_gazette/gazette/volume/Mjg0/Vol. CXII - No. 135/,Vol. CXII - No. 135,"NAIROBI, \n\t\t \t31 December,2010",http://kenyalaw.org/kenya_gazette/gazette/volume/Mjg0/Vol. CXII - No. 135/,Gazette Notice No. 16762,THE REGISTERED LAND ACT,"(Cap. 300, section 35)",Issue of a New Land Title Deed,"WHEREAS Charles Kangangi, of P.O. Box 11, Subukia in the Republic of Kenya, is registered as proprietor in absolute ownership interest of that piece of land containing 1.445 hectares or thereabout, situate in the district of Nakuru, registered under title No. Subukia/Subukia Block 3/113, and whereas sufficient evidence has been adduced to show that the land title deed issued thereof has been lost, notice is given that after the expiration of sixty (60) days from the date hereof, I shall issue a new land title deed provided that no objection has been received within that period.","Dated the 31st December, 2010.","J. M. MWAURA,","Land Registrar,",Na
4,http://kenyalaw.org/kenya_gazette/gazette/volume/Mjg0/Vol. CXII - No. 135/,0,2010,http://kenyalaw.org/kenya_gazette/gazette/year/2010,http://kenyalaw.org/kenya_gazette/gazette/volume/Mjg0/Vol. CXII - No. 135/,Vol. CXII - No. 135,"NAIROBI, \n\t\t \t31 December,2010",http://kenyalaw.org/kenya_gazette/gazette/volume/Mjg0/Vol. CXII - No. 135/,Gazette Notice No. 16763,THE REGISTERED LAND ACT,"(Cap. 300, section 35)",Issue of a New Land Title Deed,"WHEREAS John Njenga Kariuki, of P.O. Box 14938, Nakuru in the Republic of Kenya, is registered as proprietor in absolute ownership interest of that piece of land containing 0.047 hectare or thereabouts, situate in the district of Nakuru, registered under title No. Kiambogo/Kiambogo Block 2/6772, and whereas sufficient evidence has been adduced to show that the land title deed issued thereof has been lost, notice is given that after the expiration of sixty (60) days from the date hereof, I shall issue a new land title deed provided that no objection has been received within that period.","Dated the 31st December, 2010.","M. SUNGU,","Land Registrar,",Na


In [16]:
df_large.shape

(230428, 17)

In [17]:
df_large.to_csv("titles_new_df_large.csv")

# Read data on Districts, Population and Area and merge it to the location information in the dataframe

This data is important so that I can normalise the data on population but also so that I can get the code that I can use to group the data by provinces

In [18]:
df_districts=pd.read_excel('/Users/ivynyayieka/Downloads/districts.xlsx')
df_districts.head()

Unnamed: 0,District,HASC,Cod,Population,Area(km.²),Capital
0,Baringo,KE.RV.BA,701,264978,8646.0,Kabarnet
1,Bomet,KE.RV.BO,702,382794,1882.0,Bomet
2,Bondo,KE.NY.BO,611,238780,987.0,Bondo
3,Bungoma,KE.WE.BN,801,876491,2069.0,Bungoma
4,Buret,KE.RV.BU,718,316882,955.0,Litein


In [19]:
df_districts['district_named']=df_districts.District+" District"
df_districts.head()

Unnamed: 0,District,HASC,Cod,Population,Area(km.²),Capital,district_named
0,Baringo,KE.RV.BA,701,264978,8646.0,Kabarnet,Baringo District
1,Bomet,KE.RV.BO,702,382794,1882.0,Bomet,Bomet District
2,Bondo,KE.NY.BO,611,238780,987.0,Bondo,Bondo District
3,Bungoma,KE.WE.BN,801,876491,2069.0,Bungoma,Bungoma District
4,Buret,KE.RV.BU,718,316882,955.0,Litein,Buret District


### Clean up the districts' information in the dataframe to make it possible to join with districts dataframe

In [20]:
df_large.notice_loc=df_large.notice_loc.str.replace(".", "")
df_large.notice_loc=df_large.notice_loc.str.replace("Districts", "")
df_large.notice_loc=df_large.notice_loc.str.replace("District", "")
df_large.notice_loc=df_large.notice_loc.str.strip()
df_large.notice_loc.unique()

  df_large.notice_loc=df_large.notice_loc.str.replace(".", "")


array(['Nairobi', 'Na', 'Eldoret', 'Kiambu', 'Kirinyaga', 'Thika', 'Meru',
       'Machakos', 'Kitui', 'Kajiado', 'Mombasa', 'Nakuru', 'Embu', nan,
       'Nyeri', 'Laikipia', 'Siaya', 'Kitale', '', 'Naivasha', 'Mbeere',
       'Kericho/Bureti', 'Kilifi/Kaloleni/Malindi/Ganze', 'Uasin-Gishu',
       'Narok North', 'Vihiga', 'Kuria', 'Busia/Teso', 'Kwale Distict',
       'Kakamega', 'Lamu', 'Bungoma', 'Trans Nzoia', 'Makueni',
       'Rachuonyo', 'Kwale', 'Kisumu', 'Bondo', 'Uasin Gishu', 'Nyando',
       'Migori/Rongo', 'Busia', 'Busia (K)', 'Nyandarua/Samburu',
       'Eldoretv', 'Koibatek', 'Kisii', 'Nyamira', 'Bungoma/Mt Elgon',
       'Homa Bay/Ndhiwa Suba', 'Narok North/South', 'Narok',
       'Kisumu East/Kisumu West', 'Meru South', 'Murang’a', 'Kilifi',
       'Muranga', 'Nandi', 'Meru North', 'Kisumu East/West',
       'Kajiado North', 'Homa–Bay', 'Kisii Central', 'Mwingi', 'Kericho',
       'Ugenya/Ugunja', 'Malindi', 'Homa-Bay', 'Busia/ Teso',
       'Bomet/Buret/Sotik', 'Bom

In [21]:
df_districts.District.unique()

array(['Baringo', 'Bomet', 'Bondo', 'Bungoma', 'Buret', 'Busia',
       'Butere/Mumias', 'Embu', 'Garissa', 'Gucha', 'Homa Bay', 'Ijara',
       'Isiolo', 'Kajiado', 'Kakamega', 'Keiyo', 'Kericho', 'Kiambu',
       'Kilifi', 'Kirinyaga', 'Kisii Central', 'Kisumu', 'Kitui',
       'Koibatek', 'Kuria', 'Kwale', 'Laikipia', 'Lamu', 'Lugari',
       'Machakos', 'Makueni', 'Malindi', 'Mandera', 'Maragua', 'Marakwet',
       'Marsabit', 'Mbeere', 'Meru Central', 'Meru North', 'Meru South',
       'Migori', 'Mombasa', 'Mount Elgon', 'Moyale', "Murang'a", 'Mwingi',
       'Nairobi', 'Nakuru', 'Nandi', 'Narok', 'Nyamira', 'Nyandarua',
       'Nyando', 'Nyeri', 'Rachuonyo', 'Samburu', 'Siaya', 'Suba',
       'Taita Taveta', 'Tana River', 'Teso', 'Tharaka', 'Thika',
       'Trans Mara', 'Trans Nzoia', 'Turkana', 'Uasin Gishu', 'Vihiga',
       'Wajir', 'West Pokot'], dtype=object)

### Merge the main dataframe to include the districts' information

In [22]:
df_large_districts = pd.merge(df_large, df_districts, left_on='notice_loc', right_on='District', how='left')
df_large_districts.head()

Unnamed: 0,key_0,index,year,url_year,gazette_links,volume_num,volume_date,volume_url,notice_num_title,notice_act_title,notice_num_year,notice_sub_title,notice_body,notice_date,notice_registrar_name,notice_num_loc,notice_loc,District,HASC,Cod,Population,Area(km.²),Capital,district_named
0,http://kenyalaw.org/kenya_gazette/gazette/volume/Mjg0/Vol. CXII - No. 135/,0,2010,http://kenyalaw.org/kenya_gazette/gazette/year/2010,http://kenyalaw.org/kenya_gazette/gazette/volume/Mjg0/Vol. CXII - No. 135/,Vol. CXII - No. 135,"NAIROBI, \n\t\t \t31 December,2010",http://kenyalaw.org/kenya_gazette/gazette/volume/Mjg0/Vol. CXII - No. 135/,Gazette Notice No. 16759,THE REGISTERED LAND ACT,"(Cap. 300, section 35)",Issue of a New Certificate of Lease,"WHEREAS Francis Meso, of P.O. Box 52540, Nairobi in the Republic of Kenya, is registered as proprietor in leasehold interests of that piece of land containing 0.0141 hectare or thereabouts, situate in the city of Nairobi, registered under title No. Nairobi Block 93/686, and whereas sufficient evidence has been adduced to show that the land title deed issued thereof has been lost, notice is given that after the expiration of sixty (60) days from the date hereof, I shall issue a new certificate of lease provided that no objection has been received within that period.","Dated the 31st December, 2010.","B. K. LEITICH,","Land Registrar, Nairobi.",Nairobi,Nairobi,KE.NA.NB,101.0,2143254.0,696.0,Nairobi,Nairobi District
1,http://kenyalaw.org/kenya_gazette/gazette/volume/Mjg0/Vol. CXII - No. 135/,0,2010,http://kenyalaw.org/kenya_gazette/gazette/year/2010,http://kenyalaw.org/kenya_gazette/gazette/volume/Mjg0/Vol. CXII - No. 135/,Vol. CXII - No. 135,"NAIROBI, \n\t\t \t31 December,2010",http://kenyalaw.org/kenya_gazette/gazette/volume/Mjg0/Vol. CXII - No. 135/,Gazette Notice No. 16760,THE REGISTERED LAND ACT,"(Cap. 300, section 35)",Issue of a New Land Title Deed,"WHEREAS Milcah Wangui Wambura, of P.O. Box 42660, Nairobi in the Republic of Kenya, is registered as proprietor in absolute ownership interest of that piece of land containing 7.6 acres or thereabout, situate in the city of Nairobi, registered under title No. Dagoretti/Mutuini/309, and whereas sufficient evidence has been adduced to show that the land title deed issued thereof has been lost, notice is given that after the expiration of sixty (60) days from the date hereof, I shall issue a new land title deed provided that no objection has been received within that period.","Dated the 31st December, 2010.","B. K. LEITICH,","Land Registrar, Nairobi.",Nairobi,Nairobi,KE.NA.NB,101.0,2143254.0,696.0,Nairobi,Nairobi District
2,http://kenyalaw.org/kenya_gazette/gazette/volume/Mjg0/Vol. CXII - No. 135/,0,2010,http://kenyalaw.org/kenya_gazette/gazette/year/2010,http://kenyalaw.org/kenya_gazette/gazette/volume/Mjg0/Vol. CXII - No. 135/,Vol. CXII - No. 135,"NAIROBI, \n\t\t \t31 December,2010",http://kenyalaw.org/kenya_gazette/gazette/volume/Mjg0/Vol. CXII - No. 135/,Gazette Notice No. 16761,THE REGISTERED LAND ACT,"(Cap. 300, section 35)",Issue of a New Land Title Deed,"WHEREAS Joseph Muchiri Kenja, is registered as proprietor in absolute ownership interest of that piece of land containing 0.045 hectare or thereabouts, situate in the district of Nakuru, registered under title No. Nakuru/Municipality Block 22/607, and whereas sufficient evidence has been adduced to show that the land title deed issued thereof has been lost, notice is given that after the expiration of sixty (60) days from the date hereof, I shall issue a new land title deed provided that no objection has been received within that period.","Dated the 31st December, 2010.","J. M. MWAURA,","Land Registrar,",Na,,,,,,,
3,http://kenyalaw.org/kenya_gazette/gazette/volume/Mjg0/Vol. CXII - No. 135/,0,2010,http://kenyalaw.org/kenya_gazette/gazette/year/2010,http://kenyalaw.org/kenya_gazette/gazette/volume/Mjg0/Vol. CXII - No. 135/,Vol. CXII - No. 135,"NAIROBI, \n\t\t \t31 December,2010",http://kenyalaw.org/kenya_gazette/gazette/volume/Mjg0/Vol. CXII - No. 135/,Gazette Notice No. 16762,THE REGISTERED LAND ACT,"(Cap. 300, section 35)",Issue of a New Land Title Deed,"WHEREAS Charles Kangangi, of P.O. Box 11, Subukia in the Republic of Kenya, is registered as proprietor in absolute ownership interest of that piece of land containing 1.445 hectares or thereabout, situate in the district of Nakuru, registered under title No. Subukia/Subukia Block 3/113, and whereas sufficient evidence has been adduced to show that the land title deed issued thereof has been lost, notice is given that after the expiration of sixty (60) days from the date hereof, I shall issue a new land title deed provided that no objection has been received within that period.","Dated the 31st December, 2010.","J. M. MWAURA,","Land Registrar,",Na,,,,,,,
4,http://kenyalaw.org/kenya_gazette/gazette/volume/Mjg0/Vol. CXII - No. 135/,0,2010,http://kenyalaw.org/kenya_gazette/gazette/year/2010,http://kenyalaw.org/kenya_gazette/gazette/volume/Mjg0/Vol. CXII - No. 135/,Vol. CXII - No. 135,"NAIROBI, \n\t\t \t31 December,2010",http://kenyalaw.org/kenya_gazette/gazette/volume/Mjg0/Vol. CXII - No. 135/,Gazette Notice No. 16763,THE REGISTERED LAND ACT,"(Cap. 300, section 35)",Issue of a New Land Title Deed,"WHEREAS John Njenga Kariuki, of P.O. Box 14938, Nakuru in the Republic of Kenya, is registered as proprietor in absolute ownership interest of that piece of land containing 0.047 hectare or thereabouts, situate in the district of Nakuru, registered under title No. Kiambogo/Kiambogo Block 2/6772, and whereas sufficient evidence has been adduced to show that the land title deed issued thereof has been lost, notice is given that after the expiration of sixty (60) days from the date hereof, I shall issue a new land title deed provided that no objection has been received within that period.","Dated the 31st December, 2010.","M. SUNGU,","Land Registrar,",Na,,,,,,,


In [23]:
df_large_districts.to_csv('titles_new_df_large_districts.csv')

# In this next section, I use regex and other relevant scraping techniques to extract more information from the notices. 

## This is what one land-related gazette notice looks like:

![Landing Page for 2022](one_land_related_gazette_notice.png)

### Here are some of the columns created:

* First person named in a title: first_named
* Area of land referenced: acres_hectares
* Title deed number: title_number
* ID number of first person named: first_ID
* Address details of first person named: first_address
* LR number: lr_num
* Municipality referenced: municipality
* Second person named in a title: second_person
* IR number: ir_num
* Number of days to expiration: days
* Succession cause number: succession
* township: township
* district: in_the_district_of
* province code: province_code

In [24]:
df_large_districts.notice_sub_title=df_large_districts.notice_sub_title.str.upper()
df_large_districts.notice_sub_title.unique()

array(['ISSUE OF A NEW CERTIFICATE OF LEASE',
       'ISSUE OF A NEW LAND TITLE DEED', 'REGISTRATION OF INSTRUMENT',
       'OPENING OF A NEW REGISTER', 'THE REGISTERED LAND ACT', nan,
       'ISSUE OF A NE W LAND TITLE DEED', '(CAP. 280)',
       'RECONSTRUCTION OF A WHITE AND GREEN CARD',
       'ISSUE OF NEW LAND TITLE DEEDS', 'PROHIBITION (RESTRICTION) ORDER',
       'ISSUE OF A NEW LAND TILE DEED', 'ISSUE OF A NEW LAND TILE DEEDS',
       'ISSUE OF A NEW LAND TITTLE DEED',
       'ISSUE OF A NEW LAND TTITLE DEED',
       'ISSUE OF A NEW LAND CERTIFICATE', 'ISSUE OF NEW LAND TITLE DEED',
       'RECONSTRUCTION OF GREEN CARD AND WHITE CARD',
       'ISSUE OF A NEW CERTIFICATE OF TITLE',
       'ISSUE OF A NEW CERTIFATE OF LEASE',
       'RECONSTRUCTION OF WHITE AND GREEN CARD',
       'RECONSTRUCTION OF GREEN CARD AND WHITE',
       'WHEREAS (1) MOHAMED BIN SALIM, (2) RUKIYA BINT SALIM AND (3) FATUMA BINT ABDREHMAN, ALL OF P.O. BOX 74, LAMU IN THE REPUBLIC OF KENYA, IS REGISTERED PR

In [25]:
df_large_districts['first_named']=df_large_districts.notice_body.str.extract("WHEREAS ([A-Z][\w ]*)[^ \w]")

In [26]:
df_large_districts.sample(5)

Unnamed: 0,key_0,index,year,url_year,gazette_links,volume_num,volume_date,volume_url,notice_num_title,notice_act_title,notice_num_year,notice_sub_title,notice_body,notice_date,notice_registrar_name,notice_num_loc,notice_loc,District,HASC,Cod,Population,Area(km.²),Capital,district_named,first_named
228064,http://kenyalaw.org/kenya_gazette/gazette/volume/MjE5Nw--/Vol.CXXII-No.150/,10,2020,http://kenyalaw.org/kenya_gazette/gazette/year/2020,http://kenyalaw.org/kenya_gazette/gazette/volume/MjE5Nw--/Vol.CXXII-No.150/,Vol.CXXII-No.150,"NAIROBI, \n\t\t \t07 August,2020",http://kenyalaw.org/kenya_gazette/gazette/volume/MjE5Nw--/Vol.CXXII-No.150/,GAZETTE NOTICE NO. 5487,THE LAND REGISTRATION ACT,(No. 3 of 2012),ISSUE OF A NEW LAND TITLE DEED,"WHEREAS Esther Nyambura Kambo, of P.O. Box 93, Gilgil in the Republic of Kenya, is registered as proprietor in absolute ownership interest of all that piece of land containing 0.04 hectare or thereabouts, situate in the district of Nakuru, registered under title No. Kiambogo/Kiambogo Block 2/2042 (Mwariki), and whereas sufficient evidence has been adduced to show that the land title deed issued thereof has been lost, notice is given that after the expiration of sixty (60) days from the date hereof, I shall issue a new land title deed provided that no objection has been received within that period.","Dated the 7th August, 2020.","H. N. KHAREMWA,","MR/0783823 Land Registrar, Nakuru District",Nakuru,Nakuru,KE.RV.NK,709.0,1187039.0,7242.0,Nakuru,Nakuru District,Esther Nyambura Kambo
134884,http://kenyalaw.org/kenya_gazette/gazette/volume/MTE2NA--/Vol.CXVII-No.47/,5,2015,http://kenyalaw.org/kenya_gazette/gazette/year/2015,http://kenyalaw.org/kenya_gazette/gazette/volume/MTE2NA--/Vol.CXVII-No.47/,Vol.CXVII-No.47,"NAIROBI, \n\t\t \t08 May,2015",http://kenyalaw.org/kenya_gazette/gazette/volume/MTE2NA--/Vol.CXVII-No.47/,Gazette Notice No. 3219,THE LAND REGISTRATION ACT,(No. 3 of 2012),ISSUE OF A NEW LAND TITLE DEED,"WHEREAS Francis Onyango Jagam (ID/7246270), is registered as proprietor in absolute ownership interest of that piece of land containing 5.26 hectares or thereabout, situate in the district of Ugenya, registered under title No. Uholo/Rambula/453, and whereas sufficient evidence has been adduced to show that the land title deed issued thereof has been lost, notice is given that after the expiration of sixty (60) days from the date hereof, I shall issue a new title deed provided that no objection has been received within that period.","Dated the 8th May, 2015.","P. A. NYANJA,","MR/7407161 Land Registrar, Ugenya/Ugunja Districts.",Ugenya/Ugunja,,,,,,,,Francis Onyango Jagam
176410,http://kenyalaw.org/kenya_gazette/gazette/volume/MTEyMQ--/Vol.CXVII-No.20/,5,2015,http://kenyalaw.org/kenya_gazette/gazette/year/2015,http://kenyalaw.org/kenya_gazette/gazette/volume/MTEyMQ--/Vol.CXVII-No.20/,Vol.CXVII-No.20,"NAIROBI, \n\t\t \t27 February,2015",http://kenyalaw.org/kenya_gazette/gazette/volume/MTEyMQ--/Vol.CXVII-No.20/,GAZETTE NOTICE NO. 1313,THE LAND REGISTRATION ACT,(No. 3 of 2012),REGISTRATION OF INSTRUMENT,"WHEREAS Moses Mbogwah Mainah (deceased), is registered as proprietor of that piece of land containing 0.045 hectare or thereabouts, known as Ndumberi/Ndumberi/1431, situate in the district of Kiambu, and whereas the chief magistrate’s court of Kenya at Kiambu in Succession Cause No. 212 of 2007, has issued grant of letters of administration to (1) Pauline Wawira Mbugwah and (2) Thomas Njiru Ngari, and whereas the land title deed issued earlier to the said Moses Mbogwah Mainah (deceased) has been reported missing or lost, notice is given that after the expiration of thirty (30) days from the date hereof, provided no valid objection has been received within that period, I intend to dispense with the production of the said land title deed and proceed with registration of the said instrument of R.L. 19 and R.L. 7, and upon such registration the land title deed issued earlier to the said Moses Mbogwah Mainah (deceased), shall be deemed to be cancelled and of no effect.","Dated the 27th February, 2015.","W. N. MUGURO,","MR/6901371 Land Registrar, Kiambu District.",Na,,,,,,,,Moses Mbogwah Mainah
126105,http://kenyalaw.org/kenya_gazette/gazette/volume/MTE2OQ--/Vol.CXVII-No.52/,5,2015,http://kenyalaw.org/kenya_gazette/gazette/year/2015,http://kenyalaw.org/kenya_gazette/gazette/volume/MTE2OQ--/Vol.CXVII-No.52/,Vol.CXVII-No.52,"NAIROBI, \n\t\t \t22 May,2015",http://kenyalaw.org/kenya_gazette/gazette/volume/MTE2OQ--/Vol.CXVII-No.52/,GAZETTE NOTICE NO. 3508,THE LAND REGISTRATION ACT,(No. 3 of 2012),ISSUE OF A NEW LAND TITLE DEED,"WHEREAS Samuel Mwangi Nyaga (ID/0346315), of P.O. Box 285, Murang’a in the Republic of Kenya, is registered as proprietor in absolute ownership interest of that piece of land containing 0.40 hectare or thereabouts, situate in the district of Kirinyaga, registered under title No. Kiine/Sagana/3103, and whereas sufficient evidence has been adduced to show that the land title deed issued thereof has been lost, notice is given that after the expiration of sixty (60) days from the date hereof, I shall issue a new title deed provided that no objection has been received within that period.","Dated the 22nd May, 2015.","C. W. NJAGI,","MR/7413661 Land Registrar, Kirinyaga District.",Kirinyaga,Kirinyaga,KE.CE.KY,202.0,457105.0,1478.0,Kerugoya/Kutus,Kirinyaga District,Samuel Mwangi Nyaga
51719,http://kenyalaw.org/kenya_gazette/gazette/volume/MTIzMA--/Vol.CXVII-No.114/,5,2015,http://kenyalaw.org/kenya_gazette/gazette/year/2015,http://kenyalaw.org/kenya_gazette/gazette/volume/MTIzMA--/Vol.CXVII-No.114/,Vol.CXVII-No.114,"NAIROBI, \n\t\t \t23 October,2015",http://kenyalaw.org/kenya_gazette/gazette/volume/MTIzMA--/Vol.CXVII-No.114/,GAZETTE NOTICE NO. 7896,THE LAND REGISTRATION ACT,(No. 3 of 2012),ISSUE OF NEW LAND TITLE DEEDS,"WHEREAS Mary Njoki Nguru (ID/3084131), of P.O. Box 34, Matathia in the Republic of Kenya, is registered as proprietor in absolute ownership interest of those pieces of land containing 2.18, 2.13, 0.40 and 0.40 hectare or thereabouts, situate in the district of Naivasha, registered under title Nos. Longonot/Kijabe/612, 613, 1828 and 1829, and whereas sufficient evidence has been adduced to show that the land title deeds issued thereof have been lost, notice is given that after the expiration of sixty (60) days from the date hereof, I shall issue new title deeds provided that no objection has been received within that period.","Dated the 23rd October, 2015.","C. W. MWANIKI,","MR/8220865 Land Registrar, Naivasha District.",Naivasha,,,,,,,,Mary Njoki Nguru


In [27]:
df_large_districts.notice_body.sample(10)

35109                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 WHEREAS Joseph Wandera Odanga, is registered as proprietor in absolute ownership interest of that piece of land containing 0.39 hectare or thereabouts, situate in the district of Busia/Teso, registered under title No. Samia/Luanda/Mudoma/557, and whereas sufficient evidence has been adduced to show that the land title deed issued thereof has been lost, notice is given that after the expiration of sixty (60) days fr

In [28]:
df_large_districts['acres_hectares']=df_large_districts.notice_body.str.extract("containing ([\d.]+ hectares)")
# df_large_districts['acres_hectares']=df_large_districts.notice_body.str.extract("containing ([\d]+[.]*[\d]+) acres")+" acres"

df_large_districts['acres_hectares'].sample(10)


207945     3.0 hectares
107845     3.2 hectares
166727              NaN
14602               NaN
75484     2.00 hectares
65178               NaN
207749              NaN
5221                NaN
22972               NaN
168435    1.01 hectares
Name: acres_hectares, dtype: object

In [29]:
df_large_districts['acres_hectares']=df_large_districts['acres_hectares'].fillna(df_large_districts.notice_body.str.extract("containing ([\d.]+ acres)", expand=False))
df_large_districts['acres_hectares'].sample(10)

123181             NaN
145113             NaN
41210              NaN
16149              NaN
113624             NaN
177090         6 acres
65421              NaN
163772    4.0 hectares
134155             NaN
134779             NaN
Name: acres_hectares, dtype: object

In [30]:
df_large_districts['title_number']=df_large_districts.notice_body.str.extract("title No. ([^,]+),") 
# df_large_districts['first_named']=df_large_districts.notice_body.str.extract("WHEREAS ([A-Z][\w ]*)[^ \w]")
df_large_districts['title_number'].sample(10)

169707                         Nya/Kitiri/4954
377                      Bungoma/Kamakoiwa/130
187175                     Isukha/Shirere/2936
219210                                     NaN
141928                                     NaN
21263                     Kisumu/Kochieng/2382
163107                       Ngong/Ngong/20637
199987                Marama/Shinamwenyuli/317
50475     Pioneer/Ngeria Block 1 (EATEC)/12478
201035                       Nyaki/Thuura/1880
Name: title_number, dtype: object

In [31]:
df_large_districts['first_ID']=df_large_districts.notice_body.str.extract("ID[/]([\d]+)\b") 
df_large_districts['first_ID'].sample(10)

55316     NaN
58863     NaN
118089    NaN
86107     NaN
117966    NaN
87001     NaN
201948    NaN
2850      NaN
29184     NaN
224293    NaN
Name: first_ID, dtype: object

In [32]:
df_large_districts.sample(10)


Unnamed: 0,key_0,index,year,url_year,gazette_links,volume_num,volume_date,volume_url,notice_num_title,notice_act_title,notice_num_year,notice_sub_title,notice_body,notice_date,notice_registrar_name,notice_num_loc,notice_loc,District,HASC,Cod,Population,Area(km.²),Capital,district_named,first_named,acres_hectares,title_number,first_ID
185917,http://kenyalaw.org/kenya_gazette/gazette/volume/MTExNw--/Vol.CXVII-No.12/,5,2015,http://kenyalaw.org/kenya_gazette/gazette/year/2015,http://kenyalaw.org/kenya_gazette/gazette/volume/MTExNw--/Vol.CXVII-No.12/,Vol.CXVII-No.12,"NAIROBI, \n\t\t \t06 February,2015",http://kenyalaw.org/kenya_gazette/gazette/volume/MTExNw--/Vol.CXVII-No.12/,GAZETTE NOTICE NO. 799,THE LAND REGISTRATION ACT,(No. 3 of 2012),REGISTRATION OF INSTRUMENTS,"WHEREAS Robert Murithi (deceased), is registered as proprietor of those pieces of known as Abogeta/L-Mikumbune/164, Abothuguchi/Kiija/233, Nkuene/Taita/589 and 590, situate in the district of Meru, and whereas the High Court in succession cause No. 337 of 2005, has issued grant of letters of administration and certificate of confirmation to Alice Gakii Robert, and whereas the said court has executed an application to be registered as proprietor by transmission R.L. 19, and whereas the land title deeds in respect of the land registered in the name of Robert Murithi are lost, notice is given that after the expiration of thirty (30) days from the date hereof, provided no valid objection has been received within that period, I intend to dispense with the production of the said land title deeds and proceed with registration of the said application to be registered as proprietor by transmission R.L. 19, in the name of Alice Gakii Robert, and upon such registration the land title deeds issued earlier to the said Robert Murithi (deceased), shall be deemed to be cancelled and of no effect.","Dated the 6th February, 2015.","B. K. KAMWARO,","MR/6742325 Land Registrar, Meru District.",Meru,,,,,,,,Robert Murithi,,,
209845,http://kenyalaw.org/kenya_gazette/gazette/volume/MTMzMw--/Vol.CXVII-No.72/,6,2016,http://kenyalaw.org/kenya_gazette/gazette/year/2016,http://kenyalaw.org/kenya_gazette/gazette/volume/MTMzMw--/Vol.CXVII-No.72/,Vol.CXVII-No.72,"NAIROBI, \n\t\t \t01 July,2016",http://kenyalaw.org/kenya_gazette/gazette/volume/MTMzMw--/Vol.CXVII-No.72/,GAZETTE NOTICE NO. 5051,THE LAND REGISTRATION ACT,(No. 3 of 2012),ISSUE OF A NEW LAND TITLE DEED,"WHEREAS Francis Jow Adwodi, of Yala in the Republic of Kenya, is registered as proprietor in absolute ownership interest of all that piece of land containing 2.15 hectares or thereabout, situate in the district of Siaya, registered under title No. North Gem/Siriwo/883, and whereas sufficient evidence has been adduced to show that the land title deed issued thereof has been lost, notice is given that after the expiration of sixty (60) days from the date hereof, I shall issue a new title deed provided that no objection has been received within that period.","Dated the 1st July, 2016.","P. A. OWEYA,","MR/9612343 Land Registrar, Siaya District.",Siaya,Siaya,KE.NY.SI,609.0,480184.0,1520.0,Siaya,Siaya District,Francis Jow Adwodi,2.15 hectares,North Gem/Siriwo/883,
80019,http://kenyalaw.org/kenya_gazette/gazette/volume/MTE5Nw--/Vol.CXVII-No.84/,5,2015,http://kenyalaw.org/kenya_gazette/gazette/year/2015,http://kenyalaw.org/kenya_gazette/gazette/volume/MTE5Nw--/Vol.CXVII-No.84/,Vol.CXVII-No.84,"NAIROBI, \n\t\t \t14 August,2015",http://kenyalaw.org/kenya_gazette/gazette/volume/MTE5Nw--/Vol.CXVII-No.84/,GAZETTE NOTICE NO. 5903,THE LAND REGISTRATION ACT,(No. 3 of 2012),LOSS OF LAND REGISTER,"WHEREAS Rosemary Nyambura Mukundi (ID/3359505/66), of P.O. Box 1671, Thika in the Republic of Kenya, is registered as proprietor of that piece of land situate in the district of Thika, registered under title No. Thika/Municipality Block 19/790, and whereas sufficient evidence has been adduced to show that the land register (green card) of the said piece of land is missing, and whereas all efforts made to locate the said land register (green card) have failed, notice is given that after the expiration of sixty (60) days from the date hereof, provided that no objection has been received within that period I intend to open another land register and upon such opening, the said missing land register shall be deemed obsolete and of no effect.","Dated the 14th August, 2014.","B. K. LEITICH,","MR/8055086 Land Registrar, Thika District.",Thika,Thika,KE.CE.TH,206.0,645713.0,1960.0,Thika,Thika District,Rosemary Nyambura Mukundi,,Thika/Municipality Block 19/790,
154914,http://kenyalaw.org/kenya_gazette/gazette/volume/MTE1MQ--/Vol.CXVII-No.34/,5,2015,http://kenyalaw.org/kenya_gazette/gazette/year/2015,http://kenyalaw.org/kenya_gazette/gazette/volume/MTE1MQ--/Vol.CXVII-No.34/,Vol.CXVII-No.34,"NAIROBI, \n\t\t \t02 April,2015",http://kenyalaw.org/kenya_gazette/gazette/volume/MTE1MQ--/Vol.CXVII-No.34/,Gazette Notice No. 2132,THE LAND REGISTRATION ACT,(No. 3 of 2012),ISSUE OF A NEW LAND TITLE DEED,"WHEREAS Kamau Ndatho, of P.O. Box 1914–30100, Eldoret in the Republic of Kenya, is registered as proprietor in absolute ownership interest of that piece of land containing 1.694 hectares or thereabout, situate in the district of Uasin Gishu, registered under title No. Ngeria/Megun Block 3 (Kimuri)/158, and whereas sufficient evidence has been adduced to show that the land title deed issued thereof has been lost, notice is given that after the expiration of sixty (60) days from the date hereof, I shall issue a new title deed provided that no objection has been received within that period.","Dated the 2nd April, 2015.","E. J. KETER,","MR/7054450 Land Registrar, Uasin Gishu District.",Uasin Gishu,Uasin Gishu,KE.RV.UG,716.0,622705.0,3328.0,Eldoret,Uasin Gishu District,Kamau Ndatho,1.694 hectares,Ngeria/Megun Block 3 (Kimuri)/158,
104021,http://kenyalaw.org/kenya_gazette/gazette/volume/MTE4Mw--/Vol.CXVII-No.70/,5,2015,http://kenyalaw.org/kenya_gazette/gazette/year/2015,http://kenyalaw.org/kenya_gazette/gazette/volume/MTE4Mw--/Vol.CXVII-No.70/,Vol.CXVII-No.70,"NAIROBI, \n\t\t \t03 July,2015",http://kenyalaw.org/kenya_gazette/gazette/volume/MTE4Mw--/Vol.CXVII-No.70/,GAZETTE NOTICE NO. 4803,THE LAND REGISTRATION ACT,(No. 3 of 2012),ISSUE OF A NEW LAND TITLE DEED,"WHEREAS Simion Singeet Mutunkei, of P.O. Box 44, Kitengela in the Republic of Kenya, is registered as proprietor in absolute ownership interest of that piece of land situate in the district of Kajiado, registered under title No. Kajiado/Kaputiei-North/30158, and whereas sufficient evidence have been adduced to show that the land title deed issued thereof has been lost, notice is given that after the expiration of sixty (60) days from the date hereof, I shall issue a new title deed provided that no objection has been received within that period.","Dated the 3rd July, 2015.","D. M. KYULE,","MR/7828667 Land Registrar, Kajiado District.",Kajiado,Kajiado,KE.RV.KJ,704.0,406054.0,21903.0,Kajiado,Kajiado District,Simion Singeet Mutunkei,,Kajiado/Kaputiei-North/30158,
119429,http://kenyalaw.org/kenya_gazette/gazette/volume/MTE3Ng--/Vol.CXVII-No.58/,5,2015,http://kenyalaw.org/kenya_gazette/gazette/year/2015,http://kenyalaw.org/kenya_gazette/gazette/volume/MTE3Ng--/Vol.CXVII-No.58/,Vol.CXVII-No.58,"NAIROBI, \n\t\t \t05 June,2015",http://kenyalaw.org/kenya_gazette/gazette/volume/MTE3Ng--/Vol.CXVII-No.58/,GAZETTE NOTICE NO. 4042,THE LAND REGISTRATION ACT,(No. 3 of 2012),ISSUE OF A NEW LAND TITLE DEED,"WHEREAS Samuel Kamau Muhia (ID/7248154), of P.O. Box 26, Limuru in the Republic of Kenya, is registered as proprietor in absolute ownership interest of that piece of land containing 2.34 hectares or thereabout, situate in the district of Kiambu, registered under title No. Nguirubi/Ndiuni/815, and whereas sufficient evidence has been adduced to show that the land title deed issued thereof has been lost, notice is given that after the expiration of sixty (60) days from the date hereof, I shall issue a new title deed provided that no objection has been received within that period.","Dated the 5th June, 2015.","K. G. NDEGWA,","MR/7413968 Land Registrar, Kiambu District.",Kiambu,Kiambu,KE.CE.KB,201.0,744010.0,1324.0,Kiambu,Kiambu District,Samuel Kamau Muhia,2.34 hectares,Nguirubi/Ndiuni/815,
55107,http://kenyalaw.org/kenya_gazette/gazette/volume/MTIwOQ--/Vol.CXVII-No.101/,5,2015,http://kenyalaw.org/kenya_gazette/gazette/year/2015,http://kenyalaw.org/kenya_gazette/gazette/volume/MTIwOQ--/Vol.CXVII-No.101/,Vol.CXVII-No.101,"NAIROBI, \n\t\t \t18 September,2015",http://kenyalaw.org/kenya_gazette/gazette/volume/MTIwOQ--/Vol.CXVII-No.101/,GAZETTE NOTICE NO. 6887,THE LAND REGISTRATION ACT,(No. 3 of 2012),ISSUE OF A NEW LAND TITLE DEED,"WHEREAS Magdalene Njeri Kabura, of P.O. Box 5–10106, Othaya in the Republic of Kenya, is registered as proprietor in absolute ownership interest of that piece of land containing 0.07 hectare or thereabouts, situate in the district of Nyeri, registered under title No. Othaya/Kihugiru/1444, and whereas sufficient evidence has been adduced to show that the land title deed issued thereof has been lost, notice is given that after the expiration of sixty (60) days from the date hereof, I shall issue a new title deed provided that no objection has been received within that period.","Dated the 18th September, 2015.","R. W. NGAANYI,","MR/8045905 Land Registrar, Nyeri District.",Nyeri,Nyeri,KE.CE.NY,205.0,661156.0,3356.0,Nyeri,Nyeri District,Magdalene Njeri Kabura,,Othaya/Kihugiru/1444,
87105,http://kenyalaw.org/kenya_gazette/gazette/volume/MTE5NA--/Vol.CXVII-No.83/,5,2015,http://kenyalaw.org/kenya_gazette/gazette/year/2015,http://kenyalaw.org/kenya_gazette/gazette/volume/MTE5NA--/Vol.CXVII-No.83/,Vol.CXVII-No.83,"NAIROBI, \n\t\t \t07 August,2015",http://kenyalaw.org/kenya_gazette/gazette/volume/MTE5NA--/Vol.CXVII-No.83/,GAZETTE NOTICE NO. 5742,THE LAND REGISTRATION ACT,(No. 3 of 2012),RECONSTRUCTION OF LOST OR DESTROYED LAND REGISTER,"WHEREAS Primrose Management Limited, of P.O. Box 45425–00100, Nairobi in the Republic of Kenya, is registered proprietor lessee of all that piece of land known as L.R. No. 209/410/5, situate in the city of Nairobi in the Nairobi Area, by virtue of a conveyance registered in Volume N58 Folio 395/6 File 18424, and whereas the land register in respect thereof is lost or destroyed, and whereas efforts made to locate the said land register have failed, notice is given that after the expiration of sixty (60) days from the date hereof, the property register shall be reconstructed under the provisions of section 33 (5) of the Act, provided that no objection has been received within that period.","Dated the 7th August, 2015.","G. M. MUYANGA,","MR/7769896 Land Registrar, Nairobi.",Nairobi,Nairobi,KE.NA.NB,101.0,2143254.0,696.0,Nairobi,Nairobi District,Primrose Management Limited,,,
174909,http://kenyalaw.org/kenya_gazette/gazette/volume/MTEyMQ--/Vol.CXVII-No.20/,5,2015,http://kenyalaw.org/kenya_gazette/gazette/year/2015,http://kenyalaw.org/kenya_gazette/gazette/volume/MTEyMQ--/Vol.CXVII-No.20/,Vol.CXVII-No.20,"NAIROBI, \n\t\t \t27 February,2015",http://kenyalaw.org/kenya_gazette/gazette/volume/MTEyMQ--/Vol.CXVII-No.20/,GAZETTE NOTICE NO. 1285,THE LAND REGISTRATION ACT,(No. 3 of 2012),ISSUE OF A NEW LAND TITLE DEED,"WHEREAS Jacob Kitsao Yaa (ID/5474374), of P.O. Box 29, Kilifi in the Republic of Kenya, is registered as proprietor in absolute ownership interest of that piece of land containing 3.0 hectares or thereabout, situate in the district of Kilifi, registered under title No. Kilifi/Ngerenyi/1156, and whereas sufficient evidence has been adduced to show that the land title deed issued thereof has been lost, notice is given that after the expiration of sixty (60) days from the date hereof, I shall issue a new title deed provided that no objection has been received within that period.","Dated the 27th February, 2015.","M. S. CHINYAKA,","MR/6901351 Land Registrar, Kilifi District.",Na,,,,,,,,Jacob Kitsao Yaa,3.0 hectares,Kilifi/Ngerenyi/1156,
150362,http://kenyalaw.org/kenya_gazette/gazette/volume/MTE1Mw--/Vol.CXVII-No.37/,5,2015,http://kenyalaw.org/kenya_gazette/gazette/year/2015,http://kenyalaw.org/kenya_gazette/gazette/volume/MTE1Mw--/Vol.CXVII-No.37/,Vol.CXVII-No.37,"NAIROBI, \n\t\t \t10 April,2015",http://kenyalaw.org/kenya_gazette/gazette/volume/MTE1Mw--/Vol.CXVII-No.37/,GAZETTE NOTICE NO. 2355,THE LAND REGISTRATION ACT,(No. 3 of 2012),ISSUE OF A NEW LAND TITLE DEED,"WHEREAS Cyrus Maina Kamau, of Embu in the Republic of Kenya, is registered as proprietor in absolute ownership interest of that piece of land containing 0.80 hectare or thereabouts, situate in the district of Embu, registered under title No. Gaturi/Weru/4795, and whereas sufficient evidence has been adduced to show that the land title deed issued thereof has been lost, notice is given that after the expiration of sixty (60) days from the date hereof, I shall issue a new title deed provided that no objection has been received within that period.","Dated the 10th April, 2015.","M. W. KARIUKI,","MR/7054346 Land Registrar, Embu District.",Embu,Embu,KE.EA.EB,401.0,278196.0,729.0,Embu,Embu District,Cyrus Maina Kamau,,Gaturi/Weru/4795,


In [33]:
df_large_districts['first_ID']=df_large_districts.notice_body.str.extract("ID[/]([\d]+)") 
df_large_districts['first_ID'].sample(10)

179763    16037840
223716         NaN
19880      8858117
153075         NaN
15621          NaN
55652      1844136
90454          NaN
136063     7670576
59539          NaN
223068         NaN
Name: first_ID, dtype: object

In [34]:
df_large_districts['first_address']=df_large_districts.notice_body.str.extract("of (P.O. Box [\d-]+, [\w]+) in the Republic of") 
df_large_districts['first_address'].sample(10)

# of P.O. Box 3190–20100, Nakuru in the Republic of Kenya



181337       P.O. Box 334, Kikuyu
45327                         NaN
157855                        NaN
159415      P.O. Box 84, Kakamega
186215       P.O. Box 26, Werugha
171919     P.O. Box 281, Naivasha
14822       P.O. Box 129, Kajiado
156982    P.O. Box 74628, Nairobi
208148     P.O. Box 3317, Nanyuki
149936                        NaN
Name: first_address, dtype: object

In [35]:
df_large_districts['lr_num']=df_large_districts.notice_body.str.extract("all that piece of land known as L.R. No[s?]. ([\d]+/[\d]+),") 
df_large_districts['lr_num'].sample(12)



# all that piece of land known as L.R. No. 12570/174,

153464    NaN
22938     NaN
225891    NaN
217923    NaN
200922    NaN
92485     NaN
165188    NaN
226641    NaN
173716    NaN
111021    NaN
10754     NaN
99463     NaN
Name: lr_num, dtype: object

In [36]:
df_large_districts['municipality']=df_large_districts.notice_body.str.extract("in ([\w]+) Municipality") 
df_large_districts['municipality'].sample(10)
#in Nakuru Municipality

58888     NaN
129471    NaN
25063     NaN
99511     NaN
213249    NaN
36314     NaN
230076    NaN
173095    NaN
19382     NaN
131154    NaN
Name: municipality, dtype: object

In [37]:
df_large_districts['second_person']=df_large_districts.notice_body.str.extract("and [(]2[)] ([\w ]+),") 
df_large_districts['second_person'].sample(12)
#and (2) Shobna Bachulal Shah,

58085                       NaN
159608                      NaN
139637                      NaN
72554                       NaN
167850    Kevin Wachira Njoroge
107748                      NaN
149485                      NaN
219643                      NaN
152266                      NaN
89745                       NaN
209025                      NaN
88380                       NaN
Name: second_person, dtype: object

In [38]:
df_large_districts['ir_num']=df_large_districts.notice_body.str.extract("I.R. ([\d]+)") 
df_large_districts['ir_num'].sample(12)
#certificate of title registered as I.R. 120880/1, 

97023     NaN
7683      NaN
49976     NaN
84237     NaN
203642    NaN
86760     NaN
218422    NaN
164772    NaN
16259     NaN
80223     NaN
77412     NaN
8393      NaN
Name: ir_num, dtype: object

In [39]:
df_large_districts['days']=df_large_districts.notice_body.str.extract("of [\w]+ [(]([\d]+)[)] days") 
df_large_districts['days'].sample(12)
#of sixty (60) days

48353     60
95065     60
175384    60
162774    30
178976    60
80993     60
172996    60
200492    30
86968     30
158332    60
162020    30
181546    60
Name: days, dtype: object

In [40]:
df_large_districts['succession']=df_large_districts.notice_body.str.extract("in succession cause No. ([\d]+ of [\d]+)") 
df_large_districts['succession'].sample(12)
#in succession cause No. 93 of 2013 has

65153     NaN
220316    NaN
142887    NaN
26830     NaN
14604     NaN
91507     NaN
30339     NaN
34156     NaN
64927     NaN
214346    NaN
112330    NaN
224696    NaN
Name: succession, dtype: object

In [41]:
df_large_districts['township']=df_large_districts.notice_body.str.extract("in ([\w ]+) Township") 
df_large_districts['township'].sample(12)
#situate in Gilgil Township in the district of Nakuru

96740     NaN
122796    NaN
88029     NaN
42374     NaN
129556    NaN
158200    NaN
228186    NaN
156465    NaN
21494     NaN
181449    NaN
49758     NaN
213888    NaN
Name: township, dtype: object

In [42]:
df_large_districts['in_the_district_of']=df_large_districts.notice_body.str.extract("district of ([\w ]+),") 
df_large_districts['in_the_district_of'].sample(12)
#situate in Gilgil Township in the district of Nakuru

119396          Siaya
86715         Nairobi
76224          Vihiga
27001     Trans Nzoia
69518           Nandi
86970           Siaya
201495         Kisumu
22727         Kajiado
178343          Busia
196989       Kakamega
94987          Nakuru
37394        Laikipia
Name: in_the_district_of, dtype: object

In [44]:
df_large_districts['province_code']=df_large_districts['HASC'].str.extract("[A-Z][A-Z].([A-Z][A-Z]).[A-Z][A-Z]") 
df_large_districts['province_code'].sample(12)
#KE.WE.KK

183021     EA
223909     RV
51486      RV
102937     CE
36798      WE
164755     RV
22656     NaN
190663    NaN
47708      RV
85874      WE
5186      NaN
65879      CO
Name: province_code, dtype: object

In [45]:
df_large_districts['province_code'].unique()

array(['NA', nan, 'CE', 'EA', 'RV', 'CO', 'NY', 'WE', 'NE'], dtype=object)

# Reading data on provinces and their codes 

In [46]:
df_provinces=pd.read_excel('/Users/ivynyayieka/Downloads/provinces.xlsx')
df_provinces.head()

Unnamed: 0,province,province_code
0,Central,CE
1,Coast,CO
2,Eastern,EA
3,Nairobi,
4,North Eastern,NE


# Merging province data to dataframe

In [51]:
df_large_provinces = pd.merge(df_large_districts, df_provinces, left_on='province_code', right_on='province_code', how='left')
df_large_provinces.sample(10)

Unnamed: 0,key_0,index,year,url_year,gazette_links,volume_num,volume_date,volume_url,notice_num_title,notice_act_title,notice_num_year,notice_sub_title,notice_body,notice_date,notice_registrar_name,notice_num_loc,notice_loc,District,HASC,Cod,Population,Area(km.²),Capital,district_named,first_named,acres_hectares,title_number,first_ID,first_address,lr_num,municipality,second_person,ir_num,days,succession,township,in_the_district_of,province_code,province
184393,http://kenyalaw.org/kenya_gazette/gazette/volume/MTExNw--/Vol.CXVII-No.12/,5,2015,http://kenyalaw.org/kenya_gazette/gazette/year/2015,http://kenyalaw.org/kenya_gazette/gazette/volume/MTExNw--/Vol.CXVII-No.12/,Vol.CXVII-No.12,"NAIROBI, \n\t\t \t06 February,2015",http://kenyalaw.org/kenya_gazette/gazette/volume/MTExNw--/Vol.CXVII-No.12/,GAZETTE NOTICE NO. 749,THE LAND REGISTRATION ACT,(No. 3 of 2012),ISSUE OF A NEW LAND TITLE DEED,"WHEREAS Esther Wanjiru Gachuhi (ID/2885213), is registered as proprietor in absolute ownership interest of that piece of land containing 0.0556 hectare or thereabouts, situate in the district of Nakuru, registered under title No. Dundori/Lanet Block 5 (New Gakoe)/2037, and whereas sufficient evidence has been adduced to show that the land title deed issued thereof has been lost, notice is given that after the expiration of sixty (60) days from the date hereof, I shall issue a new title deed provided that no objection has been received within that period.","Dated the 6th February, 2015.","M. V. BUNYOLI,","MR/6742458 Land Registrar, Nakuru District.",Nakuru,Nakuru,KE.RV.NK,709.0,1187039.0,7242.0,Nakuru,Nakuru District,Esther Wanjiru Gachuhi,,Dundori/Lanet Block 5 (New Gakoe)/2037,2885213.0,,,,,,60,,,Nakuru,RV,Rift Valley
198124,http://kenyalaw.org/kenya_gazette/gazette/volume/MTExMw--/Vol.CXVII-No.7/,5,2015,http://kenyalaw.org/kenya_gazette/gazette/year/2015,http://kenyalaw.org/kenya_gazette/gazette/volume/MTExMw--/Vol.CXVII-No.7/,Vol.CXVII-No.7,"NAIROBI, \n\t\t \t23 January,2015",http://kenyalaw.org/kenya_gazette/gazette/volume/MTExMw--/Vol.CXVII-No.7/,Gazette Notice No. 436,THE LAND REGISTRATION ACT,(No. 3 of 2012),LOSS OF A LAND REGISTER,"WHEREAS Julius Kariuki Mwathi, of P.O. Box 520, Thika in the Republic of Kenya, is registered as proprietor of that piece of land containing 0.2023 hectare or thereabouts, known as Makuyu/Kimorori Block III/2162, situate in the district of Murang’a, and whereas sufficient evidence has been adduced to show that the the land register in respect thereof is missing, and whereas all efforts made to locate the said land register have failed, notice is given that after the expiration of sixty (60) days from the date hereof, provided that no valid objection has been received within that period, I intend to open another land register and upon such opening the said missing land register shall be deemed to have been cancelled and of no effect.","Dated the 23rd January, 2015.","P. K. KIMANI,","MR/6724403 Land Registrar, Murang’a District.",Murang’a,,,,,,,,Julius Kariuki Mwathi,,,,"P.O. Box 520, Thika",,,,,60,,,,,Nairobi
44225,http://kenyalaw.org/kenya_gazette/gazette/volume/MTA3NA--/Vol.CXVI-No.110/,4,2014,http://kenyalaw.org/kenya_gazette/gazette/year/2014,http://kenyalaw.org/kenya_gazette/gazette/volume/MTA3NA--/Vol.CXVI-No.110/,Vol.CXVI-No.110,"NAIROBI, \n\t\t \t12 September,2014",http://kenyalaw.org/kenya_gazette/gazette/volume/MTA3NA--/Vol.CXVI-No.110/,GAZETTE NOTICE NO. 6331,THE LAND REGISTRATION ACT,(No. 3 of 2012),ISSUE OF A NEW LAND TITLE DEED,"WHEREAS Patrick Samuel Macharia Gachihi, of P.O. Box 377, Naromoro in the Republic of Kenya, is registered as proprietor in absolute ownership interest of that piece of land containing 1.45 hectares or thereabout, situate in the district of Nyeri, registered under title No. Naromoro/Naromoro Block 1/Kieni East/113, and whereas sufficient evidence has been adduced to show that the land title deed issued thereof has been lost, notice is given that after the expiration of sixty (60) days from the date hereof, I shall issue a new title deed provided that no objection has been received within that period.","Dated the 12th September, 2014.","S. N. NDIRANGU,","MR/5748961 Land Registrar, Nyeri District.",Nyeri,Nyeri,KE.CE.NY,205.0,661156.0,3356.0,Nyeri,Nyeri District,Patrick Samuel Macharia Gachihi,1.45 hectares,Naromoro/Naromoro Block 1/Kieni East/113,,"P.O. Box 377, Naromoro",,,,,60,,,Nyeri,CE,Central
43293,http://kenyalaw.org/kenya_gazette/gazette/volume/MTA3NA--/Vol.CXVI-No.110/,4,2014,http://kenyalaw.org/kenya_gazette/gazette/year/2014,http://kenyalaw.org/kenya_gazette/gazette/volume/MTA3NA--/Vol.CXVI-No.110/,Vol.CXVI-No.110,"NAIROBI, \n\t\t \t12 September,2014",http://kenyalaw.org/kenya_gazette/gazette/volume/MTA3NA--/Vol.CXVI-No.110/,GAZETTE NOTICE NO. 6353,THE LAND REGISTRATION ACT,(No. 3 of 2012),ISSUE OF A NEW LAND TITLE DEED,"WHEREAS Daniel Gatimu Kalulu (ID/13846820), of P.O. Box 113, Kianyaga in the Republic of Kenya, is registered as proprietor in absolute ownership interest of that piece of land containing 0.51 hectare or thereabouts, situate in the district of Kirinyaga, registered under title No. Baragwe/Raimu/1680, and whereas sufficient evidence has been adduced to show that the land title deed issued thereof has been lost, notice is given that after the expiration of sixty (60) days from the date hereof, I shall issue a new title deed provided that no objection has been received within that period.","Dated the 12th September, 2014.","C. W. NJAGI,","MR/5749040 Land Registrar, Kirinyaga District.",Kirinyaga,Kirinyaga,KE.CE.KY,202.0,457105.0,1478.0,Kerugoya/Kutus,Kirinyaga District,Daniel Gatimu Kalulu,,Baragwe/Raimu/1680,13846820.0,"P.O. Box 113, Kianyaga",,,,,60,,,Kirinyaga,CE,Central
203828,http://kenyalaw.org/kenya_gazette/gazette/volume/MTEwOQ--/Vol.CXVII-No.1/,5,2015,http://kenyalaw.org/kenya_gazette/gazette/year/2015,http://kenyalaw.org/kenya_gazette/gazette/volume/MTEwOQ--/Vol.CXVII-No.1/,Vol.CXVII-No.1,"NAIROBI, \n\t\t \t02 January,2015",http://kenyalaw.org/kenya_gazette/gazette/volume/MTEwOQ--/Vol.CXVII-No.1/,Gazette Notice No. 42,THE LAND REGISTRATION ACT,(No. 3 of 2012),ISSUE OF A NEW LAND TITLE DEED,"WHEREAS Wilson Aseka Shahasi, of P.O. Box 44899, Nairobi in the Republic of Kenya, is registered as proprietor in absolute ownership interest of that piece of land containing 0.88 hectare or thereabouts, situate in the district of Nandi, registered under title No. Nandi/Kamobo/1433, and whereas sufficient evidence has been adduced to show that the land title deed issued thereof has been lost, notice is given that after the expiration of sixty (60) days from the date hereof, I shall issue a new title deed provided that no objection has been received within that period.","Dated the 2nd January, 2015.","E. A. ODERO,","MR/6724114 Land Registrar, Nandi District.",Nandi,Nandi,KE.RV.NA,710.0,578751.0,2899.0,Kapsabet,Nandi District,Wilson Aseka Shahasi,,Nandi/Kamobo/1433,,"P.O. Box 44899, Nairobi",,,,,60,,,Nandi,RV,Rift Valley
90456,http://kenyalaw.org/kenya_gazette/gazette/volume/MTE5Mw--/Vol.CXVII-No.80/,5,2015,http://kenyalaw.org/kenya_gazette/gazette/year/2015,http://kenyalaw.org/kenya_gazette/gazette/volume/MTE5Mw--/Vol.CXVII-No.80/,Vol.CXVII-No.80,"NAIROBI, \n\t\t \t31 July,2015",http://kenyalaw.org/kenya_gazette/gazette/volume/MTE5Mw--/Vol.CXVII-No.80/,GAZETTE NOTICE NO. 5569,THE LAND REGISTRATION ACT,(No. 3 of 2012),ISSUE OF A NEW LAND TITLE DEED,"WHEREAS Michael Kingori Ndirangu, of P.O. Box 44334, Karen in the Republic of Kenya, is registered as proprietor in absolute ownership interest of that piece of land containing 2.075 hectares or thereabout, situate in the district of Machakos, registered under title No. Mavoko/Town Block 3/1203, and whereas sufficient evidence has been adduced to show that the land title deed issued thereof has been lost, notice is given that after the expiration of sixty (60) days from the date hereof, I shall issue a new title deed provided that no objection has been received within that period.","Dated the 31st July, 2015.","G. M. NJOROGE,","MR/7796992 Land Registrar, Machakos District.",Machakos,Machakos,KE.EA.MC,405.0,906644.0,6281.0,Machakos,Machakos District,Michael Kingori Ndirangu,2.075 hectares,Mavoko/Town Block 3/1203,,"P.O. Box 44334, Karen",,,,,60,,,Machakos,EA,Eastern
126584,http://kenyalaw.org/kenya_gazette/gazette/volume/MTE2OQ--/Vol.CXVII-No.52/,5,2015,http://kenyalaw.org/kenya_gazette/gazette/year/2015,http://kenyalaw.org/kenya_gazette/gazette/volume/MTE2OQ--/Vol.CXVII-No.52/,Vol.CXVII-No.52,"NAIROBI, \n\t\t \t22 May,2015",http://kenyalaw.org/kenya_gazette/gazette/volume/MTE2OQ--/Vol.CXVII-No.52/,GAZETTE NOTICE NO. 3531,THE LAND REGISTRATION ACT,(No. 3 of 2012),REGISTRATION OF INSTRUMENT,"WHEREAS Wilson Oduor Otieno alias Wilson Oduor Othieno (deceased), of Yala in the Republic of Kenya, is registered as proprietor of that piece of land known as Siaya/Karapul Ramba/2204, situate in the district of Siaya, and whereas the High Court at Kisumu in succession cause No. H.C./244 of 2014, has ordered that the piece of land be registered in the names of (1) Percila Auma Oduor and (2) Ellen Adhiambo Oduor, and whereas efforts made to recover the land title deed issued thereof by the land registrar have failed, notice is given that after the expiration of thirty (30) days from the date hereof, provided no valid objection has been received within that period, I intend to dispense with the production of the said land title deed and proceed with registration of the said grant document and issue a land title deed to the said (1) Percila Auma Oduor and (2) Ellen Adhiambo Oduor, and upon such registration the land title deed issued earlier to the said Wilson Oduor Otieno alias Wilson Oduor Othieno (deceased), shall be deemed to be cancelled and of no effect.","Dated the 22nd May, 2014.","P. A. OWEYA ,","MR/7413518 Land Registrar, Siaya District.",Siaya,Siaya,KE.NY.SI,609.0,480184.0,1520.0,Siaya,Siaya District,Wilson Oduor Otieno alias Wilson Oduor Othieno,,,,,,,Ellen Adhiambo Oduor,,30,,,Siaya,NY,Nyanza
73699,http://kenyalaw.org/kenya_gazette/gazette/volume/MTE5Nw--/Vol.CXVII-No.84/,5,2015,http://kenyalaw.org/kenya_gazette/gazette/year/2015,http://kenyalaw.org/kenya_gazette/gazette/volume/MTE5Nw--/Vol.CXVII-No.84/,Vol.CXVII-No.84,"NAIROBI, \n\t\t \t14 August,2015",http://kenyalaw.org/kenya_gazette/gazette/volume/MTE5Nw--/Vol.CXVII-No.84/,GAZETTE NOTICE NO. 5881,THE LAND REGISTRATION ACT,(No. 3 of 2012),ISSUE OF A NEW LAND TITLE DEED,"WHEREAS Stephen Mwaniki Magethe (ID/5172279), of P.O. Box 233, North Kinangop in the Republic of Kenya, is registered as proprietor in absolute ownership interest of that piece of land containing 0.405 hectare or thereabouts, situate in the district of Nyandarua, registered under title No. Nyandarua/Mkungi/1788, and whereas sufficient evidence has been adduced to show that the land title deed issued thereof has been lost, notice is given that after the expiration of sixty (60) days from the date hereof, I shall issue a new title deed provided that no objection has been received within that period.","Dated the 14th August, 2015.","J. W. KARANJA,","MR/7769841 Land Registrar, Nyandarua/Samburu Districts.",Nyandarua/Samburu,,,,,,,,Stephen Mwaniki Magethe,,Nyandarua/Mkungi/1788,5172279.0,,,,,,60,,,Nyandarua,,Nairobi
196301,http://kenyalaw.org/kenya_gazette/gazette/volume/MTExMw--/Vol.CXVII-No.7/,5,2015,http://kenyalaw.org/kenya_gazette/gazette/year/2015,http://kenyalaw.org/kenya_gazette/gazette/volume/MTExMw--/Vol.CXVII-No.7/,Vol.CXVII-No.7,"NAIROBI, \n\t\t \t23 January,2015",http://kenyalaw.org/kenya_gazette/gazette/volume/MTExMw--/Vol.CXVII-No.7/,GAZETTE NOTICE NO. 382,THE LAND REGISTRATION ACT,(No. 3 of 2012),ISSUE OF A PROVISIONAL CERTIFICATE,"WHEREAS (1) Robert Njuguna Njubi and (2) John Njenga Njubi, both of P.O. Box 1244–20117, Naivasha in the Republic of Kenya, are the registered proprietors lessees of all that piece of land known as L.R. No. 1556/39, situate in the south east of Naivasha Town in Nakuru District, by virtue of a certificate of title registered as I.R. 42592/1, and whereas sufficient evidence has been adduced to show that the said certificate of title issued thereof has been lost, notice is given that after the expiration of sixty (60) days from the date hereof, I shall issue a provisional certificate of title provided that no objection has been received within that period.","Dated the 23rd January, 2015.","C. N. KITUYI,","MR/6724492 Registrar of Titles, Nairobi.",Nairobi,Nairobi,KE.NA.NB,101.0,2143254.0,696.0,Nairobi,Nairobi District,,,,,,,,John Njenga Njubi,42592.0,60,,,,,
190819,http://kenyalaw.org/kenya_gazette/gazette/volume/MTExNQ--/Vol.CXVII-No.9/,5,2015,http://kenyalaw.org/kenya_gazette/gazette/year/2015,http://kenyalaw.org/kenya_gazette/gazette/volume/MTExNQ--/Vol.CXVII-No.9/,Vol.CXVII-No.9,"NAIROBI, \n\t\t \t30 January,2015",http://kenyalaw.org/kenya_gazette/gazette/volume/MTExNQ--/Vol.CXVII-No.9/,Gazette Notice No. 550,THE LAND REGISTRATION ACT,(No. 3 of 2012),ISSUE OF A NEW LAND TITLE DEED,"WHEREAS Joseph Mwangi Mbochere, is registered as proprietor in absolute ownership interest of that piece of land containing 3.92 hectares or thereabout, situate in the district of Kirinyaga, registered under title No. Kiine/Thigirichi/369, and whereas sufficient evidence has been adduced to show that the land title deed issued thereof has been lost, notice is given that after the expiration of sixty (60) days from the date hereof, I shall issue a new title deed provided that no objection has been received within that period.","Dated the 30th January, 2015.","J. K. MUTHEE,","MR/6742125 Land Registrar, Kirinyaga District.",Kirinyaga,Kirinyaga,KE.CE.KY,202.0,457105.0,1478.0,Kerugoya/Kutus,Kirinyaga District,Joseph Mwangi Mbochere,3.92 hectares,Kiine/Thigirichi/369,,,,,,,60,,,Kirinyaga,CE,Central


In [54]:
df_large_provinces.to_csv('titles_new_df_large_provinces.csv')