# 2022 Kenyan General Elections Through Tweets

#### Kenya, an East African country, will be conducting a general elections this year- August 9th, 2022. This mini project aims to understand the Kenyan people sentiments on elections-related topics with the goal of understanding which presidential candidate is highly favourable and likely to clinch victory.

#### Data is acquired from Twitter, a social platform that has gained popularit by masses in Kenya (.. and globally), by web-scrapping using `Twint library`.
    

#### Data (Tweets) limit set to `5 million datapoints`
    
    
    
    
##### Scrapping Library Choice Summary: 
- Twint has unlimited scrapping capability unlike proprietary Twitter API Tweepy that requires API request and developer account. Tweepy API limit scrapping to only 3200 tweets.

- In this mini project, I will utelize Twint scrapping tool.

In [8]:
# Import twint API
import twint

# asynchronous runtime import (allows run untill complete before returning results)
import nest_asyncio
nest_asyncio.apply()

# Import libraries
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

import os
import re

In [9]:
# Configure object


def get_kenya_elections_tweets(use_cache = True):
    
    raw_tweets =  'kenyan_2022_elections_sentiments.csv'
    
    if os.path.exists(raw_tweets) and use_cache:
        
        print('Reading local csv file..')
        
        return pd.read_csv(raw_tweets)
    else: 
        config = twint.Config()
        config.Search = ["kenya elections 2022"] 
        config.Lang = "en"
        config.Limit = 5_000_000
        config.Store_csv = True
        config.Output = raw_tweets
        print('Running search...')
        twint.run.Search(config) 
        return pd.read_csv(raw_tweets)

### Read Tweets

In [10]:
# Get tweets and save in csv format
tweets = get_kenya_elections_tweets()

Running search...
1549995632382951424 2022-07-21 00:52:10 -0500 <RealWiseNganga> This is Clarity on my Prophecy about the 2022 Kenyan Elections.   The Mad Man is God's Man. He is Kenya's Salvation or Way Out from What's To Come.  GO READ ISAIAH Chapter 45 Now. #Elections2022 #Prophecy
1549975786077229056 2022-07-20 23:33:19 -0500 <nicmuhando> How @Meta is Preparing for Kenya’s 2022 General Election: working closely with election authorities and trusted partners, and a dedicated Kenyan Elections Operation Centre activated as part of its ongoing work in supporting major elections around the world. #Brekko  https://t.co/wgfGULxd5v
1549974027669803008 2022-07-20 23:26:19 -0500 <NationFmKE> The resulting “Pro-Peace Kenya” Project supports the interventions of NCIC and aims at preventing election related conflict and violence in the general and presidential elections of 2022 and beyond.  #ElectionsBilaNoma   @DavidOyuke  #DavidThisMorning #MorningFix
1549968786056519685 2022-07-20 23:05:30 -

1549827497914765312 2022-07-20 13:44:04 -0500 <FDairysia> Kenya elections 2022: President Kenyatta halves cost of maize flour  https://t.co/Y2qaSqSYwW
1549827356444999687 2022-07-20 13:43:30 -0500 <techtoday468> Kenya elections 2022: President Kenyatta halves cost of maize flour  https://t.co/tGwUcO4hHG
1549827031227154432 2022-07-20 13:42:13 -0500 <OdhiamboRhoda> Kenya elections 2022: President Kenyatta halves cost of maize flour by @CharlieGitonga @BBCAfrica   https://t.co/DV3GzgEUfa
1549826809646178309 2022-07-20 13:41:20 -0500 <SahalFarhan1> Kenya elections 2022: President Kenyatta halves cost of maize flour  https://t.co/C9ZWAUuMz7
1549826540594249728 2022-07-20 13:40:16 -0500 <thedailyretina> Kenya elections 2022: President Kenyatta halves cost of maize flour  https://t.co/8AZPA8bXcO
1549826473120464898 2022-07-20 13:40:00 -0500 <CharlieGitonga> I enjoy my ugali! So I've been busy looking at whether the cost of maize flour will come down in Kenya and why the whole issue is so pol

1549416114072985600 2022-07-19 10:29:23 -0500 <BiSOEA2021> Conflict resolution based on coming presidential election 2022 this coming august and how we are going protect our fellow refugees mostly queer refugees through elections period and after elections  @HoymasK @UNHCR_Kenya @NGLHRC @Ke_swa @drcEA_GL  https://t.co/Ls6hgSruKX
1549414697442676737 2022-07-19 10:23:45 -0500 <WiLeadership_KE> Justina Wamae and Ruth Mucheru on the Debate stands, NOW. They are among the top three female candidates in the 2022 elections as they eye the Office of the Deputy President in Kenya; the third is Martha Karua. #PresidentialDebatesKe2022 #BecauseWomenCAN
1549407027021189121 2022-07-19 09:53:16 -0500 <CEOAfricanewss> Kenya 2022: 'I am the people's project,' Odinga boasts ahead of elections  Read More:  https://t.co/5ZEslXwxI5 Adeleke Ajah Ibadan GreatReset ImoState EricaNlewedim #WeMetOnTwitter Davido BillboardHot100 Hannibal Maguire ManchesterUnited ArriveCAN #Ukraine TenHag Forced Donny  https://t

1548718494216224772 2022-07-17 12:17:17 -0500 <Kuilean1> @KTNNewsKE Stupid things that Manyora has said before: That BBI would go to referendum before elections That mt Kenya would all follow uhuru June 2021- Ruto would withdraw from 2022 race March 2021- we should postpone elections because of covid ( Martha termed this as treasonable,)
1548714557031518208 2022-07-17 12:01:38 -0500 <MediaCouncilK> Kenya Elections 2022: Join our Director for Media Training and Development .@vicbwire tomorrow morning  on .@RadioCitizenFM as he discusses media's place in the transitional polls. #jambokenya  https://t.co/tbNmjSK669
1548709477750292480 2022-07-17 11:41:27 -0500 <Kuilean1> @HManyora @bruno_otiato Stupid things that Manyora has said before: That BBI would go to referendum before elections That mt Kenya would all follow uhuru June 2021- Ruto would withdraw from 2022 race March 2021- we should postpone elections because of covid ( Martha termed this as treasonable,)
1548675301735006209 2022-07

1547906839647379458 2022-07-15 06:32:03 -0500 <CGW_Kenya> Kajiado County Plenary Session on conflict mitigation toward peaceful elections 2022 @tendasasa @USAIDKenya @GermanyinKenya  @UKinKenya @NSCpeace @NCIC_Kenya  https://t.co/mIeFXxFHvo
1547900411390046209 2022-07-15 06:06:31 -0500 <CRECOKenya> Mis &amp; Disinformation undermines democracy &amp; peaceful elections. We continue to engage key Independent Commissions towards amplifying peace messaging ahead of 2022 GE. @CRECOKenya ED  @jkchangwony met with @NCIC_Kenya Chair Rev. Dr. Samuel Kobia discuss areas of collaborations.  https://t.co/wnQ9sxxjpk
1547898615665897472 2022-07-15 05:59:23 -0500 <article19eafric> ARTICLE19 has been working with Tiktok to promote #Tiktok4peace campaigns in Kenya 🇰🇪 as we near the general elections to be held on August 9th 2022. #TikTok4Peace  https://t.co/JcNpYy8urs
1547881927444533251 2022-07-15 04:53:04 -0500 <EmmaEric1997> Une mission d'observation dirigée par SE Dr. Mulatu Teshome, l'ancien prési

1547293257465532417 2022-07-13 13:53:54 -0500 <OdhiamboAlf> @kithurekindiki_ Emperically speeking, you should really hope and pray in Central Kenya that the voter apathy witnessed in 2017 repeat elections and 2022 UDA nominations is not replicated on August 8. Lawyers argues based on facts and evidence not cursings based on frustrations and panic. VIVA!
1547265799823495168 2022-07-13 12:04:48 -0500 <assooood> Kenya elections 2022: The misinformation circulating over academic qualifications  https://t.co/gooVxAlLaa
1547252501912735745 2022-07-13 11:11:57 -0500 <missyHaroona> Kijana Jihusishe: Youth Promoting Peaceful 2022 Elections in Kenya | EEAS Website ⁦@KeshoAlliance⁩ 🙏🙏🦾🦾   https://t.co/kQgC84Zpi1
1547231555482943488 2022-07-13 09:48:43 -0500 <ILoveEmbu> We are out here, enjoying the Sun as we Preach Peace in Embu county and Kenya at Large!  Tunasema Uchaguzi Bila Noma, Elections Bila Noma!   https://t.co/Uemm0ylWhK, supporting Peaceful Elections 2022!  #Iloveembu LEAKED AUDIO #Lan

1546761094521307137 2022-07-12 02:39:17 -0500 <Chief_Johnson1> As such, the recovery of the private sector business environment in Kenya is largely pegged on how quickly the global economy stabilizes with a key concern on the rising fuel prices as well as the upcoming August 2022 general elections. #CytonnReport
1546753451895447553 2022-07-12 02:08:54 -0500 <ConnectSoko> As such, the recovery of the private sector business environment in Kenya is largely pegged on how quickly the global economy stabilizes with a key concern on the rising fuel prices as well as the upcoming August 2022 general elections. ~Soko Directory #CytonnReport
1546737165232705536 2022-07-12 01:04:11 -0500 <lloydOnyango> 2022 General Elections is about CHARACTER and INTEGRITY. We shouldn't allow ourselves to be governed by wash wash gang and fraudulent people who have tainted images. Kenya deserves better!
1546722407003693057 2022-07-12 00:05:33 -0500 <kijana_yacopa> Apostle Paul's letter to Kenya's IEBC ... "and 

1546388727441809413 2022-07-11 01:59:37 -0500 <worldnews_guru> Kenya elections 2022: The misinformation circulating over academic qualifications  https://t.co/VAFYdi2DTC
1546379541937065984 2022-07-11 01:23:07 -0500 <KiprutoCorneli> Annabel Ndilo,who has predicted the shelter, food and Clothing Crisis in Rutonomincs -written  well in advances of the historic 2022 elections. A RUTO presidency based on Justice and  Cooperation Principles that have not held Sway in Kenya Politics, for quite sometime.#Rutonomics
1546374789417369600 2022-07-11 01:04:14 -0500 <worldnews_guru> Kenya elections 2022: The misinformation circulating over academic qualifications   https://t.co/Om60bsLCdG  https://t.co/Ku8gGJNr0n
1546373451551850496 2022-07-11 00:58:55 -0500 <worldnews_guru> Kenya elections 2022: The misinformation circulating over academic qualifications  https://t.co/8y4rru0Xpa
1546372265243299842 2022-07-11 00:54:12 -0500 <ProPeace_Kenya> Eces is pleased to support @IEBCKenya in the organisation

[!] No more data! Scraping will stop now.
found 0 deleted tweets in this search.


In [11]:
# Review dataframe
tweets[:3]

Unnamed: 0,id,conversation_id,created_at,date,time,timezone,user_id,username,name,place,...,geo,source,user_rt_id,user_rt,retweet_id,reply_to,retweet_date,translate,trans_src,trans_dest
0,1549995632382951424,1549995632382951424,2022-07-21 00:52:10 CDT,2022-07-21,00:52:10,-500,1549726926553681920,realwisenganga,real.wisenganga,,...,,,,,,[],,,,
1,1549975786077229056,1549975786077229056,2022-07-20 23:33:19 CDT,2022-07-20,23:33:19,-500,1382201755,nicmuhando,MUHANDO,,...,,,,,,[],,,,
2,1549974027669803008,1549970492022464514,2022-07-20 23:26:19 CDT,2022-07-20,23:26:19,-500,46354706,nationfmke,Nation FM,,...,,,,,,[],,,,


In [12]:
# Make copy of dataframe 
df = tweets.copy()

In [13]:
# Unique tweets usernames
df.username.nunique()

254

### Check for Nulls

In [14]:
missing_data_percent = df.isnull().sum() * 100 / len(df)
missing_data_percent

id                   0.000000
conversation_id      0.000000
created_at           0.000000
date                 0.000000
time                 0.000000
timezone             0.000000
user_id              0.000000
username             0.000000
name                 0.000000
place              100.000000
tweet                0.000000
language             0.000000
mentions             0.000000
urls                 0.000000
photos               0.000000
replies_count        0.000000
retweets_count       0.000000
likes_count          0.000000
hashtags             0.000000
cashtags             0.000000
link                 0.000000
retweet              0.000000
quote_url           94.857143
video                0.000000
thumbnail           68.571429
near               100.000000
geo                100.000000
source             100.000000
user_rt_id         100.000000
user_rt            100.000000
retweet_id         100.000000
reply_to             0.000000
retweet_date       100.000000
translate 

### Drop Nulls and Unnecessary Columns

In [15]:
cols_to_drop = ['place', 'quote_url','thumbnail', 
                'near','geo','source', 'user_rt_id', 
                'user_rt','retweet_id', 'retweet_date',
                'trans_src', 'trans_dest', 'translate',
               'cashtags', 'reply_to', 'retweet', 'created_at']

df = df.drop(columns = cols_to_drop)

### Data Preparation Notes:

- Dropping columns with missing values over 90%. 
- Thumbnail contains twitter image links and will drop the columns as non-essential information.
- Scrapping limited langauge to english but other languages (not including swahili which is widely spoken in Kenya) still permeated. 

#### Data Limitations
- Dataframe has no retweet to tweets
- Although limit set to 5 million scrapped tweets,  datapoints results in 254 minimal unique usernames. (Possible multiple tweets from same accounts)

In [18]:
# Language Check
df.language.unique()

array(['en', 'qht', 'ca', 'in', 'fr', 'tl', 'und'], dtype=object)

### Univariate Features

### Bivariate Features

### Multi-variate Features