# Predicting the Norwegian Parliament Election 2025

* A model using bayesian statistics to predict the outcome of the Parliement election.

### Notes
* 169 mandates, 150 direkte, 19 utgjevningsmandater, en fra hvert valg distrikt

In [18]:
import numpy as np
import pandas as pd
from urllib.request import urlopen
import requests
from bs4 import BeautifulSoup
import re
pd.set_option('display.max_rows', None)

### Constants

In [28]:
valgdistrikt=np.array(["Østfold", "Akershus", "Oslo", "Hedemark", "Oppland", "Buskerud", "Vestfold","Telemark", "Aust-Agder", "Vest-Agder", "Rogaland","Hordaland","Sogn og Fjordane","Møre og Romsdal","Sør-Trøndelag","Nord-Trøndelag","Nordland","Troms", "Finnmark"])
partier= np.array(['Ap', 'Høyre', 'Frp', 'SV', 'Sp', 'KrF', 'Venstre', 'MDG', 'Rødt', 'Andre'])

In [21]:
# Latest local polling data
urls=["https://www.pollofpolls.no/?cmd=Stortinget&fylke=1",
      "https://www.pollofpolls.no/?cmd=Stortinget&fylke=2",
      "https://www.pollofpolls.no/?cmd=Stortinget&fylke=3",
      "https://www.pollofpolls.no/?cmd=Stortinget&fylke=4",
      "https://www.pollofpolls.no/?cmd=Stortinget&fylke=5",
      "https://www.pollofpolls.no/?cmd=Stortinget&fylke=6",
      "https://www.pollofpolls.no/?cmd=Stortinget&fylke=7",
      "https://www.pollofpolls.no/?cmd=Stortinget&fylke=8",
      "https://www.pollofpolls.no/?cmd=Stortinget&fylke=9",
      "https://www.pollofpolls.no/?cmd=Stortinget&fylke=10",
      "https://www.pollofpolls.no/?cmd=Stortinget&fylke=11",
      "https://www.pollofpolls.no/?cmd=Stortinget&fylke=12", #hordaland
      "https://www.pollofpolls.no/?cmd=Stortinget&fylke=14", #sogn
      "https://www.pollofpolls.no/?cmd=Stortinget&fylke=15",
      "https://www.pollofpolls.no/?cmd=Stortinget&fylke=16",
      "https://www.pollofpolls.no/?cmd=Stortinget&fylke=17",
      "https://www.pollofpolls.no/?cmd=Stortinget&fylke=18",
      "https://www.pollofpolls.no/?cmd=Stortinget&fylke=19",
      "https://www.pollofpolls.no/?cmd=Stortinget&fylke=20",
      ]

In [22]:
# 2021 Election data urls
urls_valg = ['https://www.pollofpolls.no/?cmd=Stortinget&do=visvalg&valg=2021&id=1',
             'https://www.pollofpolls.no/?cmd=Stortinget&do=visvalg&valg=2021&id=2',
             'https://www.pollofpolls.no/?cmd=Stortinget&do=visvalg&valg=2021&id=3',
             'https://www.pollofpolls.no/?cmd=Stortinget&do=visvalg&valg=2021&id=4',
             'https://www.pollofpolls.no/?cmd=Stortinget&do=visvalg&valg=2021&id=5',
             'https://www.pollofpolls.no/?cmd=Stortinget&do=visvalg&valg=2021&id=6',
             'https://www.pollofpolls.no/?cmd=Stortinget&do=visvalg&valg=2021&id=7',
             'https://www.pollofpolls.no/?cmd=Stortinget&do=visvalg&valg=2021&id=8',
             'https://www.pollofpolls.no/?cmd=Stortinget&do=visvalg&valg=2021&id=9',
             'https://www.pollofpolls.no/?cmd=Stortinget&do=visvalg&valg=2021&id=10',
             'https://www.pollofpolls.no/?cmd=Stortinget&do=visvalg&valg=2021&id=11',
             'https://www.pollofpolls.no/?cmd=Stortinget&do=visvalg&valg=2021&id=12', # 13 er ikke med
             'https://www.pollofpolls.no/?cmd=Stortinget&do=visvalg&valg=2021&id=14',
             'https://www.pollofpolls.no/?cmd=Stortinget&do=visvalg&valg=2021&id=15',
             'https://www.pollofpolls.no/?cmd=Stortinget&do=visvalg&valg=2021&id=16',
             'https://www.pollofpolls.no/?cmd=Stortinget&do=visvalg&valg=2021&id=17',
             'https://www.pollofpolls.no/?cmd=Stortinget&do=visvalg&valg=2021&id=18',
             'https://www.pollofpolls.no/?cmd=Stortinget&do=visvalg&valg=2021&id=19',
             'https://www.pollofpolls.no/?cmd=Stortinget&do=visvalg&valg=2021&id=20']

### Data scraping Polling Data
Extracting the local polling data in each the 19 election districts from https://www.pollofpolls.no. For each Election district the polling data concists of the distribtuion of votes (%) between the parties Ap, Høyre, Frp, SV, Sp, KrF, Venstre, MDG, Rødt, Andre. 

In [17]:
def extract_local_polling_data(url_list):
    valgdistrikt=np.array(["Østfold", "Akershus", "Oslo", "Hedemark", "Oppland", "Buskerud", "Vestfold","Telemark", "Aust-Agder", "Vest-Agder", "Rogaland","Hordaland","Sogn og Fjordane","Møre og Romsdal","Sør-Trøndelag","Nord-Trøndelag","Nordland","Troms", "Finnmark"])   
    partier= np.array(['Ap', 'Høyre', 'Frp', 'SV', 'Sp', 'KrF', 'Venstre', 'MDG', 'Rødt', 'Andre'])
    df_new = pd.DataFrame({"Valgdistrikt": np.repeat(valgdistrikt, 10, axis=0), "Partier": np.tile(partier,19),"Poll-dato":np.nan})

    prosent_lister=np.array([])
    dato_lister=np.array([])

    for url in urls:
        page = urlopen(url)
        html_bytes = page.read()
        html = html_bytes.decode("utf-8")
        soup = BeautifulSoup(html, 'html.parser')
        table = soup.find('table') #find all tables
        rows = table.find_all('tr') # find each row
        prosenter=[]
        for row in rows:
            cells = row.find_all(['th', 'td'])  # Find both header and data cells
            cell_texts = [cell.get_text(strip=True) for cell in cells]

            if cell_texts[0]=="Siste lokale målings":
                for i in range(1,11): #number of parties
                    tekst = cell_texts[i]
                    index = tekst.find(' (')
                    tall_string_format = tekst[0:index]
                    tall_string_med_komma = tall_string_format.replace(',','.')
                    tall = float(tall_string_med_komma)
                    prosenter.append(tall)
        prosent_lister=np.append(prosent_lister,prosenter)
        # add date to poll
        for li in soup.find_all('li'):
            sup_tag = li.find('sup')
            if sup_tag and sup_tag.text=='s':
                text = li.get_text()
                index1=text.find('(')
                index2=text.find(')')
                date=text[index1+1:index2]
                break
        dato_lister=np.append(dato_lister,[date for i in range(0,10)])


    df_new['Prosent-oppsluttning'] = prosent_lister
    df_new['Poll-dato'] = dato_lister

    return df_new
                

In [23]:
df_poll=extract_local_polling_data(urls)

### Data scraping 2021 Election results
Extracting the local polling data in each the 19 election districts from https://www.pollofpolls.no. For each Election district the polling data concists of the distribtuion of votes (%) between the parties Ap, Høyre, Frp, SV, Sp, KrF, Venstre, MDG, Rødt, Andre. 

In [25]:
def extract_2021_election(url_list):
    valgdistrikt=np.array(["Østfold", "Akershus", "Oslo", "Hedemark", "Oppland", "Buskerud", "Vestfold","Telemark", "Aust-Agder", "Vest-Agder", "Rogaland","Hordaland","Sogn og Fjordane","Møre og Romsdal","Sør-Trøndelag","Nord-Trøndelag","Nordland","Troms", "Finnmark"])   
    partier= np.array(['Ap', 'Høyre', 'Frp', 'SV', 'Sp', 'KrF', 'Venstre', 'MDG', 'Rødt', 'Andre'])
    df_new = pd.DataFrame({"Valgdistrikt": np.repeat(valgdistrikt, 10, axis=0), "Partier": np.tile(partier,19)})
    df_list = []

    for url in url_list:
        page = urlopen(url)
        html_bytes = page.read()
        html = html_bytes.decode("utf-8")
        soup = BeautifulSoup(html, 'html.parser')
        table = soup.find('table') #find all tables
        rows = table.find_all('tr') # find each row
        rows2 = table.find_all('td')
        liste = []
        for row in rows2:
            tall = row.text
            tall = tall.replace(" ","") # remove blank space 1000 separator
            tall = tall.replace(",",".") 
            liste.append(float(tall))
        df = pd.DataFrame()
        for i in range(0,len(liste),8):
            new_row = {'Forhånd': liste[i],
                    'Valgting': liste[i+1],
                    'Sum': liste[i+2],
                    'Endr-sum': liste[i+3],
                    'Fordeling': liste[i+4],
                    'Endr-for': liste[i+5],
                    'Mandat': liste[i+6],
                    'Endr-Mandat': liste[i+7]}
            df = df._append(new_row,ignore_index = True)
        df_list.append(df)

    df_conc_ver = pd.concat(df_list, ignore_index=True, axis=0)
    df_concat_final = pd.concat([df_new, df_conc_ver], axis=1)

    return df_concat_final

    

In [26]:
df_election = extract_2021_election(urls_valg)

In [31]:
df_poll

Unnamed: 0,Valgdistrikt,Partier,Poll-dato,Prosent-oppsluttning
0,Østfold,Ap,10/9-2021,27.5
1,Østfold,Høyre,10/9-2021,17.8
2,Østfold,Frp,10/9-2021,15.9
3,Østfold,SV,10/9-2021,7.7
4,Østfold,Sp,10/9-2021,12.9
5,Østfold,KrF,10/9-2021,2.8
6,Østfold,Venstre,10/9-2021,3.1
7,Østfold,MDG,10/9-2021,4.4
8,Østfold,Rødt,10/9-2021,5.0
9,Østfold,Andre,10/9-2021,3.1


In [30]:
df_election

Unnamed: 0,Valgdistrikt,Partier,Forhånd,Valgting,Sum,Endr-sum,Fordeling,Endr-for,Mandat,Endr-Mandat
0,Østfold,Ap,24817.0,24528.0,49345.0,-2200.0,30.5,-1.6,3.0,0.0
1,Østfold,Høyre,15598.0,14613.0,30211.0,-8053.0,18.7,-5.1,2.0,0.0
2,Østfold,Frp,9329.0,11198.0,20527.0,-7654.0,12.7,-4.8,1.0,-1.0
3,Østfold,SV,5525.0,4315.0,9840.0,2804.0,6.1,1.7,1.0,0.0
4,Østfold,Sp,10082.0,12767.0,22849.0,8906.0,14.1,5.4,2.0,1.0
5,Østfold,KrF,2576.0,2838.0,5414.0,-1397.0,3.3,-0.9,0.0,0.0
6,Østfold,Venstre,2408.0,2363.0,4771.0,893.0,2.9,0.5,0.0,0.0
7,Østfold,MDG,2974.0,1808.0,4782.0,590.0,3.0,0.3,0.0,0.0
8,Østfold,Rødt,4258.0,3160.0,7418.0,3983.0,4.6,2.4,0.0,0.0
9,Østfold,Andre,3675.0,3064.0,6739.0,3204.0,4.2,2.0,0.0,0.0


In [29]:
df_poll

Unnamed: 0,Valgdistrikt,Partier,Poll-dato,Prosent-oppsluttning
0,Østfold,Ap,10/9-2021,27.5
1,Østfold,Høyre,10/9-2021,17.8
2,Østfold,Frp,10/9-2021,15.9
3,Østfold,SV,10/9-2021,7.7
4,Østfold,Sp,10/9-2021,12.9
5,Østfold,KrF,10/9-2021,2.8
6,Østfold,Venstre,10/9-2021,3.1
7,Østfold,MDG,10/9-2021,4.4
8,Østfold,Rødt,10/9-2021,5.0
9,Østfold,Andre,10/9-2021,3.1


### Load Election district Population and Mandate distribution

In [27]:
col_names = ["Valgdistrikt","Mandater","Befolkningstall"]
df = pd.read_csv('./data/mandate-distribution.csv',header=None,names=col_names) 
df 

Unnamed: 0,Valgdistrikt,Mandater,Befolkningstall
0,Østfold,9,312152
1,Akershus,20,728803
2,Oslo,20,717710
3,Hedemark,7,202048
4,Oppland,6,174256
5,Buskerud,8,269819
6,Vestfold,7,256432
7,Telemark,6,177093
8,Aust-Agder,4,122968
9,Vest-Agder,6,196882


### Idea of the simulation

$$ Dir(\alpha) + Polling + \gamma = Dir(\bar{\alpha})$$

For each election district, we compute set a vector $\alpha$ of length 8 (number of parties). The ith element of the parameter belonging to the $k$ th election district $\alpha_i^k$ is computed as
$$

\alpha_i^k = \gamma_{\text{Valg}} n_i^k p_i^k +  \gamma_{\text{Poll}} m_i^k b_i^k + 1

$$
$n_i^k$ denotes the number of votes for political party $i$ in election district $k$, $p_i^k$ denotes the fraction of votes for the party in the election, $m_i^k$ denotes the number of votes for the party in the polling and b_i^k$ denotes the fraction of votes for the party in the polls. 

