# **Political Voting Survey Dashboard 🌏**

**Introduction**

This project aims to showcase my skills in Data Visualizations and Dashboard Development with Tableau using generated dummy data with an example case in political surveys. Also, while the Visualization using Tableau are using CSV Files, due to Tableau Public Limitations, I am also going to develop an ingestion process from Generated Data to PostgreSQL for further real-time (or updated) visualizations.

Due to current limitations in access data, I will generate dummy data using APIs from multiple random generators with a location in Indonesia. This project does not at all represent any political fact about any region and are artificial created for learning and skills showcasing purposes.

**Tables:**

- Votes (6000 Unique Values)
    - voter_name
    - voter_id
    - candidate_id
    - region_id
- Candidate (4 Unique Values)
    - Candidate_id
    - Nama (Buat 4)
    - Party_id (Buat 4)
- region (33 Unique Values)
    - region_id
    - nama_wilayah
- Electoral Vote ← Voters (dibuat dari yang menang masing-masing)
    - Wilayah (Ada 33)
    - Vote Count
    - Candidate_name
    - Party
- Party
    - party_id
    - party_name
    - member_count

In [3]:
import pandas as pd
import random
from helper.data_generate import generate_person

**Region** 

In [4]:
region_df = pd.read_csv('sources/regions_id.csv')
region_df = region_df.reset_index().rename(columns={"index": "id"})

region_df.rename(columns={"Provinsi di Indonesia":"province",
                          "Jumlah Penduduk Menurut Provinsi di Indonesia (Ribu Jiwa)":"population_count_in_thousands"},
                 inplace=True)

region_df = region_df[~region_df["province"].isin(["Indonesia"])]
region_df['id'] = region_df['id'] + 1

In [5]:
region_df

Unnamed: 0,id,province,population_count_in_thousands
0,1,Aceh,5554.8
1,2,Sumatera Utara,15588.5
2,3,Sumatera Barat,5836.2
3,4,Riau,6728.1
4,5,Kep. Riau,2183.3
5,6,Jambi,3724.3
6,7,Sumatera Selatan,8837.3
7,8,Kep. Bangka Belitung,1531.5
8,9,Bengkulu,2112.2
9,10,Lampung,9419.6


**Party** 

In [6]:
party_names = ["Partai Sosial Demokrasi Indonesia", "Partai Republik Nasionalis Nusantara", "Partai Amanat Indonesia", "Partai Kerjasama dan Utusan Rakyat"]
party_list = []
id = 1

for i in party_names:
    party = {}
    party["id"] = id
    id += 1
    party["party_name"] = i
    party["total_members"] = random.randint(1500, 5000)

    party_list.append(party)

In [7]:
party_df = pd.DataFrame(party_list)

In [8]:
party_df

Unnamed: 0,id,party_name,total_members
0,1,Partai Sosial Demokrasi Indonesia,3089
1,2,Partai Republik Nasionalis Nusantara,2489
2,3,Partai Amanat Indonesia,3090
3,4,Partai Kerjasama dan Utusan Rakyat,2093


**Candidate** 

In [9]:
generate_person()

{'gender': 'male',
 'name': {'title': 'Monsieur', 'first': 'Enea', 'last': 'Aubert'},
 'location': {'street': {'number': 2564, 'name': 'Rue Duguesclin'},
  'city': 'Büetigen',
  'state': 'Schaffhausen',
  'country': 'Switzerland',
  'postcode': 9621,
  'coordinates': {'latitude': '-73.0704', 'longitude': '-90.4727'},
  'timezone': {'offset': '-5:00',
   'description': 'Eastern Time (US & Canada), Bogota, Lima'}},
 'email': 'enea.aubert@example.com',
 'login': {'uuid': '4034d294-9ed3-4586-9dea-aacdf8a8618f',
  'username': 'tinymouse706',
  'password': 'snapple',
  'salt': 'J4ZlvFyw',
  'md5': 'ec539ea35523c19df2d90d950047bac6',
  'sha1': 'bb878477720b87967dbbdebfe4d1265154509d5d',
  'sha256': '7fde6c221341ead9571f29d75c47e47373f2ba49d75ce8e01a0b57342c683cd8'},
 'dob': {'date': '1981-02-16T08:53:11.279Z', 'age': 44},
 'registered': {'date': '2018-05-04T01:48:17.433Z', 'age': 6},
 'phone': '077 654 05 98',
 'cell': '075 756 53 25',
 'id': {'name': 'AVS', 'value': '756.3082.4639.96'},
 'pi

In [10]:
candidates = []

for i in party_df["id"]:
    candidate = {}
    person = generate_person()

    candidate['id'] = i
    candidate['name'] = person['name']['first'] + ' ' + person['name']['last']
    candidate['gender'] = person['gender']
    candidate['party_id'] = i

    candidates.append(candidate)

candidates_df = pd.DataFrame(candidates) 

In [11]:
candidates_df

Unnamed: 0,id,name,gender,party_id
0,1,Germaine Thomas,female,1
1,2,Rahel Da Silva,female,2
2,3,Andre Brunet,male,3
3,4,Claudio Guerin,male,4


**Voters** 

In [12]:
voters_csv_df = pd.read_csv("csv/voters.csv")
voters_csv_df.drop(columns=['Unnamed: 0'], inplace=True)

In [13]:
voters_csv_df

Unnamed: 0,name,gender,vote,region_id
0,Mohamad Olivier,male,2,2
1,Ulrich Dumont,male,3,1
2,Catherine Petit,female,3,4
3,Lars Bourgeois,male,3,3
4,Kerstin Robert,female,2,2
...,...,...,...,...
7050,Georg Leroy,male,2,4
7051,Désirée Gaillard,female,1,1
7052,Yvan Andre,male,2,2
7053,Viviane Roche,female,1,2


In [None]:
voters = []

def generate_voter():
    person = generate_person()
    voter = {}

    voter['name'] = person['name']['first'] + ' ' + person['name']['last']
    voter['gender'] = person['gender']
    voter['vote'] = random.randint(1, max(candidates_df['id']))
    voter['region_id'] = random.randint(1, max(region_df['id']))

    print(voter)
    return voter


In [15]:
for x in range(7):
    for i in range(250):
        print(str(i) + " " + str(x))
        voter = generate_voter()
        voters.append(voter)

0 0
{'name': 'Jean-Philippe Pierre', 'gender': 'male', 'vote': 1, 'region_id': 4}
1 0
{'name': 'Tiziano Hubert', 'gender': 'male', 'vote': 3, 'region_id': 2}
2 0
{'name': 'Umberto Leclerc', 'gender': 'male', 'vote': 3, 'region_id': 1}
3 0
{'name': 'Boris Meyer', 'gender': 'male', 'vote': 4, 'region_id': 4}
4 0
{'name': 'Jérémie Boyer', 'gender': 'male', 'vote': 4, 'region_id': 4}
5 0
{'name': 'Ella Roche', 'gender': 'female', 'vote': 4, 'region_id': 2}
6 0
{'name': 'Fatime Lopez', 'gender': 'female', 'vote': 2, 'region_id': 3}
7 0
{'name': 'Viktor Duval', 'gender': 'male', 'vote': 4, 'region_id': 2}
8 0
{'name': 'Emir Moulin', 'gender': 'male', 'vote': 2, 'region_id': 1}
9 0
{'name': 'Giorgia Denis', 'gender': 'female', 'vote': 3, 'region_id': 4}
10 0
{'name': 'Maurice Meyer', 'gender': 'male', 'vote': 4, 'region_id': 3}
11 0
{'name': 'Dominique Laurent', 'gender': 'female', 'vote': 4, 'region_id': 4}
12 0
{'name': 'Silas Colin', 'gender': 'male', 'vote': 3, 'region_id': 2}
13 0
{'name

In [16]:
voters_df = pd.DataFrame(voters)

In [None]:
voters_csv_df = pd.concat([voters_csv_df, voters_df], ignore_index=True)
voters_csv_df = voters_csv_df.drop_duplicates()

voters_csv_df.to_csv("csv/voters.csv")

In [65]:
voters_csv_df['region_id'] = voters_csv_df.apply(lambda _: random.randint(1, max(region_df['id'])), axis=1)

In [66]:
voters_csv_df

Unnamed: 0,name,gender,vote,region_id
0,Mohamad Olivier,male,2,5
1,Ulrich Dumont,male,3,33
2,Catherine Petit,female,3,15
3,Lars Bourgeois,male,3,6
4,Kerstin Robert,female,2,32
...,...,...,...,...
8800,Annalise Perrin,female,3,29
8801,Tiago Guerin,male,2,19
8802,Myriam Colin,female,4,34
8803,Lotti Faure,female,3,29


**Electoral** 

In [67]:
votes_with_candidates = pd.merge(voters_csv_df, candidates_df[['id', 'name', 'party_id']], left_on='vote', right_on='id', how='inner')

votes_with_region = pd.merge(votes_with_candidates, region_df[['id', 'province']], left_on='region_id', right_on='id', how='inner')

votes_cleansed = pd.merge(votes_with_region, party_df[['id','party_name']], left_on='party_id', right_on='id', how='inner')

votes_cleansed.drop(columns=['id_x','id_y','id'], inplace=True)

votes_cleansed.rename(columns={'name_x': 'voter_name', 'name_y':'candidate_name'}, inplace=True)

In [68]:
votes_cleansed

Unnamed: 0,voter_name,gender,vote,region_id,candidate_name,party_id,province,party_name
0,Mohamad Olivier,male,2,5,Rahel Da Silva,2,Kep. Riau,Partai Republik Nasionalis Nusantara
1,Ulrich Dumont,male,3,33,Andre Brunet,3,Papua Barat,Partai Amanat Indonesia
2,Catherine Petit,female,3,15,Andre Brunet,3,DI Yogyakarta,Partai Amanat Indonesia
3,Lars Bourgeois,male,3,6,Andre Brunet,3,Jambi,Partai Amanat Indonesia
4,Kerstin Robert,female,2,32,Rahel Da Silva,2,Maluku Utara,Partai Republik Nasionalis Nusantara
...,...,...,...,...,...,...,...,...
8794,Annalise Perrin,female,3,29,Andre Brunet,3,Nusa Tenggara Barat,Partai Amanat Indonesia
8795,Tiago Guerin,male,2,19,Rahel Da Silva,2,Kalimantan Selatan,Partai Republik Nasionalis Nusantara
8796,Myriam Colin,female,4,34,Claudio Guerin,4,Papua,Partai Kerjasama dan Utusan Rakyat
8797,Lotti Faure,female,3,29,Andre Brunet,3,Nusa Tenggara Barat,Partai Amanat Indonesia


In [None]:
# Data detail untuk masing-masing provinsi
details_province = votes_cleansed.groupby(['province', 'candidate_name']).agg({
    'voter_name': ['count']
})

details_province.rename(columns={'voter_name': 'vote_count'}, inplace=True)
details_province.columns = details_province.columns.droplevel(1)
details_province.reset_index()

details_province

Unnamed: 0,province,candidate_name,vote_count
0,Aceh,Andre Brunet,67
1,Aceh,Claudio Guerin,55
2,Aceh,Germaine Thomas,67
3,Aceh,Rahel Da Silva,62
4,Bali,Andre Brunet,56
...,...,...,...
131,Sumatera Selatan,Rahel Da Silva,62
132,Sumatera Utara,Andre Brunet,62
133,Sumatera Utara,Claudio Guerin,58
134,Sumatera Utara,Germaine Thomas,65


In [None]:
# Data pememenang setiap provinsi
winners = details_province.groupby('province')['vote_count'].idxmax()
winning_candidates = details_province.loc[winners].reset_index()

winning_candidates

Unnamed: 0,province,candidate_name,vote_count
0,Aceh,Andre Brunet,67
1,Bali,Rahel Da Silva,84
2,Banten,Claudio Guerin,75
3,Bengkulu,Claudio Guerin,81
4,DI Yogyakarta,Claudio Guerin,68
5,DKI Jakarta,Claudio Guerin,85
6,Gorontalo,Rahel Da Silva,72
7,Jambi,Rahel Da Silva,70
8,Jawa Barat,Andre Brunet,69
9,Jawa Tengah,Germaine Thomas,82


In [None]:
# Result total pemenang
winning_candidates.groupby('candidate_name').agg({
    'province':'count',
    'vote_count': 'sum'
})

Unnamed: 0_level_0,province,vote_count
candidate_name,Unnamed: 1_level_1,Unnamed: 2_level_1
Andre Brunet,12,850
Claudio Guerin,6,460
Germaine Thomas,6,423
Rahel Da Silva,10,713
