# IBB Wifi Data Inspection and Forecasting

## Summary
This dataset contains the number of users of IBB Wi-Fi Locations by years. The dataset contains static locations(station) represented as known location names and also contains moving vehicle name codes which are known as public transport vehicles. This dataset doesn’t have Geospatial data all locations listed by their names. Additionally, 2021 data looks incomplete in terms of numbers so we might need to consider this factor when using it.

## Importing Libraries

In [1]:
import tensorflow as tf
import vehicle as vehicle
from tensorflow import keras
from keras import layers
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import numpy as np
import tabpy
from sklearn import preprocessing
from sklearn.model_selection import train_test_split
from tensorflow.python.ops.control_flow_ops import case
import tableauserverclient as TSC
from pandleau import *

You are using the Extract API 2.0, please save the output as .hyper format


## Load Data

Find out how many locations in the dataset.

In [2]:
df = pd.read_csv('https://data.ibb.gov.tr/dataset/2994fdcd-4710-4d9f-91eb-cbe7e915899f/resource/c4ff3626-4092-4669-a1da-04b7294553d7/download/ibb-wi-fi-noktalar-yllara-gore-kullanc-says.csv', encoding ='latin1', sep=';')
print(f'Total Locations: {len(df)}')


Total Locations: 4475


In [3]:
df.head(5)

Unnamed: 0,Konum,2019,2020,2021
0,PDM Fatih,8,0,0
1,IETT-O3041,509771,71799,1710
2,ISKI-Esenler,1504313,1243715,11428
3,Null,25120646,4612,282
4,IETT-B5060,129208,38182,635


## Preprocess Data

There is an unknown location in the dataset. We can change it to "Unknown Location" for a more explaining name.

In [4]:
df['Konum'] = df['Konum'].replace(['Null'],'Unknown Location')
df.head(5)

Unnamed: 0,Konum,2019,2020,2021
0,PDM Fatih,8,0,0
1,IETT-O3041,509771,71799,1710
2,ISKI-Esenler,1504313,1243715,11428
3,Unknown Location,25120646,4612,282
4,IETT-B5060,129208,38182,635


Check null, N/A, empty string values in dataset.

In [5]:
print(df.isnull().sum())
print(f'Total Empty Strings:{(df["Konum"].values == "").sum()}')

Konum    0
2019     0
2020     0
2021     0
dtype: int64
Total Empty Strings:0


### Separating Station data from Public Transport Vehicle data by adding extra information column

In [6]:
def vehicleorstation(x):
    """" Function to find if given location either vehicle or station"""
    if 'IETT-' in x:
        return 'vehicle'
    elif 'Metrobus-' in x:
        return 'vehicle'
    elif 'Metretrobus-' in x:
        return 'vehicle'
    elif 'HAVAIST-' in x:
        return 'vehicle'
    elif 'BLNT-' in x:
        return 'vehicle'
    elif 'BELNET-' in x:
        return 'vehicle'
    elif 'Belnet-' in x:
        return 'vehicle'
    elif 'ÝETT-' in x:
        return 'vehicle'
    elif 'HAVATAS-' in x:
        return 'vehicle'
    elif 'Deniz Otobus' in x:
        return 'vehicle'
    elif 'Metobus-' in x:
        return 'vehicle'
    else:
        return 'station'
df['Type'] = df.Konum.apply(lambda x: vehicleorstation(x))
df

Unnamed: 0,Konum,2019,2020,2021,Type
0,PDM Fatih,8,0,0,station
1,IETT-O3041,509771,71799,1710,vehicle
2,ISKI-Esenler,1504313,1243715,11428,station
3,Unknown Location,25120646,4612,282,station
4,IETT-B5060,129208,38182,635,vehicle
...,...,...,...,...,...
4470,017-Cemberlitas Nuriosmaniye Caddesi,426,0,0,station
4471,ErdemBeyazitKutuphanesi-2,100199,0,0,station
4472,018-IBB Tercih Merkezi Beyoglu,0,28643,0,station
4473,IETT-O3040,311873,82873,1070,vehicle


## Export Dataset to Tableau

In [7]:
pand_df = pandleau(df)
pand_df.to_tableau('mydata.hyper', add_index=False)

Table 'Extract' does not exist in extract mydata.hyper, creating.


processing table: 4475it [00:00, 50852.93it/s]


## Data Inspection

### Total Vehicles and Stations