#The Modern Mamak: Data Filtering by Company Status

The dataset will be separated into several dataframes: live stores (currently in operation) and non-live stores (currently not in operation) that were registered within each decade. This is to have a rough guage of how many stores of have closed down since the passing of each decade. This data would also be used to approximate the termination date of some non-live shops and for visualising the survivability of shops in each decade.

<br>

###Content:

1. Separating Shops in the 1980s into Live and Non-live
2. Separating Shops in the 1990s into Live and Non-live
3. Separating Shops in the 2000s into Live and Non-live
4. Separating Shops in the 2010s into Live and Non-live
5. Separating Shops in the 2020s into Live and Non-live

# Libraries

In [3]:
import numpy as np
import pandas as pd
import geopandas as gpd
import shapefile as shp
import matplotlib.pyplot as plt
import seaborn as sns

In [4]:
#Google credentials
from pydrive2.auth import GoogleAuth
from pydrive2.drive import GoogleDrive
from google.colab import auth
from oauth2client.client import GoogleCredentials

from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


# Read Files

In [5]:
base_path = "/content/drive/MyDrive/Mamak Stores/ACRA data/"

decades = ["1980s", "1990s", "2000s", "2010s", "2020s"]

mamaks_geo = {decade: pd.read_csv(f"{base_path}mamaks_{decade}_geo.csv") for decade in decades}

mm_1980s_geo = mamaks_geo["1980s"]
mm_1990s_geo = mamaks_geo["1990s"]
mm_2000s_geo = mamaks_geo["2000s"]
mm_2010s_geo = mamaks_geo["2010s"]
mm_2020s_geo = mamaks_geo["2020s"]

#Definitions for Separating Live and non-Live Shops

In [6]:
def get_live_shops(df):
  temp_df_live = df[
    (df['entity_status_description'] == 'Live') |
    (df['entity_status_description'] == 'Live Company')
  ]
  temp_df_live = temp_df_live.reset_index(drop=True)

  return temp_df_live


In [7]:
def get_nlive_shops(df):
  temp_df_nlive = df[
    (df['entity_status_description'] != 'Live') &
    (df['entity_status_description'] != 'Live Company')
  ]
  temp_df_nlive = temp_df_nlive.reset_index(drop=True)

  return temp_df_nlive

#Separating Shops into Live and Non-live

###1980s

In [8]:
mm_1980s_live = get_live_shops(mm_1980s_geo)
mm_1980s_nlive = get_nlive_shops(mm_1980s_geo)

In [9]:
mm_1980s_live.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 47 entries, 0 to 46
Data columns (total 31 columns):
 #   Column                             Non-Null Count  Dtype  
---  ------                             --------------  -----  
 0   uen                                47 non-null     object 
 1   issuance_agency_id                 47 non-null     object 
 2   entity_name                        47 non-null     object 
 3   entity_type_description            47 non-null     object 
 4   business_constitution_description  47 non-null     object 
 5   company_type_description           47 non-null     object 
 6   paf_constitution_description       47 non-null     object 
 7   entity_status_description          47 non-null     object 
 8   registration_incorporation_date    47 non-null     object 
 9   uen_issue_date                     47 non-null     object 
 10  address_type                       47 non-null     object 
 11  block                              47 non-null     object 
 

In [10]:
mm_1980s_nlive.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 893 entries, 0 to 892
Data columns (total 31 columns):
 #   Column                             Non-Null Count  Dtype  
---  ------                             --------------  -----  
 0   uen                                893 non-null    object 
 1   issuance_agency_id                 893 non-null    object 
 2   entity_name                        893 non-null    object 
 3   entity_type_description            893 non-null    object 
 4   business_constitution_description  893 non-null    object 
 5   company_type_description           893 non-null    object 
 6   paf_constitution_description       893 non-null    object 
 7   entity_status_description          893 non-null    object 
 8   registration_incorporation_date    893 non-null    object 
 9   uen_issue_date                     893 non-null    object 
 10  address_type                       893 non-null    object 
 11  block                              893 non-null    object 

###1990s

In [11]:
mm_1990s_live = get_live_shops(mm_1990s_geo)
mm_1990s_nlive = get_nlive_shops(mm_1990s_geo)

In [12]:
mm_1990s_live.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 76 entries, 0 to 75
Data columns (total 31 columns):
 #   Column                             Non-Null Count  Dtype  
---  ------                             --------------  -----  
 0   uen                                76 non-null     object 
 1   issuance_agency_id                 76 non-null     object 
 2   entity_name                        76 non-null     object 
 3   entity_type_description            76 non-null     object 
 4   business_constitution_description  76 non-null     object 
 5   company_type_description           76 non-null     object 
 6   paf_constitution_description       76 non-null     object 
 7   entity_status_description          76 non-null     object 
 8   registration_incorporation_date    76 non-null     object 
 9   uen_issue_date                     76 non-null     object 
 10  address_type                       76 non-null     object 
 11  block                              76 non-null     object 
 

In [13]:
mm_1990s_nlive.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1285 entries, 0 to 1284
Data columns (total 31 columns):
 #   Column                             Non-Null Count  Dtype  
---  ------                             --------------  -----  
 0   uen                                1285 non-null   object 
 1   issuance_agency_id                 1285 non-null   object 
 2   entity_name                        1285 non-null   object 
 3   entity_type_description            1285 non-null   object 
 4   business_constitution_description  1285 non-null   object 
 5   company_type_description           1285 non-null   object 
 6   paf_constitution_description       1285 non-null   object 
 7   entity_status_description          1285 non-null   object 
 8   registration_incorporation_date    1285 non-null   object 
 9   uen_issue_date                     1285 non-null   object 
 10  address_type                       1285 non-null   object 
 11  block                              1285 non-null   objec

###2000s

In [14]:
mm_2000s_live = get_live_shops(mm_2000s_geo)
mm_2000s_nlive = get_nlive_shops(mm_2000s_geo)

In [15]:
mm_2000s_live.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 118 entries, 0 to 117
Data columns (total 31 columns):
 #   Column                             Non-Null Count  Dtype  
---  ------                             --------------  -----  
 0   uen                                118 non-null    object 
 1   issuance_agency_id                 118 non-null    object 
 2   entity_name                        118 non-null    object 
 3   entity_type_description            118 non-null    object 
 4   business_constitution_description  118 non-null    object 
 5   company_type_description           118 non-null    object 
 6   paf_constitution_description       118 non-null    object 
 7   entity_status_description          118 non-null    object 
 8   registration_incorporation_date    118 non-null    object 
 9   uen_issue_date                     118 non-null    object 
 10  address_type                       118 non-null    object 
 11  block                              118 non-null    object 

In [16]:
mm_2000s_nlive.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1381 entries, 0 to 1380
Data columns (total 31 columns):
 #   Column                             Non-Null Count  Dtype  
---  ------                             --------------  -----  
 0   uen                                1381 non-null   object 
 1   issuance_agency_id                 1381 non-null   object 
 2   entity_name                        1381 non-null   object 
 3   entity_type_description            1381 non-null   object 
 4   business_constitution_description  1381 non-null   object 
 5   company_type_description           1381 non-null   object 
 6   paf_constitution_description       1381 non-null   object 
 7   entity_status_description          1381 non-null   object 
 8   registration_incorporation_date    1381 non-null   object 
 9   uen_issue_date                     1381 non-null   object 
 10  address_type                       1381 non-null   object 
 11  block                              1381 non-null   objec

###2010s

In [17]:
mm_2010s_live = get_live_shops(mm_2010s_geo)
mm_2010s_nlive = get_nlive_shops(mm_2010s_geo)

In [18]:
mm_2010s_live.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 166 entries, 0 to 165
Data columns (total 31 columns):
 #   Column                             Non-Null Count  Dtype  
---  ------                             --------------  -----  
 0   uen                                166 non-null    object 
 1   issuance_agency_id                 166 non-null    object 
 2   entity_name                        166 non-null    object 
 3   entity_type_description            166 non-null    object 
 4   business_constitution_description  166 non-null    object 
 5   company_type_description           166 non-null    object 
 6   paf_constitution_description       166 non-null    object 
 7   entity_status_description          166 non-null    object 
 8   registration_incorporation_date    166 non-null    object 
 9   uen_issue_date                     166 non-null    object 
 10  address_type                       166 non-null    object 
 11  block                              166 non-null    object 

In [19]:
mm_2010s_nlive.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 845 entries, 0 to 844
Data columns (total 31 columns):
 #   Column                             Non-Null Count  Dtype  
---  ------                             --------------  -----  
 0   uen                                845 non-null    object 
 1   issuance_agency_id                 845 non-null    object 
 2   entity_name                        844 non-null    object 
 3   entity_type_description            845 non-null    object 
 4   business_constitution_description  845 non-null    object 
 5   company_type_description           845 non-null    object 
 6   paf_constitution_description       845 non-null    object 
 7   entity_status_description          845 non-null    object 
 8   registration_incorporation_date    845 non-null    object 
 9   uen_issue_date                     845 non-null    object 
 10  address_type                       845 non-null    object 
 11  block                              845 non-null    object 

###2020s

In [20]:
mm_2020s_live = get_live_shops(mm_2020s_geo)
mm_2020s_nlive = get_nlive_shops(mm_2020s_geo)

In [21]:
mm_2020s_live.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 149 entries, 0 to 148
Data columns (total 31 columns):
 #   Column                             Non-Null Count  Dtype  
---  ------                             --------------  -----  
 0   uen                                149 non-null    object 
 1   issuance_agency_id                 149 non-null    object 
 2   entity_name                        149 non-null    object 
 3   entity_type_description            149 non-null    object 
 4   business_constitution_description  149 non-null    object 
 5   company_type_description           149 non-null    object 
 6   paf_constitution_description       149 non-null    object 
 7   entity_status_description          149 non-null    object 
 8   registration_incorporation_date    149 non-null    object 
 9   uen_issue_date                     149 non-null    object 
 10  address_type                       149 non-null    object 
 11  block                              149 non-null    object 

In [22]:
mm_2020s_nlive.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 115 entries, 0 to 114
Data columns (total 31 columns):
 #   Column                             Non-Null Count  Dtype  
---  ------                             --------------  -----  
 0   uen                                115 non-null    object 
 1   issuance_agency_id                 115 non-null    object 
 2   entity_name                        115 non-null    object 
 3   entity_type_description            115 non-null    object 
 4   business_constitution_description  115 non-null    object 
 5   company_type_description           115 non-null    object 
 6   paf_constitution_description       115 non-null    object 
 7   entity_status_description          115 non-null    object 
 8   registration_incorporation_date    115 non-null    object 
 9   uen_issue_date                     115 non-null    object 
 10  address_type                       115 non-null    object 
 11  block                              115 non-null    object 

###Save

In [23]:
#SAVE LIVE SHOPS
mm_1980s_live.to_csv('/content/drive/MyDrive/Mamak Stores/ACRA data/mamaks_1980s_live.csv', index=False)
mm_1990s_live.to_csv('/content/drive/MyDrive/Mamak Stores/ACRA data/mamaks_1990s_live.csv', index=False)
mm_2000s_live.to_csv('/content/drive/MyDrive/Mamak Stores/ACRA data/mamaks_2000s_live.csv', index=False)
mm_2010s_live.to_csv('/content/drive/MyDrive/Mamak Stores/ACRA data/mamaks_2010s_live.csv', index=False)
mm_2020s_live.to_csv('/content/drive/MyDrive/Mamak Stores/ACRA data/mamaks_2020s_live.csv', index=False)

In [24]:
#SAVE NON-LIVE SHOPS
mm_1980s_nlive.to_csv('/content/drive/MyDrive/Mamak Stores/ACRA data/mamaks_1980s_nlive.csv', index=False)
mm_1990s_nlive.to_csv('/content/drive/MyDrive/Mamak Stores/ACRA data/mamaks_1990s_nlive.csv', index=False)
mm_2000s_nlive.to_csv('/content/drive/MyDrive/Mamak Stores/ACRA data/mamaks_2000s_nlive.csv', index=False)
mm_2010s_nlive.to_csv('/content/drive/MyDrive/Mamak Stores/ACRA data/mamaks_2010s_nlive.csv', index=False)
mm_2020s_nlive.to_csv('/content/drive/MyDrive/Mamak Stores/ACRA data/mamaks_2020s_nlive.csv', index=False)