<h1 align='center'>
<br>
    <img src="./assets/ibm_x_taiwan.png" alt="IBM Course" width="400">
<br>
    Applied Data Science Capstone
</h1>


<h2 align='center'>
The Battle of Neighborhoods - A Look Around Universities in Taiwan
</h2>

<h5 align='center'>
    Brian L. Chen, Feb., 2021
</h5>


## 1. Introduction

### 1.1 Background
Taiwan is known for its outstanding performance throughout this pandemic, and students from all over the world urge to know more about the unique island. Taiwan strictly performs its regulation and carries out a 14-day quarantine for people who are allowed to enter the island, including Taiwanese citizens. 

Besides the restrictions, Taiwan is also known for its high population density (ranked 17th worldwide in 2019 according to [statista](https://www.statista.com/statistics/264683/top-fifty-countries-with-the-highest-population-density/)), beautiful mountains, and convenience. 

As a student currently studying in Taiwan, providing an insightful analysis to the local universities in terms of convenience would be interesting for future students when it comes to choosing which university to go to. Though it may be somewhat absurd only choosing universities based on their convenience, it is still something worth noticing, especially for those who haven't actually been to a specific university before. 

Moreover, as a gym-goer and a sports fanatic, picking an university surrounded by sports facilities is always considerable. 

### 1.2 Interest 

The project is then aiming (1) **to examine the neighborhood of each university in Taiwan, exploring how it is like to live around its neighborhood**, and (2) **to inspect the sports facilities around each university in Taiwan**. 

The aim is to help future students choose their universities in terms on what their neighborhoods have to offer and what they would like to experience. The second part of this project could also help students playing sports gain a better perspective on which school to go to based on their preferences. 


<div class='alert alert-block alert-warning' style='margin-top: 20px'>
🗒️ Notice<br>
I am not here to get involved in any political issues. Taiwan is a beautiful place with beautiful people, and the project aims to provide insights regarding the universities on the island. Please do not add extra political perspective towards this project.
</div>

![](https://dynaimage.cdn.cnn.com/cnn/q_auto,w_900,c_fill,g_auto,h_506,ar_16:9/http%3A%2F%2Fcdn.cnn.com%2Fcnnnext%2Fdam%2Fassets%2F180719131350-beautiful-taiwan-popumon-alishan.jpg)

## 2. Data Acquisition and Cleaning

### 2.1 Data Sources
#### 2.1.1 List of Accredited Taiwan Universities
The data of listed universities is from the Ministry of Education, which can be found [here](https://ulist.moe.gov.tw/Download/FileDownload). The data contains the following information:

+ School ID
+ Private/Public Status
+ System/Institution
+ University Name (in Chinese)
+ University Name (in English)
+ Principal of the university
+ County
+ School District
+ Postal Code
+ School Address (in Chinese)
+ Tel
+ School Website

Since longitude and latitude for each school is not listed, **[HERE API](https://developer.here.com)** is used to fetch the data needed to be further processed using Foursquare. The final outcome is *stored locally* so that the API calls will not be provoked everytime when I restart the kernal.

The final attributes include:

+ Serial Number 
+ ID
+ Public/Private
+ System
+ Name
+ Eng. Name
+ Postal Code
+ Latitude
+ Longitude
+ Address
+ City
+ County
+ Neighborhood
+ County (zh)

#### 2.1.2 Geojson Data of Taiwan
The **geojson data** of Taiwan is obtained to plot the choropleth map. The data helps define the boundaries for each county in Taiwan. It is provided by **[g0v](https://github.com/g0v/twgeojson)**. Since the data hasn't been updated for nearly 6 years, some content from the dataset needs to be corrected.

I'm fetching [this](https://raw.githubusercontent.com/g0v/twgeojson/master/json/twCounty2010.geo.json) particular geojson data. As an example of the correction, 桃園縣 (Taoyuan County) needs to be corrected to 桃園市 (Taoyuan City). 
  
#### 2.1.3 Information of Venues Around Universities
[Foursquare API](https://developer.foursquare.com) is used to obtain the final list of venues to be further examined. Also note that Foursquare API uses special "category id" for each category. Since I am collecting a series of fitness/gym sites, scraping (instead of connecting through [its API](https://developer.foursquare.com/docs/api-reference/venues/categories/) the [venue category page](https://developer.foursquare.com/docs/build-with-foursquare/categories/) works better for me. I am collecting data with the following information:

+ University (zh)
+ University (eng)
+ University Latitude
+ University Longitude
+ Venue
+ Venue Latitude
+ Venue Longitude
+ Venue Category

#### 2.1.4 Results
Last but not least, there would be a total of **five** datasets to make analysis from: 

1. Venues within 1,500m for each university on the list - to examine the convenience of each university
2. Sports venues within 1,000m and 2,000m
3. Gym venues only within 1,0000m and 2,000m

### 2.2 Data Processing

In [13]:
# Import libraries

import pandas as pd
import numpy as np
from tqdm.notebook import tqdm
from geopy.geocoders import Nominatim # module to convert an address into latitude and longitude values
import requests
import json
import geocoder
import numpy as np # library to handle data in a vectorized manner

import pandas as pd # library for data analsysis
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)
from tqdm.notebook import tqdm
from tqdm.contrib import tzip
import math

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors
import matplotlib.pyplot as plt # plotting library
%matplotlib inline 

# import k-means from clustering stage
from sklearn.cluster import KMeans


import folium # map rendering library

# scraping libraries
from bs4 import BeautifulSoup

plt.style.use('ggplot') # refined style

Since the csv file is downloaded directly from the Ministry of Education, the contents are mainly in Chinese.

In [2]:
rdf.head()

Unnamed: 0,序號,學校代碼,公私立,體制,學校名稱,學校英文名稱,職稱,姓名,縣市別,第三級行政區,郵遞區號,學校地址,學校總機,學校傳真,網址
0,1,1,公立,一般大學,國立政治大學,National Chengchi Universerity,校長,郭明政,臺北市,文山區,11605,臺北市文山區指南路2段64號,02-2939-3091,02-2937-9611,http://www.nccu.edu.tw
1,2,2,公立,一般大學,國立清華大學,National Tsing Hua University,校長,賀陳弘,新竹市,新竹市,30013,新竹市光復路2段101號,03-571-5131,03-572-4038,http://www.nthu.edu.tw
2,3,3,公立,一般大學,國立臺灣大學,NATIONAL TAIWAN UNIVERSITY,校長,管中閔,臺北市,大安區,10617,臺北市大安區羅斯福路4段1號,02-3366-3366,02-2362-7651,http://www.ntu.edu.tw
3,4,4,公立,一般大學,國立臺灣師範大學,National Taiwan Normal University,校長,吳正己,臺北市,大安區,10610,臺北市大安區和平東路1段162號,02-7749-1111,無,http://www.ntnu.edu.tw
4,5,5,公立,一般大學,國立成功大學,National Cheng Kung University,校長,蘇慧貞,臺南市,東　區,70101,臺南市東區大學路1號,06-275-7575,06-276-6462,http://www.ncku.edu.tw


For this, we will select specific columns and use some of them in our [HERE API](https://developer.here.com) calls.

In [6]:
pdf = rdf.rename(columns={'序號': 'Serial Number', '學校代碼': 'ID', '公私立': 'Public/Private', '體制': 'System', '學校名稱': 'Name', '學校英文名稱': 'Eng. Name', '郵遞區號':'Postal Code', '學校總機': 'Tel','學校傳真': 'Fax','網址': 'URL'})
pdf['Public/Private'].replace(to_replace=['公立','私立'], value=['Public', 'Private'],inplace=True)
selected = ['Serial Number',
 'ID',
 'Public/Private',
 'System',
 'Name',
 'Eng. Name',
 'Postal Code',
 '縣市別',
 '第三級行政區',
 '學校地址']

pdf = pdf[selected]
pdf.head()

Unnamed: 0,Serial Number,ID,Public/Private,System,Name,Eng. Name,Postal Code,縣市別,第三級行政區,學校地址
0,1,1,Public,一般大學,國立政治大學,National Chengchi Universerity,11605,臺北市,文山區,臺北市文山區指南路2段64號
1,2,2,Public,一般大學,國立清華大學,National Tsing Hua University,30013,新竹市,新竹市,新竹市光復路2段101號
2,3,3,Public,一般大學,國立臺灣大學,NATIONAL TAIWAN UNIVERSITY,10617,臺北市,大安區,臺北市大安區羅斯福路4段1號
3,4,4,Public,一般大學,國立臺灣師範大學,National Taiwan Normal University,10610,臺北市,大安區,臺北市大安區和平東路1段162號
4,5,5,Public,一般大學,國立成功大學,National Cheng Kung University,70101,臺南市,東　區,臺南市東區大學路1號


In [4]:
# define the dataframe columns
column_names = ['Serial Number',
 'ID',
 'Public/Private',
 'System',
 'Name',
 'Eng. Name',
 'Postal Code', 
 'Latitude', 
 'Longitude', 
 'Address',
 'City',
 'County',
 'Neighborhood'] 

# instantiate the dataframe
updf = pd.DataFrame(columns=column_names)
updf

Unnamed: 0,Serial Number,ID,Public/Private,System,Name,Eng. Name,Postal Code,Latitude,Longitude,Address,City,County,Neighborhood


In [5]:
# Credentials HERE API
APP_ID = ''
APP_CODE = ''

In [8]:
for data in tqdm(pdf.to_dict(orient='records')):
    """
    ['Serial Number',
     'ID',
     'Public/Private',
     'System',
     'Name',
     'Eng. Name',
     'Postal Code',
     '縣市別',
     '第三級行政區',
     '學校地址']
    """
    Serial_Number = data['Serial Number']
    ID = data['ID']
    Public_Private = data['Public/Private']
    System = data['System']
    Name = data['Name']
    Eng_Name = data['Eng. Name']
    Postal_Code = data['Postal Code']
    address_zh = data['學校地址']
    county_zh = data['縣市別']
    
    g = geocoder.here(address_zh,
                       app_id=APP_ID,
                       app_code=APP_CODE)
    result = g.json
    
    Latitude = result['lat']
    Longitude = result['lng']
    Address = result['address']
    City = result['city']
    County = result['county']
    if 'neighborhood' in result:
        Neighborhood = result['neighborhood']
    else:
        Neighborhood = np.nan
    updf = updf.append({
        'Serial Number': Serial_Number,
        'ID': ID,
        'Public/Private': Public_Private,
        'System':System,
        'Name': Name,
        'Eng. Name': Eng_Name,
        'Postal Code': Postal_Code,
        'Latitude':Latitude,
        'Longitude': Longitude,
        'Address': Address,
        'City': City,
        'County': County,
        'Neighborhood': Neighborhood,
        'County (zh)': county_zh}, ignore_index=True
    )

updf.head()

HBox(children=(HTML(value=''), FloatProgress(value=0.0, max=160.0), HTML(value='')))




Unnamed: 0,Serial Number,ID,Public/Private,System,Name,Eng. Name,Postal Code,Latitude,Longitude,Address,City,County,Neighborhood,County (zh)
0,1,1,Public,一般大學,國立政治大學,National Chengchi Universerity,11605,24.98741,121.57651,"No. 64, Sec 2, Zhi Nan Rd., Wenshan District, ...",Taipei City,Taipei City,Wenshan District,臺北市
1,2,2,Public,一般大學,國立清華大學,National Tsing Hua University,30013,24.79635,120.99686,"No. 101, Sec 2, Guang Fu Rd., East District, H...",Hsinchu City,Hsinchu City,East District,新竹市
2,3,3,Public,一般大學,國立臺灣大學,NATIONAL TAIWAN UNIVERSITY,10617,25.01697,121.53369,"No. 1, Sec 4, Luo Si Fu Rd., Daan District, Ta...",Taipei City,Taipei City,Daan District,臺北市
3,4,4,Public,一般大學,國立臺灣師範大學,National Taiwan Normal University,10610,25.02643,121.52756,"No. 162, Sec 1, He Ping E. Rd., Daan District,...",Taipei City,Taipei City,Daan District,臺北市
4,5,5,Public,一般大學,國立成功大學,National Cheng Kung University,70101,22.99632,120.21953,"No. 1, Da Syue Rd., East District, Tainan City...",Tainan City,Tainan City,East District,臺南市


In [9]:
updf.shape

(160, 14)

In [10]:
updf.to_csv('assets/data/p_ulist_2021-02-20.csv', index=False)

Now the data is ready to be sent to Foursquare API and make further exploration. 

#### 2.2.1 Fetching Data with Foursquare API

In [2]:
df = pd.read_csv('assets/data/p_ulist_2021-02-20.csv')
udf = df.copy()
udf['Neighborhood'] = udf['Neighborhood'].fillna(udf['City'])
# check if we still have something that is null
udf.isnull().sum()

Serial Number     0
ID                0
Public/Private    0
System            0
Name              0
Eng. Name         0
Postal Code       0
Latitude          0
Longitude         0
Address           0
City              0
County            0
Neighborhood      0
County (zh)       0
dtype: int64

In [23]:
# Foursquare API Credentials
CODE = ''
url_to_access_token = ''
CLIENT_ID = '' # your Foursquare ID
CLIENT_SECRET = '' # your Foursquare Secret
ACCESS_TOKEN = '' # your FourSquare Access Token
VERSION = ''
LIMIT = 100 # A default Foursquare API limit value

In [6]:
def getNearbyVenues(namess, latitudes, longitudes, radius=1000):
    
    venues_list=[]
    full_list = [] # for future reference, can be handful when examining different radius (don't haeve to rerun the quries again)
    (names, eng_names) = namess
    for name, eng_name, lat, lng in tzip(names, eng_names, latitudes, longitudes):
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        full_list.append(results)
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, eng_name,
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['University (zh)', 'University (eng)', 
                  'University Latitude', 
                  'University Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return nearby_venues, full_list

In [48]:
uni_venues = pd.DataFrame()
uni_venues, full_list_raw = getNearbyVenues((np.asarray(udf['Name']), np.asarray(udf['Eng. Name'])), np.asarray(udf.Latitude), np.asarray(udf.Longitude), radius=1500)
print(uni_venues.shape)
uni_venues.head()
len(uni_venues['University (zh)'].unique())

HBox(children=(HTML(value=''), FloatProgress(value=0.0, max=160.0), HTML(value='')))


(4119, 8)


159

Here, we export the received DataFrame as a csv file for further usage.

In [10]:
uni_venues.to_csv('assets/data/uni_venues_1500_2021-02-20.csv', index=False)

In [49]:
print(uni_venues.shape)
uni_venues.groupby('University (zh)').count()

(4119, 8)


Unnamed: 0_level_0,University (eng),University Latitude,University Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
University (zh),Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
一貫道崇德學院,4,4,4,4,4,4,4
世新大學,93,93,93,93,93,93,93
中信金融管理學院,1,1,1,1,1,1,1
中原大學,14,14,14,14,14,14,14
中國文化大學,36,36,36,36,36,36,36
中國科技大學,27,27,27,27,27,27,27
中國醫藥大學,45,45,45,45,45,45,45
中山醫學大學,9,9,9,9,9,9,9
中州學校財團法人中州科技大學,2,2,2,2,2,2,2
中臺科技大學,17,17,17,17,17,17,17


In [50]:
print('There are {} uniques categories.'.format(len(uni_venues['Venue Category'].unique())))

There are 262 uniques categories.


#### 2.2.2 Retrieving Gym Related Category IDs

In [14]:
page = requests.get("https://developer.foursquare.com/docs/build-with-foursquare/categories/")

In [15]:
soup = BeautifulSoup(page.content, 'html.parser')
all_categories_raw = soup.find_all('ul', class_='VenueCategories__Wrapper-sc-1ysxg0y-0 dikXMT')[0]

all_categories_name = all_categories_raw.find_all('h3')
all_categories_id = all_categories_raw.find_all('p')
all_categories_name = [cat.get_text() for cat in all_categories_name]
all_categories_id = [cat.get_text() for cat in all_categories_id]
all_categories_id = list(filter(lambda x: len(x) == 24, all_categories_id))

cat_df = pd.DataFrame({'Name': all_categories_name, 'ID': all_categories_id})
cat_df.head()

Unnamed: 0,Name,ID
0,Arts & Entertainment,4d4b7104d754a06370d81259
1,Amphitheater,56aa371be4b08b9a8d5734db
2,Aquarium,4fceea171983d5d06c3e9823
3,Arcade,4bf58dd8d48988d1e1931735
4,Art Gallery,4bf58dd8d48988d1e2931735


In [16]:
cat_df.to_csv('assets/data/categories_id_2021-02-20.csv', index=False)

After exporting the list of names and IDs of categories for future use, I'd like to find out the ones that represent sports facility. If you are more into gyms, you can probably choose the schools that offer more gyms in vicinity.

In [24]:
sports_categories = ['Gym', 'Gym / Fitness Center', 'Athletics & Sports', 'Climbing Gym', 'College Gym', 'Gymnastics Gym' , 'Beach', 'Bike Trail', 'Dive Spot', 'Fishing Spot', 'Park', 'Recreation Center', 'Rock Climbing Spot', 'Ski Area']

In [27]:
categories_id = ','.join(map(str, cat_df[cat_df['Name'].str.contains('Gym')].ID.to_list())) 
sports_categories_id = ','.join(map(str, cat_df[cat_df['Name'].isin(sports_categories)].ID.to_list())) 

In [29]:
def getNearbySportsVenues(namess, latitudes, longitudes, radius=1000):
    venues_list=[]
    full_list = [] # for future reference, can be handful when examining different radius (don't haeve to rerun the quries again)
    (names, eng_names) = namess
    for name, eng_name, lat, lng in tzip(names, eng_names, latitudes, longitudes):
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&categoryId={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            sports_categories_id, # categories_id
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        full_list.append(results)
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, eng_name,
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['University (zh)', 'University (eng)', 
                  'University Latitude', 
                  'University Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return nearby_venues, full_list

I am retrieving two dataset, one having a radius of 1000m and the other with 2000m. Biking 2k to a gym is not terribly far, and it can also serve as a sort of warm-ups. I personally go to gym that is 1.7km far away from my current location and it takes between 10-15 minutes to get there. 

In [19]:
# Gym only
uni_sports_venues = pd.DataFrame()
uni_sports_venues, full_sports_list_raw = getNearbySportsVenues((np.asarray(udf['Name']), np.asarray(udf['Eng. Name'])), np.asarray(udf.Latitude), np.asarray(udf.Longitude), radius=1000)
uni_sports_venues_2000, full_sports_list_raw_2000 = getNearbySportsVenues((np.asarray(udf['Name']), np.asarray(udf['Eng. Name'])), np.asarray(udf.Latitude), np.asarray(udf.Longitude), radius=2000)
print(uni_sports_venues.shape)
uni_sports_venues.head()

HBox(children=(HTML(value=''), FloatProgress(value=0.0, max=160.0), HTML(value='')))




HBox(children=(HTML(value=''), FloatProgress(value=0.0, max=160.0), HTML(value='')))


(136, 8)


Unnamed: 0,University (zh),University (eng),University Latitude,University Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,國立清華大學,National Tsing Hua University,24.79635,120.99686,清華大學新體育館 NTHU Stadium,24.793868,120.992318,College Stadium
1,國立清華大學,National Tsing Hua University,24.79635,120.99686,NCTU Gym,24.789599,120.995678,College Gym
2,國立清華大學,National Tsing Hua University,24.79635,120.99686,NCTU Swimming Pool,24.789435,120.9956,Gym Pool
3,國立臺灣大學,NATIONAL TAIWAN UNIVERSITY,25.01697,121.53369,World Gym,25.01447,121.5346,Gym / Fitness Center
4,國立臺灣大學,NATIONAL TAIWAN UNIVERSITY,25.01697,121.53369,台灣大學舊體育館,25.019733,121.535821,College Gym


In [20]:
print('Within 1000m:')
print(uni_sports_venues.shape)
print('There are {} uniques categories.'.format(len(uni_sports_venues['Venue Category'].unique())))

print('Within 2000m:')
print(uni_sports_venues_2000.shape)
print('There are {} uniques categories.'.format(len(uni_sports_venues_2000['Venue Category'].unique())))

Within 1000m:
(136, 8)
There are 16 uniques categories.
Within 2000m:
(487, 8)
There are 21 uniques categories.


In [26]:
uni_sports_venues.to_csv('assets/data/uni_gym_venues_2021-02-20.csv', index=False)
uni_sports_venues_2000.to_csv('assets/data/uni_gym_venues_2000_2021-02-20.csv', index=False)

In [30]:
# All Exercise Facilities
uni_sports_venues = pd.DataFrame()
uni_sports_venues, full_sports_list_raw = getNearbySportsVenues((np.asarray(udf['Name']), np.asarray(udf['Eng. Name'])), np.asarray(udf.Latitude), np.asarray(udf.Longitude), radius=1000)
uni_sports_venues_2000, full_sports_list_raw_2000 = getNearbySportsVenues((np.asarray(udf['Name']), np.asarray(udf['Eng. Name'])), np.asarray(udf.Latitude), np.asarray(udf.Longitude), radius=2000)
print(uni_sports_venues.shape)
uni_sports_venues.head()

HBox(children=(HTML(value=''), FloatProgress(value=0.0, max=160.0), HTML(value='')))




HBox(children=(HTML(value=''), FloatProgress(value=0.0, max=160.0), HTML(value='')))


(395, 8)


Unnamed: 0,University (zh),University (eng),University Latitude,University Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,國立政治大學,National Chengchi Universerity,24.98741,121.57651,Roman Square (羅馬廣場),24.986799,121.57454,Plaza
1,國立政治大學,National Chengchi Universerity,24.98741,121.57651,道南右岸河濱公園,24.995167,121.572406,Park
2,國立政治大學,National Chengchi Universerity,24.98741,121.57651,政大堤外球場,24.984681,121.572953,Baseball Field
3,國立政治大學,National Chengchi Universerity,24.98741,121.57651,政大網球場(山上) Tennis Courts,24.98197,121.578353,College Tennis Court
4,國立清華大學,National Tsing Hua University,24.79635,120.99686,清華大學新體育館 NTHU Stadium,24.793868,120.992318,College Stadium


In [31]:
print('Within 1000m:')
print(uni_sports_venues.shape)
print('There are {} uniques categories.'.format(len(uni_sports_venues['Venue Category'].unique())))

print('Within 2000m:')
print(uni_sports_venues_2000.shape)
print('There are {} uniques categories.'.format(len(uni_sports_venues_2000['Venue Category'].unique())))

Within 1000m:
(395, 8)
There are 45 uniques categories.
Within 2000m:
(1352, 8)
There are 64 uniques categories.


In [32]:
uni_sports_venues.to_csv('assets/data/uni_sports_venues_2021-02-20.csv', index=False)
uni_sports_venues_2000.to_csv('assets/data/uni_sports_venues_2000_2021-02-20.csv', index=False)