# <center>**Discover Affordable Housing and Venues in Tech Centers of Austin, Texas**</center>

# **Introduction:**

##### As the capital of Texas, Austin, is one of the most innovative tech hubs in the U.S. Silicon Hills is a nickname for the cluster of high-tech companies in the Austin metropolitan area. Austin has been ranked by U.S. News and World Report as the number one metro area to live in the U.S.  Other than the booming tech industry, this is also due to stable housing market, and the desirable lifestyle of the city.

##### So which Austin neighborhood is the tech center of the tech hub?  Ones that are looking to start a life in Austin, may question "Are there affordable housing available in the tech center of Austin?" 

##### In this analysis, we will discover the tech center of Austin, and any available affordable housing in Austin tech center. We will also discover the venues around the Austin Tech Center.

# **Data Sources**

#### 1. **Austin's top 10 most funded zip codes:** (https://www.builtinaustin.com/2016/08/09/austin-fundings-zipcode)
##### *(Any good tech scene needs plenty of funding. But what really makes a tech hub is large investments in a concentrated area, fostering growth through collaboration and drawing in talent from around the country.)*
##### **Data Feature:** Digital tech fundings between August 1, 2015 and July 31, 2016 going by zip code (top 10).

#### 2. **Austin Zip Code-Neighborhoods Search:** (http://www.greenlightaustinrealty.com/austin-zip-code-home-search.php)


#### 3. **Austin Comprehensive Affordable Housing Directory:** CSV Link (https://data.austintexas.gov/api/views/4syj-z4ky/rows.csv?accessType=DOWNLOAD)
##### *(This dataset contains all income-restricted housing within the Austin. This includes properties funded by the City of Austin along with the Housing Authority City of Austin, Housing Authority of Travis County, and Texas Department of Housing and Community Affairs. The property attributes are intended to help Austin residents find income-restricted housing that best suits their needs.)*
##### **Data Feature:** Property Name, Address, Zip Code, Latitude, Longitude, Unit Type, Students Only, Total Income Restricted Units, Has Waitlist, Total Units, Units Segmented by Bedroom, etc. 

#### 4. **Latitude and Longitude Finder:** Link (https://www.latlong.net/)
##### *(The csv file from Austin Government Website provides us a list of neighborhood names of Austin, TX)*
##### **Data Feature:**NEIGHNAME, BASE_ZONE, GENERAL_ZONING,ACRES, etc.

#### 5. **Foursquare API:**we will use Foursquare API to explore and obtain venues around the neighborhood.

# **Methodology**
##### We will follow below steps to discover Austin, TX. Hopefully we can find answers to the questions we raised in the Introduction.
##### **Step 1**: we will check the first source to find the top 3 most tech funded zip codes. The 3 zip codes identified would be the tech centers in Austin. 
##### **Step 2**: we will check the neighborhoods in the 3 zip codes, and creat a data frame for the neighborhoods considered as tech centers, which are associated with the 3 zip codes. We call it "df_TechNeigh".
##### **Step 3**: we will explore Austin Affordable Housing Directory dataset, and determine which high-tech neighborhood has the most affordable housing properties available.
##### **Step 4**: we will explore venues around the high-tech neighborhood with most affordable housing property by using Foursquare API. We will also determine top venue categories by frequency.
###### Now let's start the exploration!
***

#### Install Libraries

In [1]:
# Import pandas and numpy
import numpy as np # library to handle data in a vectorized manner
import pandas as pd # library for data analsysis

# install & import libraries
!conda install -c conda-forge geopy --yes 
from geopy.geocoders import Nominatim # module to convert an address into latitude and longitude values

import requests # library to handle requests
from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

# import k-means from clustering stage
from sklearn.cluster import KMeans

!conda install -c conda-forge folium=0.5.0 --yes 
import folium # map rendering library

print('Libraries imported.')

Collecting package metadata (current_repodata.json): done
Solving environment: done

## Package Plan ##

  environment location: /home/jupyterlab/conda/envs/python

  added / updated specs:
    - geopy


The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    geographiclib-1.50         |             py_0          34 KB  conda-forge
    geopy-2.0.0                |     pyh9f0ad1d_0          63 KB  conda-forge
    ------------------------------------------------------------
                                           Total:          97 KB

The following NEW packages will be INSTALLED:

  geographiclib      conda-forge/noarch::geographiclib-1.50-py_0
  geopy              conda-forge/noarch::geopy-2.0.0-pyh9f0ad1d_0



Downloading and Extracting Packages
geopy-2.0.0          | 63 KB     | ##################################### | 100% 
geographiclib-1.50   | 34 KB     | ################################

***

## Step 1:
#### From Data Source 1, we obtained below information: 

<p dir="ltr"><strong>The top 10 most funded zip codes:</strong></p>

<p dir="ltr"><strong>78701: </strong>$286,225,628</p>

<p dir="ltr"><strong>78746: </strong>$132,497,667</p>

<p dir="ltr"><strong>78759: </strong>$73,154,062</p>

<p dir="ltr"><strong>78759: </strong>$73,154,062</p>

<p dir="ltr"><strong>78731: </strong>$41,996,000</p>

<p dir="ltr"><strong>78617: </strong>$33,750,000</p>

<p dir="ltr"><strong>78704: </strong>$30,775,436</p>

<p dir="ltr"><strong>78726: </strong>$30,000,000</p>

<p dir="ltr"><strong>78745: </strong>$26,000,000</p>

<p dir="ltr"><strong>78703: </strong>$22,174,242</p>


#### So, we know that the top 3 tech funded zip codes in Austin are: 78701, 78746, 78759
***

# Step 2:
#### With Data Source 2, we looked up the neighborhoods in the top 3 tech funded zip codes, then we found:
#### **Neighborhood in zip code 78701 is: DOWNTOWN AUSTIN**
#### **Neighborhood in zip code 78746 are: WESTLAKE HILLS, ROLLINGWOO**
#### **Neighborhood in zip code 78759 are: ARBORETUM, BALCONES, GREAT HILLS**

#### Now, let's create a data frame "df_TechNeigh" for neighborhoods that are considered as tech centers.


In [2]:
# Initialize Data list
data= {'Zip Code':[78701,78746,78746,78759,78759,78759],'Neighborhood':['DOWNTOWN AUSTIN','WEST LAKE HILLS','ROLLINGWOOD','ARBORETUM','BALCONES','GREAT HILLS']}

# Create DataFrame
df_TechNeigh= pd.DataFrame(data)

print(df_TechNeigh)

   Zip Code     Neighborhood
0     78701  DOWNTOWN AUSTIN
1     78746  WEST LAKE HILLS
2     78746      ROLLINGWOOD
3     78759        ARBORETUM
4     78759         BALCONES
5     78759      GREAT HILLS


***

# Step 3:
#### Explore affordable housing in the top tech neighborhoods in "df_TechNeigh".
#### First, let's take a look at Data Source 3.

In [3]:
# Use Pandas.Read_csv to check the first 5 rows of the Affordable Housing source dataset, we will creat a dataframe and name it df_ah
df_ah=pd.read_csv('http://data.austintexas.gov/api/views/4syj-z4ky/rows.csv?accessType=DOWNLOAD')
df_ah.head()

Unnamed: 0,Comp Affordable Housing ID,Property Name,Address,City,State,Zip Code,Latitude,Longitude,Unit Type,Census Tract,...,Low Income Housing Tax Credit,Source ATC Guide,Source TDHCA,Source AHI,Is Duplicate,TDHCA Funded,NHCD Funded,HACA Funded,HATC Funded,Geocolumn
0,2928,Hilltop-UNDER CONSTRUCTION,2402 San Gabriel Street,Austin,TX,78705,30.288769,-97.748248,Multifamily,,...,,,,,0,,True,,0.0,POINT (-97.748248 30.288769)
1,2557,E6,2400 E 6th Street,Austin,TX,78702,30.2598,-97.716003,Multifamily,,...,,True,False,True,0,False,True,,,POINT (-97.71600342 30.25979996)
2,2908,Heights on Parmer Phase II,1500 E Parmer Lane,Austin,TX,78753,30.390823,-97.650793,Multifamily,,...,,,,,0,,True,,,POINT (-97.650793 30.390823)
3,2795,Heights on Parmer,1500 E Parmer Lane,Austin,TX,78753,30.3908,-97.650803,Multifamily,,...,4.0,True,True,True,0,False,True,,,POINT (-97.65080261 30.39080048)
4,2391,Solaris Aparments,1601 Royal Crest Drive,Austin,TX,78741,30.239201,-97.729103,Multifamily,,...,,True,False,False,0,,False,False,0.0,POINT (-97.72910309 30.23920059)


#### It looks like the Affordable Housing dataframe has many data features that we will not need. Lets take a look at the list of data features.

In [4]:
# Obtain a list of features in df_ah
list(df_ah.columns)

['Comp Affordable Housing ID',
 'Property Name',
 'Address',
 'City',
 'State',
 'Zip Code',
 'Latitude',
 'Longitude',
 'Unit Type',
 'Census Tract',
 'Owner',
 'Developer',
 'Council District',
 'Phone',
 'Email',
 'Property Manager Or Landlord',
 'Website',
 'Students Only',
 'Community Elderly',
 'Community Disabled',
 'Cmty Domestic Abuse Survivor',
 'Community Mental',
 'Community Veteran',
 'Community Military',
 'Only Serves Designated Cmtys',
 'Cmty Served Descriptions',
 'Broken Lease',
 'Broken Lease Criteria',
 'Eviction History',
 'Eviction History Criteria',
 'Criminal History',
 'Criminal History Criteria',
 'Has Waitlist',
 'Total Units',
 'Total Permanent Support Units',
 'Has Permanent Support Units',
 'Total Income Restricted Units',
 'Has Allocated IR Units',
 'Total Housing Choice Units',
 'Accepts Housing Choice',
 'Total Accessible IR Units',
 'Has Allocated Acc IR Units',
 'Num Units MFI 30',
 'Num Units MFI 40',
 'Num Units MFI 50',
 'Num Units MFI 60',
 'Num U

# Data Cleansing
#### Let's update the dataframe by selecting data features that align with standards in MLS/real estate listings . (i.e. unit type, numbers of bed rooms)
###### **Primary Key = "Zip Code"** (this is to align with Primary Key in "df_TechNeigh")
###### **Other features:** Property Name, Address, Latitude, Longitude, Unit Type, Student Only, Has Waitlist, Has 0/1/2/3 Bed Unti, Elementary School, Middle School, High School.
###### *Students Only: Property serves student renters only. 
###### *Has Waitlist: Does this property have a waitlist.

In [5]:
# Select Features and define dataframe
df_ah=df_ah[['Zip Code','Property Name','Address','Latitude','Longitude','Unit Type','Students Only','Has Waitlist','Has 0 Bed Unit','Has 1 Bed Unit','Has 2 Bed Unit','Has 3 Bed Unit','Elementary School','Middle School','High School']]
df_ah.head()

Unnamed: 0,Zip Code,Property Name,Address,Latitude,Longitude,Unit Type,Students Only,Has Waitlist,Has 0 Bed Unit,Has 1 Bed Unit,Has 2 Bed Unit,Has 3 Bed Unit,Elementary School,Middle School,High School
0,78705,Hilltop-UNDER CONSTRUCTION,2402 San Gabriel Street,30.288769,-97.748248,Multifamily,,,,,,,,,
1,78702,E6,2400 E 6th Street,30.2598,-97.716003,Multifamily,False,False,True,True,True,,Zavala Elementary,Martin Middle,Eastside Memorial HS at Johnston
2,78753,Heights on Parmer Phase II,1500 E Parmer Lane,30.390823,-97.650793,Multifamily,,,,,,,,Dessau Middle School,John B Connaly High School
3,78753,Heights on Parmer,1500 E Parmer Lane,30.3908,-97.650803,Multifamily,False,True,,True,True,True,Copperfield Elementary School,Dessau Middle School,John B Connally High School
4,78741,Solaris Aparments,1601 Royal Crest Drive,30.239201,-97.729103,Multifamily,False,False,True,True,True,True,Sanchez Elementary,Martin Middle,Travis High


In [6]:
df_ah.shape

(391, 15)

#### We selected features for the Affordable Housing dataframe. Now let's make the dataframe shorter to only show data of the top 3 tech funded zip codes. We call the dataframe "df_ah_tc"

In [7]:
# Use isin function to select zip codes.
zip=['78701','78746','78759']
df_ah_tc=df_ah[df_ah['Zip Code'].isin(zip)]
print('Total number of Affordable Housing Property in Austin Tech Center:', len(df_ah_tc))# check the total number of affordable housing properties in Austin tech centers

Total number of Affordable Housing Property in Austin Tech Center: 10


In [8]:
# Let's take a look at the dataframe
df_ah_tc.head(10)

Unnamed: 0,Zip Code,Property Name,Address,Latitude,Longitude,Unit Type,Students Only,Has Waitlist,Has 0 Bed Unit,Has 1 Bed Unit,Has 2 Bed Unit,Has 3 Bed Unit,Elementary School,Middle School,High School
11,78701,700 E 11th Street-UNDER CONSTRUCTION,700 E 11th Street,30.271082,-97.733918,Multifamily,,,,,,,Matthews Elementary School,O Henry Middle School,Austin High School
36,78759,Bent Tree Apartments,8405 Bent Tree Road,30.3724,-97.743202,Multifamily,False,False,,True,True,,Hill Elementary School,Murchison Middle School,Anderson High School
38,78759,Bridge at Terracina,8100 N Mopac Expressway,30.365905,-97.743785,multi-family,False,False,,,,,Hill Elementary,Murchison Middle,Anderson High
114,78701,Lakeside Apartments,85 Trinity Street,30.261299,-97.740799,Multifamily,False,True,,True,True,,Mathews Elementary,O. Henry Middle,Austin High
136,78701,Capital Studios,309 E 11th Street,30.271601,-97.738197,Individual,False,True,True,False,False,False,Mathews Elementary,O. Henry Middle,Austin High
157,78701,AMLI on 2nd,421 W 3rd Street,30.266199,-97.7481,Multifamily,False,,,True,True,True,Mathews Elementary School,O. Henry Middle School,Austin High School
232,78701,44 East-UNDER CONSTRUCTION,44 East Avenue,30.255904,-97.739027,Multifamily,,,,,,,,,
301,78701,91 Red River-UNDER CONSTRUCTION,91 Red River St,30.260518,-97.739027,Multifamily,,,,,,,Matthews Elementary School,O Henry Middle Schol,Austin High School
347,78701,North Shore Apartments,110 San Antonio Street,30.265301,-97.749496,Multifamily,False,True,,True,True,,Mathews Elementary School,O. Henry Middle School,Austin High School
361,78759,Summit Oaks,11607 Sierra Nevada,30.4203,-97.754601,Multifamily,False,True,,True,True,,Davis Elementary,Murchison Middle,Anderson High


#### We see 3 properties have no data entries (NaN) in "Students Only", "Has Waitlist", and Numbers of Bed Unit data features. However, the property name clarrified the 3 properties are under construction. 
###### Before we decide what to do with the under construction properties, let's do a quick search on them since we only have 3 properties. 
###### **700 E 11th Street:** Apartment Building scheduled for completion in 2021. It has a total of 276 units. (source link: https://www.buzzbuzzhome.com/us/alexan-capitol)
###### **44 East Ave:** Apartment Building scheduled for completion in 2022. It has a total of 322 units (source link: https://www.buzzbuzzhome.com/us/44-east-ave)
###### **91 Red River Street:** Apartment Building scheduled for completion in 2021 (source link: https://atxrealestatenews.com/2019/01/30/rainey-district-has-its-head-in-the-clouds-with-another-100-million-project/)

#### **Our research show the 3 properties are scheduled to be completed in near future, instead of drop them, we will include them and update/correct the missing data.**
###### **Students Only:** update NaN to False as majority of the entries have "False"
###### **Has Waitlist:** update NaN to False as the buildings are under construction now
###### **Has 0/1/2/3 Bed Unit:** update NaN to True as the 3 new construction properties have all kinds of available bed room units
###### Note: AMLI on 2nd website does not indicate any waitlist (https://www.amli.com/apartments/austin/2nd-street-district-apartments/amli-on-2nd/floorplans)

In [9]:
# Correct Data based on above data decision
df_ah_tc['Students Only']=df_ah_tc['Students Only'].replace(np.nan,'False')
df_ah_tc['Has Waitlist']=df_ah_tc['Has Waitlist'].replace(np.nan,'False')
df_ah_tc.loc[df_ah_tc['Property Name'].str.contains('UNDER CONSTRUCTION'),'Has 0 Bed Unit'] = 'True'
df_ah_tc.loc[df_ah_tc['Property Name'].str.contains('UNDER CONSTRUCTION'),'Has 1 Bed Unit']= 'True'
df_ah_tc.loc[df_ah_tc['Property Name'].str.contains('UNDER CONSTRUCTION'),'Has 2 Bed Unit'] = 'True'
df_ah_tc.loc[df_ah_tc['Property Name'].str.contains('UNDER CONSTRUCTION'),'Has 3 Bed Unit'] = 'True'
df_ah_tc.head(10)

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  This is separate from the ipykernel package so we can avoid doing imports until
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  self.obj[item] = s


Unnamed: 0,Zip Code,Property Name,Address,Latitude,Longitude,Unit Type,Students Only,Has Waitlist,Has 0 Bed Unit,Has 1 Bed Unit,Has 2 Bed Unit,Has 3 Bed Unit,Elementary School,Middle School,High School
11,78701,700 E 11th Street-UNDER CONSTRUCTION,700 E 11th Street,30.271082,-97.733918,Multifamily,False,False,True,True,True,True,Matthews Elementary School,O Henry Middle School,Austin High School
36,78759,Bent Tree Apartments,8405 Bent Tree Road,30.3724,-97.743202,Multifamily,False,False,,True,True,,Hill Elementary School,Murchison Middle School,Anderson High School
38,78759,Bridge at Terracina,8100 N Mopac Expressway,30.365905,-97.743785,multi-family,False,False,,,,,Hill Elementary,Murchison Middle,Anderson High
114,78701,Lakeside Apartments,85 Trinity Street,30.261299,-97.740799,Multifamily,False,True,,True,True,,Mathews Elementary,O. Henry Middle,Austin High
136,78701,Capital Studios,309 E 11th Street,30.271601,-97.738197,Individual,False,True,True,False,False,False,Mathews Elementary,O. Henry Middle,Austin High
157,78701,AMLI on 2nd,421 W 3rd Street,30.266199,-97.7481,Multifamily,False,False,,True,True,True,Mathews Elementary School,O. Henry Middle School,Austin High School
232,78701,44 East-UNDER CONSTRUCTION,44 East Avenue,30.255904,-97.739027,Multifamily,False,False,True,True,True,True,,,
301,78701,91 Red River-UNDER CONSTRUCTION,91 Red River St,30.260518,-97.739027,Multifamily,False,False,True,True,True,True,Matthews Elementary School,O Henry Middle Schol,Austin High School
347,78701,North Shore Apartments,110 San Antonio Street,30.265301,-97.749496,Multifamily,False,True,,True,True,,Mathews Elementary School,O. Henry Middle School,Austin High School
361,78759,Summit Oaks,11607 Sierra Nevada,30.4203,-97.754601,Multifamily,False,True,,True,True,,Davis Elementary,Murchison Middle,Anderson High


In [10]:
# Check data types in the dataframe
df_ah_tc.dtypes

Zip Code              object
Property Name         object
Address               object
Latitude             float64
Longitude            float64
Unit Type             object
Students Only         object
Has Waitlist          object
Has 0 Bed Unit        object
Has 1 Bed Unit        object
Has 2 Bed Unit        object
Has 3 Bed Unit        object
Elementary School     object
Middle School         object
High School           object
dtype: object

##### The data cleaning is completed. Now let's add the Latitude, Longitude, and count of Affordable Housing properties to "df_TechNeigh".

In [11]:
# Find out number of Affordable Housing property in each zip code
df_ah_tc.groupby(['Zip Code']).size()

Zip Code
78701    7
78759    3
dtype: int64

In [12]:
# Add number of Affordable Housing property, "Property Number", to the "df_TechNeigh" dataframe
pn=[7,0,0,3,3,3] # number of affordable housing properties associated with each zip code
df_TechNeigh['Property Number']=pn

In [13]:
# Add columns (Latitude, Longitude) to df_TechNeigh by using data source 4-Latitude and Longitude Finder
lat=[30.2703,30.3070, 30.2683, 30.3915, 30.4415, 30.4088]
long=[-97.7434, -97.7858, -97.7772, -97.7521, -97.7985, -97.7716]
df_TechNeigh['Latitude']=lat
df_TechNeigh['Longitude']=long

In [14]:
print (df_TechNeigh)

   Zip Code     Neighborhood  Property Number  Latitude  Longitude
0     78701  DOWNTOWN AUSTIN                7   30.2703   -97.7434
1     78746  WEST LAKE HILLS                0   30.3070   -97.7858
2     78746      ROLLINGWOOD                0   30.2683   -97.7772
3     78759        ARBORETUM                3   30.3915   -97.7521
4     78759         BALCONES                3   30.4415   -97.7985
5     78759      GREAT HILLS                3   30.4088   -97.7716


#### Use geopy library to get latitude and longitude of Austin City, TX
###### We will define a user_agent for geocoder

In [15]:
address = 'Austin, TX'

geolocator = Nominatim(user_agent='Aus_explorer')
location=geolocator.geocode(address)
latitude1=location.latitude
longitude1=location.longitude
print('The geographical coordinate of Austin are{},{}.'.format(latitude1, longitude1))

The geographical coordinate of Austin are30.2711286,-97.7436995.


#### Create map of Austin, TX

In [16]:
# create map of Austin using latitude and longitude values
map_austin= folium.Map(location=[latitude1, longitude1],zoom_start=12)

map_austin

## Step 4:

#### Based on data in the "df_TechNeigh" dataframe, we can easily see that, out of the 5 high-tech neighborhoods, the most funded high-tech neighborhood in Austin is Downtown Austin (zip code 78701). Downtown Austin also has the highest number (7) of affordable housing properties.

#### Now, let's explore the venues around Downtown Austin by using Foursquare API!

#### Let's Forusquare Credentials and Version

In [17]:
# Define Foursquare  credentials
CLIENT_ID = 'JIVUD0KYY0XAUUWMOLZVVZU10EOS2IFCNQBFSQ1AIOJOIXJ4' # Foursquare ID
CLIENT_SECRET = 'GV5F13XPZW0YS1D04GEQH00E1PM3DVZQK2RLBOKJJV3TL3CT' # your Foursquare Secret
VERSION = '20180604'
LIMIT = 30

In [18]:
# Get Downtown's latitude and longitude values
neigh_lat = df_TechNeigh.loc[0,'Latitude'] # Latitude value
neigh_lng = df_TechNeigh.loc[0,'Longitude'] # Longitude value

neigh_name=df_TechNeigh.loc[0,'Neighborhood'] # Neighborhood Name

print('{} Latitude and Longitude are: {},{}.'.format(neigh_name,neigh_lat,neigh_lng))

DOWNTOWN AUSTIN Latitude and Longitude are: 30.2703,-97.7434.


#### Now, let's get the top 100 venues that are in Downtown Austin within a radius of 500 meters.

##### 1). We will create a GET request URL as "url"

In [19]:
LIMIT = 100
radius = 500
url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
    CLIENT_ID, 
    CLIENT_SECRET, 
    VERSION, 
    neigh_lat, 
    neigh_lng, 
    radius, 
    LIMIT)

##### 2). Send the GET request

In [20]:
results = requests.get(url).json()
results

{'meta': {'code': 200, 'requestId': '5f0e5d2cba4b825abded2e56'},
 'response': {'suggestedFilters': {'header': 'Tap to show:',
   'filters': [{'name': 'Open now', 'key': 'openNow'},
    {'name': '$-$$$$', 'key': 'price'}]},
  'headerLocation': 'Downtown Austin',
  'headerFullLocation': 'Downtown Austin, Austin',
  'headerLocationGranularity': 'neighborhood',
  'totalResults': 61,
  'suggestedBounds': {'ne': {'lat': 30.274800004500005,
    'lng': -97.73819932117796},
   'sw': {'lat': 30.265799995499993, 'lng': -97.74860067882203}},
  'groups': [{'type': 'Recommended Places',
    'name': 'recommended',
    'items': [{'reasons': {'count': 0,
       'items': [{'summary': 'This spot is popular',
         'type': 'general',
         'reasonName': 'globalInteractionReason'}]},
      'venue': {'id': '44b75631f964a5206d351fe3',
       'name': 'Paramount Theatre',
       'location': {'address': '713 Congress Ave',
        'crossStreet': '7th St.',
        'lat': 30.269456913114563,
        'lng':

##### 3). Information we need is in the *item* key. We will define a get_category_type function 

In [21]:
def get_category_type(row):
    try:
        categories_list = row['categories']
    except:
        categories_list = row['venue.categories']
        
    if len(categories_list) == 0:
        return None
    else:
        return categories_list[0]['name']

##### 4). clean the json file and structure it into pandas dataframe

In [22]:
venues = results['response']['groups'][0]['items']
    
nearby_venues = json_normalize(venues) # flatten JSON

# filter columns
filtered_columns = ['venue.name', 'venue.categories', 'venue.location.lat', 'venue.location.lng']
nearby_venues =nearby_venues.loc[:, filtered_columns]

# filter the category for each row
nearby_venues['venue.categories'] = nearby_venues.apply(get_category_type, axis=1)

# clean columns
nearby_venues.columns = [col.split(".")[-1] for col in nearby_venues.columns]

nearby_venues.head()

  This is separate from the ipykernel package so we can avoid doing imports until


Unnamed: 0,name,categories,lat,lng
0,Paramount Theatre,Movie Theater,30.269457,-97.742077
1,Upstairs at Caroline,Hotel,30.26881,-97.74231
2,The Hideout Theatre,Theater,30.268627,-97.742521
3,Perry's Steakhouse,Steakhouse,30.269374,-97.743676
4,The Townsend,Lounge,30.269611,-97.742448


##### 5). What's the total number of venues returned by Foursquare?

In [23]:
# Check numbers of venues returned by Foursquare
print ('Total number of venues in Downtown Austin within 500 meters radius:{}'.format(nearby_venues.shape[0]))

Total number of venues in Downtown Austin within 500 meters radius:61


##### 6). How many unique categories can be curated from returned venues?

In [24]:
# Check how many unique categories can be curated from returned venues
print('There are {} uniques categories.'.format(len(nearby_venues['categories'].unique())))

There are 39 uniques categories.


##### 7). Analyze Downtwon Austin Neighborhood

In [25]:
# one hot encoding
downtown_onehot = pd.get_dummies(nearby_venues[['categories']], prefix='', prefix_sep='')
downtown_onehot.head()

Unnamed: 0,American Restaurant,Arts & Crafts Store,Bar,Burger Joint,Café,Cajun / Creole Restaurant,Capitol Building,Chinese Restaurant,Cocktail Bar,Coffee Shop,...,New American Restaurant,Park,Pizza Place,Seafood Restaurant,Shipping Store,Speakeasy,Steakhouse,Sushi Restaurant,Thai Restaurant,Theater
0,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,1
3,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,1,0,0,0
4,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


In [26]:
downtown_onehot.shape # check the size of the dataframe, the numbers should match numbers in 5) & 6)

(61, 39)

##### 8). Print top 10 most common venue categories

In [27]:
# Take the mean of the frequency of occurence of each category
CommonVenues = downtown_onehot.mean().reset_index()
CommonVenues = CommonVenues.sort_values(by=[0],ascending=False)
CommonVenues.head(10)

Unnamed: 0,index,0
9,Coffee Shop,0.098361
2,Bar,0.065574
8,Cocktail Bar,0.065574
17,Hotel,0.065574
0,American Restaurant,0.032787
20,Juice Bar,0.032787
36,Sushi Restaurant,0.032787
34,Speakeasy,0.032787
23,Lounge,0.032787
27,Movie Theater,0.032787


In [28]:
# Let's check all venues categories with decending frequency.
CommonVenues.head(39)

Unnamed: 0,index,0
9,Coffee Shop,0.098361
2,Bar,0.065574
8,Cocktail Bar,0.065574
17,Hotel,0.065574
0,American Restaurant,0.032787
20,Juice Bar,0.032787
36,Sushi Restaurant,0.032787
34,Speakeasy,0.032787
23,Lounge,0.032787
27,Movie Theater,0.032787


***

# **Results**

#### **I. The top 3 most tech funded zip codes in Austin and the top 6 most tech funded neighborhoods are:**
###### *Zip code 78701 was funded $286.225.628. Neighborhood is: DOWNTOWN AUSTIN

###### *Zip code 78746 was funded $132,497,667. Neighborhoods are: WESTLAKE HILLS, ROLLINGWOO

###### *Zip code 78759 was funded $73,154,062. Neighborhoods are: ARBORETUM, BALCONES, GREAT HILLS

#### **II. There are a total of 391 Affordable Housing properties in Austin, 10 out of the 391 properties are located in the top 3 tech funded zip codes or the top 6 tech funded neighborhoods. 7 out of the 10 properties are located in the top 1 tech funded zip code/neighborhood, which is Down Town Austin.**

###### •	In Down Town Austin (Zip Code: 78701), 3 affordable housing apartment buildings are currently under construction, scheduled to be completed in the near future between 2021 and 2022.
###### •	All 10 properties in the 6 high-tech neighborhoods are not “Student Only” properties. 

#### **III. The top 10 most common venue categories in Downtown Austin within 500 meters radius are: Coffee Shop, Bar, Cocktail Bar, Hotel, American Restaurant, Juce Bar, Sushi Restaurant, Speakeasy, Lounge, Movie Theater.**
###### •	Total number of venues in Downtown Austin within 500 meters radius is 61.
###### •	Total number of venue categories in Downtown Austin within 500 meters radius is 39.

# **Discusion**
##### Although we discovered that the Downtown Austin neighborhood has more than double of the total number of affordable housing properties in other high-tech neighborhoods (i.e. West Lake Hills, Rollingwood, Arboretum, Balcones, Great Hills), we do not have the data to compare the monthly cost of the same type of unit in each affordable housing property. A neighborhood's average monthly renting cost would be an important factor that people consider when making a moving decision.

# **Conclusion**
##### In Austin, TX, we discovered that the Downtown Austin neighborhood would be the tech center of the tech hub. We also discovered that there are many affordable housing properties in Down Town, Austin, including 3 apartment buildings currently under construction and scheduled to be completed in 2021-2022. 

##### Also, there are a variety of venues for you to explore in Downtown, Austin. Venue categories include different restaurant, coffee Shops, Bar, Movie Theater. There are also parks, grocery stores, Gym / Fitness Center, and theaters.  The top 3 frequent venues are Coffee Shop, Bar, and Hotel.