<a href="https://colab.research.google.com/github/SzuYing322/DSND_Term1/blob/master/Copy_of_city_search_starter.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# City Search Tool

There are a lot of factors that go into making a big move, and for many people, the top priority is either their job or their family. But if you’re on your own and you have job flexibility to go basically wherever you want (i.e. you work remotely), then what? In that case, you have the luxury of finding a place that suits you—and not necessarily just your career.

A myriad of decisions go into picking the perfect place to call home — political leanings, crime rates, walkability, affordability, religious affiliations, weather and more — can you make a tool that allows aggie graduates and others to find thier next move?

[High speed internet](https://www.highspeedinternet.com/best-cities-to-live-work-remotely) (of all people?!) made a tool to do this.... but you can do better! Think of more factors: like median income of a location, cuisine, primary ethnicity, pollution index, happiness index, number of coffee shops or microbreweries in the city, etc. There's no end! Furthermore, maybe you are an international student and want to make this tool for global placement! Go for it! Maybe you want to penalize distance from POI's (points of interest) like family. Do it! The world is your oyster!

#### Starter Datasets
- [MoveHub City Ratings](https://www.kaggle.com/blitzr/movehub-city-rankings?select=movehubqualityoflife.csv)
  - [Notebooks for ideas on how to use data](https://www.kaggle.com/blitzr/movehub-city-rankings/notebooks)
- [World City Populations](https://www.kaggle.com/max-mind/world-cities-database?select=worldcitiespop.csv)
- [Rental Price](https://www.kaggle.com/zillow/rent-index)

#### Where to Find More Data
- [Google Datasets](https://datasetsearch.research.google.com/)
- [US Census](https://data.census.gov/cedsci/?q=United%20States)
- [Kaggle Datasets](https://www.kaggle.com/datasets)


#### How We Judge
- *Data Use*: Effectively used data, acquired additional data
- *Analytics*: Effective application of analytics (bonus points for ML/clustering techniques)
- *Visualization*: Solution is visually appealing and useful (Bonus points if you create an interactive tool/ application/ website)
- *Impact*: Clear impact of solution to solving problem

#### Helpful Workshops
- Intro to Python: Sat, 10:30-12:00
- Statistics for Data Scientists: Sat, 10:30-12:00
- How to Win TAMU Datathon: Sat, 13:00-14:00
- Data Wrangling: Sat, 17:00-18:15
- Data Visualization: Sat, 18:30-19:45
- Machine Learning Part 1 - Theory: Sat, 20:00-21:15
- Machine Learning Part 2 - Applied: Sat, 21:30-22:45


In [None]:
import pandas as pd
df = pd.read_csv('https://drive.google.com/uc?id=1hSMhl-JeTCX-t72KjhasTQoL1LdWSRhw')

In [None]:
df.head()

Unnamed: 0,City,Movehub Rating,Purchase Power,Health Care,Pollution,Quality of Life,Crime Rating,lat,lng
0,Caracas,65.18,11.25,44.44,83.45,8.61,85.7,10.480594,-66.903606
1,Johannesburg,84.08,53.99,59.98,47.39,51.26,83.93,-26.204103,28.047305
2,Fortaleza,80.17,52.28,45.46,66.32,36.68,78.65,-3.732714,-38.526998
3,Saint Louis,85.25,80.4,77.29,31.33,87.51,78.13,38.627003,-90.199404
4,Mexico City,75.07,24.28,61.76,18.95,27.91,77.86,19.432608,-99.133208


In [None]:
import plotly.express as px
# df = px.data.gapminder()
# fig = px.choropleth(df, locations="iso_alpha", color="lifeExp", hover_name="country", animation_frame="year", range_color=[20,80])
# fig.show()

In [None]:
#@title Rate importance of each of the following factors


movehub_rating = "None" #@param ["None", "Low", "Med", "High"]
purchase_power = "High" #@param ["None", "Low", "Med", "High"]
health_care = "Low" #@param ["None", "Low", "Med", "High"]
quality_of_life = "None" #@param ["None", "Low", "Med", "High"]
pollution = "None" #@param ["None", "Low", "Med", "High"]
crime_rating = "None" #@param ["None", "Low", "Med", "High"]

import numpy as np

weights = [
  movehub_rating,
  purchase_power,
  health_care,
  quality_of_life,
  pollution,
  crime_rating,
]
replace = {'None': 0, 'Low': 1, 'Med': 2, 'High': 3}
weights = np.array([replace[x] for x in weights])
weights *= [1, 1, 1, 1, -1, -1]

features = ['Movehub Rating', 'Purchase Power', 'Health Care', 'Quality of Life', 'Pollution', 'Crime Rating']
norm = lambda xs: (xs-xs.min())/(xs.max()-xs.min())

df['Score'] = norm(df[features].dot(weights))*10

fig = px.scatter_mapbox(df.sort_values('Score', ascending=False).round(),
                        lat="lat", lon="lng", color="Score", hover_name="City",
                        hover_data=features,
                        color_continuous_scale=px.colors.cyclical.IceFire, size_max=15, zoom=1,
                        mapbox_style="carto-positron")
fig.show()

df.sort_values('Score', ascending=False)[['City', 'Score'] + features].round()

Unnamed: 0,City,Score,Movehub Rating,Purchase Power,Health Care,Quality of Life,Pollution,Crime Rating
29,Glasgow,10.0,84.0,85.0,91.0,80.0,0.0,60.0
188,Ottawa,10.0,88.0,92.0,66.0,86.0,34.0,22.0
129,Lausanne,10.0,87.0,91.0,66.0,73.0,88.0,36.0
156,Newark,10.0,85.0,84.0,80.0,73.0,62.0,30.0
209,Dresden,9.0,85.0,83.0,78.0,90.0,17.0,15.0
...,...,...,...,...,...,...,...,...
174,Addis Ababa,1.0,60.0,6.0,64.0,28.0,86.0,26.0
0,Caracas,1.0,65.0,11.0,44.0,9.0,83.0,86.0
66,Quito,1.0,67.0,14.0,32.0,46.0,15.0,48.0
124,Baku,0.0,66.0,11.0,29.0,17.0,49.0,37.0


[Maps with express](https://plotly.com/python/plotly-express/#maps)

# Host On Web App?
Show us all you got by building a dashboard webapp in Python at
[streamlit.io](https://www.streamlit.io/)!