Andrew Yu | Final project for DS-SF-38 Data Science course (Fall 2017)
REWRITE THIS: See https://www.kaggle.com/aychando/predicting-employee-kernelover/editnb as an example
Overview: Experts across industries agree that the landscape and make up of the modern city will change drastically, in large part due to the proliferation of internet/smart devices. As a cohesive whole, these changes will shape our cities to be "smart", "responsive" cities. Given existing data on cities and livability, the aim of this project is to predict the where the next smart, responsive city will be – where is the ripest opportunity across the world?
Hypothesis: Previous existing city information (transportation, freight and trade, power, communications, waste management) will allow us to predict whether a city will be a smart, responsive city in the next 5 - 20 years, and provide high quality of living for its inhabitants.
Methods and models: This will be a regression problem, as I will be predicting some continuous number, some kind of score that will allow me to rank cities.
Potential impact: Hopefully this is a first step to generating a model that can predict where we should be focusing on, as a global community, to improve the overall welfare of humankind.
Predictors/covariates: TBD
Avalable Data/Datasets: Will start of using "featured" datasets on Kaggle (which are cleaner than raw data) as a starting point, then look for more robust, in depth raw data:
- https://www.kaggle.com/okfn/world-cities/data
- https://www.kaggle.com/okfn/world-cities
- https://www.kaggle.com/blitzr/movehub-city-rankings
- Others TBD
Assumptions: Working under the assumption that there is a definition of 'smart, responsive' city, that we can predict.
Risks: There is a danger of generalizing, under fitting the model, with sparse features.
- Will be important to focus a lot of resources into feature engineering
- Cost of the model being wrong? TBD
- Benefit of the model being right? TBD
Caveats: TBD
Outcomes: TBD
I will largely be using my domain knowledge from the following ETHx Zurich courses I took on edX
- ETHx: FC-01x Future Cities
- ETHx: FC-02x Livable Future Cities
- ETHx: FC-03x Smart Cities
- ETHx: FC-04x Responsive Cities
Existing research efforts:
- There's a fair amount of existing research on this, will list here: TBD
TBD, get an accurate prediction of smart cities
- Expectations – expected output
- Complexity
- Success criteria for the project
- Next steps if project is "a bust"