<div>
    <img style="float:right;" src="images/smi-logo.png"/>
    <div style="float:left;color:#58288C;"><h1>Introduction to Python for Data Science</h1></div>
</div>

---
# Notebook 4: INSIDER Task
This notebook introduces the instructions for the INSIDER task you have to complete.

---

### Your task as group of 2-3 ppl:
- Briefly check out https://teleport.org to understand what data this assignment is about
- Pick 3-5 big cities that you're interested in benchmarking
- Use the [teleport API](https://developers.teleport.org/api/) to compile a single table that lists all life quality scores for the respective urban areas. The resulting table should look like this
   Name | City1 | City2 | City3 | ...
   :-----------|:--:|:--:|:--:|:--:
   Housing | 7.2 | 6.9 | ... | ...
   Cost of Living | 5.7 | 9.1 | ... | ...
   ... | ... | ... | ... | ...
- Compute the total score of the urban areas by averaging all individual scores of each area
- Round all scores to one decimal digit and add the mean scores to your result dataframe
- Generate a horizontal bar chart to visualize your findings
- Make sure, all key steps of your code have concise english comments
- Submit your commented code + the mean scores of your cities as solution to the INSIDER task

> *Remarks regarding scoring of this task*: Task is considered basically completed if your code produces a table of correct scores for each urban area. The task is done well if your code generates a single table with all scores, can display the demanded bar chart and compute the means successfully. The task is done perfect if you round all values, append the mean values to the table and this way include the mean scores in your bar chart.

### Hints

#### General

- Develop your code in multiple cells so you can check the output of every step
- For the final solution you will need a loop to fetch data for all the cities, but don't start this way. First get the task working for a single city, then carefully introduce more complexity to your code.
- You have seen all required operations in the notebooks you worked through for this task; feel free to use this as copy/paste reference. Where unsure use google to find more examples of a certain operation.
- Getting the quality scores from teleport API requires going through these steps: 
  * look up the city, get the link to the city data
  * use the city data link to get the link to the urban area of the city
  * use the urban area data link to get the link to the scores
  * retrieve the scores
- You can save your work by simply copy/pasting your code to an empty document on your computer or downloading this notebook (File -> Download). After re-opening the Jupyter environment, you can re-upload the notebook by drag/dropping the downloaded notebook to the file explorer on the left hand side.

#### Working with API results

- Before writing the code, you might want to walk through the sequence of queries in a browser. Start with the city search and copy/paste the next required link. They look like this:
   * Link to city search: `http://api.teleport.org/api/cities/?search=CityName`
   * Link to city data: `https://api.teleport.org/api/cities/geonameid:???????/`
   * Link to urban area: `https://api.teleport.org/api/urban_areas/slug:???????/`
   * Link to scores: `https://api.teleport.org/api/urban_areas/slug:??????/scores/`
- Getting the required bits of information from the retrieved raw (JSON) data is a little cumbersome. The cell below contains an example query that demonstrates how to dig layer by layer into the first API response. If this feels confusing, revisit notebook #2 and recheck the sections about lists and dictionaries, especially the person example.

#### Preparing your dataframe

- Remember: operations like drop, rename, etc. create and return a copy of the dataframe. Use inplace editing like `df.rename(..., inplace=True)` or assign the changed dataset to the original name like in `df = df.rename(...)` to make your changes persistent
- To approach the problem stepwise, create and prepare separate dataframes for each city in your loop and join the dataframes at the end outside the loop. 
- In might be useful to drop the color column and rename column "score_out_of_10" to the city name in dataframe preparation.
- Don't forget to set the index. 
- There are at least two options to join multiple dataframes:
   * Use the `.loc[]` function to piece your result dataframe together, like `df.loc[:,"columnX"] = ...` to add a column
   * Check the documentation of `pandas.concat()` function online and use it to generate your result

## Some things to get you started...
You can start working directly in the cell below or create a new notebook with File->New->Notebook and copy the following code.

In [None]:
import requests
import pandas

# Entry link to teleport API
url_citysearch = "http://api.teleport.org/api/cities/?search="   # append the name of the city you want to search for at the end of this string

# Cities to analyze
cities = ["city1", "city2", "city3", ...]
scores = []                                  # empty list, your loop can add the results here later

# Starting point / example: Lookup of exemplary city

result = requests.get(url_citysearch + "Berlin")
link_to_city_data = result.json()["_embedded"]["city:search-results"][0]["_links"]["city:item"]["href"]
print(link_to_city_data)

# ...