### This lab has two parts:

**PART 1: Bicycle Hire data: Read two solutions and write a third:**

- There are two solved examples (Tasks 1 and 2) that use Cycle Hire API and then a list of questions you can try answering by looking at the data.
- if you run out of tasks feel free to come up with your own.

**PART 2:**

Pick one of the listed APIs and:

1. Get data from it
2. Represent that data as dictionaries or objects
3. Invent 1-2 business questions and answer them using your code

Notes:
response.json() ---> Will return you data as json (basically a dictionary)
response.text ---> Useful only for printing all data, Will return data as formatted string, but you can't treat it as a dictionary 

Notes: 
Use Chrome web browser and install this JSON Viewer extension to preview JSON in a more readable format: https://chrome.google.com/webstore/detail/json-viewer/gbmdgpbipfallnflgajpaliibnhdgobh

# Part 1

### Bicycle Scheme API Reference:

We'll use two API calls: 

**PERMANENT STATION INFORMATION**: detailed not-changing info about stations (eg. their intended size, location). Use it to get info about all stations in format:

```
{
        "station_id": "1727",
        "name": "Causewayside",
        "address": "Causewayside, Edinburgh, EH9 1PR",
        "lat": 55.93650603148236,
        "lon": -3.1801663476151134,
        "capacity": 15
      }
```
      
**LIVE-UPDATED STATION STATUS**: time relevant (refreshed every 10s) info about current status of the bicycle docking stations right now. Use it to get info about all stations in format:

```
{
        "station_id": "1727",
        "is_installed": 1,
        "is_renting": 1,
        "is_returning": 1,
        "last_reported": 1571174615,
        "num_bikes_available": 3,
        "num_docks_available": 12
      }
      
```

**Documentation:**
 
https://edinburghcyclehire.com/open-data/realtime 

**Full responses of the calls:**

https://gbfs.urbansharing.com/edinburghcyclehire.com/station_information.json

```
{
  "last_updated": 1571174713,
  "ttl": 10,
  "data": {
    "stations": [
      {
        "station_id": "1727",
        "name": "Causewayside",
        "address": "Causewayside, Edinburgh, EH9 1PR",
        "lat": 55.93650603148236,
        "lon": -3.1801663476151134,
        "capacity": 15
      },
      {
        "station_id": "1726",
        "name": "Simon Square",
        "address": "Simon Square, Edinburgh, EH8 9HP",
        "lat": 55.94485886752089,
        "lon": -3.182589723460069,
        "capacity": 13
      },
```

https://gbfs.urbansharing.com/edinburghcyclehire.com/station_status.json   


```
  {
  "last_updated": 1571174615,
  "ttl": 10,
  "data": {
    "stations": [
      {
        "station_id": "1727",
        "is_installed": 1,
        "is_renting": 1,
        "is_returning": 1,
        "last_reported": 1571174615,
        "num_bikes_available": 3,
        "num_docks_available": 12
      },
      {
        "station_id": "1726",
        "is_installed": 1,
        "is_renting": 1,
        "is_returning": 1,
        "last_reported": 1571174615,
        "num_bikes_available": 5,
        "num_docks_available": 8
      },
```



## Some starting code: Here are two functions for you to get the data. 

No need to edit those, or to even understand them. Just run these cells, so that you can use `get_all_stations_info()` and `get_all_stations_status()` to get the lists of dictionaries, with data about bicycle docking stations.

In [1]:
import requests
import pprint as pp
    

# here are two functions that you do not need to undertsand just now (but do try later!)
# just have a look at their outputs below:L

def get_all_stations_info():
    # you can see it in yoru briwser by going to: 
#   # https://gbfs.urbansharing.com/edinburghcyclehire.com/station_information.json

    response_station_information = requests.request("GET", "https://gbfs.urbansharing.com/edinburghcyclehire.com/station_information.json")
    stations_info = response_station_information.json()['data']['stations']
    return stations_info

def get_all_stations_status():
    # you can see it in yoru briwser by going to: 
    # https://gbfs.urbansharing.com/edinburghcyclehire.com/station_status.json
    
    response_station_status = requests.request("GET", "https://gbfs.urbansharing.com/edinburghcyclehire.com/station_status.json")
    stations_status = response_station_status.json()['data']['stations']
    return stations_status


Let's call these functions, so that you see what they return:

In [2]:
infos = get_all_stations_info()
pp.pprint(infos)

[{'address': 'Picady Place',
  'capacity': 21,
  'lat': 55.95653524179326,
  'lon': -3.1862476120746805,
  'name': 'Picardy Place',
  'station_id': '2268'},
 {'address': 'Bridge St',
  'capacity': 29,
  'lat': 55.94396074176487,
  'lon': -3.058306836702485,
  'name': 'Musselburgh Brunton Hall',
  'station_id': '2265'},
 {'address': 'Musselborough North High Street opposite Harbour Road',
  'capacity': 24,
  'lat': 55.94388031687606,
  'lon': -3.0667539162178628,
  'name': 'Musselburgh Lidl',
  'station_id': '2263'},
 {'address': '165 Leith Walk',
  'capacity': 31,
  'lat': 55.96791807044289,
  'lon': -3.1735862970647304,
  'name': 'Leith Walk North',
  'station_id': '2259'},
 {'address': 'Shore Rd, South Queensferry EH30 9SQ',
  'capacity': 15,
  'lat': 55.992957267668345,
  'lon': -3.4071562055591187,
  'name': 'Port Edgar Marina',
  'station_id': '1877'},
 {'address': 'Ferrymuir, South Queensferry EH30 9QZ',
  'capacity': 12,
  'lat': 55.983766187891035,
  'lon': -3.401351801602175,


In [3]:
statuses = get_all_stations_status()
pp.pprint(statuses)

[{'is_installed': 1,
  'is_renting': 1,
  'is_returning': 1,
  'last_reported': 1603117743,
  'num_bikes_available': 0,
  'num_docks_available': 21,
  'station_id': '2268'},
 {'is_installed': 1,
  'is_renting': 1,
  'is_returning': 1,
  'last_reported': 1603117743,
  'num_bikes_available': 0,
  'num_docks_available': 10,
  'station_id': '2265'},
 {'is_installed': 1,
  'is_renting': 1,
  'is_returning': 1,
  'last_reported': 1603117743,
  'num_bikes_available': 5,
  'num_docks_available': 14,
  'station_id': '2263'},
 {'is_installed': 1,
  'is_renting': 1,
  'is_returning': 1,
  'last_reported': 1603117743,
  'num_bikes_available': 4,
  'num_docks_available': 12,
  'station_id': '2259'},
 {'is_installed': 1,
  'is_renting': 1,
  'is_returning': 1,
  'last_reported': 1603117743,
  'num_bikes_available': 0,
  'num_docks_available': 15,
  'station_id': '1877'},
 {'is_installed': 1,
  'is_renting': 1,
  'is_returning': 1,
  'last_reported': 1603117743,
  'num_bikes_available': 4,
  'num_doc

### SOLVED Task 1 - What is the total capacity of all docking stations in Edinburgh?  (use station info to find out)

In [4]:
def get_total_capacity():
    all_stations_infos = get_all_stations_info()
    capacities = [ station_info['capacity']
                    for station_info in all_stations_infos
                 ]
    return sum(capacities)

print( "Total Capacity: ", get_total_capacity() )

Total Capacity:  2474


### SOLVED Task 2 - How many bicycles are available right now (use station status to find out)

In [5]:
def get_total_bikes_parked_now():
    all_stations_statuses = get_all_stations_status()


    bikes_parked = [ station_status['num_bikes_available']
                    for station_status in all_stations_statuses
                 ]        
    return sum(bikes_parked)

print( "Bikes Parked now: ", get_total_bikes_parked_now() )

Bikes Parked now:  228


### SOLVED Task 3 -  Business wants to place adverts of the largest stations (the ones larger than average). How many stations are we talking about? 

Find the number of stations that are above average capacity. Calculate total capacity (you did it already), then find out the average capacity. Finally filter only stations larger than that and count them. 

note: Prints are there to show you the process in a clearer way. But it's the return value that counts!

In [6]:
def get_stations_in_top_10_percent_largest():
    all_stations_infos = get_all_stations_info()

    total_capacity = get_total_capacity()
    print("total_capacity",total_capacity)

    number_of_stations = len(all_stations_infos)
    print("number_of_stations",number_of_stations)
    
    average_capacity = total_capacity / number_of_stations
    print("average capacity is",average_capacity)
    
    stations_in_top_10_percent_largest = [ station_info['capacity']
                    for station_info in all_stations_infos
                    if station_info['capacity'] > average_capacity
                 ]        
    
    return len(stations_in_top_10_percent_largest)

print( "Number of stations larger than average: ", get_stations_in_top_10_percent_largest() )

total_capacity 2474
number_of_stations 111
average capacity is 22.28828828828829
Number of stations larger than average:  43


Optional addition to task 3: once you understand avbove code, can you Change it so it shows the list of all capacities above average? It will look a bit like this: `[29, 24, 31, 26, 23, 36]` ?

### Task 4 - What is the number of docking stations with unknown state? (not having a bike in them, and not waiting for a bike)


In the station status we know two things about each docking station: number of usable bikes available there now `num_bikes_available`  and number of empty parking spots available now `num_docks_available`. But we also know the `capacity` that the station was designed for. 

When you look at the data carefully, you will see that often the designed `capacity` does not add up to `num_bikes_available` plus `num_docks_available`. That's becuse if a docking pointis broken, or broken bicycle is in it.

Find out how many of these 'zombie' spaces are there in the whole city. How would you do that?

In [10]:
def get_number_of_docks_not_waiting_and_not_with_a_bike():

    all_stations_infos = get_all_stations_info()
    all_stations_statuses = get_all_stations_status()

    bikes_parked_and_empty_docks = [ station_status['num_bikes_available'] + station_status['num_docks_available']
                    for station_status in all_stations_statuses
                 ]  
    
    nominal_capacity_of_stations = [ station_info['capacity']
                    for station_info in all_stations_infos
                 ]  
    total_bikes_parked_and_empty_docks = sum(bikes_parked_and_empty_docks)
    total_nominal_capacity_of_stations = sum(nominal_capacity_of_stations)
    
    print(total_nominal_capacity_of_stations, total_bikes_parked_and_empty_docks)
    return total_nominal_capacity_of_stations - total_bikes_parked_and_empty_docks

print( "Zombie docks:", get_number_of_docks_not_waiting_and_not_with_a_bike() )

2474 1874
Zombie docks: 600


### Task 5 - What is average number of bikes available per station


In [11]:
def get_avg_of_available_bikes_per_station():

    all_stations_statuses = get_all_stations_status()

    bikes_parked = [ station_status['num_bikes_available']
                    for station_status in all_stations_statuses
                 ]  

    total_bikes_parked = sum(bikes_parked)
    number_of_stations = len(all_stations_statuses)
    
    print(total_bikes_parked, number_of_stations)
    return total_bikes_parked / number_of_stations

print( "average number of bikes available per station:", get_avg_of_available_bikes_per_station() )

219 111
average number of bikes available per station: 1.972972972972973


### Task 6 - What percentage of stations is more than half-empty right now? (how would you calculate that? try to only use get_all_stations_status( ) )

In [18]:
def get_percent_of_half_empty_stations():

    all_stations_statuses = get_all_stations_status()

    all_half_empty_stations = [ station_status
                    for station_status in all_stations_statuses
                    if station_status['num_bikes_available'] < 3#station_status['num_docks_available']
                 ]  
    
    number_of_half_empty_stations = len(all_stations_statuses)
    number_of_stations = len(all_stations_statuses)

    print(number_of_half_empty_stations, number_of_stations)
    return 100 * number_of_half_empty_stations / number_of_stations

print( "percentage of stations more than half-empty right now:", get_percent_of_half_empty_stations() )

111 111
percentage of stations more than half-empty right now: 100.0


### Task 7 - Given station id (e.g. 248 is the one next to Business School) print information about it FROM BOTH APIS

In [25]:
def info_about_station(station_id):
    
    all_stations_infos = get_all_stations_info()
    all_stations_statuses = get_all_stations_status()

    statuses_of_all_stations_with_this_id = [ station_status
                    for station_status in all_stations_statuses
                     if station_status['station_id'] == str(station_id)
                 ]  
    this_station_status = statuses_of_all_stations_with_this_id[0]
    print("this_station_status", this_station_status)
    
    infos_of_all_stations_with_this_id = [ station_info
                for station_info in all_stations_infos
                 if station_info['station_id'] == str(station_id)
         ]  
    this_station_info = infos_of_all_stations_with_this_id[0]
    print("this_station_info", this_station_info)
    print()

    this_station_status.update(this_station_info)
    return this_station_status

pp.pprint(info_about_station(248) )

this_station_status {'station_id': '248', 'is_installed': 1, 'is_renting': 1, 'is_returning': 1, 'last_reported': 1603118709, 'num_bikes_available': 0, 'num_docks_available': 35}
this_station_info {'station_id': '248', 'name': 'Bristo Square', 'address': 'Bristo Square, Potterrow', 'lat': 55.94583371599285, 'lon': -3.189053072075467, 'capacity': 43}

{'address': 'Bristo Square, Potterrow',
 'capacity': 43,
 'is_installed': 1,
 'is_renting': 1,
 'is_returning': 1,
 'last_reported': 1603118709,
 'lat': 55.94583371599285,
 'lon': -3.189053072075467,
 'name': 'Bristo Square',
 'num_bikes_available': 0,
 'num_docks_available': 35,
 'station_id': '248'}


### Task 8 - Look at the available data about bicycle parking stations info and status and form a business problem that could be solved with this data. Then solve that problem with code

In [28]:
# for a givemnprint info about all stations, like "Bristo Square has 3 bikes right now"
# note you'll have to combine both sources of info

def names_of_all_stations_larger_than_average():
    
    all_stations_infos = get_all_stations_info()

    all_stations_capacities = [ station_info['capacity']
                for station_info in all_stations_infos
         ]  
    average_capacity = sum(all_stations_capacities) / len(all_stations_capacities)
    print("average_capacity", average_capacity)

    names_of_above_average_stations = [ station_info['name']
                for station_info in all_stations_infos
                if station_info['capacity'] > average_capacity
         ] 
    print("number of stations above average", len(names_of_above_average_stations))

    return names_of_above_average_stations

print(names_of_all_stations_larger_than_average() )

average_capacity 22.28828828828829
number of stations above average 43
['Musselburgh Brunton Hall', 'Musselburgh Lidl', 'Leith Walk North', 'Hawes Pier', 'Ingliston Park & Ride', 'Duke Street', 'Boroughmuir', 'Heriot Watt - Student Accommodation', 'Dynamic Earth', 'Royal Edinburgh Hospital', 'Gorgie Road', 'Joppa', 'Brunswick Place', 'Bruntsfield Links', 'Balgreen Road', 'Craigleith Road', 'IGMM - Western General', 'Corstorphine Road', 'Logie Green Road', 'Inverleith Row', 'East London Street', 'McDonald Road', 'Portobello - Kings Road', 'Cramond Foreshore', 'Tollcross', 'Marchmont Crescent', 'West Crosscauseway', 'Holyrood Road', 'Surgeons Hall', 'EICC', 'Lothian Road', 'South Trinity Road', 'Meadow Place', 'Castle Street', 'Ocean Terminal', 'Meadows East', 'Pollock Halls', 'Canonmills', 'Lauriston Place', 'Kings Building 3', 'Victoria Quay', 'Bristo Square', 'George Square']


# Part 2: If you still have time: Pick one of the below APIs: Stocks or IMDB


Pick one of the listed APIs and:

1. Get data from it
2. Represent that data as dictionaries or objects
3. Invent 1-3 business questions and answer them using your code

Agree on your goals first, so that you are on the same page.

### Stock Information:

- this api will require you to create an account (you'll need to gove them youe email)
- you will be limited to 5 calls a minute and 500 calls a day

Get API key here:  https://www.alphavantage.co/support/#api-key

Docs: https://www.alphavantage.co/documentation/ 

Look carefully at the URLs, they contain all the details about your question you're asking of API.

for example https://www.alphavantage.co/query?function=FX_DAILY&from_symbol=EUR&to_symbol=GBP&apikey=123654

contains these information:

- type of info: FX_DAILY
- from_symbol: EUR
- to_symbol: GBP
- apikey: 123654

Some ideas:

- What was the opening price of Tesla TSLA yesterday?
- What percent of days was the open price higher than close price?
- How often is the up/down trend from yesterday a good predictor of today?
- Are some months worse than others every year?

In [None]:
# Example: currency exchange rates between EUR and GBP in last 4 months
    
# {
#     "Time Series FX (Daily)": {
#         "2019-10-16": {
#             "1. open": "0.8665",
#             "2. high": "0.8674",
#             "3. low": "0.8623",
#             "4. close": "0.8644"
#         },
    
import requests
your_api_key = "abcde" #get it from the above website


url = "https://www.alphavantage.co/query?function=FX_DAILY&from_symbol=EUR&to_symbol=GBP&apikey="+your_api_key
response_currency = requests.request("GET", url)
stock_dictionary = response_currency.json()["Time Series FX (Daily)"]
pp.pprint(stock_dictionary)

In [None]:
# Example: daily stock value of GOOG since 2004-08-19"

# {
#     "Time Series (Daily)": {
#         "2019-10-15": {
#             "1. open": "1220.4000",
#             "2. high": "1235.9200",
#             "3. low": "1220.4000",
#             "4. close": "1235.5700",
#             "5. volume": "495397"
#         },


import requests
import pprint as pp
your_api_key = "abcde" #get it from the above website

url = "https://www.alphavantage.co/query?function=TIME_SERIES_DAILY&outputsize=full&symbol=GOOG&apikey="+your_api_key
response = requests.request("GET", url)
stock_dictionary = response.json()['Time Series (Daily)']
pp.pprint(stock_dictionary)


In [None]:
# this is some advanced stuff. Look at the above data and try to figure out what's the data structure this api uses.

#to get just dates, or just values
all_dates = list(stock_dictionary.keys())
all_values = list(stock_dictionary.values())

# to drill into a particular date
print("Google opening for 2014-04-01:", stock_dictionary['2020-10-14']['1. open'])

In [None]:
# advanced: list comprehention to cycle through a dictionary
["date:" + key + " opened on " + value['1. open']
for (key, value) in stock_dictionary.items()]

In [None]:
# super extra even more advanced! double list comprehention! + conditions!

# get only open and close values, only for dates in October 2020

[ date + " had "+ key + " at " + value
for (date, values) in stock_dictionary.items()
for (key, value ) in values.items()
if date[0:7] == '2020-10' and  (key == '1. open' or key == '4. close' )
]

### IMDB Movie database

website: http://www.omdbapi.com/
get api key, 1000 calls limit: http://www.omdbapi.com/apikey.aspx (you'll need to click a link in email)


**SEARCH BY TITLE** - http://www.omdbapi.com/?s=Batman - will return basic information about many movies. Each result will contain 10 results in a list. To see next page (11-20) add ```&page=2``` and to see 21-30 add &page=3 etc.

**GET DETAILED INFO BY ID** - http://www.omdbapi.com/?i=tt0372784 - will return detailed information about one movie with the id you pass in the URL, in this example ```tt0372784```

IDEA 1:
Ask user for a movie title, search for that title in API and print all titles informat "Godzilla (1993)"

IDEA 2:
Display all movies in detailed format: "Godzilla (1993) - 112 minutes - earned $10.300.000 - Rating: 8.2"

You will need to make two types of calls: 

- first get all movies with a particular title (eg "Pokemon" or "Star Wars")
- in results for that call identify movie id's ```imdbID```
- FOR EACH imdbID request detailed movie data and display it in a nice format

Some further ideas:

- are there more star wars or star trek movies?
- Do star wars movies get longer more recently?

In [None]:
import requests
your_api_key = "abcde" #get it from the above website

url = "http://www.omdbapi.com/?s=Batman&page=1&apikey=" + your_api_key
response = requests.request("GET", url)

# trying to print whole response. If you API key is not correct, this migh have unexpected value
print(response.json()) 


# will give you a list of all Batman movies... all of them? No just first 10.
# there are  'totalResults': '393'! (including cartoons)
# notice in above URL we there is an argument page=1, if you change it to page=2... you get another 10


# {
#   "Search": [
#     {
#       "Title": "Batman Begins",
#       "Year": "2005",
#       "imdbID": "tt0372784",
#       "Type": "movie",
#       "Poster": "https://m.media-amazon.com/images/M/MV5BZmUwNGU2ZmItMmRiNC00MjhlLTg5YWUtODMyNzkxODYzMmZlXkEyXkFqcGdeQXVyNTIzOTk5ODM@._V1_SX300.jpg"
#     },
#     {
#       "Title": "Batman v Superman: Dawn of Justice",
#       "Year": "2016",
#       "imdbID": "tt2975590",
#       "Type": "movie",
#       "Poster": "https://m.media-amazon.com/images/M/MV5BYThjYzcyYzItNTVjNy00NDk0LTgwMWQtYjMwNmNlNWJhMzMyXkEyXkFqcGdeQXVyMTQxNzMzNDI@._V1_SX300.jpg"
#     },

In [None]:
# grab just the movies and print just the value of the first one
movies = response.json()["Search"] 
print(movies[0]["Title"])

In [None]:
import requests
# your_api_key = "abcde" #get it from the above website

url = "http://www.omdbapi.com/?i=tt0372784&apikey={}".format(your_api_key)
response = requests.request("GET", url)
print(response.json()["Actors"])

# will give you a details of a movie with imdb id tt0372784, which is Batman Begins

# {
#   "Title": "Batman Begins",
#   "Year": "2005",
#   "Rated": "PG-13",
#   "Released": "15 Jun 2005",
#   "Runtime": "140 min",
#   "Genre": "Action, Adventure",
#   "Director": "Christopher Nolan",
#   "Writer": "Bob Kane (characters), David S. Goyer (story), Christopher Nolan (screenplay), David S. Goyer (screenplay)",
#   "Actors": "Christian Bale, Michael Caine, Liam Neeson, Katie Holmes",
#   "Plot": "After training with his mentor, Batman begins his fight to free crime-ridden Gotham City from corruption.",
#   "Language": "English, Urdu, Mandarin",
#   "Country": "USA, UK",
#   "Awards": "Nominated for 1 Oscar. Another 14 wins & 72 nominations.",
#   "Poster": "https://m.media-amazon.com/images/M/MV5BZmUwNGU2ZmItMmRiNC00MjhlLTg5YWUtODMyNzkxODYzMmZlXkEyXkFqcGdeQXVyNTIzOTk5ODM@._V1_SX300.jpg",
#   "Ratings": [
#     {
#       "Source": "Internet Movie Database",
#       "Value": "8.2/10"
#     },
#     {
#       "Source": "Rotten Tomatoes",
#       "Value": "84%"
#     },
#     {
#       "Source": "Metacritic",
#       "Value": "70/100"
#     }
#   ],
#   "Metascore": "70",
#   "imdbRating": "8.2",
#   "imdbVotes": "1,212,892",
#   "imdbID": "tt0372784",
#   "Type": "movie",
#   "DVD": "18 Oct 2005",
#   "BoxOffice": "$204,100,000",
#   "Production": "Warner Bros. Pictures",
#   "Website": "N/A",
#   "Response": "True"
# }