## Exercise 3 - Data sources
- All files used in this exercise can be found under the Exercises/data_files directory

1 Use gamedata.json for this task. This file contains information of games sold through Steam. Parse out the following information from the data:
- TOP 3 highest metacritic score. Present results using the following format: *Title* has metacritic score of *Score* (for example)
- Games with price discount being 90 % or more. Present results using the following format: *Title* | Discount: *Savings* (for example Metal Gear Solid V: Ground Zeroes | Discount: 90.090090)
- Games having metacritic score higher than steam score. Present results using the following format: *Title* has metacritic score of *MetacriticScore* and steam score of *SteamRatingPercent*

In [4]:
import json

with open('data_files/gamedata.json') as f:
    gamedata = json.load(f)
    score_list = []
    score_and_title = []

    discount_and_title = []
    meta_over_steam = []
    tmp_title = ''
    for game in gamedata:
        for key, value in game.items():
            if key == 'title':
                tmp_title = value
            elif key == 'metacriticScore':
                metaCritic = int(value)
                score_list.append(value)
                score_and_title.append({'value': value, 'title': tmp_title})
            elif key == 'steamRatingPercent':
                steamRating = int(value)
            elif key == 'salePrice':
                discount = float(value)
            elif key == 'normalPrice':
                original = float(value)
                
        if discount <= original * 0.1:
            discount_and_title.append({'title': tmp_title, 'original': original, 'discount': discount})
        if metaCritic > steamRating:
            meta_over_steam.append({'title': tmp_title, 'metaCritic': metaCritic, 'steamRating': steamRating})
            

score_list.sort(reverse=True)


print('--------- Highest Metacritic Scores ---------')
for i in score_list[:3]:
    for j in range(len(score_and_title)):
        if i == score_and_title[j]['value']:
            print(f'Title: {score_and_title[j]["title"]}, Score: {score_and_title[j]["value"]}')
            score_and_title[j]['value'] = 0
            
print('\n--------- Discount over 90 % ---------')
for i in discount_and_title:
    print(f'{i["title"]} | discount: {i["original"] - i["discount"]} €')
    
print('\n--------- MetaCritic over SteamRating ---------')
for i in meta_over_steam:
    print(f"{i['title']} has Metacritic score of '{i['metaCritic']}' and a Steam score of '{i['steamRating']}'")

--------- Highest Metacritic Scores ---------
Title: Star Wars: Knights of the Old Republic, Score: 93
Title: Metal Gear Solid V: The Phantom Pain, Score: 91
Title: Bayonetta, Score: 90

--------- Discount over 90 % ---------
Shadow Tactics: Blades of the Shogun | discount: 36.0 €
Airscape: The Fall of Gravity | discount: 4.5 €
Making History: The Calm and the Storm | discount: 4.5 €
Avencast: Rise of the Mage | discount: 9.0 €
Metal Gear Solid V: Ground Zeroes | discount: 18.0 €
The Way | discount: 13.5 €
Teslagrad | discount: 9.0 €
White Wings  | discount: 18.0 €
Phantaruk | discount: 4.5 €
Oozi Earth Adventure | discount: 4.5 €
Lucius | discount: 9.0 €
The Long Journey Home | discount: 18.0 €
NEON STRUCT | discount: 16.2 €
House of Caravan | discount: 4.5 €

--------- MetaCritic over SteamRating ---------
NBA 2K21 has Metacritic score of '67' and a Steam score of '39'
Commander 85 has Metacritic score of '45' and a Steam score of '35'
Inversion has Metacritic score of '59' and a Ste

2 Use earthquakes.csv for this task. This file contains information about earthquakes recorded between 1965 and 2016. Earthquake magnitude value describes how strong the earthquake is. Magnitude information can be categorized like presented in the table below (*Source: http://www.geo.mtu.edu/UPSeis/magnitude.html*).

| Magnitude      | Class | Effects |
|----------------|-------|---------|
| 2.5 or less    | Minor | Usually not felt, but can be recorded by seismograph. |
| 2.5 to 5.4     | Light | Often felt, but only causes minor damage. |
| 5.5 to 6.0     | Moderate | Slight damage to buildings and other structures. |
| 6.1 to 6.9     | Strong | May cause a lot of damage in very populated areas. |
| 7.0 to 7.9     | Major | Major earthquake. Serious damage. |
| 8.0 or greater | Great | Great earthquake. Can totally destroy communities near the epicenter. |

Count how many earthquakes have occurred in each class.

In [5]:
import csv

with open('data_files/earthquakes.csv') as f:
    eq_csv = csv.reader(f)
    eq_list = [i for i in eq_csv]
    eq_class_count = {'Minor': 0, 'Light': 0, 'Mod': 0, 'Strong': 0, 'Maj': 0, 'Great': 0}
    mag_int = 0
    for quake in eq_list[1:]:
        mag_int = float(quake[8])
        if mag_int <= 2.5:
            eq_class_count['Minor'] += 1
        elif mag_int > 2.5 and mag_int <= 5.4:
            eq_class_count['Light'] += 1
        elif mag_int >= 5.5 and mag_int <= 6.0:
            eq_class_count['Mod'] += 1
        elif mag_int >= 6.1 and mag_int <= 6.9:
            eq_class_count['Strong'] += 1
        elif mag_int >= 7.0 and mag_int <= 7.9:
            eq_class_count['Maj'] += 1
        else:
            eq_class_count['Great'] += 1

print(eq_class_count)

{'Minor': 0, 'Light': 0, 'Mod': 17638, 'Strong': 5035, 'Maj': 698, 'Great': 41}


3 Use netflix_titles.xml for this task. This file contains information about Netflix movies and TV shows. **Important:** Movies have duration presented in minutes while TV shows have duration presented in amount of seasons! Parse out the following information from the data:
- Movies released in 2017
- TV show and movie amount (present both counts in separate lines)
- Movies with a length between 15 and 20 minutes (values 15 and 20 included)

In [6]:
import xml.etree.ElementTree as e

tree = e.parse('data_files/netflix_titles.xml')
root = tree.getroot()

movies_2017 = []
len_movie = []


tv_count = 0
mov_count = 0

titles = []
for i in root:
    tmp = {}
    is_movie = False
    for j in i:
        if j.tag == 'release_year' and j.text == '2017':
            movies_2017.append(i)
        if j.tag == 'type':
            if j.text == 'TV Show':
                tv_count += 1
            elif j.text == 'Movie':
                mov_count += 1
                is_movie = True
        if j.tag == 'duration' and is_movie:
            duration = int(j.text.split(' ')[0])
            if duration >= 15 and duration <= 20:
                len_movie.append(i)
        
new_movies_2017 = []
new_len_mov = []


for i in movies_2017:
    for j in i:
        if j.tag == 'type' and j.text != 'TV Show':
            continue
        elif j.tag == 'title':
            new_movies_2017.append(j.text)
            
            
print(f'\nMovies and TV-Shows released in 2017 | count: {len(new_movies_2017)}')
print('First ten movies released in 2017:')
for i in new_movies_2017[:10]:
    print(i)
# Koska näitä tulisi niin paljon palautan vain muutaman

print(f'\nTV-Shows: {tv_count}\nMovies: {mov_count}')

print(f'\nMovies that are between 15 and 20 minutes: | count {len(len_movie)}')
for i in len_movie:
    for j in i:
        if j.tag == 'title':
            new_len_mov.append(j.text)
print(new_len_mov) 


Movies and TV-Shows released in 2017 | count: 1012
First ten movies released in 2017:
1922
'89
​Maj Rati ​​Keteki
​Mayurakshi
#realityhigh
Ég man þig
1 Mile to You
10 Days in Sun City
100% Hotter
12 ROUND GUN

TV-Shows: 2410
Movies: 5377

Movies that are between 15 and 20 minutes: | count 11
['A Love Song for Latasha', 'ANIMA', 'John Was Trying to Contact Aliens', 'Little Miss Sumo', 'Michael Lost and Found', 'Pocoyo & Cars', 'Sitara: Let Girls Dream', 'The Battle of Midway', 'The Claudia Kishi Club', 'WHAT DID JACK DO?', 'Ya no estoy aquí: Una conversación entre Guillermo del Toro y Alfonso Cuarón']


4 Use the following Rest API for this task: https://tie.digitraffic.fi/api/v1/data/weather-data. Calculate the average for air temperature (ILMA) and humidity (ILMAN_KOSTEUS) values using two decimals.

In [7]:
import requests

url = "https://tie.digitraffic.fi/api/v1/data/weather-data"

req = requests.get(url=url)

sum_of_air = 0
count_of_air = 0

sum_of_hum = 0
count_of_hum = 0

data = req.json()
for i in data['weatherStations']:
    for sensorvalues in i['sensorValues']:
        if sensorvalues['name'] == 'ILMA':
            sum_of_air += sensorvalues['sensorValue']
            count_of_air += 1
        elif sensorvalues['name'] == 'ILMAN_KOSTEUS':
            sum_of_hum += sensorvalues['sensorValue']
            count_of_hum += 1
        
print(f'AVG of air temp: {round(sum_of_air/count_of_air, 2)}')
print(f'AVG of humidity: {round(sum_of_hum/count_of_hum, 2)}')

AVG of air temp: -19.51
AVG of humidity: 81.9


5 Use the following Rest API for this task: https://open-api.myhelsinki.fi/v1/events/. This API provides the information for events held in Helsinki city area. List all music event names in finnish (tip: event tag must have string "music" in it). **Important:** Each event name should only be listed once!

In [8]:
import requests

url = "https://open-api.myhelsinki.fi/v1/events/"

req = requests.get(url=url)

music_events = []

data = req.json()['data']
for i in data:
    for j in i['tags']:
        if j['name'] == 'music':
            if i['name']['fi'] not in music_events:
                music_events.append(i['name']['fi'])
            break
print(music_events)

['Kitararyhmä', 'Harrastelijasoittajien jamit – avoin ryhmä', 'Avoin karaokeryhmä', 'Kibecon 2022', 'Pekka Streng -ilta', 'Musiikin teematiistai: kuuntelupiiri 2', 'Ukuleleryhmien konsertti', 'Bändi "Staroe kino" Oodissa', 'Mimmit Töölön kirjastossa', 'PERUTTU Joulukaraoke Ison Omenan kirjastossa', 'Joulukaraoke Sellon kirjastossa', 'Hyksetin joulukonsertti', 'Musiikkisatu', 'Raoul Björkenheim Trio Maagisessa paikassa', 'OodiSoi! ArtturiW', 'Joululauluja Herttoniemen kirjastossa', 'Café Barock: Vanhan musiikin uudet toivot - lounaskonsertti Kirjatornissa', 'Musiikin teematiistai: soitinesittely', 'Rihmasto - Äänen juurella', 'Töölön musiikkiopiston konsertti', 'Latinolaulujen ilta - Vamos a cantar!', 'KITKA KOLLEKTIIVI: Same same but different -Nykytanssiteos', 'Pianokonsertti', 'Runolaulujamit', 'Populaarimusiikin lukupiiri', 'Kohti joulua: joululauluja Suomesta ja maailmalta', 'Philomelan lauluyhtye Maunulan kirjastossa', 'Pamelan svengaavat swingit', 'Hetkiä helmien nauha', 'Ukulele