## Exercise 3 - Data sources
- All files used in this exercise can be found under the Exercises/data_files directory

1 Use gamedata.json for this task. This file contains information of games sold through Steam. Parse out the following information from the data (Important: Do not combine these filters, but do them separately!):
- TOP 3 highest metacritic score. Present results using the following format: *Title* has metacritic score of *Score* (for example)
- Games with price discount being 90 % or more. Present results using the following format: *Title* | Discount: *Savings* (for example Metal Gear Solid V: Ground Zeroes | Discount: 90.090090)
- Games having metacritic score higher than steam score. Present results using the following format: *Title* has metacritic score of *MetacriticScore* and steam score of *SteamRatingPercent*

In [10]:
# TOP 3 highest metacritic score
import json

with open ('./data_files/gamedata.json', 'r') as f:
    games = json.load(f)

top_3_games = sorted(
    games,
    key=lambda i: int(i["metacriticScore"]),
    reverse=True
)[:3]

for game in top_3_games:
    print(f"{game['title']} has metacritic score of {game['metacriticScore']}")






Star Wars: Knights of the Old Republic has metacritic score of 93
Metal Gear Solid V: The Phantom Pain has metacritic score of 91
Bayonetta has metacritic score of 90


In [11]:
# Games with price discount being 90 % or more

for game in games:
    if float(game['savings']) >= 90:
        print(f"{game['title']} | Discount: {game['savings']}") 


Shadow Tactics: Blades of the Shogun | Discount: 90.022506
Airscape: The Fall of Gravity | Discount: 90.180361
Making History: The Calm and the Storm | Discount: 90.180361
Avencast: Rise of the Mage | Discount: 90.090090
Metal Gear Solid V: Ground Zeroes | Discount: 90.045023
The Way | Discount: 90.060040
Teslagrad | Discount: 90.090090
White Wings  | Discount: 90.045023
Phantaruk | Discount: 90.180361
Oozi Earth Adventure | Discount: 90.180361
Lucius | Discount: 90.090090
The Long Journey Home | Discount: 90.045023
NEON STRUCT | Discount: 90.050028
House of Caravan | Discount: 90.180361


In [17]:
# Games having metacritic score higher than steam score

for game in games:
    if float(game['metacriticScore']) > float(game['steamRatingPercent']):
        print(f"{game['title']} has metacritic score of {game['metacriticScore']} and steam score of {game['steamRatingPercent']}")

NBA 2K21 has metacritic score of 67 and steam score of 39
Commander 85 has metacritic score of 45 and steam score of 35
Inversion has metacritic score of 59 and steam score of 57
Bionic Commando: Rearmed has metacritic score of 86 and steam score of 71
Metal Gear Solid V: The Phantom Pain has metacritic score of 91 and steam score of 90
Port Royale 2 has metacritic score of 75 and steam score of 68
Project Cars 2 has metacritic score of 84 and steam score of 79
Full Spectrum Warrior has metacritic score of 80 and steam score of 65
The Long Journey Home has metacritic score of 68 and steam score of 60
Star Wars: Knights of the Old Republic has metacritic score of 93 and steam score of 90
Starpoint Gemini Warlords has metacritic score of 73 and steam score of 72
Tidalis has metacritic score of 75 and steam score of 70


2 Use earthquakes.csv for this task. This file contains information about earthquakes recorded between 1965 and 2016. Earthquake magnitude value describes how strong the earthquake is. Magnitude information can be categorized like presented in the table below (*Source: http://www.geo.mtu.edu/UPSeis/magnitude.html*).

| Magnitude       | Class | Effects |
|-----------------|-------|---------|
| 2.49 or less    | Minor | Usually not felt, but can be recorded by seismograph. |
| 2.50 to 5.49    | Light | Often felt, but only causes minor damage. |
| 5.50 to 6.09    | Moderate | Slight damage to buildings and other structures. |
| 6.10 to 6.99    | Strong | May cause a lot of damage in very populated areas. |
| 7.00 to 7.99    | Major | Major earthquake. Serious damage. |
| 8.00 or greater | Great | Great earthquake. Can totally destroy communities near the epicenter. |

Count how many earthquakes have occurred in each class.

<b style="color:red;">Notice:</b> The first value has been modified to be 2.4 or less compared to the original source (has been 2.5 or less).

In [73]:
import csv

with open ('data_files/earthquakes.csv', 'r') as f:
    rows = list(csv.reader(f, delimiter=","))

titles = rows[0]
max_idx = titles.index('Magnitude')

counts = {
    "Minor": 0,
    "Light": 0,
    "Moderate": 0,
    "Strong": 0,
    "Major": 0,
    "Great": 0
}

for row in rows[1:]:
    magnitude = float(row[max_idx])

    if magnitude <= 2.49:
        counts['Minor'] += 1
    elif 2.50 <= magnitude <= 5.49:
        counts['Light'] += 1
    elif 5.50 <= magnitude <= 6.09:
        counts['Moderate'] += 1
    elif 6.10 <= magnitude <= 6.99:
        counts['Strong'] += 1
    elif 7.00 <= magnitude <= 7.99:
        counts['Major'] += 1
    elif magnitude >= 8.00:
        counts['Great'] += 1

print(counts)



{'Minor': 0, 'Light': 0, 'Moderate': 17639, 'Strong': 5035, 'Major': 698, 'Great': 40}


3 Use netflix_titles.xml for this task. This file contains information about Netflix movies and TV shows. **Important:** Movies have duration presented in minutes while TV shows have duration presented in amount of seasons! Parse out the following information from the data and **show only counts** for these (how many instances are returned):
- Movies released in 2017
- TV show and movie amount (present both counts in separate lines)
- Movies with a length between 15 and 20 minutes (values 15 and 20 included)

In [54]:
# Movies released in 2017
import xml.etree.ElementTree as et
tree = et.parse("data_files/netflix_titles.xml")
root = tree.getroot()


count_2017 = 0
for row in root:
    media_type = row.find('type').text
    release_year = row.find('release_year').text

    if media_type == "Movie" and release_year == "2017":
        count_2017 += 1

print(f"{count_2017} movies released in 2017")

744 movies released in 2017


In [None]:
# TV show count

show_count = 0
movie_count = 0

for item in root:
    for detail in item:
        if detail.tag == 'type' and detail.text == 'Movie':
            movie_count += 1
        elif  detail.tag == 'type' and detail.text == 'TV Show':
            show_count += 1
            
print(f"The Movie Count: {movie_count}")       
print(f"The Show Count: {show_count}")       

The Movie Count: 5377
The Show Count: 2410


In [68]:
# Movies with a length between 15 and 20 minutes

# Movies with a length between 15 and 20 minutes

movies = []

for item in root:  
    is_movie = False
    duration = ""
    
    for detail in item: 
        if detail.tag == 'type' and detail.text == 'Movie':
            is_movie = True
        if detail.tag == 'duration':
            duration = detail.text
    
    if is_movie:
        movies.append(duration)  


count = 0
for movie_duration in movies:
    minutes = int(movie_duration.split(' ')[0])  
    if 15 <= minutes <= 20:
        count += 1

print(f"The count of movies duration 15-20 min: {count} ")   


The count of movies duration 15-20 min: 11 


4 Use the following Rest API for this task: https://tie.digitraffic.fi/api/weather/v1/stations/data. Calculate the average for air temperature (ILMA) and humidity (ILMAN_KOSTEUS) values using two decimals.

In [71]:
import requests

url = 'https://tie.digitraffic.fi/api/weather/v1/stations/data'


response = requests.get(url)

data = response.json()

temperatures = []
humidity = []

for station in data['stations']:
    for sensor in station['sensorValues']:
        if sensor['name'] == 'ILMA':
            temperatures.append(sensor['value'])
        elif sensor['name'] == 'ILMAN_KOSTEUS':
            humidity.append(sensor['value'])

tmp_avg = sum(temperatures) / len(temperatures)
humid_avg = sum(humidity) / len(humidity)

print(f"Average temperature: {tmp_avg:.2f}°C")
print(f"Average humidity: {humid_avg:.2f}%")

Average temperature: -9.81°C
Average humidity: 87.81%


 5 Use the following Rest API for this task: https://api.hel.fi/linkedevents/v1/place/. Fetch the data from **the first page** to this notebook. Then filter the data so that only places with postal code of 00100 or 00900 are included. Finally present the finnish name and street address of those places. For example: *Stoa | Turunlinnantie 1*.

In [72]:

url = 'https://api.hel.fi/linkedevents/v1/place/'
response = requests.get(url)
data = response.json()

for place in data['data']:
    postal_code = place['postal_code']
    
    if postal_code == "00100" or postal_code == "00900":
        finnish_name = place['name']['fi']
        street_address = place['street_address']['fi']
        
        print(f"{finnish_name} | {street_address}")


Keskustakirjasto Oodi | Töölönlahdenkatu 4
Kampin palvelukeskus | Salomonkatu 21 B
Itäkeskuksen kirjasto | Turunlinnantie 1
Kampin liikuntakeskus | Malminkatu 3
Stoa | Turunlinnantie 1
Suomen Kansallisteatteri | Läntinen Teatterikuja 1
Annantalo | Annankatu 30
