## Exercise 3 - Data sources
- All files used in this exercise can be found under the Exercises/data_files directory

1 Use gamedata.json for this task. This file contains information of games sold through Steam. Parse out the following information from the data:
- TOP 3 highest metacritic score. Present results using the following format: *Title* has metacritic score of *Score* (for example)
- Games with price discount being 90 % or more. Present results using the following format: *Title* | Discount: *Savings* (for example Metal Gear Solid V: Ground Zeroes | Discount: 90.090090)
- Games having metacritic score higher than steam score. Present results using the following format: *Title* has metacritic score of *MetacriticScore* and steam score of *SteamRatingPercent*

In [25]:
import json

with open('data_files/gamedata.json') as gamedata:
    games = json.load(gamedata)
    
sorted_games = sorted(games, key=lambda x: x["metacriticScore"], reverse=True)
top_3 = sorted_games[:3]

for game in top_3:
    print(f"{game['title']} has metacritic score of {game['metacriticScore']}")
    
print("\n-------------------------------------------------------------\n")
    
discounted_games = [game for game in games if float(game["savings"]) >= 90.0]
for game in discounted_games:
    print(f"{game['title']} | Discount: {game['savings']}")
    
print("\n-------------------------------------------------------------\n")
    
mc_higher_steam = [game for game in games if int(game["metacriticScore"]) > int(game["steamRatingPercent"])]
for game in mc_higher_steam:
    print(f"{game['title']} has metacritic score of {game['metacriticScore']} and steam score of {game['steamRatingPercent']}")
    


Star Wars: Knights of the Old Republic has metacritic score of 93
Metal Gear Solid V: The Phantom Pain has metacritic score of 91
Bayonetta has metacritic score of 90

-------------------------------------------------------------

Shadow Tactics: Blades of the Shogun | Discount: 90.022506
Airscape: The Fall of Gravity | Discount: 90.180361
Making History: The Calm and the Storm | Discount: 90.180361
Avencast: Rise of the Mage | Discount: 90.090090
Metal Gear Solid V: Ground Zeroes | Discount: 90.045023
The Way | Discount: 90.060040
Teslagrad | Discount: 90.090090
White Wings  | Discount: 90.045023
Phantaruk | Discount: 90.180361
Oozi Earth Adventure | Discount: 90.180361
Lucius | Discount: 90.090090
The Long Journey Home | Discount: 90.045023
NEON STRUCT | Discount: 90.050028
House of Caravan | Discount: 90.180361

-------------------------------------------------------------

NBA 2K21 has metacritic score of 67 and steam score of 39
Commander 85 has metacritic score of 45 and steam sc

2 Use earthquakes.csv for this task. This file contains information about earthquakes recorded between 1965 and 2016. Earthquake magnitude value describes how strong the earthquake is. Magnitude information can be categorized like presented in the table below (*Source: http://www.geo.mtu.edu/UPSeis/magnitude.html*).

| Magnitude      | Class | Effects |
|----------------|-------|---------|
| 2.4 or less    | Minor | Usually not felt, but can be recorded by seismograph. |
| 2.5 to 5.4     | Light | Often felt, but only causes minor damage. |
| 5.5 to 6.0     | Moderate | Slight damage to buildings and other structures. |
| 6.1 to 6.9     | Strong | May cause a lot of damage in very populated areas. |
| 7.0 to 7.9     | Major | Major earthquake. Serious damage. |
| 8.0 or greater | Great | Great earthquake. Can totally destroy communities near the epicenter. |

Count how many earthquakes have occurred in each class.

<b style="color:red;">Notice:</b> The first value has been modified to be 2.4 or less compared to the original source (has been 2.5 or less).

In [93]:
import csv

Minor = 0
Light = 0
Moderate = 0
Strong = 0
Major = 0
Great = 0

with open("data_files/earthquakes.csv","r") as earthquakes:
    information = csv.reader(earthquakes)
    
    next(information)

    for row in information:
        magnitude = float(row[8])
        if magnitude <= 2.4:
            Minor += 1
        if 2.5 <= magnitude <= 5.4:
            Light += 1
        if 5.5 <= magnitude <= 6.0:
            Moderate += 1
        if 6.1 <= magnitude <= 6.9:
            Strong += 1
        if 7.0 <= magnitude <= 7.9:
            Major += 1
        if magnitude >= 8.0:
            Great += 1

print("Minor: ", Minor)
print("Light: ", Light)
print("Moderate: ", Moderate)
print("Strong: ", Strong)
print("Major: ", Major)
print("Great: ", Great)


Minor:  0
Light:  0
Moderate:  17638
Strong:  5035
Major:  698
Great:  40


3 Use netflix_titles.xml for this task. This file contains information about Netflix movies and TV shows. **Important:** Movies have duration presented in minutes while TV shows have duration presented in amount of seasons! Parse out the following information from the data:
- Movies released in 2017
- TV show and movie amount (present both counts in separate lines)
- Movies with a length between 15 and 20 minutes (values 15 and 20 included)

In [88]:
import xml.etree.ElementTree as e

tree = e.parse("data_files/netflix_titles.xml")
root = tree.getroot()

dataCollection = []

titles = []
if len(root) > 0:
    for i in root[0]:
        titles.append(i.tag)

for elem in root:
    information = []
    for subelem in elem:
        information.append(subelem.text)
    dataCollection.append(information)

information = []
for lst in dataCollection:
    data_obj = {}
    for i in range(len(lst)):
        data_obj[titles[i]] = lst[i]
    information.append(data_obj)

movieCount = 0
tvShowCount = 0
movies_2017 = []
short_movies = []

for row in information:
    if row['type'] == 'Movie':
        movieCount += 1
    elif row['type'] == 'TV Show':
        tvShowCount += 1
    if row['type'] == 'Movie' and row['release_year'] == '2017':
        movies_2017.append(row)
    if row['type'] == 'Movie':
        duration = row['duration']
        if 'min' in duration:
            minutes = int(duration.split(' ')[0])
            if 15 <= minutes <= 20:
                short_movies.append(row)

print(f"Number of movies: {movieCount}")
print(f"Number of TV shows: {tvShowCount}")
print(f"Number of movies released in 2017: {len(movies_2017)}")
print(f"Number of movies with duration between 15 and 20 minutes: {len(short_movies)}")


Number of movies: 5377
Number of TV shows: 2410
Number of movies released in 2017: 744
Number of movies with duration between 15 and 20 minutes: 11


4 Use the following Rest API for this task: https://tie.digitraffic.fi/api/v1/data/weather-data. Calculate the average for air temperature (ILMA) and humidity (ILMAN_KOSTEUS) values using two decimals.

In [91]:
import requests

response = requests.get('https://tie.digitraffic.fi/api/v1/data/weather-data')
data = response.json()

air_temperature_values = []
humidity_values = []
for station in data['weatherStations']:
    for sensor in station['sensorValues']:
        if sensor['name'] == 'ILMA':
            air_temperature_values.append(sensor['sensorValue'])
        elif sensor['name'] == 'ILMAN_KOSTEUS':
            humidity_values.append(sensor['sensorValue'])

air_temperature_average = round(sum(air_temperature_values) / len(air_temperature_values), 2)
humidity_average = round(sum(humidity_values) / len(humidity_values), 2)

print(f'Average air temperature: {air_temperature_average}C')
print(f'Average humidity: {humidity_average}%')


Average air temperature: 2.54C
Average humidity: 39.04%
