1 Use gamedata.json for this task. This file contains information of games sold through Steam. Parse out the following information from the data:
- TOP 3 highest metacritic score. Present results using the following format: *Title* has metacritic score of *Score* (for example)
- Games with price discount being 90 % or more. Present results using the following format: *Title* | Discount: *Savings* (for example Metal Gear Solid V: Ground Zeroes | Discount: 90.090090)
- Games having metacritic score higher than steam score. Present results using the following format: *Title* has metacritic score of *MetacriticScore* and steam score of *SteamRatingPercent*

In [6]:
import json
import pandas as pd

# Load game data from JSON file
file = pd.read_json("./Data-Analytics-Exercises/data_files/gamedata.json") 

# Function to get metacritic score
def get_metacritic_score(game):
    return game.get("metacriticScore")

# Function to get game discount
def get_discount(game):
    return float(game.get("savings"))

# Create game data copy and sort by metacritic score in descending order
topgames = gamedata.copy()
topgames.sort(key=get_metacritic_score, reverse=True)

# Top 3 games with highest metacritic score
print("Top 3 Games with the Highest Metacritic Score:")
for game in topgames[:3]:
    print(f"{game.get('title')} has a metacritic score of {game.get('metacriticScore')}.")

print("\n")

# Games with 90% discount or more
print("Games with a discount of 90% or more:")
discountedgames = gamedata.copy()
for game in discountedgames:
    discount = get_discount(game)
    if discount >= 90:
        print(f"{game.get('title')} | {discount}%")

print("\n")

# Games with higher metacritic score than Steam score
print("Games with a Higher Metacritic Score than Steam Score:")
for game in gamedata:
    steam_score = game.get("steamRatingPercent")
    metacritic_score = game.get("metacriticScore")
    if steam_score < metacritic_score:
        
        print(f"{game.get('title')} has a metacritic score of {metacritic_score} and a steam score of {steam_score}")


ValueError: Expected object or value

2 Use earthquakes.csv for this task. This file contains information about earthquakes recorded between 1965 and 2016. Earthquake magnitude value describes how strong the earthquake is. Magnitude information can be categorized like presented in the table below (*Source: http://www.geo.mtu.edu/UPSeis/magnitude.html*).

| Magnitude      | Class | Effects |
|----------------|-------|---------|
| 2.4 or less    | Minor | Usually not felt, but can be recorded by seismograph. |
| 2.5 to 5.4     | Light | Often felt, but only causes minor damage. |
| 5.5 to 6.0     | Moderate | Slight damage to buildings and other structures. |
| 6.1 to 6.9     | Strong | May cause a lot of damage in very populated areas. |
| 7.0 to 7.9     | Major | Major earthquake. Serious damage. |
| 8.0 or greater | Great | Great earthquake. Can totally destroy communities near the epicenter. |

Count how many earthquakes have occurred in each class.

<b style="color:red;">Notice:</b> The first value has been modified to be 2.4 or less compared to the original source (has been 2.5 or less).

In [7]:
import pandas as pd
file = pd.read_csv("./data_files/earthquakes.csv",delimiter=",")

quakes = {
    "Minor": 0,
    "Light": 0,
    "Moderate": 0,
    "Strong": 0,
    "Major": 0,
    "Great": 0
}

for x in file.iterrows():
    mag = x[1].Magnitude
    if mag <= 2.4:
        quakes["Minor"] += 1
    elif mag <= 5.4:
        quakes["Light"] += 1
    elif mag <= 6.0:
        quakes["Moderate"] += 1
    elif mag <= 6.9:
        quakes["Strong"] += 1
    elif mag <= 7.9:
        quakes["Major"] += 1
    else:
        quakes["Great"] += 1

print(quakes)

{'Minor': 0, 'Light': 0, 'Moderate': 17638, 'Strong': 5036, 'Major': 698, 'Great': 40}


3 Use netflix_titles.xml for this task. This file contains information about Netflix movies and TV shows. **Important:** Movies have duration presented in minutes while TV shows have duration presented in amount of seasons! Parse out the following information from the data:
- Movies released in 2017
- TV show and movie amount (present both counts in separate lines)
- Movies with a length between 15 and 20 minutes (values 15 and 20 included)

In [1]:
import xml.etree.ElementTree as ET

tree = ET.parse("./data_files/netflix_titles.xml")
root = tree.getroot()

nftitles = [{i.tag: j.text for i, j in zip(root[0], elem)} for elem in root]

movies2017 = sum(1 for x in nftitles if x.get("type") == "Movie" and "2017" in x.get("release_year"))
tv = sum(1 for x in nftitles if x.get("type") == "TV Show")
movies = sum(1 for x in nftitles if x.get("type") == "Movie")
checkmovies = sum(1 for x in nftitles if x.get("type") == "Movie" and 15 <= int(x.get("duration")[0:-3]) <= 20)

print(f"""Movies released in 2017: {movies2017} 
TV Shows: {tv} 
Movies: {movies}
Movies with a length between 15 and 20 minutes: {checkmovies}""")


Movies released in 2017: 744 
TV Shows: 2410 
Movies: 5377
Movies with a length between 15 and 20 minutes: 11


4 Use the following Rest API for this task: https://tie.digitraffic.fi/api/v1/data/weather-data. Calculate the average for air temperature (ILMA) and humidity (ILMAN_KOSTEUS) values using two decimals.

In [2]:
import requests

url = "https://tie.digitraffic.fi/api/v1/data/weather-data"
response = requests.get(url=url)
weatherdata = response.json()

# Initialize empty lists
temperatures = []
humidities = []

for station in weatherdata["weatherStations"]:
    for sensor in station.get("sensorValues"):
        name = sensor.get("name")
        if name == "ILMA":
            temperatures.append(sensor.get("sensorValue"))
        elif name == "ILMAN_KOSTEUS":
            humidities.append(sensor.get("sensorValue"))

print(f"Average air temperature: {round((sum(temperatures)/len(temperatures)),2)}")
print(f"Average air humidity: {round((sum(humidities)/len(humidities)),2)}")


Average air temperature: 0.8
Average air humidity: 73.51
