# Personal Information
Name: **Réka Mária Szabó**

StudentID: **15087972**

Email: [**reka.szabo@student.uva.nl**](youremail@student.uva.nl)

Submitted on: **22.03.2024**

# Data Context
**In this section you should introduce the datasources and datasets which you will be working with. Explain where they are from as well as their domain. Give an overview of what the context of the data is. You should not spend more than 1 to 2 paragraphs here as the core information will be in the next section.**

# Data Description

**Present here the results of your exploratory data analysis. Note that there is no need to have a "story line" - it is more important that you show your understanding of the data and the methods that you will be using in your experiments (i.e. your methodology).**

**As an example, you could show data, label, or group balances, skewness, and basic characterizations of the data. Information about data frequency and distributions as well as results from reduction mechanisms such as PCA could be useful. Furthermore, indicate outliers and how/why you are taking them out of your samples, if you do so.**

**The idea is, that you conduct this analysis to a) understand the data better but b) also to verify the shapes of the distributions and whether they meet the assumptions of the methods that you will attempt to use. Finally, make good use of images, diagrams, and tables to showcase what information you have extracted from your data.**

As you can see, you are in a jupyter notebook environment here. This means that you should focus little on writing text and more on actually exploring your data. If you need to, you can use the amsmath environment in-line: $e=mc^2$ or also in separate equations such as here:

\begin{equation}
    e=mc^2 \mathrm{\space where \space} e,m,c\in \mathbb{R}
\end{equation}

Furthermore, you can insert images such as your data aggregation diagrams like this:

![image](example.png)

In [1]:
# Imports
import os
import numpy as np
import pandas as pd

### Data Loading

In [2]:
# Load your data here

def get_video_url(v):
  """
  Get the video URL.

  Parameters
  ----------
  v : dict
      The dictionary with keys and values in the video dataset JSON file.

  Returns
  -------
  str
      The full URL of the video.
  """
  camera_names = ["hoogovens", "kooksfabriek_1", "kooksfabriek_2"]
  return v["url_root"] + camera_names[v["camera_id"]] + "/" +  v["url_part"] + "/" + v["file_name"] + ".mp4"

In [4]:
def get_video_panorama_url(v):
  """
  Get the video panorama URL.

  Parameters
  ----------
  v : dict
      The dictionary with keys and values in the video dataset JSON file.

  Returns
  -------
  str
      The full URL of the panorama video.
  """
  return "https://www.youtube.com/watch?v=" + v["url_part"]

In [6]:
import json

# Specify the path to the JSON file
json_file_path = "metadata_ijmond_jan_22_2024.json"

# Open the file and load its contents into a dictionary
with open(json_file_path, "r") as json_file:
    data_dict = json.load(json_file)

# Print the video URLs
for v in data_dict:
  print(get_video_url(v))
  print(get_video_panorama_url(v))

https://ijmondcam.multix.io/videos/kooksfabriek_1/A9W8G55JucU/A9W8G55JucU-3.mp4
https://www.youtube.com/watch?v=A9W8G55JucU
https://ijmondcam.multix.io/videos/kooksfabriek_1/23x-vGYMbec/23x-vGYMbec-2.mp4
https://www.youtube.com/watch?v=23x-vGYMbec
https://ijmondcam.multix.io/videos/hoogovens/mRjzTfzMS1I/mRjzTfzMS1I-4.mp4
https://www.youtube.com/watch?v=mRjzTfzMS1I
https://ijmondcam.multix.io/videos/hoogovens/aFhpESIPnXg/aFhpESIPnXg-4.mp4
https://www.youtube.com/watch?v=aFhpESIPnXg
https://ijmondcam.multix.io/videos/kooksfabriek_1/Qv9-nS5BloI/Qv9-nS5BloI-2.mp4
https://www.youtube.com/watch?v=Qv9-nS5BloI
https://ijmondcam.multix.io/videos/kooksfabriek_1/5lfLXLyZY_A/5lfLXLyZY_A-3.mp4
https://www.youtube.com/watch?v=5lfLXLyZY_A
https://ijmondcam.multix.io/videos/kooksfabriek_1/p5klKMpdREI/p5klKMpdREI-0.mp4
https://www.youtube.com/watch?v=p5klKMpdREI
https://ijmondcam.multix.io/videos/kooksfabriek_1/vQZz9ePv_vQ/vQZz9ePv_vQ-3.mp4
https://www.youtube.com/watch?v=vQZz9ePv_vQ
https://ijmondcam.

In [6]:
ijmond_df = pd.read_json('metadata_ijmond_jan_22_2024.json')
ijmond_df.head()

Unnamed: 0,camera_id,file_name,id,label_state,label_state_admin,start_time,url_part,url_root,view_id
0,1,A9W8G55JucU-3,75,23,23,2023-03-24 17:47:46,A9W8G55JucU,https://ijmondcam.multix.io/videos/,3
1,1,23x-vGYMbec-2,6,16,32,2023-04-12 08:42:50,23x-vGYMbec,https://ijmondcam.multix.io/videos/,2
2,0,mRjzTfzMS1I-4,843,16,16,2023-07-18 16:56:58,mRjzTfzMS1I,https://ijmondcam.multix.io/videos/,4
3,0,aFhpESIPnXg-4,787,16,16,2023-06-17 08:53:21,aFhpESIPnXg,https://ijmondcam.multix.io/videos/,4
4,1,Qv9-nS5BloI-2,81,23,47,2023-06-07 12:13:22,Qv9-nS5BloI,https://ijmondcam.multix.io/videos/,2


In [8]:
rise_df = pd.read_json('metadata_02242020.json')
rise_df.head()

Unnamed: 0,camera_id,file_name,id,label_state,label_state_admin,start_time,url_part,url_root,view_id
0,0,0-7-2019-06-24-3504-1067-4125-1688-180-180-972...,103169,23,-1,2019-06-24 21:10:15,2019-06-24/0-7/0-7-2019-06-24-3504-1067-4125-1...,https://smoke.createlab.org/videos/180/,7
1,0,0-7-2019-02-03-3544-899-4026-1381-180-180-7424...,22392,23,-1,2019-02-03 18:09:35,2019-02-03/0-7/0-7-2019-02-03-3544-899-4026-13...,https://smoke.createlab.org/videos/180/,7
2,0,0-2-2018-07-07-5648-1004-6150-1506-180-180-598...,35476,23,-1,2018-07-07 15:20:35,2018-07-07/0-2/0-2-2018-07-07-5648-1004-6150-1...,https://smoke.createlab.org/videos/180/,2
3,0,0-6-2018-07-07-3981-1084-4484-1587-180-180-385...,36353,23,-1,2018-07-07 12:15:00,2018-07-07/0-6/0-6-2018-07-07-3981-1084-4484-1...,https://smoke.createlab.org/videos/180/,6
4,0,0-2-2018-09-19-5648-1004-6150-1506-180-180-100...,42767,23,-1,2018-09-19 21:43:05,2018-09-19/0-2/0-2-2018-09-19-5648-1004-6150-1...,https://smoke.createlab.org/videos/180/,2


### Analysis 1: 
Make sure to add some explanation of what you are doing in your code. This will help you and whoever will read this a lot in following your steps.

In [3]:
# Also don't forget to comment your code
# This way it's also easier to spot thought errors along the way

### Analysis 2: 

In [4]:
# ...

### Analysis n:

In [5]:
# ...