# Loading Datasets

One of the first things you need to know to dive into the EdgeSimPy universe is to load datasets. Once you understand how EdgeSimPy loads data, you can use existing datasets or even build your own simulated scenarios to prototype resource management strategies. This tutorial will guide you through the different ways of loading data supported by EdgeSimPy.

Before digging into EdgeSimPy's load dataset features, we must load the simulator modules. We can do that with the following command:

In [None]:
# Downloading EdgeSimPy binaries from GitHub (the "-q" parameter suppresses Pip's output. You check the full logs by removing it)
!pip install -q git+https://github.com/EdgeSimPy/EdgeSimPy.git@v1.0.0

# Importing EdgeSimPy components
from edge_sim_py import *

## Loading Datasets from URLs

With the rise of open science and reproducibility, researchers are increasingly publishing online the research artifacts of their papers. Considering that, EdgeSimPy allows you to load datasets through public URLs without burden.

To load external datasets into EdgeSimPy, we simply need to call the `initialize()` method informing the dataset's URL in the `input_file` attribute, as shown below.


In [None]:
# Creating a Simulator object
simulator = Simulator()

# Loading the dataset file from the external JSON file
simulator.initialize(input_file="https://raw.githubusercontent.com/EdgeSimPy/edgesimpy-tutorials/master/datasets/sample_dataset1.json")

# Displaying some of the objects loaded from the dataset
for user in User.all():
    print(f"{user}. Coordinates: {user.coordinates}")

## Loading Datasets from Local Files

EdgeSimPy also facilitates loading data from local dataset files. In this case, we just need to call the `initialize()` method, passing the location of the local dataset file in the `input_file` attribute.

EdgeSimPy automatically identifies both absolute paths (e.g., `/home/user/my_research/dataset.json`) and relative paths (e.g., `my_research/dataset.json`). In the code below, EdgeSimPy loads a dataset from a local file called `dataset.json`.

Please notice that we must download the `dataset.json` file before calling the `initialize()` method, or it will not work.


In [None]:
!curl https://raw.githubusercontent.com/EdgeSimPy/edgesimpy-tutorials/master/datasets/sample_dataset1.json --output dataset.json

# Creating a Simulator object
simulator = Simulator()

# Loading the dataset from the local "dataset.json" file
simulator.initialize(input_file="dataset.json")

# Displaying some of the objects loaded from the dataset
for edge_server in EdgeServer.all():
    print(f"{edge_server}. CPU Capacity: {edge_server.cpu} cores")

## Loading Datasets from Python Dictionaries

In addition to allowing us to load datasets from external and local files written in JSON format, EdgeSimPy also reads datasets encoded as Python dictionaries. To use that feature, we just need to pass a valid Python dictionary to the `input_file` attribute of the `initialize()` method. In the example below, EdgeSimPy reads a dataset from a Python dictionary containing a couple of users. For simplicity, users only have two attributes: `id` and `coordinates`—regular `User` objects would have other attributes.


In [None]:
# Creating a Python dictionary representing a sample dataset with a couple of users
my_simplified_dataset = {
    "User": [
        {
            "attributes": {
                "id": 1,
                "coordinates": [
                    1,
                    1
                ]
            },
            "relationships": {}
        },
        {
            "attributes": {
                "id": 2,
                "coordinates": [
                    3,
                    3
                ]
            },
            "relationships": {}
        },
        {
            "attributes": {
                "id": 3,
                "coordinates": [
                    5,
                    1
                ]
            },
            "relationships": {}
        },
        {
            "attributes": {
                "id": 4,
                "coordinates": [
                    0,
                    0
                ]
            },
            "relationships": {}
        }
    ]
}

# Creating a Simulator object
simulator = Simulator()

# Loading the dataset from the dictionary "my_simplified_dataset"
simulator.initialize(input_file=my_simplified_dataset)

# Displaying the objects loaded from the dataset
for user in User.all():
    print(f"{user}. Coordinates: {user.coordinates}")