# Toolkit and Data Sources Documentation  
Welcome to the toolkit and data sources documentation.  
In this tutorial, you will become more familiar with toolkits and data sources, how they are related, and how to connect and use them in Projects.

## Toolkits  
Let's start with toolkits. A toolkit is a set of tools and functionalities tailored for specific areas and their data.  
For example, the GIS Buildings toolkit is designed to work with Buildings data and includes functions for analyzing and managing this type of data.

The following list includes the available toolkits Hera provides:  
- Experiment toolkit
- GIS Landcover toolkit
- GIS Buildings toolkit
- GIS Tiles toolkit
- GIS Topography toolkit
- GIS Demography toolkit
- Meteorology Low Frequency toolkit
- Meteorology High Frequency toolkit
- Riskassessment Agents toolkit
- Wind Profile toolkit
- LSM toolkit
- OpenFOAM toolkit

Each toolkit requires one or more datasources to function and operate. Each type of toolkit works with its corresponding data source type.  
For example, the GIS Buildings toolkit works with a GIS Buildings Data Source.  
Data sources are placed inside projects.  
We will soon detail more about data sources in this notebook.

### The Toolkit Structure
Basically, each toolkit is designed as a 3-tier application. In simple words, each toolkit includes 3 layers:  
1) **Data Layer**: Manages loading and parsing the specific data domain.  
2) **Presentation Layer**: Manages the graphs and other types of presentation for the specific domain.  
3) **Analysis Layer**: Manages the analysis and computational functions for the specific domain.

At the end, users will mainly use the second and third layers.  
It is important to note that while some toolkits may include only one or two of these layers, this is rare.

### Getting Started with Toolkits  
Let's see how we initiate a toolkit and connect it to a project.  
First, we need to import the **toolkitHome** class from Hera:

In [None]:
from hera import toolkitHome

Using the **toolkitHome** class, we can reach and initiate any type of toolkit we want, using the *getToolkit()* function.  
The *getToolkit()* function receives two main arguments:  
1) Toolkit Name — the type of toolkit you want to access.  
2) Project Name — the project you want to connect the toolkit to.

The project name is simply a user-defined string (no strict format).  
The toolkit name must match one of the predefined attributes in `toolkitHome`.  
When specifying a toolkit name, use only names from the following list:

In [None]:
for attr in dir(toolkitHome)[1:-2]:
    if not callable(getattr(toolkitHome, attr)) and not attr.startswith("__") and not 'SAVEMODE' in attr:
        print(f"{attr}: '{getattr(toolkitHome, attr)}'")

For demonstration, let's initiate the GIS Buildings toolkit and connect it to a new project.  
Pay attention to the syntax and how we retrieve the toolkit name:

In [None]:
toolkitName = toolkitHome.GIS_BUILDINGS  ## Same as 'GIS_LandCover'
projectName = "MY_PROJECT"

building_toolkit = toolkitHome.getToolkit(
    toolkitName=toolkitName,
    projectName=projectName
)

By writing *toolkitHome.GIS_BUILDINGS*,  
we basically retrieve the corresponding string for the GIS Buildings toolkit.

In [None]:
toolkitName

Now we have a new variable *building_toolkit*, which represents a GIS Buildings toolkit connected to the project "MY_PROJECT".

**Important Note:**  
When initializing a toolkit:
- If no project with the specified name exists, a new empty project will be created.
- If the project already exists, the toolkit will connect to it.

Since the project is currently empty (no data sources added yet), let's set the toolkit aside for now and move on to data sources.

## Data Sources  
A data source is external data needed by a toolkit.  
Each toolkit works with one or more data sources, which provide the information it needs to operate.

Sources can include:  
- URLs
- File paths to files and folders
- Python Classes or Objects
- And more

Datasources also include versioning and metadata.  
Each toolkit knows how to handle its corresponding data source type.

### Data Source Structure

A datasource is represented as a structured JSON object that defines all necessary information about a dataset.  
For example, a GIS Buildings Data Source could look like this:

```json
"GIS_Buildings": {
    "DataSource": {
        "BNTL": {
            "item": {
                "resource": "data/GIS_BUILDING/BNTL-JERUSALEM/JERU-BLDG.shp",
                "dataFormat": "geopandas",
                "desc": {
                    "crs": 2039,
                    "BuildingHeightColumn": "BLDG_HT",
                    "LandHeightColumns": "HT_LAND"
                }
            }
        }
    }
}
```

**Explanation of the Components:**  
- **Toolkit Type (GIS_Buildings)**: Defines which toolkit will use the datasource.  
- **DataSource Section**: Contains named datasource entries (like 'BNTL').  
- **Item Details**:  
  - **resource**: The path to the file.  
  - **dataFormat**: The data format (e.g., geopandas, rasterio).  
  - **desc**: Metadata such as coordinate system, important columns, etc.

## Understanding the Role of JSON and Data Sources

In Hera, toolkits manage complex data, but they **do not inherently know** where the data is or how it is structured.  
That's why we need a **JSON description** that instructs the toolkit:

- Where the data file is located (`resource`).
- What format the data is in (`dataFormat`).
- How to interpret it (`desc`).

### Why is JSON Necessary?

The toolkit can't guess:
- Which file to open.
- Which library to use.
- How to map the data fields.

The JSON acts as a **contract** — a blueprint that tells the toolkit how to interact with the data properly.

### Where Does the Data Actually Come From?

The **Data Source** is the operational part that reads the JSON instructions and actually:
- Opens the specified file.
- Loads it using the correct handler (e.g., Rasterio, GeoPandas).
- Parses it according to the given metadata.

If the `resource` is missing or wrong, loading will fail!

### Simple Example

Suppose we have an elevation file `/home/user/elevation/N33E035.hgt`,  
the corresponding JSON would be:

```json
{
    "GIS_Topography": {
        "DataSource": {
            "SRTMGL1": {
                "item": {
                    "resource": "/home/user/elevation/N33E035.hgt",
                    "dataFormat": "rasterio",
                    "desc": {
                        "crs": 4326
                    }
                }
            }
        }
    }
}
```

The toolkit will:
- Open `/home/user/elevation/N33E035.hgt`.
- Use Rasterio to read it.
- Understand that it uses WGS84 coordinates (EPSG:4326).

## Example: GIS Topography Toolkit and Elevation Data Source

Now, let's see a complete example using the **GIS Topography Toolkit**.

### Preparing the Data Source

We have a real elevation file `N33E035.hgt` containing SRTM data for a geographic region.

The JSON would be:

```json
{
    "GIS_Topography": {
        "DataSource": {
            "SRTMGL1": {
                "item": {
                    "resource": "path/to/N33E035.hgt",
                    "dataFormat": "rasterio",
                    "desc": {
                        "crs": 4326
                    }
                }
            }
        }
    }
}
```

### Loading the Toolkit and Adding the Data Source

In [None]:
from hera import toolkitHome

# Set the toolkit and project
toolkitName = toolkitHome.GIS_TOPOGRAPHY
projectName = "MY_TOPO_PROJECT"

# Initialize the toolkit
topography_toolkit = toolkitHome.getToolkit(
    toolkitName=toolkitName,
    projectName=projectName
)

# Define the datasource JSON
datasource_json = {
    "GIS_Topography": {
        "DataSource": {
            "SRTMGL1": {
                "item": {
                    "resource": "path/to/N33E035.hgt",
                    "dataFormat": "rasterio",
                    "desc": {
                        "crs": 4326
                    }
                }
            }
        }
    }
}

# Add the datasource to the toolkit
topography_toolkit.addDatasource(datasource_json)

### Querying Elevation Data

In [None]:
# Get elevation at latitude 33.8N and longitude 35.2E
elevation = topography_toolkit.getPointElevation(lat=33.8, lon=35.2)
print(f"Elevation at (33.8N, 35.2E): {elevation} meters")

### Summary

Toolkits in Hera make it easy to manage complex datasets.  
By defining clear JSON Data Sources, you can connect external files and perform advanced queries and analysis — all with clean, simple Python code.