# Toolkit and Data Sources

## Toolkits  
A toolkit is a set of tools and functionalities tailored for specific areas and their data.  
For example, the GIS Buildings toolkit is designed to work with Buildings data and includes functions for analyzing and managing this type of data.

Each toolkit requires one or more datasources to function and operate. Each type of toolkit works with its corresponding data source type.  
For example, the GIS Buildings toolkit works with a GIS Buildings Data Source.  
Data sources are placed inside projects.  
We will soon detail more about data sources in this notebook.

### The Toolkit Structure
Basically, each toolkit is designed as a 3-tier application. In simple words, each toolkit includes 3 layers:  
1) **Data Layer**: Manages loading and parsing the specific data domain.  
2) **Presentation Layer**: Manages the graphs and other types of presentation for the specific domain.  
3) **Analysis Layer**: Manages the analysis and computational functions for the specific domain.

At the end, users will mainly use the second and third layers.  

### Getting Started with Toolkits  
Let's see how we initiate a toolkit and connect it to a project.  
First, we need to import the **toolkitHome** class from Hera:

In [1]:
from hera import toolkitHome

Using the **toolkitHome** class, we can reach and initiate any type of toolkit we want, using the *getToolkit()* function.  
The *getToolkit()* function receives two main arguments:  
1) Toolkit Name — the type of toolkit you want to access.  
2) Project Name — the project you want to connect the toolkit to.

The project name is simply a user-defined string (no strict format).  
The toolkit name must match one of the predefined attributes in `toolkitHome`.  
When specifying a toolkit name, use only names from the following list:

In [2]:
for attr in dir(toolkitHome)[1:-2]:
    if not callable(getattr(toolkitHome, attr)) and not attr.startswith("__") and not 'SAVEMODE' in attr:
        print(f"{attr}: '{getattr(toolkitHome, attr)}'")

EXPERIMENT: 'experiment'
GAUSSIANDISPERSION: 'GaussianDispersion'
GIS_BUILDINGS: 'GIS_Buildings'
GIS_DEMOGRAPHY: 'GIS_Demography'
GIS_LANDCOVER: 'GIS_LandCover'
GIS_RASTER_TOPOGRAPHY: 'GIS_Raster_Topography'
GIS_SHAPES: 'GIS_Shapes'
GIS_TILES: 'GIS_Tiles'
GIS_VECTOR_TOPOGRAPHY: 'GIS_Vector_Topography'
LSM: 'LSM'
METEOROLOGY_HIGHFREQ: 'MeteoHighFreq'
METEOROLOGY_LOWFREQ: 'MeteoLowFreq'
RISKASSESSMENT: 'RiskAssessment'
SIMULATIONS_OPENFOAM: 'OpenFOAM'
SIMULATIONS_WORKFLOWS: 'hermesWorkflows'
WINDPROFILE: 'WindProfile'


For demonstration, let's initiate the GIS Buildings toolkit and connect it to a new project.  
Pay attention to the syntax and how we retrieve the toolkit name:

In [3]:
toolkitName = toolkitHome.GIS_BUILDINGS  ## Same as 'GIS_LandCover'
projectName = "MY_PROJECT"

building_toolkit = toolkitHome.getToolkit(
    toolkitName=toolkitName,
    projectName=projectName
)

By writing *toolkitHome.GIS_BUILDINGS*,  
we basically retrieve the corresponding string for the GIS Buildings toolkit.

In [4]:
toolkitName

'GIS_Buildings'

Now we have a new variable *building_toolkit*, which represents a GIS Buildings toolkit connected to the project "MY_PROJECT".

**Important Note:**  
When initializing a toolkit:
- If no project with the specified name exists, a new empty project will be created.
- If the project already exists, the toolkit will connect to it.

Since the project is currently empty (no data sources added yet), let's set the toolkit aside for now and move on to data sources.

## Data Sources  
A data source is external data needed by a toolkit.  
Each toolkit works with one or more data sources, which provide the information it needs to operate.

Sources can include:  
- URLs
- File paths to files and folders
- Python Classes or Objects<br>
  And more

Datasources also include versioning and metadata.  
Each toolkit knows how to handle its corresponding data source type.

### Data Source Structure

A datasource is represented as a structured JSON object that defines all necessary information about a dataset.  
For example, a GIS Buildings Data Source could look like this:

```json
"GIS_Buildings": {
    "DataSource": {
        "BNTL": {
            "item": {
                "resource": "data/GIS_BUILDING/BNTL-JERUSALEM/JERU-BLDG.shp",
                "dataFormat": "geopandas",
                "desc": {
                    "crs": 2039,
                    "BuildingHeightColumn": "BLDG_HT",
                    "LandHeightColumns": "HT_LAND"
                }
            }
        }
    }
}
```

**Explanation of the Components:**  
- **Toolkit Type (GIS_Buildings)**: Defines which toolkit will use the datasource.  
- **DataSource Section**: Contains named datasource entries (like 'BNTL').  
- **Item Details**:  
  - **resource**: The path to the file.  
  - **dataFormat**: The data format (e.g., geopandas, rasterio).  
  - **desc**: Metadata such as coordinate system, important columns, etc.

### Example- Loading the Toolkit and Adding the Data Source

In [1]:
from hera import toolkitHome

# Set the toolkit and project
toolkitName = toolkitHome.GIS_BUILDINGS
projectName = "MY_PROJECT"

# Initialize the toolkit
building_toolkit = toolkitHome.getToolkit(
    toolkitName=toolkitName,
    projectName=projectName
)

# Define the datasource JSON
datasource_json = {
    "GIS_Buildings": {
        "DataSource": {
            "BNTL": {
                "item": {
                    "resource": "data/GIS_BUILDING/BNTL-JERUSALEM/JERU-BLDG.shp",
                    "dataFormat": "geopandas",
                    "desc": {
                        "crs": 2039,
                        "BuildingHeightColumn": "BLDG_HT",
                        "LandHeightColumns": "HT_LAND"
                }
            }
        }
    }
}
}
# Add the datasource to the toolkit
building_toolkit.addDataSource(datasource_json,"data/GIS_BUILDING/BNTL-JERUSALEM/JERU-BLDG.shp","geopandas")

<Measurements: {
    "_cls": "Metadata.Measurements",
    "projectName": "MY_PROJECT",
    "desc": {
        "toolkit": "Buildings",
        "datasourceName": {
            "GIS_Buildings": {
                "DataSource": {
                    "BNTL": {
                        "item": {
                            "resource": "data/GIS_BUILDING/BNTL-JERUSALEM/JERU-BLDG.shp",
                            "dataFormat": "geopandas",
                            "desc": {
                                "crs": 2039,
                                "BuildingHeightColumn": "BLDG_HT",
                                "LandHeightColumns": "HT_LAND"
                            }
                        }
                    }
                }
            }
        },
        "version": [
            0,
            0,
            1
        ]
    },
    "type": "ToolkitDataSource",
    "resource": "data/GIS_BUILDING/BNTL-JERUSALEM/JERU-BLDG.shp",
    "dataFormat": "geopandas"
}>