# Toolkit and Data Sources Documentation  
Welcome to the toolkit and data sources documentation.  
In this tutorial, you will become more familiar with toolkits and data sources, how they are related and how to connect and use them in Projects.

## Toolkits  
Let's start with toolkits. Toolkits is a set of tools and functionalities tailored for specific areas and their data. For example, the GIS Building toolkit is designed to work with Buildings data and includes functions for analyzing and managing this type of data.  
The following list includes the available toolkits the hera provides:  
- Experiment toolkit
- GIS Landcover toolkit
- GIS Buildings toolkit
- GIS Tiles toolkit
- GIS Topography toolkit
- GIS Demography toolkit
- Meterology Low Frequency toolkit
- Meterology High Frequency toolkit
- Riskassessment Agents toolkit
- Wind Profile toolkit
- LSM toolkit
- Open Foam toolkit


Each toolkit requires one or more datasources to function and operate. Each type of toolkit works with its corresponding data sources type. For example, the GIS Building toolkit works with a GIS Building Data source. The data sources are placed inside projects. We will soon detail more about data sources on in this notebook.  

### The Toolkit Structure
Basicly, each toolkit is designed as a 3-tier application. In simple words, each of the toolkits include 3 layers:  
1) **Data Layer**: Manages loading and parsing the specific data domain.
2) **Presentation Layer**: Manages the graphs and other types of presentation for the specific domain.
3) **Analysis Layer**: Manages the analysis and computational functions for the specific domain.

At the end,  users will mainly use the second and third layers.  
It is important to note that while some toolkits may include only one of these layers, this is a rare occurrence.

### Getting Started with Toolkits  
Let's see how we initate a toolkit and connect it to a project.  
First, we need to import the **toolkitHome** class from hera:

In [1]:
from hera import toolkitHome

Using the **toolkitHome** class, we can reach and initiate any type of toolkit we want, using the *getToolkit()* function.  
The *getToolkit* function recieves two main arguments you should know:  
1) Toolkit Name - This is the type of toolkit you want to reach.
2) The project Name - This is the project name you want to connect your toolkit to.

The project name is essentially a string that the user can specify for a new or existing project in the system, with no restrictions on its format.  
The toolkit name, however, must be a specific string defined in the attributes of the ToolkitHome class. When specifying a toolkit name, you should use only strings from the following list:  

In [20]:
for attr in dir(toolkitHome)[1:-2]:
    if not callable(getattr(toolkitHome, attr)) and not attr.startswith("__") and not 'SAVEMODE' in attr:
        print(f"{attr}: '{getattr(toolkitHome, attr)}'")

EXPERIMENT: 'experiment'
GIS_BUILDINGS: 'GIS_Buildings'
GIS_DEMOGRAPHY: 'GIS_Demography'
GIS_LANDCOVER: 'GIS_LandCover'
GIS_RASTER_TOPOGRAPHY: 'GIS_Raster_Topography'
GIS_SHAPES: 'GIS_Shapes'
GIS_TILES: 'GIS_Tiles'
GIS_VECTOR_TOPOGRAPHY: 'GIS_Vector_Topography'
LSM: 'LSM'
METEOROLOGY_HIGHFREQ: 'MeteoHighFreq'
METEOROLOGY_LOWFREQ: 'MeteoLowFreq'
RISKASSESSMENT: 'RiskAssessment'
SIMULATIONS_OPENFOAM: 'OpenFOAM'
SIMULATIONS_WORKFLOWS: 'hermesWorkflows'
WINDPROFILE: 'WindProfile'


For demonstration, let's initate the GIS Building toolkit and connect it to a new project. Pay attention to the syntax and how I reach the toolkit name:

In [22]:
toolkitName = toolkitHome.GIS_BUILDINGS ## Same as 'GIS_LandCover'
projectName = "MY_PROJECT"

building_toolkit = toolkitHome.getToolkit(
                    toolkitName=toolkitName,
                    projectName=projectName)

You can see that by writting *toolkitHome.GIS_BUILDINGS*, I basicly reach the corresponding string of the GIS Building name:

In [23]:
toolkitName

'GIS_Buildings'

Now we have a new variable *building_toolkit* , which is a GIS Building toolkit, inside a new project with the name "MY_PROJECT".  

You will soon learn more about Projects and how they work. For now, here's an important note:
When initializing a toolkit for a project, if no project with the specified name exists, a new empty project will be created. If the project already exists, the toolkit will be connected to it.

Since the project is currently empty with no data sources added, let's set the toolkit aside for now and move on to explaining more about data sources.

## Data Sources  
A datasource is an external data that is needed for the toolkit. Each toolkit works with various data sources, which provide the information it needs to operate. Data sources are the origins of the data used in toolkits. These can include:  
- URLs
- File path for files and folders (usually the case).
- Path to Classes.
- The Data source object it self.
- And More.
  
Datasources also include version and metadata. Each toolkit type knows how to deal with its corresponding data source type.

### Data Source Structure

A datasource is a structured JSON object that defines all the necessary information about a particular dataset. Let's explain its structure and fields using an example.  
A GIS Building Data Source may look like this:  
```json
"GIS_Buildings": {
    "Config": {
        "defaultBuildingDataSource": "BNTL"
    },
    "DataSource": {
        "BNTL": {
            "isRelativePath": "True",
            "item": {
                "resource": "data/GIS_BUILDING/BNTL-JERUSALEM/JERU-BLDG.shp",
                "dataFormat": "geopandas",
                "desc": {
                    "crs": 2039,
                    "BuildingHeightColumn": "BLDG_HT",
                    "LandHeightColumns": "HT_LAND"
                }
            }
        }
    }
}
```
### Explanation of the Components
- **Toolkit Type (GIS_Buildings)** : This represents the type of toolkit that will handle the datasource. Each toolkit corresponds to a specific type of data processing or management.  
    - **Config Section**: Contains configurations relevant to the toolkit. In this example, the defaultBuildingDataSource specifies which datasource should be used as the default.  
    - **DataSource Section**: This section contains JSON formats (in this example one), each identified by a unique name (e.g., 'BNTL').  
    - **Datasource Details (BNTL)**:  
        - **isRelativePath**: Indicates whether the path to the datasource is relative to the repository.
        - **item**: A JSON which contains details about the datasource.
            - **resource**: The path to the data file (e.g., a shapefile in this case).
            - **dataFormat**: Specifies the format of the data (e.g., geopandas).
            - **desc**: A JSON object holding additional metadata, which can vary depending on the datasource. The fields vary between each datasource.