# VAFT libary and VEST Database guide

This notebook is about how to insatall VAFT libarary and access data in VEST Database. It is just for temporary use. And will be given some changes. Currently not all data is stored in the HSDS so you might just have to create ods by yourself.
To use this first install h5pyd, h5py/ and omas.(pip install h5pyd h5py omas)  
Then just follow this [link](https://satelite2517.github.io/vest/guide/Installation/#configuration) you do not need to do the other thing, just make configuration step in your command line.

And also the we have to use omas source code since the change of omas code is not yet.

## Background Overview

The VEST integrated data analysis platform is structured according to FAIR (Findable, Accessible, Interoperable, and Reusable) data principles and the IMAS (Integrated Modeling and Analysis Suite) standards, offering a unified environment for experimental and simulation data management, analysis, and visualization. The core components include:

1. **VEST-IMAS Database**
2. **VAFT Python Library**
3. **Data Pipeline**

### 1. VEST IMAS Database

The database is built on OMAS (Ordered Multidimensional Array Structure) interfacing and uses the HSDS (Highly Scalable Data Service) backend.

* **Data Storage**: HDF5 format via HSDS, enabling parallel I/O and cloud compatibility.
* **Data Structure**: Compatible with IMAS Interface Data Structure (IDS) through OMAS, ensuring standardization and interoperability.
* **Authentication and Access**: Managed through VAFT Python library for remote and flexible querying.

### 2. VAFT Python Library

VAFT (Versatile Analysis Framework for Tokamak) python libray provides modular and extensible Python routines for data access, analysis, modeling, and visualization. 
Its primary modules include:

#### Database Module

* Accesses both legacy SQL and HSDS-based databases.
* Performs dynamic queries aligned with IMAS IDS.

#### Formula Module

* Offers reusable formulae for equilibrium parameters, tokamak circuit equations, and Green’s functions.
* Interacts with OMAS objects to provide derived parameters.

#### Plot Module

* Enables versatile visualization, including scatter plots, time-series, and spatial mappings.
* Facilitates comparative analysis between experiments and simulations.

#### Process Module

* Manages preprocessing of diagnostic data, including outlier detection, signal conditioning, and indexing based on plasma characteristics (e.g., plasma current onset).

#### OMAS Module

* Simplifies interactions with OMAS APIs and IDS structures.
* Manages derived parameter updates, time conversions, and API abstractions.

#### Machine Mapping Module

* Translates raw diagnostic data into structured IDS-compliant ODS objects.

#### Code Module

* Interfaces with simulation tools (EFIT, CHEASE, GPEC).
* Handles simulation execution and data conversion to IMAS formats.

### 3. Data Processing Pipeline

Using Snakemake workflow management system, VEST integrated data analysis platform orchestrates the data pipeline, ensuring reproducibility and traceability across multiple processing steps:
* **Diagnostics Processing**: Automated trimming, calibration, integration, and storage of routinely acquired signals. Also automatic update newly uploaded diagnostic file data.
* **Electromagnetic Modeling**: Eddy current calculation through 2D geometry representation and response matrix computation.
* **Magnetic Equilibrium Reconstruction (EFIT)**: Equilibrium fitting using magnetic diagnostics and eddy current profiles.
* **Equilibrium Refinement (CHEASE)**: Refinement of EFIT results for enhanced accuracy and convergence.
* **Linear MHD Stability Analysis (GPEC - DCON/RDCON/STRIDE)**: Evaluation of linear stability of plasma ($\delta W$ using DCON, $\Delta^\prime using RDCON, STRIDE Code).
* **Core Profile Fitting**: Kinetic diagnostics mapping onto equilibrium profiles for core electron temperature and density estimation.

Also based on these dataset, data mining conducted to extract insights from dataset.

## Install VAFT Library and Connect to DB

This guide explains how to set up and install the VAFT library from source. It includes installing prerequisites, cloning the repository, and configuring the environment for either development or regular usage.

### Prerequisites

Before installing the VAFT library, ensure the following tools are installed on your system:

* **Anaconda**: [Installation Guide](https://www.anaconda.com/docs/getting-started/anaconda/install)
* **Git**: [Download Git](https://git-scm.com/downloads)

### Clone the Repository

Open a terminal application (Terminal, Anaconda Shell, PowerShell, PuTTY, etc.) and execute the following commands:

```bash
cd {desired_directory}
git clone https://github.com/VEST-Tokamak/vaft.git
```

### Install Required Packages and VAFT Library

Navigate to the cloned repository directory and create a new Anaconda environment:

```bash
cd vaft
conda create --name vaft python=3.10
conda activate vaft
```

Install the required dependencies:

```bash
pip install -r requirements.txt
pip install h5pyd==0.20.0 --no-deps  # Resolves version conflict (safe for usage)
```

Install the VAFT library:

```bash
pip install .
```

For development purposes, use editable mode:

```bash
pip install -e .
```

### Configure the Environment

Run the configuration tool:

```bash
hsconfigure
```

You will see the prompt: "Enter new values or accept defaults in brackets with Enter."

Provide the following values:

```
Server endpoint - http://147.46.36.244:5101
Username - reader
Password - test
```

If the configuration is successful, you will see the message:

```
Testing connection... connection ok
```


## Checking the connection with the VEST database

If the below answers `true`, you're ready to use the data. If you get `false` instead, then check out the guide in the link above and chekc the connection guide page 

In [7]:
import vaft
from omas import *
vaft.database.is_connect()

INFO:root:GET: http://147.46.36.145:5101/about [None] timeout: (10, 1000)
INFO:root:status: 200
INFO:root:GET: http://147.46.36.145:5101/about [None] timeout: (10, 1000)
INFO:root:status: 200


True

## Get data from VEST database and Save it to your own DB

Firstly you can get the original data from VEST system, in the public folder. (You can also save and load the data in your own folder)

- VEST.database.load(`variable name of ods`, `shot number`)
- VEST.database.save(`variable name of ods`, `shot number`, `filname` = None, `env`=server)
    - if you want to use your own filename the user filename otherwise the file will be save with default name `shotnumber.h5`.
    - Also if you want to save the data to local the set the `env` to 'local'


note that default public database is named 'public'

In [2]:
shotnumbers=vaft.database.exist_file('public')

INFO:root:GET: http://147.46.36.145:5101/ [/public] timeout: (10, 1000)
INFO:root:status: 200
INFO:root:got domain json: 444 bytes
INFO:root:domain_json: {'class': 'folder', 'owner': 'admin', 'created': 1719987644.8596134, 'limits': {'min_chunk_size': 1048576, 'max_chunk_size': 4194304, 'max_request_size': 104857600}, 'compressors': ['blosclz', 'lz4', 'lz4hc', 'gzip', 'zstd', 'deflate'], 'version': '0.9.0.alpha0', 'lastModified': 1719987919.0932884, 'hrefs': [{'rel': 'self', 'href': 'http://147.46.36.145/?domain=/public'}, {'rel': 'acls', 'href': 'http://147.46.36.145/acls?domain=/public'}]}
INFO:root:GET: http://147.46.36.145:5101/domains [/public/] timeout: (10, 1000)
INFO:root:status: 200
INFO:root:GET: http://147.46.36.145:5101/domains [/public/] timeout: (10, 1000)
INFO:root:status: 200
INFO:root:GET: http://147.46.36.145:5101/domains [/public/] timeout: (10, 1000)
INFO:root:status: 200
INFO:root:GET: http://147.46.36.145:5101/domains [/public/] timeout: (10, 1000)
INFO:root:statu

Total number of shots: 3562
Shot numbers: [114, 12345, 38000, 38012, 38019, 38025, 38026, 38027, 38028, 38029, 38030, 38032, 38034, 38035, 38036, 38038, 38039, 38041, 38042, 38043, 38044, 38045, 38046, 38047, 38048, 38049, 38050, 38052, 38053, 38055, 38057, 38058, 38059, 38060, 38061, 38062, 38063, 38064, 38065, 38066, 38067, 38068, 38090, 38091, 38092, 38093, 38094, 38095, 38096, 38097, 38098, 38099, 38100, 38101, 38104, 38105, 38107, 38167, 38171, 38173, 38178, 38179, 38180, 38184, 38187, 38188, 38189, 38190, 38191, 38192, 38193, 38194, 38195, 38196, 38197, 38198, 38199, 38200, 38201, 38202, 38203, 38204, 38205, 38206, 38207, 38208, 38210, 38211, 38213, 38214, 38215, 38216, 38217, 38225, 38228, 38229, 38230, 38231, 38232, 38233, 38234, 38235, 38236, 38238, 38239, 38240, 38243, 38244, 38245, 38246, 38247, 38248, 38249, 38250, 38251, 38252, 38253, 38254, 38255, 38256, 38257, 38258, 38259, 38260, 38261, 38264, 38266, 38267, 38269, 38306, 38308, 38309, 38310, 38311, 38312, 38313, 38314, 

You can directly save your file to your directory and find out what kind of h5 is in your directory.

In [3]:
ods=vaft.database.load(39915,'public')
# ods=vaft.database.load(39915) # same

INFO:root:HttpConn.init - timeout = 180
INFO:root:GET: http://147.46.36.145:5101/ [/public/39915.h5] timeout: 180
INFO:root:status: 200
INFO:root:got domain json: 1927 bytes
INFO:root:GET: http://147.46.36.145:5101/datasets/d-905c8edc-c3018d26-ac7a-76208f-722a8e [/public/39915.h5] timeout: 180
INFO:root:status: 200
INFO:root:selection: start (0,) stop [2097152] step (1,)
INFO:root:page_stop: 2097152
INFO:root:page_mshape: (2097152,)
INFO:root:GET: http://147.46.36.145:5101/datasets/d-905c8edc-c3018d26-ac7a-76208f-722a8e/value [/public/39915.h5] timeout: 180
INFO:root:status: 200
INFO:root:retrieved 512 http_chunks  2097152 total bytes
INFO:root:binary response, 2097152 bytes
INFO:root:got arr: (2097152,), cleaning up shape!
INFO:root:selection: start (81788928,) stop [81870184] step (1,)
INFO:root:page_stop: 81870184
INFO:root:page_mshape: (81256,)
INFO:root:GET: http://147.46.36.145:5101/datasets/d-905c8edc-c3018d26-ac7a-76208f-722a8e/value [/public/39915.h5] timeout: 180
INFO:root:st

Successfully loaded ODS data for shot: 39915


You can load multiple shots using a list of shot numbers, and you can also combine their data into a single ODC (Ordered Data Collection).
ODC provides a convenient way to compare multiple experimental shots.


In [8]:
[ods1, ods2, ods3]=vaft.database.load([39915, 39916, 39917], 'public')
odc=ODC()
odc['0'] = ods1
odc['1'] = ods2
odc['2'] = ods3

INFO:root:HttpConn.init - timeout = 180
INFO:root:GET: http://147.46.36.145:5101/ [/public/39915.h5] timeout: 180
INFO:root:status: 200
INFO:root:got domain json: 1927 bytes
INFO:root:GET: http://147.46.36.145:5101/datasets/d-905c8edc-c3018d26-ac7a-76208f-722a8e [/public/39915.h5] timeout: 180
INFO:root:status: 200
INFO:root:selection: start (0,) stop [2097152] step (1,)
INFO:root:page_stop: 2097152
INFO:root:page_mshape: (2097152,)
INFO:root:GET: http://147.46.36.145:5101/datasets/d-905c8edc-c3018d26-ac7a-76208f-722a8e/value [/public/39915.h5] timeout: 180
INFO:root:status: 200
INFO:root:retrieved 512 http_chunks  2097152 total bytes
INFO:root:binary response, 2097152 bytes
INFO:root:got arr: (2097152,), cleaning up shape!
INFO:root:selection: start (81788928,) stop [81870184] step (1,)
INFO:root:page_stop: 81870184
INFO:root:page_mshape: (81256,)
INFO:root:GET: http://147.46.36.145:5101/datasets/d-905c8edc-c3018d26-ac7a-76208f-722a8e/value [/public/39915.h5] timeout: 180
INFO:root:st

Successfully loaded ODS data for shot: 39915


INFO:root:selection: start (10485760,) stop [12582912] step (1,)
INFO:root:page_stop: 12582912
INFO:root:page_mshape: (2097152,)
INFO:root:GET: http://147.46.36.145:5101/datasets/d-d2e1f193-477d81c0-a1f1-2dcec4-189d1c/value [/public/39916.h5] timeout: 180
INFO:root:status: 200
INFO:root:retrieved 512 http_chunks  2097152 total bytes
INFO:root:binary response, 2097152 bytes
INFO:root:got arr: (2097152,), cleaning up shape!
INFO:root:selection: start (12582912,) stop [14680064] step (1,)
INFO:root:page_stop: 14680064
INFO:root:page_mshape: (2097152,)
INFO:root:GET: http://147.46.36.145:5101/datasets/d-d2e1f193-477d81c0-a1f1-2dcec4-189d1c/value [/public/39916.h5] timeout: 180
INFO:root:status: 200
INFO:root:retrieved 512 http_chunks  2097152 total bytes
INFO:root:binary response, 2097152 bytes
INFO:root:got arr: (2097152,), cleaning up shape!
INFO:root:selection: start (8388608,) stop [10485760] step (1,)
INFO:root:page_stop: 10485760
INFO:root:page_mshape: (2097152,)
INFO:root:GET: http:

Successfully loaded ODS data for shot: 39916


INFO:root:retrieved 512 http_chunks  2097152 total bytes
INFO:root:binary response, 2097152 bytes
INFO:root:got arr: (2097152,), cleaning up shape!
INFO:root:selection: start (6291456,) stop [8388608] step (1,)
INFO:root:page_stop: 8388608
INFO:root:page_mshape: (2097152,)
INFO:root:GET: http://147.46.36.145:5101/datasets/d-a06f6836-23b39f60-5729-c710d7-00c499/value [/public/39917.h5] timeout: 180
INFO:root:status: 200
INFO:root:retrieved 512 http_chunks  2097152 total bytes
INFO:root:binary response, 2097152 bytes
INFO:root:got arr: (2097152,), cleaning up shape!
INFO:root:selection: start (10485760,) stop [12582912] step (1,)
INFO:root:page_stop: 12582912
INFO:root:page_mshape: (2097152,)
INFO:root:GET: http://147.46.36.145:5101/datasets/d-a06f6836-23b39f60-5729-c710d7-00c499/value [/public/39917.h5] timeout: 180
INFO:root:status: 200
INFO:root:retrieved 512 http_chunks  2097152 total bytes
INFO:root:binary response, 2097152 bytes
INFO:root:got arr: (2097152,), cleaning up shape!
INF

Successfully loaded ODS data for shot: 39917
Successfully loaded a list of ODS data


In [9]:
# Authentication saving is now restricted to admin users only.
vaft.database.save(ods, 39915)

INFO:root:GET: http://147.46.36.145:5101/about [None] timeout: (10, 1000)
INFO:root:status: 200
INFO:root:GET: http://147.46.36.145:5101/about [None] timeout: (10, 1000)
INFO:root:status: 200
INFO:root:GET: http://147.46.36.145:5101/about [None] timeout: (10, 1000)
INFO:root:status: 200
