# Intro To PyEntr and the ENTR Runtime

This notebook showcases the PyEntr Connection classes, the openoa from_entr() constructor, the openoa schema module, and describes the current implementation and future vision of these software.

## 1. Directly querying the warehouse with PyEntr

PyEntr, which is developed [here](https://github.com/entralliance/py-entr) contains a module, `entr.connection` that helps you connect to the ENTR warehouse and start querying it quickly and easily. PyEntr is pre-installed in the ENTR runtime, so assuming you are within the ENTR Runtime, you can simply run:

In [2]:
import entr.connection

### Querying from the built in Spark warehouse

In [3]:
conn = entr.connection.PySparkEntrConnection()

In [15]:
conn.pandas_query("show tables;")[:8]

Unnamed: 0,namespace,tableName,isTemporary
0,entr_warehouse,dim_asset_reanalysis_dataset,False
1,entr_warehouse,dim_asset_wind_plant,False
2,entr_warehouse,dim_asset_wind_turbine,False
3,entr_warehouse,dim_entr_asset,False
4,entr_warehouse,dim_entr_tag_list,False
5,entr_warehouse,fct_entr_plant_data,False
6,entr_warehouse,fct_entr_reanalysis_data,False
7,entr_warehouse,fct_entr_time_series,False


In [6]:
conn.pandas_query("select * from dim_asset_wind_plant limit 10;")

Unnamed: 0,plant_id,plant_name,latitude,longitude,plant_capacity,number_of_turbines,turbine_capacity
0,1,La Haute Borne,48.452,5.588,8.2,4,2.05


## Using the openoa.PlantData.from_entr constructor

OpenOA, developed [here](https://github.com/NREL/OpenOA), is NREL's open source operational analysis python package for wind power plants. OpenOA is distributed with the ENTR Runtime, and comes pre-installed and configured there. A core component of OpenOA is the PlantData data model. The PlantData data model is implemented in OpenOA as a python class containing several Pandas data frames. More information about the OpenOA PlantData data model can be found in the [OpenOA documentation](http://openoa.readthedocs.io). 

PyEntr provides a PlantData constructor which can load data directly from an ENTR warehouse into the OpenOA PlantData data model for analysis. This function is located at `entr.plantdata.from_entr`, and is automatically attached to `openoa.PlantData.from_entr` using a thin-wrapper method if PyEntr is installed.

In [7]:
import openoa

In [8]:
plant = openoa.PlantData.from_entr("La Haute Borne", schema="MonteCarloAEP")

{'meter': {'columns': ['MMTR_SupWh'], 'freq': ('MS', 'W', 'D', 'H', 'T', 'min', 'S', 'L', 'ms', 'U', 'us', 'N')}, 'curtail': {'columns': ['IAVL_DnWh', 'IAVL_ExtPwrDnWh'], 'freq': ('MS', 'W', 'D', 'H', 'T', 'min', 'S', 'L', 'ms', 'U', 'us', 'N')}, 'reanalysis': {'columns': ['WMETR_HorWdSpd', 'WMETR_AirDen'], 'conditional_columns': {'reg_temperature': ['WMETR_EnvTmp'], 'reg_wind_direction': ['WMETR_HorWdSpdU', 'WMETR_HorWdSpdV']}, 'freq': ('MS', 'W', 'D', 'H', 'T', 'min', 'S', 'L', 'ms', 'U', 'us', 'N')}}
meter
'MMTR.SupWh'
SELECT interval_s, value_type, value_units FROM openoa_revenue_meter_tag_metadata WHERE entr_tag_name in ('MMTR.SupWh');
   interval_s value_type value_units
0  600.000000        sum         kWh
SELECT float(`MMTR.SupWh`) as MMTR_SupWh , date_time as time FROM openoa_revenue_meter WHERE plant_id = 1 ORDER BY time;
curtail
'IAVL.DnWh','IAVL.ExtPwrDnWh'
SELECT interval_s, value_type, value_units FROM openoa_curtailment_and_availability_tag_metadata WHERE entr_tag_name i

In [9]:
plant

PlantData(metadata=PlantMetaData(latitude=48.452, longitude=5.588, capacity=8.2, scada=SCADAMetaData(time='time', WTUR_TurNam='WTUR_TurNam', WTUR_W='WTUR_W', WMET_HorWdSpd='WMET_HorWdSpd', WMET_HorWdDir='WMET_HorWdDir', WTUR_TurSt='WTUR_TurSt', WROT_BlPthAngVal='WROT_BlPthAngVal', WMET_EnvTmp='WMET_EnvTmp', frequency='10T', name='scada', WTUR_SupWh='WTUR_SupWh', col_map={'time': 'time', 'WTUR_TurNam': 'WTUR_TurNam', 'WTUR_W': 'WTUR_W', 'WMET_HorWdSpd': 'WMET_HorWdSpd', 'WMET_HorWdDir': 'WMET_HorWdDir', 'WTUR_TurSt': 'WTUR_TurSt', 'WROT_BlPthAngVal': 'WROT_BlPthAngVal', 'WMET_EnvTmp': 'WMET_EnvTmp', 'WTUR_SupWh': 'WTUR_SupWh'}, col_map_reversed={'time': 'time', 'WTUR_TurNam': 'WTUR_TurNam', 'WTUR_W': 'WTUR_W', 'WMET_HorWdSpd': 'WMET_HorWdSpd', 'WMET_HorWdDir': 'WMET_HorWdDir', 'WTUR_TurSt': 'WTUR_TurSt', 'WROT_BlPthAngVal': 'WROT_BlPthAngVal', 'WMET_EnvTmp': 'WMET_EnvTmp', 'WTUR_SupWh': 'WTUR_SupWh'}, dtypes={'time': <class 'numpy.datetime64'>, 'WTUR_TurNam': <class 'str'>, 'WTUR_W': <c

In [11]:
plant.meter

Unnamed: 0_level_0,MMTR_SupWh
time,Unnamed: 1_level_1
2014-01-01 00:00:00,369.726013
2014-01-01 00:10:00,376.408997
2014-01-01 00:20:00,309.199005
2014-01-01 00:30:00,350.175995
2014-01-01 00:40:00,286.333008
...,...
2015-12-31 23:10:00,147.141006
2015-12-31 23:20:00,194.841995
2015-12-31 23:30:00,180.688004
2015-12-31 23:40:00,149.039001


## Future Directions

On the roadmap, we have several ideas on how to improve the PyEntr connection.

### More flexible schemas

### Directly ingest PySpark dataframes

### How to contribute?

Attend tomorrow's tutorial session!