# 00 Calculate Correlation Matrix
This notebooks provides functionality to calculate correlation metrics for all EntryGroups currently available to metacatalog.

### Data available:

Status: 08.09.2021
1. LTZ
2. DWD
3. LUBW: ImmutableResultSet -> calculate correlation for HALF of the Entries! (but all data is included, merging)
4. Buehlot
5. Eddy
6. Sapflow
7. Lysimeter

In [1]:
from metacatalog import ext, api
from metacatalog_corr import metrics, extension, manage

# import thesis_functions.py
import sys
sys.path.append('../')

import thesis_functions

#### Create database session:

In [2]:
# Local
CONNECTION = 'mc_corr_final'

In [3]:
session = api.connect_database(CONNECTION)
print(session.bind)

Engine(postgresql://postgres:***@localhost:5432/mc_corr_final)


#### Activate extension if not done before:

In [4]:
if 'corr' not in ext.EXTENSIONS.keys():
    ext.activate_extension('corr', 'metacatalog_corr.extension', 'CorrExtension')

#### Populate defaults in table CorrelationMetric
Always run when a new metric was implemented.

In [5]:
manage.install(CONNECTION)

# Upload Data

In [6]:
UPLOAD = False

  return warn(


All available data can be uploaded using the following cell.  
A metacatalog database connection named **mc_corr** must be saved and available before uploads are performed.  
Note that the upload notebooks in *./upload_scripts/* have been heavily modified to run within this notebook. Some paths have been changed and most of the output has been suppressed to make the output readable in this notebook. The original versions of the upload scripts can be found at https://github.com/VForWaTer/scripts (private).

This is an attempt to make the data upload as simple and easy as possible for anyone who wants to follow the practical part of the master thesis and may have less experience with metacatalog.

In [7]:
# LTZ and DWD
if UPLOAD and False:
    %run ./upload_scripts/ltz_dwd/upload_ltz_dwd.ipynb

# LUBW
if UPLOAD and False:
    %run ./upload_scripts/lubw_gauge/upload_lubw.ipynb
    
# Buehlot
if UPLOAD and False:
    %run ./upload_scripts/buehlot/upload_buehlot_kit.ipynb
    
# Eddy
if UPLOAD and False:
    %run ./upload_scripts/eddy/uploading_eddy.ipynb
    
# Sapflow
if UPLOAD and False:
    %run ./upload_scripts/sap_flow_upload/uploading_sap_flow.ipynb
    
# Lysimeter
if UPLOAD and False:
    %run ./upload_scripts/lysimeter/Fendt1.ipynb
    %run ./upload_scripts/lysimeter/Fendt2.ipynb
    %run ./upload_scripts/lysimeter/Fendt3.ipynb
    %run ./upload_scripts/lysimeter/Grasswang.ipynb

# EntryGroups

In [7]:
entry_groups = []
entry_groups.extend(api.find_group(session, type=1, title='LTZ Augustenberg'))
entry_groups.extend(api.find_group(session, type=1, title='DWD station Rheinstetten'))
entry_groups.extend(api.find_group(session, type=1, title='Bühlot Dataset'))
entry_groups.extend(api.find_group(session, type=1, title='Sap Flow - Hohes Holz'))

# LUBW gauge network: Split datasets -> get result set to merge Split datasets
entry_groups.extend(api.find_group(session, type=1, title='LUBW gauge network', as_result=True))

entry_groups.extend(api.find_group(session, type=2, title='*Eddy*'))
entry_groups.extend(api.find_group(session, type=4))


for g in entry_groups:
    if str(type(g)) == "<class 'metacatalog.util.results.ImmutableResultSet'>":
        print(g.group.title)
    else:
        print(g.title)

LTZ Augustenberg
DWD station Rheinstetten
Bühlot Dataset
Sap Flow - Hohes Holz
LUBW gauge network
Fendt dataset: Eddy covariance data
Fendt 1 TERENO preAlpine Observatory / SUSALPS
Fendt 2 TERENO preAlpine Observatory / SUSALPS
Fendt 3 TERENO preAlpine Observatory / SUSALPS
Grasswang TERENO preAlpine Observatory / SUSALPS


In [8]:
# save all grouped Entries to dict, use for calculation later on
entries_dict = {}
entries_dict['LTZ'] = entry_groups[0].entries
entries_dict['DWD'] = entry_groups[1].entries
entries_dict['Bühlot'] = entry_groups[2].entries
entries_dict['SapFlowHoH'] = entry_groups[3].entries
entries_dict['LUBW'] = entry_groups[4]._members
entries_dict['FendtEddy'] = entry_groups[5].entries
entries_dict['Fendt1'] = entry_groups[6].entries
entries_dict['Fendt2'] = entry_groups[7].entries
entries_dict['Fendt3'] = entry_groups[8].entries
entries_dict['Grasswang'] = entry_groups[9].entries
entries_dict

{'LTZ': [<metacatalog.models.entry.Entry at 0x7ff2e22c0fa0>,
  <metacatalog.models.entry.Entry at 0x7ff305812fa0>,
  <metacatalog.models.entry.Entry at 0x7ff30584c2b0>,
  <metacatalog.models.entry.Entry at 0x7ff3058773a0>,
  <metacatalog.models.entry.Entry at 0x7ff2e23be250>,
  <metacatalog.models.entry.Entry at 0x7ff2e23fdf40>],
 'DWD': [<metacatalog.models.entry.Entry at 0x7ff2e25fd910>,
  <metacatalog.models.entry.Entry at 0x7ff2e28285b0>,
  <metacatalog.models.entry.Entry at 0x7ff2e2853250>,
  <metacatalog.models.entry.Entry at 0x7ff2e2892ac0>,
  <metacatalog.models.entry.Entry at 0x7ff2e28fa340>,
  <metacatalog.models.entry.Entry at 0x7ff2e2964a30>,
  <metacatalog.models.entry.Entry at 0x7ff2e2974130>,
  <metacatalog.models.entry.Entry at 0x7ff2e29f6250>,
  <metacatalog.models.entry.Entry at 0x7ff2e2a1be20>,
  <metacatalog.models.entry.Entry at 0x7ff2e2a4f070>,
  <metacatalog.models.entry.Entry at 0x7ff2e2aa5130>,
  <metacatalog.models.entry.Entry at 0x7ff2e2afa040>,
  <metacatalo

In [9]:
# LUBW: delete entries with no datasource
no_datasource = []

for e in entries_dict['LUBW']:
    if (e._members[0].datasource is None or e._members[1].datasource is None):
        no_datasource.append(e)
        entries_dict['LUBW'].remove(e)
        
print(f"LUBW ImmutableResultSets without datasource: {len(no_datasource)}")

LUBW ImmutableResultSets without datasource: 10


### Test: Upload was successful and complete?

In [9]:
# number of entries in entry_groups
i = 0
for g in entry_groups:
    if str(type(g)) == "<class 'metacatalog.util.results.ImmutableResultSet'>":
        i+= len(g._members) * 2 # 2 Entries / ImmutableResultSet
    else:
        i += len(g.entries)
    
# number of all entries in metacatalog
j = len(api.find_entry(session, include_partial=True))

# number of entries expected (09.09.2021)
k = 1153

# (i+2): 2 Entries are removed from the EntryGroups -> Bühlot Kernel Crash
if (i+2) == j == k:
    print('Upload successful: \n Number of Entries in the database equals the number of Entries expected (%d), all Entries are included in the EntryGroups.' % k)
else:
    print('Check upload: \n Expected %d Entries, database contains %d Entries, from which %d are contained in one of the EntryGroups' % (k,j,i))

Check upload: 
 Expected 1153 Entries, database contains 1153 Entries, from which 1133 are contained in one of the EntryGroups


# Calculate metrics

In [10]:
# these metrics are tested and should work
metrics = ['pearson', 'spearman', 'dcor', 'mic', 'kendall_tau', 'weighted_tau', 
           'hoeffdings_d', 'perc_bend', 'biweight_mid', 'shepherd',
           'conditional_entropy', 'mutual_info', 'js_divergence', 'js_distance']

# not normalized metrics
not_norm_metrics = ['cross_entropy', 'kullback_leibler']

# these metrics are tested, produce results (?) but are very very slow!
slow_metrics = ['somers_d', 'skipped']

# select metrics: final master thesis
select_metrics = ['pearson', 'spearman', 'dcor', 'mic', 'kendall_tau', 'mutual_info', 'js_distance']

# ToDo
## Eddy Cov
15.11.21: eventuell Eddy Covariance nochmal neu berechnen -> np.hstack() hat gefehlt, dadurch cond_entropy komische Werte  
Generell alle Entries mit n-dim datasource checken

## Conditional Entropy:
Werte >> 1: con_entropy_normiert = cond_entropy / entropy(x) ->  
cond_entropy = -1.269206961751479e-12  
entropy(x) = -1.6017132519074597e-15  
--> **cond_entropy_normiert = 792.405856816129**  

# Liste
- [ ] 1) left: LTZ berechnen
- [ ] 2) left: DWD berechnen
- [ ] 3) left: Bühlot berechnen -> Eddy table -> Kernel Crash -> erstmal weggelassen
- [ ] 4) left: Sap Flow Hohes Holz -> Sap Flow - Sap Flow, id 1075 weggelassen (Kernel Crash)
- [ ] 5) left: LUBW berechnen
- [ ] 6) left: Fendt 1 berechnen
- [ ] 7) left: Fendt 2 berechnen
- [ ] 8) left: Fendt 3 berechnen
- [ ] 9) left: Grasswang berechnen
- [ ] LUWB: ImmutableResultSet Implementation hat nicht funktioniert -> merge hat gefehlt -> LUBW 1,2,3,4 neu berechnen


## Dataset extents:
- **LTZ**: 2007 - 2011
- **DWD**: 2008 - 2020
- **Bühlot**: 2012 - 2021
- **Sap Flow HoH**: 2015 - 2015
- **LUBW**: 1826 - 2018, strongly varying
- **Fendt Eddy**: 2014 - 2018
- **Fendt 1 Tereno**: 2012 - 2014 (daily)
- **Fendt 2 Tereno**: 2012 - 2014 (daily)
- **Fendt 3 Tereno**: 2012 - 2014 (daily)
- **Grasswang Tereno**: 2012 - 2014 (daily)

#### Function to calculate table correlation_matrix for one left group and a list of right groups:

In [12]:
from metacatalog.models import EntryGroup, Entry
from metacatalog.util.results import ImmutableResultSet

def calculate_corr_matrix(left_group, right_groups, metrics, identifier):
    """
    Function to calculate correlation metrics for left_group and all groups in right_groups.
    
    Parameters
    ----------
    left_entries : list of Entries or ImmutableResultSets
        e.g. EntryGroup.entries or ImmutableResultSet._members
    right_groups : list of list of Entries or ImmutableResultSets
        e.g. [EntryGroup.entries, EntryGroup.entries, ImmutableResultSet._members]
    metrics : list of str
        All metrics for calculation (metric.symbol)
    identifier : str
        Identifier in table correlation_matrix
    """
    for right_group in right_groups:
        # left: ImmutableResultSets, right: ImmutableResultSets
        if (type(left_group[0]) == ImmutableResultSet and type(right_group[0]) == ImmutableResultSet): 
            print(left_group[0].group.title, '-', right_group[0].group.title)
        
        # left: Entries, right: ImmutableResultSets
        elif (type(left_group[0]) == Entry and type(right_group[0]) == ImmutableResultSet): 
            print(left_group[0].associated_groups[0].title, '-', right_group[0]._members[0].associated_groups[1].title)  
        
        # left: ImmutableResultSets, right: Entries
        elif (type(left_group[0]) == ImmutableResultSet and type(right_group[0]) == Entry): 
            print(left_group[0]._members[0].associated_groups[1].title, '-', right_group[0].associated_groups[0].title)  
        
        # left: Entries, right: Entries
        elif (type(left_group[0]) == Entry and type(right_group[0]) == Entry):
            print(left_group[0].associated_groups[0].title, '-', right_group[0].associated_groups[0].title)  

        for left_entry in left_group:
            extension.index_correlation_matrix(left_entry, 
                                               right_group,
                                               metrics=metrics,
                                               harmonize=True, 
                                               identifier=identifier,
                                               if_exists='replace', commit=True, verbose=True)

# LTZ calculation
- **left**:
 - **LTZ**: 2007 - 2011

- **right**:
 - **LTZ**: 2007 - 2011
 - **DWD**: 2008 - 2020, without daily resolutions (harmonize data: LTZ only at midnight in this case)

Start: Saturday, 22.01.2022, 20:05 

In [13]:
# mi und jsd nachberechnen
#select_metrics = ['mutual_info', 'js_distance']

# DWD: exclude entries with daily resolution
dwd_entries = [e for e in entries_dict['DWD'] if e.datasource.temporal_scale.resolution not in ('P1DT0H0M0S')]

# list of right groups
right_groups = [entries_dict['LTZ'], dwd_entries]

# calculate
calculate_corr_matrix(left_group=entries_dict['LTZ'], 
                      right_groups=right_groups,
                      metrics=select_metrics, identifier='FINAL')

LTZ Augustenberg - LTZ Augustenberg


100%|██████████████████████████████████████████| 6/6 [00:12<00:00,  2.09s/cells]
100%|██████████████████████████████████████████| 6/6 [00:13<00:00,  2.19s/cells]
100%|██████████████████████████████████████████| 6/6 [00:12<00:00,  2.05s/cells]
100%|██████████████████████████████████████████| 6/6 [00:12<00:00,  2.09s/cells]
100%|██████████████████████████████████████████| 6/6 [00:10<00:00,  1.83s/cells]
100%|██████████████████████████████████████████| 6/6 [00:12<00:00,  2.14s/cells]


LTZ Augustenberg - DWD station Rheinstetten


100%|████████████████████████████████████████| 14/14 [03:40<00:00, 15.77s/cells]
100%|████████████████████████████████████████| 14/14 [03:51<00:00, 16.54s/cells]
100%|████████████████████████████████████████| 14/14 [04:06<00:00, 17.58s/cells]
100%|████████████████████████████████████████| 14/14 [03:41<00:00, 15.81s/cells]
100%|████████████████████████████████████████| 14/14 [04:50<00:00, 20.78s/cells]
100%|████████████████████████████████████████| 14/14 [03:29<00:00, 15.00s/cells]


# DWD calculation
- **left**
 - **DWD**: 2008 - 2020, exclude daily
- **right**
 - **DWD**: 2008 - 2020, exclude daily
 - **Bühlot**: 2012 - 2021
 - **Sap Flow HoH**: 2015 - 2015
 - **LUBW**: 1826 - 2018, varies strongly
 - **Fendt Eddy**: 2014 - 2018
 
right calculated before:
- **LTZ**: 2007 - 2011

exclude Lysimeters, calculate daily Lysimeter with daily DWD further below
- **Fendt 1 Tereno**: 2012 - 2014 (daily)
- **Fendt 2 Tereno**: 2012 - 2014 (daily)
- **Fendt 3 Tereno**: 2012 - 2014 (daily)
- **Grasswang Tereno**: 2012 - 2014 (daily)

In [14]:
# mi und jsd nachberechnen
select_metrics = ['mutual_info', 'js_distance']

# DWD: exclude entries with daily resolution
dwd_entries = [e for e in entries_dict['DWD'] if e.datasource.temporal_scale.resolution not in ('P1DT0H0M0S')]
# Fendt Eddy: exclude Eddy Covariance table
fendt_eddy_entries = [e for e in entries_dict['FendtEddy'] if e.variable.name not in ('Eddy Covariance')]

right_groups = [dwd_entries, 
                entries_dict['Bühlot'], 
                entries_dict['SapFlowHoH'], 
                entries_dict['LUBW'],
                fendt_eddy_entries]

# calculate
calculate_corr_matrix(left_group=dwd_entries, 
                      right_groups=right_groups,
                      metrics=select_metrics, identifier='FINAL')

DWD station Rheinstetten - DWD station Rheinstetten


100%|████████████████████████████████████████| 14/14 [04:17<00:00, 18.41s/cells]
100%|████████████████████████████████████████| 14/14 [03:31<00:00, 15.10s/cells]
100%|████████████████████████████████████████| 14/14 [08:16<00:00, 35.46s/cells]
100%|████████████████████████████████████████| 14/14 [06:20<00:00, 27.20s/cells]
100%|████████████████████████████████████████| 14/14 [04:20<00:00, 18.59s/cells]
100%|████████████████████████████████████████| 14/14 [03:59<00:00, 17.13s/cells]
100%|████████████████████████████████████████| 14/14 [06:03<00:00, 25.98s/cells]
100%|████████████████████████████████████████| 14/14 [04:33<00:00, 19.55s/cells]
100%|████████████████████████████████████████| 14/14 [03:31<00:00, 15.11s/cells]
100%|████████████████████████████████████████| 14/14 [10:19<00:00, 44.28s/cells]
100%|████████████████████████████████████████| 14/14 [11:36<00:00, 49.74s/cells]
100%|████████████████████████████████████████| 14/14 [04:02<00:00, 17.33s/cells]
100%|███████████████████████

DWD station Rheinstetten - Bühlot Dataset


100%|████████████████████████████████████████| 44/44 [04:34<00:00,  6.24s/cells]
100%|████████████████████████████████████████| 44/44 [08:18<00:00, 11.33s/cells]
100%|████████████████████████████████████████| 44/44 [11:31<00:00, 15.71s/cells]
100%|████████████████████████████████████████| 44/44 [08:20<00:00, 11.37s/cells]
100%|████████████████████████████████████████| 44/44 [04:26<00:00,  6.05s/cells]
100%|████████████████████████████████████████| 44/44 [03:38<00:00,  4.98s/cells]
100%|████████████████████████████████████████| 44/44 [08:05<00:00, 11.04s/cells]
100%|████████████████████████████████████████| 44/44 [04:28<00:00,  6.11s/cells]
100%|████████████████████████████████████████| 44/44 [03:40<00:00,  5.00s/cells]
100%|████████████████████████████████████████| 44/44 [11:30<00:00, 15.70s/cells]
100%|████████████████████████████████████████| 44/44 [08:21<00:00, 11.40s/cells]
100%|████████████████████████████████████████| 44/44 [04:59<00:00,  6.81s/cells]
100%|███████████████████████

DWD station Rheinstetten - Sap Flow - Hohes Holz


100%|████████████████████████████████████████| 17/17 [17:14<00:00, 60.85s/cells]
100%|████████████████████████████████████████| 17/17 [10:18<00:00, 36.40s/cells]
100%|█████████████████████████████████████| 17/17 [1:13:03<00:00, 257.87s/cells]
100%|███████████████████████████████████████| 17/17 [41:09<00:00, 145.25s/cells]
100%|████████████████████████████████████████| 17/17 [08:38<00:00, 30.51s/cells]
100%|████████████████████████████████████████| 17/17 [01:26<00:00,  5.08s/cells]
100%|█████████████████████████████████████| 17/17 [1:27:49<00:00, 309.97s/cells]
100%|████████████████████████████████████████| 17/17 [15:00<00:00, 52.95s/cells]
100%|████████████████████████████████████████| 17/17 [02:56<00:00, 10.40s/cells]
100%|████████████████████████████████████| 17/17 [4:46:02<00:00, 1009.55s/cells]
100%|█████████████████████████████████████| 17/17 [4:11:58<00:00, 889.33s/cells]
100%|█████████████████████████████████████| 17/17 [1:43:06<00:00, 363.92s/cells]
100%|███████████████████████

DWD station Rheinstetten - LUBW gauge network


100%|██████████████████████████████████████| 484/484 [40:07<00:00,  4.97s/cells]
100%|██████████████████████████████████████| 484/484 [40:30<00:00,  5.02s/cells]
100%|██████████████████████████████████████| 484/484 [40:36<00:00,  5.03s/cells]
100%|██████████████████████████████████████| 484/484 [40:45<00:00,  5.05s/cells]
100%|██████████████████████████████████████| 484/484 [42:06<00:00,  5.22s/cells]
100%|██████████████████████████████████████| 484/484 [40:58<00:00,  5.08s/cells]
100%|██████████████████████████████████████| 484/484 [41:00<00:00,  5.08s/cells]
100%|██████████████████████████████████████| 484/484 [40:59<00:00,  5.08s/cells]
100%|██████████████████████████████████████| 484/484 [41:09<00:00,  5.10s/cells]
100%|██████████████████████████████████████| 484/484 [41:06<00:00,  5.10s/cells]
100%|██████████████████████████████████████| 484/484 [41:06<00:00,  5.10s/cells]
100%|██████████████████████████████████████| 484/484 [41:07<00:00,  5.10s/cells]
100%|███████████████████████

DWD station Rheinstetten - Fendt dataset: Eddy covariance data


100%|██████████████████████████████████████████| 6/6 [00:25<00:00,  4.22s/cells]
100%|██████████████████████████████████████████| 6/6 [00:08<00:00,  1.48s/cells]
100%|██████████████████████████████████████████| 6/6 [00:42<00:00,  7.09s/cells]
100%|██████████████████████████████████████████| 6/6 [00:43<00:00,  7.25s/cells]
100%|██████████████████████████████████████████| 6/6 [00:24<00:00,  4.11s/cells]
100%|██████████████████████████████████████████| 6/6 [00:09<00:00,  1.62s/cells]
100%|██████████████████████████████████████████| 6/6 [00:42<00:00,  7.05s/cells]
100%|██████████████████████████████████████████| 6/6 [00:25<00:00,  4.20s/cells]
100%|██████████████████████████████████████████| 6/6 [00:09<00:00,  1.67s/cells]
100%|██████████████████████████████████████████| 6/6 [00:40<00:00,  6.68s/cells]
100%|██████████████████████████████████████████| 6/6 [00:37<00:00,  6.18s/cells]
100%|██████████████████████████████████████████| 6/6 [00:22<00:00,  3.71s/cells]
100%|███████████████████████

# LTZ & DWD repeat calculation for MI and JSD!!
07.02.2022
- [ ] LTZ
- [ ] DWD  

erledigt: 24.02.2022

## Bühlot calculation
Fendt EC: **Eddy table weglassen (Kernel Crash bei einer Spalte)**
- **left**:
 - **Bühlot**: 2012 - 2021

- **right**
 - **Bühlot**: 2012 - 2021
 - **Sap Flow HoH**: 2015 - 2015
 - **LUBW**: 1826 - 2018, strongly varying
 - **Fendt Eddy**: 2014 - 2018
 
right calculated before:
- **LTZ**: 2007 - 2011
- **DWD**: 2008 - 2020

exclude Lysimeters (daily)
- **Fendt 1 Tereno**: 2012 - 2014 (daily)
- **Fendt 2 Tereno**: 2012 - 2014 (daily)
- **Fendt 3 Tereno**: 2012 - 2014 (daily)
- **Grasswang Tereno**: 2012 - 2014 (daily)

In [None]:
# Fendt Eddy: exclude Eddy Covariance table
fendt_eddy_entries = [e for e in entries_dict['FendtEddy'] if e.variable.name not in ('Eddy Covariance')]

right_groups = [entries_dict['Bühlot'], 
                entries_dict['SapFlowHoH'], 
                entries_dict['LUBW'],
                fendt_eddy_entries]

# calculate
calculate_corr_matrix(left_group=entries_dict['Bühlot'], 
                      right_groups=right_groups,
                      metrics=select_metrics, identifier='FINAL')

Bühlot Dataset - Bühlot Dataset


100%|███████████████████████████████████| 44/44 [49:56:10<00:00, 4085.70s/cells]
100%|█████████████████████████████████████| 44/44 [2:00:32<00:00, 164.38s/cells]
100%|███████████████████████████████████| 44/44 [70:03:14<00:00, 5731.70s/cells]
100%|█████████████████████████████████████| 44/44 [2:29:32<00:00, 203.93s/cells]
100%|███████████████████████████████████| 44/44 [73:50:03<00:00, 6040.99s/cells]
100%|█████████████████████████████████████| 44/44 [1:44:36<00:00, 142.64s/cells]
100%|███████████████████████████████████| 44/44 [58:10:21<00:00, 4759.58s/cells]
100%|█████████████████████████████████████| 44/44 [1:41:57<00:00, 139.04s/cells]
100%|███████████████████████████████████| 44/44 [57:58:41<00:00, 4743.68s/cells]
100%|█████████████████████████████████████| 44/44 [1:32:47<00:00, 126.54s/cells]
100%|███████████████████████████████████| 44/44 [23:47:59<00:00, 1947.26s/cells]
100%|███████████████████████████████████| 44/44 [29:59:44<00:00, 2454.19s/cells]
 18%|██████                 

ABBRUCH BÜHLOT!

**Bühlot jetzt unten bei LUBW dabei!**

# Sap Flow HoH Upload
- **left**:
 - **Sap Flow HoH**: 2012 - 2015

- **right**
 - **Sap Flow HoH**: 2015 - 2015
 - **LUBW**: 1826 - 2018, strongly varying
 - **Fendt Eddy**: 2014 - 2018
 
right calculated before:
- **LTZ**: 2007 - 2011
- **DWD**: 2008 - 2020
- **Bühlot**: 2012 - 2021

exclude Lysimeters (daily)
- **Fendt 1 Tereno**: 2012 - 2014 (daily)
- **Fendt 2 Tereno**: 2012 - 2014 (daily)
- **Fendt 3 Tereno**: 2012 - 2014 (daily)
- **Grasswang Tereno**: 2012 - 2014 (daily)

In [None]:
# Fendt Eddy: exclude Eddy Covariance table
fendt_eddy_entries = [e for e in entries_dict['FendtEddy'] if e.variable.name not in ('Eddy Covariance')]

right_groups = [entries_dict['SapFlowHoH'], 
                entries_dict['LUBW'],
                fendt_eddy_entries]

# calculate
calculate_corr_matrix(left_group=entries_dict['SapFlowHoH'], 
                      right_groups=right_groups,
                      metrics=select_metrics, identifier='FINAL')

# LUBW Upload
- **left**:
 - **LUBW**: 1826 - 2018, varies greatly

- **right**
 - **LUBW**: 1826 - 2018, varies greatly
 - **Fendt Eddy**: 2014 - 2018
 
right calculated before:
- **LTZ**: 2007 - 2011
- **DWD**: 2008 - 2020
- **Bühlot**: 2012 - 2021 -> oben Abbruch, jetzt doch auch hier dabei!
- **Sap Flow HoH**: 2015 - 2015

exclude Lysimeters (daily)
- **Fendt 1 Tereno**: 2012 - 2014 (daily)
- **Fendt 2 Tereno**: 2012 - 2014 (daily)
- **Fendt 3 Tereno**: 2012 - 2014 (daily)
- **Grasswang Tereno**: 2012 - 2014 (daily)

In [16]:
select_metrics = ['pearson', 'spearman', 'dcor', 'mic', 'kendall_tau', 'mutual_info', 'js_distance']

# Fendt Eddy: exclude Eddy Covariance table
fendt_eddy_entries = [e for e in entries_dict['FendtEddy'] if e.variable.name not in ('Eddy Covariance')]

right_groups = [entries_dict['LUBW']]

# calculate
calculate_corr_matrix(left_group=entries_dict['LUBW'], 
                      right_groups=right_groups,
                      metrics=select_metrics, identifier='FINAL')

LUBW gauge data: Möhringen - LUBW gauge data: Möhringen


100%|██████████████████████████████████████| 484/484 [41:33<00:00,  5.15s/cells]
100%|██████████████████████████████████████| 484/484 [41:50<00:00,  5.19s/cells]
100%|██████████████████████████████████████| 484/484 [41:43<00:00,  5.17s/cells]
100%|██████████████████████████████████████| 484/484 [41:46<00:00,  5.18s/cells]
100%|██████████████████████████████████████| 484/484 [41:43<00:00,  5.17s/cells]
100%|██████████████████████████████████████| 484/484 [41:40<00:00,  5.17s/cells]
100%|██████████████████████████████████████| 484/484 [41:49<00:00,  5.19s/cells]
100%|██████████████████████████████████████| 484/484 [41:44<00:00,  5.17s/cells]
100%|██████████████████████████████████████| 484/484 [41:47<00:00,  5.18s/cells]
100%|██████████████████████████████████████| 484/484 [41:48<00:00,  5.18s/cells]
100%|██████████████████████████████████████| 484/484 [41:33<00:00,  5.15s/cells]
100%|██████████████████████████████████████| 484/484 [41:48<00:00,  5.18s/cells]
100%|███████████████████████

100%|██████████████████████████████████████| 484/484 [42:30<00:00,  5.27s/cells]
100%|██████████████████████████████████████| 484/484 [42:30<00:00,  5.27s/cells]
100%|██████████████████████████████████████| 484/484 [42:30<00:00,  5.27s/cells]
100%|██████████████████████████████████████| 484/484 [42:31<00:00,  5.27s/cells]
100%|██████████████████████████████████████| 484/484 [42:33<00:00,  5.28s/cells]
100%|████████████████████████████████████| 484/484 [1:37:36<00:00, 12.10s/cells]
100%|██████████████████████████████████████| 484/484 [45:25<00:00,  5.63s/cells]
100%|██████████████████████████████████████| 484/484 [43:44<00:00,  5.42s/cells]
100%|██████████████████████████████████████| 484/484 [44:14<00:00,  5.49s/cells]
100%|██████████████████████████████████████| 484/484 [45:11<00:00,  5.60s/cells]
100%|██████████████████████████████████████| 484/484 [44:13<00:00,  5.48s/cells]
100%|██████████████████████████████████████| 484/484 [44:15<00:00,  5.49s/cells]
100%|███████████████████████

100%|██████████████████████████████████████| 484/484 [47:29<00:00,  5.89s/cells]
100%|██████████████████████████████████████| 484/484 [43:53<00:00,  5.44s/cells]
100%|██████████████████████████████████████| 484/484 [44:08<00:00,  5.47s/cells]
100%|██████████████████████████████████████| 484/484 [45:44<00:00,  5.67s/cells]
100%|██████████████████████████████████████| 484/484 [47:51<00:00,  5.93s/cells]
100%|██████████████████████████████████████| 484/484 [47:42<00:00,  5.91s/cells]
100%|██████████████████████████████████████| 484/484 [49:36<00:00,  6.15s/cells]
100%|██████████████████████████████████████| 484/484 [49:08<00:00,  6.09s/cells]
100%|██████████████████████████████████████| 484/484 [58:47<00:00,  7.29s/cells]
100%|██████████████████████████████████| 484/484 [14:37:29<00:00, 108.78s/cells]
100%|██████████████████████████████████████| 484/484 [45:56<00:00,  5.70s/cells]
100%|██████████████████████████████████████| 484/484 [47:31<00:00,  5.89s/cells]
100%|███████████████████████

ABBRUCH: left lubw, right bühlot  
WEITERMACHEN: left lubw, right lubw

# Fendt Eddy
- **DWD**: 2008 - 2020
- **Bühlot**: 2012 - 2021
- **Sap Flow HoH**: 2015 - 2015
- **LUBW**: 1826 - 2018, strongly varying
- **Fendt Eddy**: 2014 - 2018

# Lysimeter data
- **Fendt 1 Tereno**: 2012 - 2014 (daily)
- **Fendt 2 Tereno**: 2012 - 2014 (daily)
- **Fendt 3 Tereno**: 2012 - 2014 (daily)
- **Grasswang Tereno**: 2012 - 2014 (daily)
- **DWD**: daily data