<a href="https://colab.research.google.com/github/afonso-tiago/thesis-notebooks/blob/main/tables_and_figures.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Intoduction 

This is one of the complementary notebooks to the bachelor thesis titled "*Comparing Performance of Different Goal Functionals in Solving PDEs Using Neural Networks*". It contains the code with wich all tables and figures of the thesis were created. The code was written in a modular way so that a lot of different plots can be created with it. This also means that it should be fairly easy to create plots for new data created with the [tests.ipynb](https://github.com/afonso-tiago/thesis-notebooks/blob/main/tests.ipynb) notebook.

> A good way to navigate Google Colaboratory notebooks is by using the built-in table of contents; the first item in the menu bar on the left side of the screen.

> Helpful keyboard shortcuts are: 
* <kbd>Shift</kbd> + <kbd>Enter</kbd>: executes a cell and jump to the next one
* <kbd>Ctrl</kbd> + <kbd>Enter</kbd>: executes a cell but stays in that cell

# Preparation

In order to run all the cells in this notebook you will need to upload the [logs.zip](https://github.com/afonso-tiago/thesis-notebooks/blob/main/logs.zip) file from the [GitHub repository](https://github.com/afonso-tiago/thesis-notebooks) into your current Colab session.

For this you can simply run the next cell.

In [None]:
!wget https://raw.githubusercontent.com/afonso-tiago/thesis-notebooks/main/logs.zip
!unzip logs

In order to use your own data from your google drive run the next cell. 

If you have data stored elsewhere you can upload it manually with the build-in file explorer; the last item in the menu bar on the left side of the screen.

In [None]:
from google.colab import drive
drive.mount('/content/drive')

Run the next cell to import all necessary libraries

In [None]:
from os import listdir
from os.path import isfile, join

import tensorflow as tf
from tensorboard.backend.event_processing.event_accumulator import EventAccumulator

import math
import numpy as np

import pandas as pd
from decimal import Decimal

import matplotlib
import matplotlib.pyplot as plt
import seaborn as sns
sns.set_style('whitegrid')

import re
from difflib import SequenceMatcher

from IPython.display import clear_output

# API

The cells in this section of the notebook comprise a small API that help in the creation of different plots. 

The code can be split up into two parts: 
1. retreving the desired data (find_minimizers, retrieve_alt_data, find_all)
2. plotting the data (plot_PINN_and_DRM_minimizer_separately, plot_PINN_and_DRM_minimizer_together, plot_x_y_hue)

In [None]:
# substitute the substring matching the i-th pattern in patterns 
# with the i-th replacement string in repls for all i
def sub_all(patterns, repls, string):
  for pattern, repl in zip(patterns, repls):
    string = re.sub(pattern, repl, string)
  return string

In [None]:
def find_minimizers(dir_paths, groups, filter_include='.', filter_exclude='$impossible', 
                    metric_PINN='PINN relative_H1_error', metric_DRM='DRM relative_H1_error', 
                    calc_min=None, calc_PINN_min=lambda values: min(values), calc_DRM_min=lambda values: min(values), verbose=True):
  """Find all minimizers of the specified groups.

  Go through all paths in dir_paths and find for every group in groups "the best run"/"minimizer".
  A run is considered better if its minum calculated by calc_min over all the values belonging to the specified metric 
  is smaller that the minima of all other runs from the same group.

  Args: 
    dir_paths: A is a list of strings, or a single string
               specifing the paths in which to search for "minimizers"
    groups: A list of regular expressions, i.e. a list of strings. 
            A run belongs to a group if the path to the run's file (relative from dir_path)
            matches the regex of the group.
            e.g. 
            if dir_path = '/content/logs/no weak 2nd derivative/n=10/'
            and groups = ['n=10', 'beta=100']
            then '/content/logs/no weak 2nd derivative/n=10/beta=100#1' belongs to the group 'beta=100' but NOT 'n=10'
    filter_include: A regular expression, which basically defines a white list of paths where we will look for minimizers.
                    The default value '.' matches every string.
    filter_exclude: A regular expression, which basically defines a black list of paths where we will NOT look for minimizers.
                    The default value '$impossible' can never match a string, as '$' signals the end of a string.
    metric_PINN: A string with the name of a metric that was recorded for a run using PINNs. 
                The minimizers of the runs using PINNs will be calculated for this metric.
    metric_DRM: A string with the name of a metric that was recorded for a run using the DRM. 
                The minimizers of the runs using the DRM will be calculated for this metric.
    calc_min: A function which takes values as its input and returns a minimum.
              This argument, if passed, will overwrite BOTH calc_PINN_min and calc_DRM_min.
    calc_PINN_min: A function which takes values as its input and returns a minimum.
                  This function is used to determine the minimizer of the runs using PINNs.
    calc_DRM_min: A function which takes values as its input and returns a minimum.
                  This function is used to determine the minimizer of the runs using the DRM.
    verbose: A boolean. When set to True, the goup currently worked on and the found minimizers will be printed.

  Returns:
    groups: The groups passed as an argument.
    min: A list of floats. The i-th value is the minimum of the minimizer of the i-th group.
    data: A list of pd.DataFrames. The i-th entry is the raw data of the minimizer of the i-th group.
    name: A list of strings. The i-th string is the name of the run corresponding to the minimizer of the i-th group.
          The name is the path to the run starting from dir_path.
    path: A list of strings. The i-th string is the complete path to the run which corresponds to the i-th group."""
  # if dir_paths is a string convert it to a list of a single string
  # this way the remaining code can always expect a list of strings
  if type(dir_paths) == str:
    dir_paths = [dir_paths]
  # if calc_min is set overwrite calc_PINN_min and calc_DRM_min
  if calc_min != None:
    calc_PINN_min = calc_min
    calc_DRM_min = calc_min

  min_PINN, min_DRM = len(groups)*[None], len(groups)*[None]
  data_PINN, data_DRM = len(groups)*[None], len(groups)*[None]
  name_PINN, name_DRM = len(groups)*[''], len(groups)*['']
  path_PINN, path_DRM = len(groups)*[''], len(groups)*['']
  # this method will recursively visit nestead folders starting from dir_path
  # if a file is reached it will check if it belongs to the group specified by the argument group
  # it will then check if the minimum of the corresponding run calculated by calc_(PINN/DRM)_min is the smallest up to this point
  # if so, it will save its minimum, name, data and path
  def check_path_for_minimizer(path, dir_path, group, index):
    # in order to access the variables defined outside this function (and not create new local versions)
    # we need to add the keyword nonlocal
    nonlocal min_PINN, min_DRM, data_PINN, data_DRM, name_PINN, name_DRM, path_PINN, path_DRM
    for subdir_name in listdir(path):
      subpath = join(path, subdir_name)
      if not isfile(subpath):
        # recursively check nested folders
        check_path_for_minimizer(subpath, dir_path, group, index)
      else:
        name = path.replace(dir_path, '')
        # check if the name passes the filters and includes the group
        if re.search(filter_include, name) and not re.search(filter_exclude, name) and re.search(group, name):
          # gather all the data of the run
          event_acc = EventAccumulator(subpath, size_guidance={'tensors': 0}) 
                                                  # From the docs: 
                                                  # The size_guidance should be a map from a `tagType` string to an integer 
                                                  # representing the number of items to keep per tag for items of that `tagType`. 
                                                  # If the size is 0, all events are stored.
          event_acc.Reload()

          try:
            # save the data for the specified metric
            data = pd.DataFrame([(w, s, tf.make_ndarray(t)[()]) for w, s, t in event_acc.Tensors(metric_PINN)],
                                      columns=['wall_time', 'step', metric_PINN],)
          except:
            # an execption occurs if the specified metric was not captured during the run
            print('WARNING: An error occured while trying to load ' + metric_PINN + ' from the file ' + subpath)
            print('The available tags are ' + str(event_acc.Tags()))
            continue
          minimum = calc_PINN_min(data[metric_PINN].tolist())
          # check if the minimum is the smallest up to this point
          if min_PINN[index] == None or minimum < min_PINN[index]:
            min_PINN[index] = minimum
            data_PINN[index] = data
            name_PINN[index] = name
            path_PINN[index] = subpath

          # repeat the same steps above, this time for the DRM
          try:
            data = pd.DataFrame([(w, s, tf.make_ndarray(t)[()]) for w, s, t in event_acc.Tensors(metric_DRM)],
                                    columns=['wall_time', 'step', metric_DRM],)
          except:
            print('WARNING: An error occured while trying to load ' + metric_DRM + ' from the file ' + subpath)
            print('The available tags are ' + str(event_acc.Tags()))
            continue
          minimum = calc_DRM_min(data[metric_DRM].tolist())
          if min_DRM[index] == None or minimum < min_DRM[index]:
            min_DRM[index] = minimum
            data_DRM[index] = data
            name_DRM[index] = name
            path_DRM[index] = subpath
  for index, group in enumerate(groups):
    if verbose: print(f'determining minimizer of group {group} ...')
    for dir_path in dir_paths:
      check_path_for_minimizer(dir_path, dir_path, group, index)
    if verbose: print(f'minimizer determined! PINN: {name_PINN[index]} DRM: {name_DRM[index]}')
  return groups, min_PINN, min_DRM, data_PINN, data_DRM, name_PINN, name_DRM, path_PINN, path_DRM

In [None]:
def retrieve_alt_data(groups, min_PINN, min_DRM, data_PINN, data_DRM, name_PINN, name_DRM, path_PINN, path_DRM,
                     metric_PINN='PINN relative_H1_error', metric_DRM='DRM relative_H1_error', 
                     calc_min=None, calc_PINN_min=lambda values: min(values), calc_DRM_min=lambda values: min(values), verbose=True):
  """Retrieve alternative data for the runs with the specified paths.

  Get the complete data for every run saved at path_PINN/DRM
  and calculate using the (new) function calc_min the minimum of all the values belonging to the (new) metric specified by metric_PINN/DRM.

  Args: 
    groups, min_PINN, min_DRM, data_PINN, data_DRM, name_PINN, name_DRM, path_PINN, path_DRM: 
      The values returned by find_minimizers. Note that this function only really needs path_PINN and path_DRM,
      but by accepting all those arguments we can directly chain find_minimizers and retrieve_alt_data.
    metric_PINN: A string with the name of a metric that was recorded for a run using PINNs.
                 This function will use this (new) metric to retrive alternative data from the runs specified by path_PINN.
    metric_DRM: A string with the name of a metric that was recorded for a run using the DRM.
                This function will use this (new) metric to retrive alternative data from the runs specified by path_DRM.
    calc_min: A function which takes values as its input and returns a minimum.
              This argument, if passed, will overwrite BOTH calc_PINN_min and calc_DRM_min.
    calc_PINN_min: A function which takes values as its input and returns a minimum.
                   This function is used to determine the minimizer of the runs using PINNs.
    calc_DRM_min: A function which takes values as its input and returns a minimum.
                  This function is used to determine the minimizer of the runs using the DRM.
    verbose: A boolean. When set to True, prints the path from which currently alternative data is fetched.

  Returns:
    groups: The groups passed as an argument.
    new_min: A list of floats. The i-th value is the minimum of the minimizer of the i-th group.
    new_data: A list of pd.DataFrames. The i-th entry is the raw data of the minimizer of the i-th group.
    new_name: A list of strings. The i-th string is the name of the run corresponding to the minimizer of the i-th group.
          The name is the path to the run starting from dir_path.
    new_path: A list of strings. The i-th string is the complete path to the run which corresponds to the i-th group."""
  new_min_PINN, new_min_DRM = len(groups)*[None], len(groups)*[None]
  new_data_PINN, new_data_DRM = len(groups)*[None], len(groups)*[None]
  # if calc_min is set overwrite calc_PINN_min and calc_DRM_min
  if calc_min != None:
    calc_PINN_min = calc_min
    calc_DRM_min = calc_min
  # go over all of PINNs' paths
  for index, path in enumerate(path_PINN):
    if verbose: print(f'retrieving alternative PINN data form {path} ...')
    event_acc = EventAccumulator(path, size_guidance={'tensors': 0}) 
                                                    # From the docs: 
                                                    # The size_guidance should be a map from a `tagType` string to an integer 
                                                    # representing the number of items to keep per tag for items of that `tagType`. 
                                                    # If the size is 0, all events are stored.
    event_acc.Reload()

    data = pd.DataFrame([(w, s, tf.make_ndarray(t)[()]) for w, s, t in event_acc.Tensors(metric_PINN)],
                        columns=['wall_time', 'step', metric_PINN],)
    minimum = calc_PINN_min(data[metric_PINN].tolist())
    new_data_PINN[index] = data
    new_min_PINN[index] = minimum
  # go over all of DRM's paths
  for index, path in enumerate(path_DRM):
    if verbose: print(f'retrieving alternative DRM data form {path} ...')
    event_acc = EventAccumulator(path, size_guidance={'tensors': 0}) 
                                                    # From the docs: 
                                                    # The size_guidance should be a map from a `tagType` string to an integer 
                                                    # representing the number of items to keep per tag for items of that `tagType`. 
                                                    # If the size is 0, all events are stored.
    event_acc.Reload()

    data = pd.DataFrame([(w, s, tf.make_ndarray(t)[()]) for w, s, t in event_acc.Tensors(metric_DRM)],
                                columns=['wall_time', 'step', metric_DRM],)
    minimum = calc_DRM_min(data[metric_DRM].tolist())
    new_data_DRM[index] = data
    new_min_DRM[index] = minimum
  return groups, new_min_PINN, new_min_DRM, new_data_PINN, new_data_DRM, name_PINN, name_DRM, path_PINN, path_DRM

In [None]:
def plot_PINN_and_DRM_minimizer_separately(groups, min_PINN, min_DRM, data_PINN, data_DRM, name_PINN, name_DRM, path_PINN, path_DRM,
                                           xlabel='epoch', ylabel=None, figsize=None, suptitle='', patterns=['$impossible'], repls=[''], 
                                           PINN_visible=True, DRM_visible=True, sharey=True, xscale='linear', yscale='log', 
                                           markers=[None], colors=[None]): 
  """Create a figure where the "best run" of every group is shown in the same plot. 
  Create one plot for the runs using PINNs on the left and the DRM on the right.
  The y-axis is scaled logarithmically by default.

  Args: 
    groups, min_PINN, min_DRM, data_PINN, data_DRM, name_PINN, name_DRM, path_PINN, path_DRM: 
      The values returned by find_minimizers or retrieve_alt_data.
    ylabel: A string or None. If None is passed, the ylabel will be automatically constructed from 
            the longest overlap in the names of metric_PINN and metric_DRM, e.g. 
            if metric_PINN = 'metric PINN relative_L2_error' and metric_DRM = 'metric_DRM_relative_L2_error'
            ylabel will become 'relative_L2_error'.
    patterns: A list of regular expressions. Replace for all names in name_PINN/DRM the first match with pattern[i] by repls[i].
    repls: A list of strings, which replace the matched patterns."""
  assert PINN_visible or DRM_visible
  # if no ylabel was specified change it to the longest overlapping string between metric_PINN and metric_DRM
  if ylabel == None:
    metric_PINN, metric_DRM = data_PINN[0].columns[-1], data_DRM[0].columns[-1]
    sm = SequenceMatcher(None, metric_PINN, metric_DRM)
    match = sm.find_longest_match(0,len(metric_PINN),0,len(metric_DRM))
    ylabel = metric_PINN[match[0]:match[0]+match[2]]
  if len(markers) < len(groups):
    markers *= math.ceil(len(groups)/len(markers))
  if len(colors) < len(groups):
    colors *= math.ceil(len(groups)/len(colors))
  # rows is fixed, but by creating a variable for it the remaining code is easier to read
  rows, columns = 1, (PINN_visible + DRM_visible)
  if figsize == None:
    figsize = (6*columns, 4*rows)
  fig, ax = plt.subplots(rows, columns, figsize=figsize, sharey=sharey)
  # only set different values for ax0 and ax1 when two axis exist
  ax0 = ax[0] if PINN_visible and DRM_visible else ax
  ax1 = ax[1] if PINN_visible and DRM_visible else ax
  for index, group in enumerate(groups):
    if PINN_visible:
      name = sub_all(patterns, repls, name_PINN[index])
      ax0.plot('step', data_PINN[index].columns[-1], data=data_PINN[index], label=name, marker=markers[index], color=colors[index])
    if DRM_visible:
      name = sub_all(patterns, repls, name_DRM[index])
      ax1.plot('step', data_DRM[index].columns[-1], data=data_DRM[index], label=name, marker=markers[index], color=colors[index])
  if PINN_visible:
    ax0.set(title='PINN', xscale=xscale, yscale=yscale, xlabel=xlabel, ylabel=ylabel)
      # sort PINN labels by min
    handles, labels = ax0.get_legend_handles_labels()
    sorted_mhl = sorted(zip(min_PINN, handles, labels), key=lambda z:z[0], reverse=True)
    ax0.legend([h for m,h,l in sorted_mhl], [l for m,h,l in sorted_mhl]) 
  if DRM_visible:  
    ax1.set(title='DRM', xscale=xscale, yscale=yscale, xlabel=xlabel, ylabel=ylabel)
      # sort DRM labels by min
    handles, labels = ax1.get_legend_handles_labels()
    sorted_mhl = sorted(zip(min_DRM, handles, labels), key=lambda z:z[0], reverse=True)
    ax1.legend([h for m,h,l in sorted_mhl], [l for m,h,l in sorted_mhl]) 
  fig.suptitle(suptitle, fontsize=14)
  fig.tight_layout(rect=[0,0,1,0.95]) # tight_layout fixes overlapping labels and rect is necessary for clean suptitle
  fig.show()
  # return fig, ax

In [None]:
def plot_PINN_and_DRM_minimizer_together(groups, min_PINN, min_DRM, data_PINN, data_DRM, name_PINN, name_DRM, path_PINN, path_DRM,
                                         rows = None, columns = None, xlabel='epoch', ylabel=None, figsize=None, suptitle='',
                                         patterns=['$impossible'], repls=[''], xscale='linear', yscale='log'):  
  """Create one plot for every group, where the "best run" of this group using PINNs and the DRM are both shown in the same plot.

  Args: 
    groups, min_PINN, min_DRM, data_PINN, data_DRM, name_PINN, name_DRM, path_PINN, path_DRM: 
      The values returned by find_minimizers or retrieve_alt_data.
    rows, columns: Two whole numbers or None. If none are passed columns will default to 1 and rows will grow as needed.
                   In general, if only one of the two values is passed the other one will be choosen large enough
                   s.t. at least one subplot exists for every group in groups. Exess subplots created this way are left empty.
    ylabel: A string or None. If None is passed, the ylabel will be automatically constructed from 
            the longest overlap in the names of metric_PINN and metric_DRM, e.g. 
            if metric_PINN = 'metric PINN relative_L2_error' and metric_DRM = 'metric_DRM_relative_L2_error'
            ylabel will become 'relative_L2_error'.
    patterns: A list of regular expressions. Replace for all names in name_PINN/DRM the first match with pattern[i] by repls[i].
    repls: A list of strings, which replace the matched patterns."""
  if rows == None and columns == None:
    columns = 1
  if rows == None and columns != None:
    # choose rows just large enough s.t. rows*columns >= len(groups)
    rows = math.ceil(len(groups) / columns)
  if rows != None and columns == None:
    # choose columns just large enough s.t. rows*columns >= len(groups)
    columns = math.ceil(len(groups) / rows)
  # if no ylabel was specified change it to the longest overlapping string between metric_PINN and metric_DRM
  if ylabel == None:
    metric_PINN, metric_DRM = data_PINN[0].columns[-1], data_DRM[0].columns[-1]
    sm = SequenceMatcher(None, metric_PINN, metric_DRM)
    match = sm.find_longest_match(0,len(metric_PINN),0,len(metric_DRM))
    ylabel = metric_PINN[match[0]:match[0]+match[2]]
  if figsize == None:
    # if no figure size was passed choose one based on the number of columns and rows
    figsize = (6*columns, 4*rows)
  fig, ax = plt.subplots(rows, columns, figsize=figsize)
  legend_PINN, legend_DRM = [], []
  for index in range(rows*columns): 
    # translate the (1d) index into (2d) row and column
    row = index // columns
    column = index % columns
    # if only one subplot exists (i.e. rows == columns == 1)
    # matplotlib does not allow selecting the axis via ax[1, 1]
    # as ax in this case is not a tuple
    # the following lines manually set ax_i in this and similar case(s)
    if rows == columns == 1:
      ax_i = ax
    elif rows == 1 or columns == 1:
      ax_i = ax[index]
    else:
      ax_i = ax[row, column]
    
    if len(groups) <= index:
      ax_i.set_axis_off()
      continue
    group = groups[index]

    name = sub_all(patterns, repls, name_PINN[index])
    ax_i.plot('step', data_PINN[index].columns[-1], data=data_PINN[index], label='PINN: '+name)
    name = sub_all(patterns, repls, name_DRM[index])
    ax_i.plot('step', data_DRM[index].columns[-1], data=data_DRM[index], label='DRM: '+name)
    # reverse the plot order when PINN is below DRM
    if min_PINN[index] < min_DRM[index]:
      handles, labels = ax_i.get_legend_handles_labels()
      ax_i.legend(handles[::-1], labels[::-1]) 
    ax_i.set(title=group, xscale=xscale, yscale=yscale, xlabel=xlabel, ylabel=ylabel)
    fig.suptitle(suptitle, fontsize=14)
    fig.tight_layout(rect=[0,0,1,0.95]) # tight_layout fixes overlapping labels and rect is necessary for clean suptitle

In [None]:
def find_all(dir_paths, filter_include='.', filter_exclude='$impossible', metrics=[],  verbose=True):
  """For every metric in metrics get the corresponding values from all runs found anywhere under dir_paths.

  Args: 
    dir_paths: A is a list of strings, or a single string
               specifing the paths in which to collect the data.
    filter_include: A regular expression, which basically defines a white list of paths where we will collect data.
                    The default value '.' matches every string.
    filter_exclude: A regular expression, which basically defines a black list of paths where we will NOT collect data.
                    The default value '$impossible' can never match a string, as '$' signals the end of a string.
    metrics: A list of strings, where every element is the name of a metric that was recorded during training. 
    verbose: A boolean. When set to True, the number of paths (passing the filters) are printed for every (sub) folder.

  Returns:
    list_vals_map: A list of maps. The i-th map maps the name of a metric to a list 
                   containing the values of the corresponding metric of the i-th run.
    name_list: A list of the names of the runs found.
    path_list: A list of the paths to the runs found."""
  # if dir_paths is a string convert it to a list of a single string
  # this way the remaining code can always expect a list of strings
  if type(dir_paths) == str:
    dir_paths = [dir_paths]

  list_vals_map, name_list, path_list = [], [], []
  # this method will recursively visit nestead folders starting from dir_path
  # if a file is reached it will check if it passes the filters: filter_include and filter_exclude.
  # if so, it will save its the values of every metric in metrics to list_vals_map, and name and path to name_list, path_list
  def check_path(path, dir_path):
    # in order to access the variables defined outside this function (and not create new local versions)
    # we need to add the keyword nonlocal
    nonlocal list_vals_map, name_list, path_list
    for subdir_name in listdir(path):
      subpath = join(path, subdir_name)
      if not isfile(subpath):
        # save length before for verbose output
        len_before = len(path_list)
        check_path(subpath, dir_path)
        if len(listdir(subpath)) > 1:
          if verbose: print(f'found {len(path_list) - len_before} file(s) in {subpath.replace(dir_path, "")}')
      else:
        name = path.replace(dir_path, '')
        # check if the name passes the filters
        if re.search(filter_include, name) and not re.search(filter_exclude, name):
          # gather all the data of the run
          event_acc = EventAccumulator(subpath, size_guidance={'tensors': 0}) 
                                                          # From the docs: 
                                                          # The size_guidance should be a map from a `tagType` string to an integer 
                                                          # representing the number of items to keep per tag for items of that `tagType`. 
                                                          # If the size is 0, all events are stored.
          event_acc.Reload()
          vals_map = {}
          for metric in metrics:
            vals_map[metric] = [tf.make_ndarray(t)[()] for _, _, t in event_acc.Tensors(metric)] 
          list_vals_map.append(vals_map)
          name_list.append(name)
          path_list.append(subpath)

  for dir_path in dir_paths:
    check_path(dir_path, dir_path)
    # this info is always printed (independent of the value of verbose)
    print(f'found {len(path_list)} file(s) in {dir_path}')

  return list_vals_map, name_list, path_list

In [None]:
def turn_into_x_y_hue(list_vals_map, name_list, path_list, x=None, y=None, hue=None,
                      calc_min=None, calc_min_x=lambda x: min(x), calc_min_y=lambda y: min(y), calc_min_hue=lambda h: min(h)):
  """Turn the data from find_all into x, y, hue values ready to be passed to plot_x_y_hue.

  Args: 
    list_vals_map, name_list, path_list: 
      The values returned by find_all.
    x: A string or a function. If x is a string it has to be one of the metrics in vals_map. 
       This method will then prepare data_list in such a way that plot_x_y_hue will plot the values 
       corresponding to this metric on the x-axis. 
       If x is a function it has to take a values_map as input and return values, 
       e.g. if we collected both the 'L2' and 'H1' metric but we want to plot their sum on the x-axis
       we can do this by passing x: labmda vmap: vmap['L2']+vmap['H1']
    y, hue: Similar to x. 
    calc_min: A function which takes values as its input and returns a minimum.
              This argument, if passed, will overwrite calc_min_x, calc_min_y and calc_min_hue.
    calc_min_x: A function which takes values as its input and returns a minimum.
                This function is used to turn the x values into a single number.
    calc_min_y, calc_min_hue: Similar to calc_min_x. 

  Returns:
    data_list: A list of pd.DataFrames, with a column for the x-axis, y-axis and hue-axis.
    name_list: A list of the names of the runs whose data is stored in data_list.
    path_list: A list of the paths to the runs whose data is stored in data_list."""  
  # if calc_min is set overwrite calc_min_x, calc_min_y and calc_min_hue.
  if calc_min != None:
    calc_min_x = calc_min
    calc_min_y = calc_min
    calc_min_hue = calc_min
  x_name, y_name, hue_name = '', '', ''
  # if x,y or hue are strings convert them to functions that pick out the metric with their name
  # this way the remaining code can always expect x,y, hue to be functions
  if type(x) == str:
    x_name = x
    x = lambda vmap: vmap[x_name]
  if type(y) == str:
    y_name = y
    y = lambda vmap: vmap[y_name]
  if type(hue) == str:
    hue_name = hue
    hue = lambda vmap: vmap[hue_name]
  
  x_list, y_list, hue_list = [], [], []
  for vals_map in list_vals_map:
    x_list.append(calc_min_x( x(vals_map) ))
    y_list.append(calc_min_y( y(vals_map) ))
    hue_list.append(calc_min_hue( hue(vals_map) ))
  data_list = pd.DataFrame(zip(x_list, y_list, hue_list), columns=[x_name, y_name, hue_name],)
  return data_list, name_list, path_list

In [None]:
def plot_x_y_hue(data_list, name_list, path_list,
                 xlabel=None, ylabel=None, huelabel=None, xscale='log', yscale='log', huescale='log',
                 figsize=(6,4), suptitle='', cmap='viridis', vmin=None, vmax=None, ax=None):  
  """Plot the result of turn_into_x_y_hue.

  The pd.Dataframe data_list has to contains three columns x, y, hue in this order.
  The x and y values is used to create hollow circles at the corresponding positions on the 2d plan
  which are colored according to the values in hue.

  Args: 
    data_list, name_list, path_list,
      The values returned by turn_into_x_y_hue.
    ax: An instance of matplotlib.axes.Axes or None. If this argument is passed, no extra figure is created.
        It is assumed that the figure is created outside the method."""
  if ax == None:
    fig, ax = plt.subplots(1, 1, figsize=figsize)
  
  # the hue values have to be in the third column
  hue = data_list.iloc[:,2]
  if vmin == None: vmin = hue.min()
  if vmax == None: vmax = hue.max()
  # save the method of normalizing the colors inside the variable norm 
  norm = matplotlib.colors.LogNorm() if huescale != 'linear' else matplotlib.colors.Normalize()
  s = ax.scatter(x=data_list.iloc[:,0], y=data_list.iloc[:,1], c=hue, cmap='viridis',
                 norm=norm, vmin=vmin, vmax=vmax, marker=matplotlib.textpath.TextPath((0, 0), "◯"), linewidths=0.1, s=500)
  # autolabel the x-,y- and hue-axis if no labels were passed
  if xlabel == None: xlabel = data_list.iloc[:,0].name
  if ylabel == None: ylabel = data_list.iloc[:,1].name
  if huelabel == None: huelabel = hue.name
  ax.set(xlabel=xlabel, ylabel=ylabel, xscale=xscale, yscale=yscale)
  cbar = plt.colorbar(mappable=s, ax=ax)
  cbar.set_label(huelabel)

  # show figure if ax was created inside this function
  if ax == None:
    fig.show()

#  Finite Diﬀerences vs Autodiﬀ in High Dimesions

In [None]:
dir_path = '/content/logs/runtime/DRM paper d=100/'
calc_min = lambda values: sorted(values[int(len(values)*0.8):])[int(len(values)*0.1)]
patterns = ['#[0-9]']
repls = ['']

res = find_minimizers(dir_path, ['autodiff', 'finite diff'], metric_PINN='PINN relative_L2_error', metric_DRM='DRM relative_L2_error', calc_min=calc_min)
plot_PINN_and_DRM_minimizer_separately(*res, ylabel='relative L2 error', suptitle='', figsize=(12,5), patterns=patterns, repls=repls)

# re-entrant corners

In [None]:
dir_path = '/content/logs/re-entrant corner/'
calc_min = lambda values: sorted(values[int(len(values)*0.8):])[int(len(values)*0.1)]
patterns = ['nodes_per_layer=[0-9]*', 'activation=', '/', 'num_blocks']
repls = ['', '', ', ', 'blocks']

## Overall best run

In [None]:
res_overall_best = find_minimizers(dir_path, ['0.5xpi', '1.0xpi', '1.5xpi', '2.0xpi'], calc_min=calc_min)

plot_PINN_and_DRM_minimizer_separately(*res_overall_best, ylabel='relative H1 error', suptitle='Best overall run', patterns=patterns, repls=repls)

plot_PINN_and_DRM_minimizer_together(*res_overall_best, rows = 2, columns = 2, suptitle='Best overall run', patterns=patterns, repls=repls)

## Fixed Activation

In [None]:
groups = ['relu_x_div_5_pow_3', 'swish', 'tanh']

res_fixed_activation_05 = find_minimizers(dir_path, groups, filter_include='0.5xpi', calc_min=calc_min)
plot_PINN_and_DRM_minimizer_separately(*res_fixed_activation_05, suptitle='Fixed activation best run 0.5xpi')

res_fixed_activation_10 = find_minimizers(dir_path, groups, filter_include='1.0xpi', calc_min=calc_min)
plot_PINN_and_DRM_minimizer_separately(*res_fixed_activation_10, suptitle='Fixed activation best run 1.0xpi')

res_fixed_activation_15 = find_minimizers(dir_path, groups, filter_include='1.5xpi', calc_min=calc_min)
plot_PINN_and_DRM_minimizer_separately(*res_fixed_activation_15, suptitle='Fixed activation best run 1.5xpi')

res_fixed_activation_20 = find_minimizers(dir_path, groups, filter_include='2.0xpi', calc_min=calc_min)
plot_PINN_and_DRM_minimizer_separately(*res_fixed_activation_20, suptitle='Fixed activation best run 2.0xpi')

In [None]:
res_relu_x_div_5_pow_3 = find_minimizers(dir_path, ['#1','#2','#3'], filter_include='(1.0xpi).*(relu_x_div_5_pow_3).*(beta=0.1#)', calc_min=calc_min)
ax = plot_PINN_and_DRM_minimizer_separately(*res_relu_x_div_5_pow_3, suptitle='relu_x_div_5_pow_3 beta=0.1', 
                                            DRM_visible=False, colors=['#8D9EBA', '#5050A8', '#5698C6'])

## Fixed Penalty Parameter $\beta$

In [None]:
groups = ['beta=0.1#', 'beta=1#', 'beta=100#', 'beta=1000#', 'beta=10000#', 'beta=100000#']
# note: it is important to add a # after beta=xyz because otherwise (for example) beta=1 would also match beta=100,beta=10000, beta=100000

res_05pi = find_minimizers(dir_path, groups, filter_include='0.5xpi', calc_min=calc_min)
# plot_PINN_and_DRM_minimizer_separately(*res_05pi, suptitle='Fixed beta best run 0.5xpi')

res_10pi = find_minimizers(dir_path, groups, filter_include='1.0xpi', calc_min=calc_min)
# plot_PINN_and_DRM_minimizer_separately(*res_10pi, suptitle='Fixed beta best run 1.0xpi')

res_15pi = find_minimizers(dir_path, groups, filter_include='1.5xpi', calc_min=calc_min)
# plot_PINN_and_DRM_minimizer_separately(*res_15pi, suptitle='Fixed beta best run 1.5xpi')

res_20pi = find_minimizers(dir_path, groups, filter_include='2.0xpi', calc_min=calc_min)
# plot_PINN_and_DRM_minimizer_separately(*res_20pi, suptitle='Fixed beta best run 2.0xpi')

# plot_PINN_and_DRM_minimizer_together(*res_20pi, rows=2, columns=3, suptitle='Fixed beta best run')

In [None]:
conv = lambda x: '{:.2e}'.format(Decimal(x)) # convert 
min_PINN_05pi = [conv(x) for x in res_05pi[1]]
min_PINN_10pi = [conv(x) for x in res_10pi[1]]
min_PINN_15pi = [conv(x) for x in res_15pi[1]]
min_PINN_20pi = [conv(x) for x in res_20pi[1]]

min_DRM_05pi = [conv(x) for x in res_05pi[2]]
min_DRM_10pi = [conv(x) for x in res_10pi[2]]
min_DRM_15pi = [conv(x) for x in res_15pi[2]]
min_DRM_20pi = [conv(x) for x in res_20pi[2]]

In [None]:
pd.DataFrame(data={group: [min_PINN_05pi[index], min_PINN_10pi[index], min_PINN_15pi[index], min_PINN_20pi[index]] 
                   for index, group in enumerate(groups)}, index=['0.5xpi', '1.0xpi', '1.5xpi', '2.0xpi'])

In [None]:
pd.DataFrame(data={group: [min_DRM_05pi[index], min_DRM_10pi[index], min_DRM_15pi[index], min_DRM_20pi[index]] 
                   for index, group in enumerate(groups)}, index=['0.5xpi', '1.0xpi', '1.5xpi', '2.0xpi'])

## Relations between goal functions and errors

In [None]:
metrics=['PINN relative_H1_error', 'PINN relative_L2_error', 'PINN relative_L2_error boundary', 'PINN loss_inner', 'PINN loss_boundary',
         'DRM relative_H1_error', 'DRM relative_L2_error', 'DRM relative_L2_error boundary', 'DRM loss_inner', 'DRM loss_boundary',]
res_all = find_all(dir_path, filter_include='^[0-9].[0-9]xpi', metrics=metrics, verbose=True)

In [None]:
fig, ax = plt.subplots(1, 2, figsize=(12,4))

res_inner_vs_boundary_H1_color_PINN = turn_into_x_y_hue(*res_all, x='PINN loss_inner', y='PINN loss_boundary', hue='PINN relative_H1_error', calc_min=calc_min)
plot_x_y_hue(*res_inner_vs_boundary_H1_color_PINN, ax=ax[0], xlabel='PINN goal_inner', ylabel='PINN goal_boundary')
calc_min_x = lambda values: sorted(np.abs(math.pi/4 -np.array(values[int(len(values)*0.8):])))[int(len(values)*0.1)]
res_inner_vs_boundary_H1_color_DRM = turn_into_x_y_hue(*res_all, x='DRM loss_inner', y='DRM loss_boundary', hue='DRM relative_H1_error', 
                                                           calc_min_x=calc_min_x, calc_min_y=calc_min, calc_min_hue=calc_min)
plot_x_y_hue(*res_inner_vs_boundary_H1_color_DRM, ax=ax[1], xlabel='abs diff to optimal Dirichlet energy', ylabel='DRM goal_boundary')

fig.tight_layout(rect=[0,0,1,0.95]) # tight_layout fixes overlapping labels and rect is necessary for clean suptitle
fig.show()

In [None]:
# fig, ax = plt.subplots(1, 2, figsize=(12,4))

# res_inner_vs_boundary_loss_i_color_PINN = turn_into_x_y_hue(*res_all, x='PINN relative_L2_error', y='PINN relative_L2_error boundary', hue='PINN loss_inner', calc_min=calc_min)
# plot_x_y_hue(*res_inner_vs_boundary_loss_i_color_PINN, ax=ax[0], huelabel='PINN goal_inner')
# res_inner_vs_boundary_loss_b_color_PINN = turn_into_x_y_hue(*res_all, x='PINN relative_L2_error', y='PINN relative_L2_error boundary', hue='PINN loss_boundary', calc_min=calc_min)
# plot_x_y_hue(*res_inner_vs_boundary_loss_b_color_PINN, ax=ax[1], huelabel='PINN goal_boundary')

# fig.tight_layout(rect=[0,0,1,0.95]) # tight_layout fixes overlapping labels and rect is necessary for clean suptitle
# fig.show()

In [None]:
# fig, ax = plt.subplots(1, 2, figsize=(12,4))

# res_inner_vs_boundary_L2_color_PINN = turn_into_x_y_hue(*res_all, x='PINN loss_inner', y='PINN loss_boundary', hue='PINN relative_L2_error', calc_min=calc_min)
# plot_x_y_hue(*res_inner_vs_boundary_L2_color_PINN, ax=ax[0], xlabel='PINN goal_inner', ylabel='PINN goal_boundary')
# res_inner_vs_boundary_L2_b_color_PINN = turn_into_x_y_hue(*res_all, x='PINN loss_inner', y='PINN loss_boundary', hue='PINN relative_L2_error boundary', calc_min=calc_min)
# plot_x_y_hue(*res_inner_vs_boundary_L2_b_color_PINN, ax=ax[1], xlabel='PINN goal_inner', ylabel='PINN goal_boundary')

# fig.tight_layout(rect=[0,0,1,0.95]) # tight_layout fixes overlapping labels and rect is necessary for clean suptitle
# fig.show()

In [None]:
# fig, ax = plt.subplots(1, 2, figsize=(12,4))

# calc_min_hue = lambda values: sorted(np.abs(math.pi/4 -np.array(values[int(len(values)*0.8):])))[int(len(values)*0.1)]
# res_inner_vs_boundary_loss_i_color_DRM = turn_into_x_y_hue(*res_all, x='DRM relative_L2_error', y='DRM relative_L2_error boundary', hue='DRM loss_inner', 
#                                                            calc_min_x=calc_min, calc_min_y=calc_min, calc_min_hue=calc_min_hue)
# plot_x_y_hue(*res_inner_vs_boundary_loss_i_color_DRM, ax=ax[0], huelabel='abs diff to optimal Dirichlet energy')
# res_inner_vs_boundary_loss_b_color_DRM = turn_into_x_y_hue(*res_all, x='DRM relative_L2_error', y='DRM relative_L2_error boundary', hue='DRM loss_boundary', calc_min=calc_min)
# plot_x_y_hue(*res_inner_vs_boundary_loss_b_color_DRM, ax=ax[1], huelabel='DRM goal_boundary')

# fig.tight_layout(rect=[0,0,1,0.95]) # tight_layout fixes overlapping labels and rect is necessary for clean suptitle
# fig.show()

In [None]:
# fig, ax = plt.subplots(1, 2, figsize=(12,4))

# calc_min_x = lambda values: sorted(np.abs(math.pi/4 -np.array(values[int(len(values)*0.8):])))[int(len(values)*0.1)]
# res_inner_vs_boundary_L2_color_DRM = turn_into_x_y_hue(*res_all, x='DRM loss_inner', y='DRM loss_boundary', hue='DRM relative_L2_error', 
#                                                            calc_min_x=calc_min_x, calc_min_y=calc_min, calc_min_hue=calc_min)
# plot_x_y_hue(*res_inner_vs_boundary_L2_color_DRM, ax=ax[0], xlabel='abs diff to optimal Dirichlet energy', ylabel='DRM goal_boundary')
# res_inner_vs_boundary_L2_b_color_DRM = turn_into_x_y_hue(*res_all, x='DRM loss_inner', y='DRM loss_boundary', hue='DRM relative_L2_error boundary', 
#                                                            calc_min_x=calc_min_x, calc_min_y=calc_min, calc_min_hue=calc_min)
# plot_x_y_hue(*res_inner_vs_boundary_L2_b_color_DRM, ax=ax[1], xlabel='abs diff to optimal Dirichlet energy', ylabel='DRM goal_boundary')

# fig.tight_layout(rect=[0,0,1,0.95]) # tight_layout fixes overlapping labels and rect is necessary for clean suptitle
# fig.show()

# no weak $2^\mathrm{nd}$ derivative

In [None]:
dir_path = '/content/logs/no weak 2nd derivatives/'
dir_paths = [dir_path+'n=10/', dir_path+'n=50/']
calc_min = lambda values: sorted(values[int(len(values)*0.8):])[int(len(values)*0.1)]
patterns = ['input_dim=', 'domain.*beta', 'interior.*beta', ', $']
repls = ['n=', 'beta', 'beta', '']

## overall best

In [None]:
res_overall_best = find_minimizers(dir_paths, ['input_dim=10,', 'input_dim=50,'], calc_min=calc_min)
plot_PINN_and_DRM_minimizer_together(*res_overall_best, rows=1, suptitle='Best overall run', patterns=patterns, repls=repls)

In [None]:
PINN_last_err = res_overall_best[3][1]['PINN relative_H1_error'].iat[-1]
DRM_last_err = res_overall_best[4][1]['DRM relative_H1_error'].iat[-1]
print('PINN last err ', PINN_last_err) 
print('DRM last err ', DRM_last_err) 
print('Relative difference ', abs(PINN_last_err - DRM_last_err)/min(PINN_last_err, DRM_last_err))

## different batch sizes

In [None]:
res_few_points_n_10 = find_minimizers(dir_path+'n=10/', ['deep', 'few inner', 'few everywhere'], calc_min=calc_min)
plot_PINN_and_DRM_minimizer_separately(*res_few_points_n_10, suptitle='Different batch sizes, n=10', patterns=patterns+['deep'], repls=repls+['normal'])

res_few_points_n_50 = find_minimizers(dir_path+'n=50/', ['deep', 'few inner', 'few everywhere'], calc_min=calc_min)
plot_PINN_and_DRM_minimizer_separately(*res_few_points_n_50, suptitle='Different batch sizes, n=50', patterns=patterns+['deep'], repls=repls+['normal'])

In [None]:
alt_res_few_points_n_10 = retrieve_alt_data(*res_few_points_n_10, metric_PINN='PINN relative_L2_error boundary', metric_DRM='DRM relative_L2_error boundary', calc_min=calc_min)
plot_PINN_and_DRM_minimizer_separately(*alt_res_few_points_n_10, suptitle='Relative L2 error on boundary, n=10', patterns=patterns+['deep'], repls=repls+['normal'])

alt_res_few_points_n_50 = retrieve_alt_data(*res_few_points_n_50, metric_PINN='PINN relative_L2_error boundary', metric_DRM='DRM relative_L2_error boundary', calc_min=calc_min)
plot_PINN_and_DRM_minimizer_separately(*alt_res_few_points_n_50, suptitle='Relative L2 error on boundary, n=50', patterns=patterns+['deep'], repls=repls+['normal'])

## difference shallow vs deep

In [None]:
# n=10
res_shallow_and_deep_n_10 = find_minimizers(dir_path+'n=10/', ['num_blocks=3', 'num_blocks=0'], filter_exclude='few', calc_min=calc_min)
plot_PINN_and_DRM_minimizer_separately(*res_shallow_and_deep_n_10, suptitle='Shallow vs Deep, n=10', patterns=patterns, repls=repls)

# n=50
res_shallow_and_deep_n_50 = find_minimizers(dir_path+'n=50/', ['num_blocks=3', 'num_blocks=0'], filter_exclude='few', calc_min=calc_min)
plot_PINN_and_DRM_minimizer_separately(*res_shallow_and_deep_n_50, suptitle='Shallow vs Deep, n=50', patterns=patterns, repls=repls)

## beta table

In [None]:
groups = ['beta=0.1#', 'beta=1#', 'beta=100#', 'beta=10000#', 'beta=100000#']
# note: it is important to add a # after beta=xyz because otherwise (for example) beta=1 would also match beta=100,beta=10000, beta=100000

res_n_10 = find_minimizers(dir_paths, groups, filter_include='input_dim=10,', metric_PINN='PINN relative_H1_error', metric_DRM='DRM relative_H1_error', calc_min=calc_min)
res_n_50 = find_minimizers(dir_paths, groups, filter_include='input_dim=50,', metric_PINN='PINN relative_H1_error', metric_DRM='DRM relative_H1_error', calc_min=calc_min)

conv = lambda x: '{:.2e}'.format(Decimal(x)) # convert 
min_PINN_n_10 = [conv(x) for x in res_n_10[1]]
min_PINN_n_50 = [conv(x) for x in res_n_50[1]]

min_DRM_n_10 = [conv(x) for x in res_n_10[2]]
min_DRM_n_50 = [conv(x) for x in res_n_50[2]]

In [None]:
pd.DataFrame(data={group: [min_PINN_n_10[index], min_PINN_n_50[index]] 
                   for index, group in enumerate(groups)}, index=['n=10', 'n=50'])

In [None]:
pd.DataFrame(data={group: [min_DRM_n_10[index], min_DRM_n_50[index]] 
                   for index, group in enumerate(groups)}, index=['n=10', 'n=50'])

## Relations between goal functions and errors

In [None]:
metrics=['PINN relative_H1_error', 'PINN relative_L2_error', 'PINN relative_L2_error boundary', 'PINN loss_inner', 'PINN loss_boundary',
         'DRM relative_H1_error', 'DRM relative_L2_error', 'DRM relative_L2_error boundary', 'DRM loss_inner', 'DRM loss_boundary',]
res_all = find_all(dir_path, filter_include='^n=[0-9]*/', metrics=metrics)

In [None]:
fig, ax = plt.subplots(1, 2, figsize=(12,4))

res_inner_vs_boundary_H1_color_PINN = turn_into_x_y_hue(*res_all, x='PINN loss_inner', y='PINN loss_boundary', hue='PINN relative_H1_error', calc_min=calc_min)
plot_x_y_hue(*res_inner_vs_boundary_H1_color_PINN, ax=ax[0], 
             huescale='linear', xlabel='PINN goal_inner', ylabel='PINN goal_boundary')
calc_min_x = lambda values: sorted(np.abs(5 -np.array(values[int(len(values)*0.8):])))[int(len(values)*0.1)]
res_inner_vs_boundary_H1_color_DRM_n_10 = turn_into_x_y_hue(*res_all, x='DRM loss_inner', y='DRM loss_boundary', hue='DRM relative_H1_error', 
                                                           calc_min_x=calc_min_x, calc_min_y=calc_min, calc_min_hue=calc_min)
calc_min_x = lambda values: sorted(np.abs(25 -np.array(values[int(len(values)*0.8):])))[int(len(values)*0.1)]
res_inner_vs_boundary_H1_color_DRM_n_50 = turn_into_x_y_hue(*res_all, x='DRM loss_inner', y='DRM loss_boundary', hue='DRM relative_H1_error', 
                                                           calc_min_x=calc_min_x, calc_min_y=calc_min, calc_min_hue=calc_min)
res_zip = list(zip(res_inner_vs_boundary_H1_color_DRM_n_10, res_inner_vs_boundary_H1_color_DRM_n_50))
res_inner_vs_boundary_H1_color_DRM = []
for el in res_zip:
  try:
    el = pd.concat(el)
  except:
    el = np.concatenate(el)
  finally:
    res_inner_vs_boundary_H1_color_DRM.append(el)
plot_x_y_hue(*res_inner_vs_boundary_H1_color_DRM, ax=ax[1], 
             huescale='linear', xlabel='abs diff to optimal Dirichlet energy', ylabel='DRM goal_boundary')

fig.tight_layout(rect=[0,0,1,0.95]) # tight_layout fixes overlapping labels and rect is necessary for clean suptitle
fig.show()

In [None]:
fig, ax = plt.subplots(1, 2, figsize=(12,4))

res_inner_vs_boundary_loss_i_color_PINN = turn_into_x_y_hue(*res_all, x='PINN relative_L2_error', y='PINN relative_L2_error boundary', hue='PINN loss_inner', calc_min=calc_min)
plot_x_y_hue(*res_inner_vs_boundary_loss_i_color_PINN, ax=ax[0], xscale='linear', yscale='linear', huelabel='PINN goal_inner')
res_inner_vs_boundary_loss_b_color_PINN = turn_into_x_y_hue(*res_all, x='PINN relative_L2_error', y='PINN relative_L2_error boundary', hue='PINN loss_boundary', calc_min=calc_min)
plot_x_y_hue(*res_inner_vs_boundary_loss_b_color_PINN, ax=ax[1], xscale='linear', yscale='linear', huelabel='PINN goal_boundary')

fig.tight_layout(rect=[0,0,1,0.95]) # tight_layout fixes overlapping labels and rect is necessary for clean suptitle
fig.show()

In [None]:
# fig, ax = plt.subplots(1, 2, figsize=(12,4))

# res_inner_vs_boundary_L2_color_PINN = turn_into_x_y_hue(*res_all, x='PINN loss_inner', y='PINN loss_boundary', hue='PINN relative_L2_error', calc_min=calc_min)
# plot_x_y_hue(*res_inner_vs_boundary_L2_color_PINN, ax=ax[0], huescale='linear')
# res_inner_vs_boundary_L2_b_color_PINN = turn_into_x_y_hue(*res_all, x='PINN loss_inner', y='PINN loss_boundary', hue='PINN relative_L2_error boundary', calc_min=calc_min)
# plot_x_y_hue(*res_inner_vs_boundary_L2_b_color_PINN, ax=ax[1], huescale='linear')
# fig.tight_layout(rect=[0,0,1,0.95]) # tight_layout fixes overlapping labels and rect is necessary for clean suptitle
# fig.show()

In [None]:
fig, ax = plt.subplots(1, 2, figsize=(12,4))

# hue = loss_inner
calc_min_hue = lambda values: sorted(np.abs(5 -np.array(values[int(len(values)*0.8):])))[int(len(values)*0.1)]
res_inner_vs_boundary_loss_i_color_DRM_n_10 = turn_into_x_y_hue(*res_all, x='DRM relative_L2_error', y='DRM relative_L2_error boundary', hue='DRM loss_inner', 
                                                           calc_min_x=calc_min, calc_min_y=calc_min, calc_min_hue=calc_min_hue)
calc_min_hue = lambda values: sorted(np.abs(25 -np.array(values[int(len(values)*0.8):])))[int(len(values)*0.1)]
res_inner_vs_boundary_loss_i_color_DRM_n_50 = turn_into_x_y_hue(*res_all, x='DRM relative_L2_error', y='DRM relative_L2_error boundary', hue='DRM loss_inner', 
                                                           calc_min_x=calc_min, calc_min_y=calc_min, calc_min_hue=calc_min_hue)
res_zip = list(zip(res_inner_vs_boundary_loss_i_color_DRM_n_10, res_inner_vs_boundary_loss_i_color_DRM_n_50))
res_inner_vs_boundary_loss_i_color_DRM = []
for el in res_zip:
  try:
    el = pd.concat(el)
  except:
    el = np.concatenate(el)
  finally:
    res_inner_vs_boundary_loss_i_color_DRM.append(el)
plot_x_y_hue(*res_inner_vs_boundary_loss_i_color_DRM, ax=ax[0], xscale='linear', yscale='linear', huescale='linear', huelabel='abs diff to optimal Dirichlet energy')

# hue = loss_boundary
calc_min_hue = lambda values: sorted(np.abs(5 -np.array(values[int(len(values)*0.8):])))[int(len(values)*0.1)]
res_inner_vs_boundary_loss_b_color_DRM_n_10 = turn_into_x_y_hue(*res_all, x='DRM relative_L2_error', y='DRM relative_L2_error boundary', hue='DRM loss_boundary', 
                                                           calc_min_x=calc_min, calc_min_y=calc_min, calc_min_hue=calc_min_hue)
calc_min_hue = lambda values: sorted(np.abs(25 -np.array(values[int(len(values)*0.8):])))[int(len(values)*0.1)]
res_inner_vs_boundary_loss_b_color_DRM_n_50 = turn_into_x_y_hue(*res_all, x='DRM relative_L2_error', y='DRM relative_L2_error boundary', hue='DRM loss_boundary', 
                                                           calc_min_x=calc_min, calc_min_y=calc_min, calc_min_hue=calc_min_hue)
res_zip = list(zip(res_inner_vs_boundary_loss_b_color_DRM_n_10, res_inner_vs_boundary_loss_b_color_DRM_n_50))
res_inner_vs_boundary_loss_b_color_DRM = []
for el in res_zip:
  try:
    el = pd.concat(el)
  except:
    el = np.concatenate(el)
  finally:
    res_inner_vs_boundary_loss_b_color_DRM.append(el)
plot_x_y_hue(*res_inner_vs_boundary_loss_b_color_DRM, ax=ax[1], xscale='linear', yscale='linear', huescale='linear', huelabel='DRM goal_boundary')

fig.tight_layout(rect=[0,0,1,0.95]) # tight_layout fixes overlapping labels and rect is necessary for clean suptitle
fig.show()

In [None]:
# fig, ax = plt.subplots(1, 2, figsize=(12,4))

# # hue = relative_L2_error
# calc_min_x = lambda values: sorted(np.abs(5 -np.array(values[int(len(values)*0.8):])))[int(len(values)*0.1)]
# res_inner_vs_boundary_L2_color_DRM_n_10 = turn_into_x_y_hue(*res_all, x='DRM loss_inner', y='DRM loss_boundary', hue='DRM relative_L2_error', 
#                                                            calc_min_x=calc_min_x, calc_min_y=calc_min, calc_min_hue=calc_min)
# calc_min_x = lambda values: sorted(np.abs(25 -np.array(values[int(len(values)*0.8):])))[int(len(values)*0.1)]
# res_inner_vs_boundary_L2_color_DRM_n_50 = turn_into_x_y_hue(*res_all, x='DRM loss_inner', y='DRM loss_boundary', hue='DRM relative_L2_error', 
#                                                            calc_min_x=calc_min_x, calc_min_y=calc_min, calc_min_hue=calc_min)
# res_zip = list(zip(res_inner_vs_boundary_L2_color_DRM_n_10, res_inner_vs_boundary_L2_color_DRM_n_50))
# res_inner_vs_boundary_L2_color_DRM = []
# for el in res_zip:
#   try:
#     el = pd.concat(el)
#   except:
#     el = np.concatenate(el)
#   finally:
#     res_inner_vs_boundary_L2_color_DRM.append(el)
# plot_x_y_hue(*res_inner_vs_boundary_L2_color_DRM, ax=ax[0], xscale='linear', yscale='linear', huescale='linear', xlabel='abs diff to optimal Dirichlet energy')

# # hue = relative_L2_error boundary
# calc_min_x = lambda values: sorted(np.abs(5 -np.array(values[int(len(values)*0.8):])))[int(len(values)*0.1)]
# res_inner_vs_boundary_L2_b_color_DRM_n_10 = turn_into_x_y_hue(*res_all, x='DRM loss_inner', y='DRM loss_boundary', hue='DRM relative_L2_error boundary', 
#                                                            calc_min_x=calc_min_x, calc_min_y=calc_min, calc_min_hue=calc_min)
# calc_min_x = lambda values: sorted(np.abs(25 -np.array(values[int(len(values)*0.8):])))[int(len(values)*0.1)]
# res_inner_vs_boundary_L2_b_color_DRM_n_50 = turn_into_x_y_hue(*res_all, x='DRM loss_inner', y='DRM loss_boundary', hue='DRM relative_L2_error boundary', 
#                                                            calc_min_x=calc_min_x, calc_min_y=calc_min, calc_min_hue=calc_min)
# res_zip = list(zip(res_inner_vs_boundary_L2_b_color_DRM_n_10, res_inner_vs_boundary_L2_b_color_DRM_n_50))
# res_inner_vs_boundary_L2_b_color_DRM = []
# for el in res_zip:
#   try:
#     el = pd.concat(el)
#   except:
#     el = np.concatenate(el)
#   finally:
#     res_inner_vs_boundary_L2_b_color_DRM.append(el)
# plot_x_y_hue(*res_inner_vs_boundary_L2_b_color_DRM, ax=ax[1], xscale='linear', yscale='linear', huescale='linear', xlabel='abs diff to optimal Dirichlet energy')

# fig.tight_layout(rect=[0,0,1,0.95]) # tight_layout fixes overlapping labels and rect is necessary for clean suptitle
# fig.show()

# the $2^\mathrm{nd}$ derivative explodes

In [None]:
dir_path = '/content/logs/exploding 2nd derivatives/'
dir_paths = [dir_path+'n=10/', dir_path+'n=100/']
calc_min = lambda values: sorted(values[int(len(values)*0.8):])[int(len(values)*0.1)]
patterns = ['input_dim=', 'num.*beta', 'interior.*beta', ', $']
repls = ['n=', 'beta', 'beta', '']

## overall best

In [None]:
res_overall_best = find_minimizers(dir_paths, ['input_dim=10,', 'input_dim=100,'], metric_PINN='PINN relative_C1_error', metric_DRM='DRM relative_C1_error', calc_min=calc_min)
plot_PINN_and_DRM_minimizer_together(*res_overall_best, rows=1, suptitle='Best overall run', patterns=patterns, repls=repls)

In [None]:
PINN_last_err = res_overall_best[3][0]['PINN relative_C1_error'].iat[-1]
DRM_last_err = res_overall_best[4][0]['DRM relative_C1_error'].iat[-1]
print('PINN last err ', PINN_last_err) 
print('DRM last err ', DRM_last_err) 
print('Relative difference ', abs(PINN_last_err - DRM_last_err)/min(PINN_last_err, DRM_last_err))

## different batch sizes

In [None]:
res_few_points_n_10 = find_minimizers(dir_path+'n=10/', ['shallow', 'few inner', 'few everywhere'], metric_PINN='PINN relative_C1_error', metric_DRM='DRM relative_C1_error', calc_min=calc_min)
plot_PINN_and_DRM_minimizer_separately(*res_few_points_n_10, suptitle='Different batch sizes, n=10', patterns=patterns+['shallow'], repls=repls+['normal'])

res_few_points_n_100 = find_minimizers(dir_path+'n=100/', ['shallow', 'few inner', 'few everywhere'], metric_PINN='PINN relative_C1_error', metric_DRM='DRM relative_C1_error', calc_min=calc_min)
plot_PINN_and_DRM_minimizer_separately(*res_few_points_n_100, suptitle='Different batch sizes, n=100', patterns=patterns+['shallow'], repls=repls+['normal'])

In [None]:
alt_res_few_points_n_10 = retrieve_alt_data(*res_few_points_n_10, metric_PINN='PINN relative_L_oo_error boundary', metric_DRM='DRM relative_L_oo_error boundary', calc_min=calc_min)
plot_PINN_and_DRM_minimizer_separately(*alt_res_few_points_n_10, suptitle='Relative L_inf error on boundary, n=10', ylabel='relative_L_inf_error boundary', patterns=patterns+['shallow'], repls=repls+['normal'])

alt_res_few_points_n_100 = retrieve_alt_data(*res_few_points_n_100, metric_PINN='PINN relative_L_oo_error boundary', metric_DRM='DRM relative_L_oo_error boundary', calc_min=calc_min)
plot_PINN_and_DRM_minimizer_separately(*alt_res_few_points_n_100, suptitle='Relative L_inf error on boundary, n=100', ylabel='relative_L_inf_error boundary', patterns=patterns+['shallow'], repls=repls+['normal'])

In [None]:
res_few_points_n_10_L_inf = find_minimizers(dir_path+'n=10/', ['shallow', 'few inner', 'few everywhere'], metric_PINN='PINN relative_L_oo_error', metric_DRM='DRM relative_L_oo_error', calc_min=calc_min)
plot_PINN_and_DRM_minimizer_separately(*res_few_points_n_10_L_inf, suptitle='L_inf, n=10', ylabel='relative_L_inf_error', DRM_visible=False, patterns=patterns+['shallow'], repls=repls+['normal'])

alt_res_few_points_n_10_L_inf = retrieve_alt_data(*res_few_points_n_10_L_inf, metric_PINN='PINN relative_L_oo_error boundary', metric_DRM='DRM relative_L_oo_error boundary', calc_min=calc_min)
plot_PINN_and_DRM_minimizer_separately(*alt_res_few_points_n_10_L_inf, suptitle='Relative L_inf error on boundary, n=10', ylabel='relative_L_inf_error boundary', DRM_visible=False, patterns=patterns+['shallow'], repls=repls+['normal'])

## difference shallow and deep

In [None]:
# n=10
res_shallow_and_deep_n_10 = find_minimizers(dir_path+'n=10/', ['num_blocks=3', 'num_blocks=0'], filter_exclude='few', metric_PINN='PINN relative_C1_error', metric_DRM='DRM relative_C1_error', calc_min=calc_min)
plot_PINN_and_DRM_minimizer_separately(*res_shallow_and_deep_n_10, suptitle='Shallow vs Deep, n=10', patterns=patterns, repls=repls)

# n=100
res_shallow_and_deep_n_100 = find_minimizers(dir_path+'n=100/', ['num_blocks=3', 'num_blocks=0'], filter_exclude='few', metric_PINN='PINN relative_C1_error', metric_DRM='DRM relative_C1_error', calc_min=calc_min)
plot_PINN_and_DRM_minimizer_separately(*res_shallow_and_deep_n_100, suptitle='Shallow vs Deep, n=100', patterns=patterns, repls=repls)

In [None]:
# long epochs tests
def plot_shallow_vs_deep_lots_of_epochs():
  dir_path = '/content/logs/exploding 2nd derivatives/'
  names = [('deep, input_dim=10, num_blocks=3, nodes_per_layer=10#1/beta=10000#1', 'shallow, input_dim=10, num_blocks=0, nodes_per_layer=100#1/beta=1#1',), 
           ('deep, input_dim=10, num_blocks=3, nodes_per_layer=10#1/beta=100#1', 'shallow, input_dim=10, num_blocks=0, nodes_per_layer=100#1/beta=100#1'), 
           ('deep, input_dim=100, num_blocks=3, nodes_per_layer=100#1/beta=100000#1', 'shallow, input_dim=100, num_blocks=0, nodes_per_layer=100#1/beta=10000#1'), 
           ('deep, input_dim=100, num_blocks=3, nodes_per_layer=100#1/beta=100000#1', 'shallow, input_dim=100, num_blocks=0, nodes_per_layer=100#1/beta=100#1'),]
  index = 0
  for n in [10, 100]:
    fig, ax = plt.subplots(1, 2, figsize=(12,4), sharey=True)
    for i, method in enumerate(['PINN', 'DRM']):
      metric = method+' relative_H1_error'
      for j, network in enumerate(['deep', 'shallow']):
        color = '#5698C6' if network == 'deep' else '#FF9E4A'
        for epochs in [f'n={n}/', 'epochs=50000/']:
          name = names[index][j]
          
          event_acc = EventAccumulator(dir_path + epochs + name, size_guidance={'tensors': 0}) 
          event_acc.Reload()
          data = pd.DataFrame([(w, s, tf.make_ndarray(t)[()]) for w, s, t in event_acc.Tensors(metric)], columns=['wall_time', 'step', metric],)

          if epochs == 'epochs=50000/':
            label = sub_all(patterns, repls, name)
            linestyle = 'solid'
          else:
            label = None
            linestyle = 'dotted'
          sns.lineplot(data=data, x='step', y=metric, ax=ax[i], label=label, linestyle=linestyle, color=color)
      ax[i].set(title=method, xlabel='epoch', ylabel='relative_C1_error')
      index += 1
    plt.yscale('log')
    fig.suptitle(f'Shallow vs Deep, lots of epochs, n={n}', fontsize=14)
    fig.tight_layout(rect=[0,0,1,0.95]) # tight_layout fixes overlapping labels and rect is necessary for clean suptitle
    fig.show()

plot_shallow_vs_deep_lots_of_epochs()

## beta table

In [None]:
groups = ['beta=0.1#', 'beta=1#', 'beta=100#', 'beta=10000#', 'beta=100000#']
# note: it is important to add a # after beta=xyz because otherwise (for example) beta=1 would also match beta=100,beta=10000, beta=100000

res_n_10 = find_minimizers(dir_paths, groups, filter_include='input_dim=10,', metric_PINN='PINN relative_C1_error', metric_DRM='DRM relative_C1_error', calc_min=calc_min)
res_n_100 = find_minimizers(dir_paths, groups, filter_include='input_dim=100,', metric_PINN='PINN relative_C1_error', metric_DRM='DRM relative_C1_error', calc_min=calc_min)

conv = lambda x: '{:.2e}'.format(Decimal(x)) # convert 
min_PINN_n_10 = [conv(x) for x in res_n_10[1]]
min_PINN_n_100 = [conv(x) for x in res_n_100[1]]

min_DRM_n_10 = [conv(x) for x in res_n_10[2]]
min_DRM_n_100 = [conv(x) for x in res_n_100[2]]

In [None]:
pd.DataFrame(data={group: [min_PINN_n_10[index], min_PINN_n_100[index]] 
                   for index, group in enumerate(groups)}, index=['n=10', 'n=100'])

In [None]:
pd.DataFrame(data={group: [min_DRM_n_10[index], min_DRM_n_100[index]] 
                   for index, group in enumerate(groups)}, index=['n=10', 'n=100'])