## Problem Statement

Use PINNs to solve the parameterized fluid flow for the given geometry and flow parameters

Simulate the same chip flow, but now with variable chip height and width in a single parameteric run. 

<img src="chip_2d_parameterized.png" alt="Drawing" style="width: 800px;"/>

## Challenge

The main challenge in this problem is to correctly formulate the problem using PINNs. In order to achieve that, you will have to complete the following parts successfully:
1. Define the correct *parameterized* geometry for the problem
2. Set-up the correct boundary conditions and equations
3. Create the neural network and solve the *parameteric* problem

A successful completion of the problem should result a network that can be inferenced at any geometry parameter combination of our choice. For example, the model should be able to create plots for a geometry configuraions where the chip height is of 0.4 and width is 1.4. An example is shown below. Remember, once the model is trained, you can infer any geometry combination without the need to sovle the problem again. Quite cool isn't it!? 

<img src="challenge_results_param_updated.png" alt="Drawing" style="width: 800px;"/>

In this template, we will have give you a skeleton code which you will fill in to define and solve the parametreric problem. If you have completed the challenge 1 successfully, moving to this part should be fairly easy as there are only a few minor edits. Let us start with importing the required packages.

**Note: You need to edit the `chip_2d_parameterized_template.py` script that is placed in the ../source_code/chip_2d/ directory.**

From the top menu, click on File, and Open `chip_2d_parameterized_template.py` from the current directory at `../source_code/chip_2d` directory. Remember to SAVE your code after changes, before running below cells.

```python
from sympy import Symbol
import numpy as np
import tensorflow as tf
from modulus.solver import Solver
from modulus.dataset import TrainDomain, ValidationDomain, InferenceDomain
from modulus.data import Validation, Inference
from modulus.sympy_utils.geometry_2d import Rectangle, Line, Channel2D
from modulus.sympy_utils.functions import parabola
from modulus.csv_utils.csv_rw import csv_to_dict
from modulus.PDES.navier_stokes import IntegralContinuity, NavierStokes
from modulus.controller import ModulusController
from modulus.architecture import FourierNetArch
from modulus.learning_rate import ExponentialDecayLRWithWarmup
```

Now we have defined the simulation parameters and the symbolic variables to parameterize the geometry below. As seen in the earlier tutorials, we have define a range in which data points will be sampled during the trainig. Feel free to try different ranges for the variables.

```python
# simulation params
channel_length = (-2.5, 2.5)
channel_width = (-0.5, 0.5)
chip_pos = -1.0
#chip_height = 0.6         # Not fixed anymore
#chip_width = 1.0          # Not fixed anymore
inlet_vel = 1.5

# paramteric variables
chip_height = Symbol('chip_height')
chip_width = Symbol('chip_width')

chip_height_range = (0.4, 0.8)
chip_width_range  = (0.6, 1.4)

param_ranges = {chip_height: chip_height_range, 
                chip_width: chip_width_range}
```

```python
#TODO: Replace all the placeholders with appropriate geometry constructions
# define geometry here
# you may use the geometry generated in the previous challenge problem as a reference

channel = placeholder
# define inlet and outlet
inlet = placeholder
outlet = placeholder
# define the chip
rec = placeholder
# create a geometry for higher sampling of point cloud near the fin
flow_rec = placeholder

# fluid area
geo = placeholder
geo_hr = placeholder
geo_lr = placeholder

x_pos = Symbol('x_pos')
integral_line = placeholder
x_pos_range = {x_pos: lambda batch_size: np.full((batch_size, 1), np.random.uniform(channel_length[0], channel_length[1]))}
```

Now complete the `Chip2DTrain` class below. Remember that this time, you will have to add the dictionary of the parameter keys and their ranges as an input to the `param_ranges` parameter. An example of the inlet boundary condition is already shown. *Note: For the integral continuity planes, you have two dictionaries. One for the randomly sampled* `x_pos` *and the other for the parameteric variables itself. Make sure to include both of these as input to the* `param_ranges`

```python
#TODO: Replace all the placeholders with appropriate values 

# define sympy variables to parametrize domain curves
x, y = Symbol('x'), Symbol('y')

class Chip2DTrain(TrainDomain):
  def __init__(self, **config):
    super(Chip2DTrain, self).__init__()

    # inlet
    inlet_parabola = parabola(y, channel_width[0], channel_width[1], inlet_vel)
    inlet_bc = inlet.boundary_bc(outvar_sympy={'u': inlet_parabola, 'v': 0},
                                 batch_size_per_area=64,
                                 param_ranges=param_ranges)
    self.add(inlet_bc, name="Inlet")

    # outlet
    outlet_bc = outlet.boundary_bc(outvar_sympy={placeholder},
                                   batch_size_per_area=placeholder,
                                   param_ranges=placeholder)
    self.add(outlet_bc, name="Outlet")

    # noslip
    noslip = geo.boundary_bc(outvar_sympy={placeholder},
                             batch_size_per_area=placeholder,
                             param_ranges=placeholder)
    self.add(noslip, name="ChipNS")

    # interior lr
    interior_lr = geo_lr.interior_bc(outvar_sympy={placeholder},
                                     bounds={placeholder},
                                     lambda_sympy={placeholder},
                                     batch_size_per_area=placeholder,
                                     param_ranges=placeholder)
    self.add(interior_lr, name="InteriorLR")

    # interior hr
    interior_hr = geo_hr.interior_bc(outvar_sympy={placeholder},
                                     bounds={placeholder},
                                     lambda_sympy={placeholder},
                                     batch_size_per_area=placeholder,
                                     param_ranges=placeholder)
    self.add(interior_hr, name="InteriorHR")


    # integral continuity
    for i in range(4):
      IC = integral_line.boundary_bc(outvar_sympy={placeholder},
                                     batch_size_per_area=placeholder,
                                     lambda_sympy={placeholder},
                                     criteria=placeholder,
                                     param_ranges={placeholder},
                                     fixed_var=placeholder)
      self.add(IC, name="IntegralContinuity_"+str(i))
```

Now, add validation data to the problem. Similar to the tutorial examples, validate this parameteric solution for a single configuration. Remember that the network for this problem has four inputs (x, y, chip_height, and chip_width). So your validation data also must have those corresponding inputs for each point in the CSV file. 

```python
# validation data
mapping = {'Points:0': 'x', 'Points:1': 'y',
           'U:0': 'u', 'U:1': 'v', 'p': 'p'}
openfoam_var = csv_to_dict('openfoam/2D_chip_fluid0.csv', mapping)
openfoam_var['x'] -= 2.5 # normalize pos
openfoam_var['y'] -= 0.5

#TODO: Add the arrays for 'chip_height' and 'chip_width'

openfoam_invar_numpy = {key: value for key, value in openfoam_var.items() if key in ['x', 'y', 'chip_height', 'chip_width']}
openfoam_outvar_numpy = {key: value for key, value in openfoam_var.items() if key in ['u', 'v', 'p']}

class Chip2DVal(ValidationDomain):
  def __init__(self, **config):
    super(Chip2DVal, self).__init__()
    val = Validation.from_numpy(openfoam_invar_numpy, openfoam_outvar_numpy)
    self.add(val, name='Val')
```

As discussed in the begining, we would like to visualize multiple designs without traing again and again. Let's define a `InferenceDomain` to do this. By changing the values in the this domain alone, and executing the script with `--run_mode=eval` commandline argument will help you to visualize multiple designs using the already trained network. 

```python
class Chip2DInf(InferenceDomain):
  def __init__(self, **config):
    super(Chip2DInf, self).__init__()
    inf = Inference(geo.sample_interior(2048, bounds={x: channel_length, y: channel_width}, 
                                        param_ranges={chip_height: 0.4, chip_width: 1.4}),
                    ['u', 'v', 'p'])
    self.add(inf, name='Inference')
```

Now finish the problem by defining the `ChipSolver` to solve our parametric problem. We have used the same fourier networks we used in the previous template. The important parameters of the neural network are defined for you. Feel free to tweak them and observe its behavior on the results and speed of convergence.

```python
#TODO: Replace all the placeholders with appropriate values
class ChipSolver(Solver):
  train_domain = placeholder
  val_domain = placeholder
  arch = FourierNetArch
  lr = ExponentialDecayLRWithWarmup
  inference_domain = placeholder

  def __init__(self, **config):
    super(ChipSolver, self).__init__(**config)

    self.frequencies = ('axis,diagonal', [i/5. for i in range(25)]) 

    self.equations = (placeholder)
    flow_net = self.arch.make_node(name='flow_net',
                                   inputs=[placeholder],
                                   outputs=[placeholder])
    self.nets = [flow_net]

  @classmethod
  def update_defaults(cls, defaults):
    defaults.update({
        'network_dir': './network_checkpoint_chip_2d_parameterized',
        'rec_results': True,
        'rec_results_freq': 5000,
        'max_steps': 20000,
        'decay_steps': 400,
        'warmup_type': 'gradual',
        'warmup_steps': 2000,
        'xla': True
        })
if __name__ == '__main__':
  ctr = ModulusController(ChipSolver)
  ctr.run()
```

## Bonus: Multi-GPU speedup

Nvida Modulus library is optimized to achieve fast and scalable performance with features like accelerated linear algebra (XLA), automatic mixed precision (AMP) support and multi-GPU/multi-node implementation. You would have already observed that we had the `'xla'` option set to `True` for most of our examples. Here, we will discuss about the multi-GPU capabilies. For a detailed discussion on the performance upgrades available in Modulus, please refer to the *Modulus User Guide Chapter 16: Case Study: Performance Upgrades and Parallel Processing using Multi-GPU Configurations*

To boost performance and to run larger problems, Modulus supports multi-GPU and multi-node scaling using Horovod. This allows for multiple processes, each targeting a single GPU, to perform independent forward and backward passes and aggregate the gradients collectively before updating the model weights. The figure below shows the scaling performance
of Modulus on a annular ring test problem up to 128 V100 GPUs on 16 nodes. The scaling efficiency from 1 to 32 GPUs is more than 95%.
<img src="multi_GPU_1.png" alt="Drawing" style="width: 800px;"/>

This data parallel fashion of multi-GPU training keeps the number of points sampled per GPU constant while increasing
the total effective batch size. We can use this to our advantage to increase the number of points sampled by increasing
the number of GPUs allowing us to handle much larger problems.

We can run the same scripts using multi-GPU to achieve larger batch sizes. And doing so, the time per iteration remains fairly constant and the benefits are mostly in terms of large problem solution (refer the figure above). But what if we want to also achieve faster time to convergence? If you noticed carefully, we have used a different learning rate schedule in this problem. This basically allows us to decrease the total time to convergence when training using multiple GPUs. This is done by scaling the learning rate linearly with the number of GPUs. As described in this [paper](https://arxiv.org/abs/1706.02677), simply increasing the learning rate can cause the model to diverge at large batch
sizes. This can be fixed by using an initial learning rate warm-up. In this code we have made use of this feature of Modulus (with gradual warm-up). Below figure shows the learning rate schedule and the loss function evolution as the number of GPUs is increased from 1 to 16 for the NVSwitch heatsink case. For the multi-GPU cases, the learning rate is gradually increased from the baseline case and this allows the model to train without diverging early on and allows the model to converge faster as a result of the increased global batch size coupled with the increased learning rate.

<img src="multi_GPU_2.png" alt="Drawing" style="width: 800px;"/>

For more details, refer to *Modulus User Guide Chapter 1 and Chapter 16*.

To run your code using multiple GPUs, simply execute the python script using the horovod command as follows

`horovodrun -np #GPUs python_script.py `


# Licensing
This material is released by NVIDIA Corporation under the Creative Commons Attribution 4.0 International (CC BY 4.0)