<p style="font-size:32px; font-weight: bolder; text-align: center"> Rotations, equivariance and <br/> symmetry-adapted regression </p>
<p style="text-align: center"><i> authored by: <a href="mailto:michele.ceriotti@gmail.com"> Michele Ceriotti </a></i></p>

This notebook discusses the concept of equivariance, with a specific focus on the rotation group. We will learn about Cartesian rotations, spherical harmonics, Wigner matrices.
We will see how these concepts apply to some of the equivariant descriptors used in machine learning, and how it is possible to build simple regression models that yield rotationally equivariant predictions for vectorial or tensorial properties.

### Packages and dependencies
This module uses some utility functions from `scipy` and `spherical` to handle rotations, and `rascaline` to compute descriptors. 

In [None]:
%matplotlib widget
# scwidgets import
import matplotlib as mpl
import matplotlib.pyplot as plt
import mpl_toolkits.mplot3d as mplot3d
import chemiscope

import ipywidgets
from ipywidgets import FloatSlider, IntSlider, Checkbox, Dropdown, HBox, Layout, HTML, Text

from markdown import markdown as mdwn

import scwidgets
from scwidgets.check import (
    Check,
    CheckRegistry,
    assert_numpy_allclose,
    assert_numpy_floating_sub_dtype,
    assert_shape,
    assert_type,
)
from scwidgets.code import ParameterPanel, CodeInput
from scwidgets.cue import CueObject, CueFigure
from scwidgets.exercise import CodeExercise, TextExercise, ExerciseRegistry

In [None]:
import numpy as np
import ase, ase.io
import itertools
from copy import deepcopy
from tqdm.notebook import tqdm

import rascaline
from metatensor import mean_over_samples, Labels, TensorMap, TensorBlock, slice_block

from sklearn.decomposition import PCA
from sklearn.linear_model import Ridge, RidgeCV

import scipy

from sphericart import SphericalHarmonics
sph = SphericalHarmonics(l_max=8) # initializes sph calculator

In [None]:
from rotutils import *

In [None]:
# set CSS style for code-hide
scwidgets.get_css_style()

### Answer settings

In [None]:
exercise_registry = ExerciseRegistry(filename_prefix="module_03")
exercise_registry

In [None]:
check_registry = CheckRegistry()
check_registry

In [None]:
module_summary = TextExercise(
    exercise_description="""You can use this box to make general considerations, 
    or keep track of your doubts and questions about this notebook.""",
    exercise_registry=exercise_registry,
    exercise_title="Module comments",
    exercise_key="00"
)
display(module_summary)

# The rotation group 

Rotations describe the changes in the orientation in space of a rigid body relative to a fixed coordinate system. The mathematical description of rotations is notoriously tedious, with a plethora of different conventions that are often applied inconsistently in different works. 
If you are the kind of person who enjoys this stuff, you this [wikipedia article](https://en.wikipedia.org/wiki/Rotation_formalisms_in_three_dimensions) provides a comprehensive overview. 

In this exercise we are going to define rotations in terms of [Euler angles](https://en.wikipedia.org/wiki/Euler_angles) in the so-called ZYZ convention, in which the rotation is identified by three angles $(\alpha, \beta, \gamma)$ where $\alpha$ and $\gamma$ are periodic and can be chosen in the interval $[-\pi,\pi]$, and $\beta$ in the interval $[0,\pi]$.

To get a grasp of what Euler angles do, and why you need to define three angles to properly characterize the orientation of a structure, you can play around with the following visualization.

In [None]:
ex01_pb =  ParameterPanel(
    alpha = FloatSlider(value=0,min=-np.pi,max=np.pi,step=0.01,description=r'$\alpha$'),
    beta = FloatSlider(value=0,min=0,max=np.pi,step=0.01,description=r'$\beta$'),
    gamma = FloatSlider(value=0,min=-np.pi,max=np.pi,step=0.01,description=r'$\gamma$'))

In [None]:
ex01_fig = plt.figure(tight_layout=True)
ax01 = ex01_fig.add_subplot(111, projection='3d')
ex01_cuefig = CueFigure(ex01_fig) 

theta = np.linspace(0, 2 * np.pi, 20)
w = np.linspace(-0.5, 0.5, 10)
theta, w = np.meshgrid(theta, w)
R = 1
x = (R + w * np.cos(theta / 2)) * np.cos(theta)
y = (R + w * np.cos(theta / 2)) * np.sin(theta)
z = w * np.sin(theta / 2)

ex01_xyz =  np.array([x,y,z]).T

def update_01(code_exercise):    
    alpha, beta, gamma = code_exercise.parameters.values()
    cue_figure = code_exercise.cue_outputs[0]
    ax = cue_figure.figure.get_axes()[0]
    rot = rotation_matrix(alpha,beta,gamma)
    (x,y,z) = (ex01_xyz@rot.T).T
    ax.set_xlim([-2,2])
    ax.set_ylim([-2,2])
    ax.set_zlim([-2,2])
    dax = 2*np.eye(3)@rot.T    
    ax.quiver(0,0,0,*(dax[0]),color='r')
    ax.quiver(0,0,0,*(dax[1]),color='g')
    ax.quiver(0,0,0,*(dax[2]),color='b')
    ax.set_xlabel('X')
    ax.set_ylabel('Y')
    ax.set_zlabel('Z')
    ax.plot_surface(x, y, z, color='gray')
    
    ax.set_aspect('auto')
    cue_figure.figure.subplots_adjust(left=0.0, right=1, top=1, bottom=0.0)

ce01 = CodeExercise(
            parameters=ex01_pb,
            cue_outputs = [ex01_cuefig],
            update_func = update_01,
            update_mode="continuous")

display(ce01)
ce01.run_update()

A group is a set of endowed with a composition operation. The composition operator is associative, and the set has to be closed under the action of the composition, have an identity element and for each element contain also its inverse. 

This means that it is possible to combine rotations, and that the composition of two rotations is itself a rotation $\hat{R}''=\hat{R}'\hat{R}$. Rotations are _not_ commutative, so the order of application of the rotation operators matters.

## Cartesian rotations

In practical terms, a rotation operator $\hat{R}$ parameterized by the Euler angles acts on a 3D object by applying the corresponding rotation matrix $\mathbf{R}$ to the Cartesian coordinates of all its points: if a structure $A$ has atomic positions $\mathbf{r}_i$ (each of these being a 3-vector corresponding to the Cartesian coordinates $(x,y,z)$) then the rotated structure $\hat{R}A$ has atomic coordinates $\mathbf{R}\mathbf{r}_i$. The same transformation is applied to all the properties $\mathbf{y}$ of $A$ that have a vectorial character, e.g. the dipole moment - so that the dipole of $\hat{R}A$ is $\mathbf{R}\mathbf{y}$. 

Note also that $SO(3)$ is an _orthogonal_ group, meaning that if $\mathbf{R}$ is the rotation matrix associated with $\hat{R}$, the matrix associated with $\hat{R}^{-1}$ is $\mathbf{R}^T$, and $\mathbf{RR}^T=\mathbf{R}^T\mathbf{R}$ is the identity. 

Let's consider a dataset that contains a few organic molecules, and for each of them the computed dipole noment $\boldsymbol{\mu}$ and polarizability $\boldsymbol{\alpha}$. These are structures from the "showcase" dataset from ([Yang et al. (2019)](http://doi.org/10.1038/s41597-019-0157-8)).

In [None]:
frames_alphamu = ase.io.read('data/showcase.xyz', ":")

In [None]:
dipoles_show = chemiscope.ase_vectors_to_arrows(frames_alphamu, "dipole_ccsd", scale=0.5)
dipoles_show["parameters"]["global"]["color"]="0xff8000"

alphas_show = chemiscope.ase_tensors_to_ellipsoids(frames_alphamu, "ccsd_pol", scale=0.2)
alphas_show["parameters"]["global"]["color"]="0xff0080"

In [None]:
chemiscope.show(frames=frames_alphamu, 
                shapes={
                    "mu": dipoles_show,
                    "alpha": alphas_show
                       },
                mode="structure",
               settings=chemiscope.quick_settings(structure_settings={"shape":"mu"})
               )

In [None]:
ex02_wci = CodeInput(
        function_name="rotate_atoms", 
        function_parameters="positions, dipole, rotm",
        docstring="""takes the positions and dipole of a structure and transforms them
        according to the given rotation matrix 
        
        :param positions: a (n_atoms,3) array containing the atomic positions
        :param dipole: a (3) array containing the dipole components
        :param rotm: a (3,3) array containing the rotation matrix
        
        :returns: (positions, dipole) - a tuple containing the transformed positions and dipole
""",
        function_body="""

# NB: be careful with how you can apply the rotations to the positions array

new_positions = positions.copy()
new_dipole = dipole.copy()

# Apply the rotation here. Be careful with the shape and layout of the arrays

return new_positions, new_dipole
"""
        )

In [None]:
def update_02(code_exercise):
    output = code_exercise.cue_outputs[0]
    output.clear_output()
    rots = []
    f = frames_alphamu[0]
    for r in np.pi*np.array([[0,0,0],[0,0.125,0],[0,0.250,0],[0,0.375,0],[0,0.5,0],
                             [0.125,0.5,0],[0.25,0.5,0],[0.375,0.5,0],[0.5,0.5,0],
                             [0.5,0.5,0.125],[0.5,0.5,0.25],[0.5,0.5,0.375],[0.5,0.5,0.5]]):
        nf = deepcopy(f)
        nf.positions, nf.info["dipole_ccsd"] = ex02_wci.get_function_object()(
            f.positions, f.info["dipole_ccsd"], rotation_matrix(*r) )
        rots.append(nf)
    with output:
        dipoles_show = chemiscope.ase_vectors_to_arrows(rots, "dipole_ccsd", scale=0.5)
        dipoles_show["parameters"]["global"]["color"]="0xff8000"
        cs=chemiscope.show(rots, shapes={
                    "mu": dipoles_show,
                },
                mode="structure",
               settings=chemiscope.quick_settings(structure_settings={"shape":"mu", 
                                                                      "keepOrientation":True})
                          )
        cs.save("module_02-dipole_rotations.chemiscope.json.gz")
        display(cs)

ex02_reference_input = [{'positions':np.array([[0.,0,1],[1,2,0],[3,2,-1]]), 
                         'dipole':np.array([5.,6,7]),
                         'rotm': np.eye(3)},
                       {'positions':np.array([[0.,0,1],[1,2,0],[3,2,-1]]), 
                         'dipole':np.array([5.,6,7]),
                         'rotm': rotation_matrix(0,np.pi/2,0)}]
ex02_reference_output = [(np.array([[0.,0,1],[1,2,0],[3,2,-1]]),np.array([5.,6,7])),
                         (np.array([[ 1.00000000e+00,  0.00000000e+00,  2.22044605e-16],
         [ 2.22044605e-16,  2.00000000e+00, -1.00000000e+00],
         [-1.00000000e+00,  2.00000000e+00, -3.00000000e+00]]),
  np.array([ 7.,  6., -5.]))]
ex02_code_demo = CodeExercise(
    code= ex02_wci,
    check_registry=check_registry,
    cue_outputs = [CueObject()],
    update_func = update_02,
    exercise_key="02",
    exercise_registry=exercise_registry,
    exercise_title="Exercise 02: Moelcular rotations",
    exercise_description=mdwn("""
Implement a function that gets positions and dipole of a molecule and rotate them according to
the provided rotation matrix.
""")
)

check_registry.add_check(ex02_code_demo,
    asserts= [
        assert_type,
        assert_shape,
        assert_numpy_allclose,
    ],
     inputs_parameters=ex02_reference_input,
     outputs_references =ex02_reference_output)

In [None]:
display(ex02_code_demo)

[Download chemiscope datafile](./module_02-dipole_rotations.chemiscope.json.gz)

There are however more complicated properties than those transforming as vectors. Go back to the dataset viewer, and change the visualizer settings to display the polarizability `alpha`. The polarizability describes the second order response of the energy of a molecule to an applied electric field, i.e.

$$
\alpha_{ab} = \frac{\partial^2 U}{\partial E_a \partial E_b}
$$

It is therefore a _tensor_ labeled by two Cartesian indices. In order to see how it transforms under rotations, you should consider that a rotation would affect the relation of the reference frame of the molecule to that of _both_ electric field vectors, so one needs to apply _two_ rotation matrices,

$$
\boldsymbol{\alpha}(\hat{R}A) = \mathbf{R}\boldsymbol{\alpha}(A)\mathbf{R}^T
$$

In [None]:
ex03_wci = CodeInput(
        function_name="rotate_atoms_pol", 
        function_parameters="positions, alpha, rotm",
        docstring="""takes the positions and polarizability of a structure and transforms 
        them according to the given rotation matrix 
        
        :param positions: a (n_atoms,3) array containing the atomic positions
        :param alpha: a (3,3) matrix containing the polarizability
        :param rotm: a (3,3) array containing the rotation matrix
        
        :returns: (positions, alpha) - a tuple containing the transformed positions and polarizability
""",
        function_body="""

# NB: be careful with how you can apply the rotations to the positions array

new_positions = positions.copy()
new_alpha = alpha.copy()

# Apply the rotation here. Be careful with the shape and layout of the arrays

return new_positions, new_alpha
"""
        )

In [None]:
def update_03(code_exercise):
    output = code_exercise.cue_outputs[0]
    output.clear_output()
    rots = []
    f = frames_alphamu[0]
    for r in np.pi*np.array([[0,0,0],[0,0.125,0],[0,0.250,0],[0,0.375,0],[0,0.5,0],
                             [0.125,0.5,0],[0.25,0.5,0],[0.375,0.5,0],[0.5,0.5,0],
                             [0.5,0.5,0.125],[0.5,0.5,0.25],[0.5,0.5,0.375],[0.5,0.5,0.5]]):
        nf = deepcopy(f)
        pol = nf.info["ccsd_pol"]
        pol = np.array([[pol[0], pol[3], pol[4]],[pol[3], pol[1], pol[5]],[pol[4], pol[5], pol[2]]])
        nf.positions, pol = ex03_wci.get_function_object()(
            nf.positions, pol, rotation_matrix(*r) )
        nf.info["ccsd_pol"][:] = [ pol[0,0], pol[1,1], pol[2,2], pol[0,1], pol[0,2], pol[1,2]] 
        rots.append(nf)
    with output:
        alphas_show = chemiscope.ase_tensors_to_ellipsoids(rots, "ccsd_pol", scale=0.2)
        alphas_show["parameters"]["global"]["color"]="0xff0080"
        cs=chemiscope.show(rots, shapes={
                    "alpha": alphas_show,
                },
                mode="structure",
               settings=chemiscope.quick_settings(structure_settings={"shape":"alpha", 
                                                                      "keepOrientation":True})
                          )
        cs.save("module_02-alpha_rotations.chemiscope.json.gz")
        display(cs)

ex03_reference_input = [{'positions':np.array([[0.,0,1],[1,2,0],[3,2,-1]]), 
                         'alpha':np.array([[5.,1,1],[1,3,0],[1,0,4]]),
                         'rotm': np.eye(3)},
                       {'positions':np.array([[0.,0,1],[1,2,0],[3,2,-1]]), 
                         'alpha':np.array([[8.,-1,1],[-1,9,0],[1,0,4]]),
                         'rotm': rotation_matrix(0,np.pi/2,0)}]
ex03_reference_output = [(np.array([[ 0.,  0.,  1.],
         [ 1.,  2.,  0.],
         [ 3.,  2., -1.]]),
  np.array([[5., 1., 1.],
         [1., 3., 0.],
         [1., 0., 4.]])),
 (np.array([[ 1.00000000e+00,  0.00000000e+00,  2.22044605e-16],
         [ 2.22044605e-16,  2.00000000e+00, -1.00000000e+00],
         [-1.00000000e+00,  2.00000000e+00, -3.00000000e+00]]),
  np.array([[ 4.00000000e+00, -2.22044605e-16, -1.00000000e+00],
         [-2.22044605e-16,  9.00000000e+00,  1.00000000e+00],
         [-1.00000000e+00,  1.00000000e+00,  8.00000000e+00]]))]

ex03_code_demo = CodeExercise(
    code= ex03_wci,
    check_registry=check_registry,
    cue_outputs = [CueObject()],
    update_func = update_03,
    exercise_key="03",
    exercise_registry=exercise_registry,
    exercise_title="Exercise 03: Tensor rotations",
    exercise_description=mdwn("""
Implement a function that gets positions and polarizability of a molecule and rotates 
them according to the provided rotation matrix.
""")
)

check_registry.add_check(ex03_code_demo,
    asserts= [
        assert_type,
        assert_shape,
        assert_numpy_allclose,
    ],
     inputs_parameters=ex03_reference_input,
     outputs_references =ex03_reference_output)

In [None]:
display(ex03_code_demo)

[Download chemiscope datafile](./module_02-alpha_rotations.chemiscope.json.gz)

## Rotating tensors

The action of rotations on a Cartesian tensorial quantity can always be formulated as a matrix-vector multiplication, by "unrolling" the tensor and combining multiple rotation matrices together, e.g.

$$
\alpha_{(ab)} = \sum_{(a'b')} R_{(ab)(a'b')} \alpha_{(a'b')}
$$

where $R_{(ab)(a'b')}=R_{aa'}R_{bb'}$.  

In the visualization below you can see how the elements of the combined rotation matrix change with the Euler angles. 

In [None]:
ex04_pb =  ParameterPanel(
    alpha = FloatSlider(value=0,min=-np.pi,max=np.pi,step=0.01,description=r'$\alpha$'),
    beta = FloatSlider(value=0,min=0,max=np.pi,step=0.01,description=r'$\beta$'),
    gamma = FloatSlider(value=0,min=-np.pi,max=np.pi,step=0.01,description=r'$\gamma$'))

ex04_fig = plt.figure(tight_layout=True)
ax04 = ex04_fig.add_subplot(111)
ex04_cuefig = CueFigure(ex04_fig) 

ex04_cbar = None
def update_04(code_exercise):
    global ex04_cbar
    alpha, beta, gamma = code_exercise.parameters.values()
    cue_figure = code_exercise.cue_outputs[0]
    ax = cue_figure.figure.get_axes()[0]
    rot = rotation_matrix(alpha,beta,gamma)
    ROT = np.einsum("ab,cd->acbd",rot, rot).reshape(9,9)
    
    fig = code_exercise.cue_outputs[0].figure
    ax = fig.get_axes()[0]

    cax=ax.matshow(ROT, cmap='seismic', vmin=-1, vmax=1)
    if ex04_cbar is None:
        ex04_cbar = fig.colorbar(cax, ax=ax, orientation='vertical' )
    else:
        ex04_cbar.update_normal(cax)

    ax.set_xlabel("(ab)")
    ax.set_ylabel("(a'b')")    

In [None]:
ex04_code_demo = CodeExercise(
            parameters= ex04_pb,            
            cue_outputs = [ex04_cuefig],
            update_func = update_04,
    update_mode="continuous",
    #exercise_key="04",
    #exercise_registry=exercise_registry,
    exercise_title="Exercise 04",
    exercise_description=mdwn("")
)

display(ex04_code_demo)
ex04_code_demo.run_update()

In [None]:
ex04_txt = TextExercise(
    exercise_description="""
Play around with the parameters. Does the matrix have any obvious structure? 
How many multiplications would you have to perform to rotate the polarizability 
using the tensorial (two rotations) notation? And how about the combined (single matrix)
case?""",
    exercise_registry=exercise_registry,
    exercise_key="04",
    exercise_title=""
)
display(ex04_txt)

In [None]:
ex04b_txt = TextExercise(
    exercise_description=mdwn(r"""
Given we have seen there are different ways to transform a tensor such as the 
polarizability, we can wonder if there are _better_ ways to apply a rotation to a tensor.
For example, consider the _trace_ of the polarizability, $\alpha_{xx}+\alpha_{yy}+\alpha_{zz}$.
How does it transform under rotations?"""),
    exercise_registry=exercise_registry,
    exercise_key="04b",
    exercise_title="Polarizability trace"
)
display(ex04b_txt)

## Rotations and spherical harmonics

[Spherical harmonics](https://en.wikipedia.org/wiki/Spherical_harmonics) are special functions of the polar angles $(\theta, \phi)$ that can be obtained as the orthogonal solutions of the Laplacian eigenvalue problem on the sphere $\nabla^2 Y^m_l(\theta, \phi) = \epsilon_l Y^m_l(\theta, \phi)$. The spherical harmonics are indexed by two integers, $l\ge 0$ and $-l\le m \le l$. 
Much as for rotations, there are many different convention used when defining spherical harmonics. We use real valued spherical harmonics, which are the same used in the construction of the density expansion features. 

In [None]:
# Define the resolution of the sphere
num_points = 40

# Create the angles
theta = np.linspace(0, np.pi, num_points)
phi =   np.linspace(0, 2*np.pi, 2*num_points)
theta, phi = np.meshgrid(theta, phi)

# Convert to Cartesian coordinates
x = np.sin(theta) * np.cos(phi)
y = np.sin(theta) * np.sin(phi)
z = np.cos(theta)

ex05_xyz =  np.array([x,y,z])

# Define a function on the sphere
# Example function: cos(theta) + sin(phi)
ex05_sph = sph.compute(ex05_xyz.T.reshape(-1,3)).reshape(theta.shape[1], theta.shape[0], -1).T

In [None]:
ex05_mslider = IntSlider(value=0,min=0,max=8,step=1,description=r'$m$')
ex05_pb =  ParameterPanel(
    l = IntSlider(value=1,min=0,max=8,step=1,description=r'$l$'),
    m = ex05_mslider
)

ex05_fig = plt.figure(tight_layout=True)
ax05 = ex05_fig.add_subplot(111, projection='3d')
ex05_cuefig = CueFigure(ex05_fig) 

def update_05(code_exercise):    
    l, m = code_exercise.parameters.values()
    # updates the range of the m slider
    ex05_mslider.min = -l 
    ex05_mslider.max = l
    if m>l: 
        m=l
    if m<-l:
        m=-l
    cue_figure = code_exercise.cue_outputs[0]
    ax = cue_figure.figure.get_axes()[0]
    ax.set_xlim([-1.0,1.0])
    ax.set_ylim([-1.0,1.0])
    ax.set_zlim([-1.0,1.0])
    
    color_map = lambda x:  mpl.colormaps['seismic']((x-x.min())/(1e-15+x.max()-x.min()))
    x,y,z = ex05_xyz
    lm=l*l+l+m
    # Plot the sphere with colors
    ax.plot_surface(x, y, z, rstride=1, antialiased=True,
                    cstride=1, shade=True, facecolors=color_map(ex05_sph[lm]) )
    ax.set_axis_off()
    ax.set_aspect('auto')
    cue_figure.figure.subplots_adjust(left=0.0, right=1, top=1, bottom=0.0)
    
ce05 = CodeExercise(
            parameters=ex05_pb,
            cue_outputs = [ex05_cuefig],
            update_func = update_05,
            update_mode="continuous")

Use the viewer below to visualize different spherical harmonics. Spherical harmonics are connected to the solution of the SchrÃ¶dinger equation in a central potential, and more broadly to the definition of angular momentum in quantum mechanics. 

In [None]:
display(ce05)
ce05.run_update()

Spherical harmonics are also special because of their connection with the properties of the rotation group. In fact, it is possible to define special matrix-valued functions (the [Wigner D matrices](https://en.wikipedia.org/wiki/Wigner_D-matrix#Relation_to_spherical_harmonics_and_Legendre_polynomials) $D^l_{mm'}(\hat{R})$) that describe how spherical harmonics transform under rotations. 

Essentially, all the spherical harmonics with the same $l$ should be treated as a vector, and then 

$$
Y^m_l(\hat{R}\theta, \hat{R}\phi) = \sum_{m'}D^l_{mm'}(\hat{R}) Y^{m'}_l(\theta,\phi)
$$

We are going to see how this can be realized by writing a function that takes a list of 3-vectors that are points on the surface of a sphere, and use a utility function (a wrapper to the `sphericart` library) to compute the spherical harmonics for the initial and rotated points, and apply the rotation directly to the spherical harmonics using the Wigner matrix for the rotation.

In [None]:
def spherical_harmonics(l, r):
    return sph.compute(r)[:,l**2:(l+1)**2]
ex06_wci = CodeInput(
        function_name="rotate_ylm", 
        function_parameters="xyz, rotm, wigd, spherical_harmonics",
        docstring="""computes rotated spherical harmonics in two different ways, 
        first by rotating the input positions, and then by rotating the spherical 
        harmonics computed at the initial positions. 
        
        :param xyz: a (n,3) array containing the positions at which to 
             compute the spherical harmonics
        :param rotm: a (3,3) array containing the rotation matrix
        :param wigd: a (2l+1,2l+1) array containing the wigner D matrix
        :param spherical_harmonics: a function of (l, xyz) that computes the spherical
             harmonics, returning a (n, 2l+1) array
        :returns: (rylm1, rylm2) - a tuple containing the spherical harmonics computed
             using Ylm(R xyz) and Dlm Ylm(R)
""",
        function_body="""

l = (wigd.shape[0]-1)//2   # guesses the order of Ylm from the shape of Dl
ylm = spherical_harmonics(l, xyz)

rxyz = xyz   # <--- apply rotation here
rylm_1 = spherical_harmonics(l, rxyz)

rylm_2 = ylm  # <--- apply rotation here

return rylm_1, rylm_2
"""
)

In [None]:
ex06_mslider = IntSlider(value=0,min=0,max=8,step=1,description=r'$m$')
ex06_pb =  ParameterPanel(
    l = IntSlider(value=1,min=0,max=8,step=1,description=r'$l$'),
    m = ex06_mslider,
    alpha = FloatSlider(value=0,min=-np.pi,max=np.pi,step=0.01,description=r'$\alpha$'),
    beta = FloatSlider(value=0,min=0,max=np.pi,step=0.01,description=r'$\beta$'),
    gamma = FloatSlider(value=0,min=-np.pi,max=np.pi,step=0.01,description=r'$\gamma$'),
)

ex06_fig = plt.figure(tight_layout=True)
ax061 = ex06_fig.add_subplot(121, projection='3d')
ax062 = ex06_fig.add_subplot(122, projection='3d')
ex06_cuefig = CueFigure(ex06_fig) 
for ax in [ax061, ax062]:
    ax.mouse_init(rotate_btn=None, pan_btn=None, zoom_btn=None)
    ax.set_xlim([-1.0,1.0])
    ax.set_ylim([-1.0,1.0])
    ax.set_zlim([-1.0,1.0])
    ax.set_axis_off()

ex06_xyz = ex05_xyz

def update_06(code_exercise):    
    l, m, alpha, beta, gamma = code_exercise.parameters.values()
    # updates the range of the m slider
    ex06_mslider.min = -l 
    ex06_mslider.max = l
    if m>l: 
        m=l
    if m<-l:
        m=-l
    cue_figure = code_exercise.cue_outputs[0]
    ax1, ax2 = cue_figure.figure.get_axes()[:2]
        
    color_map = lambda x:  mpl.colormaps['seismic']((x-x.min())/(1e-15+x.max()-x.min()))
    shape = ex06_xyz.shape 
    x, y, z = ex06_xyz
    xyz = ex06_xyz.reshape(3, -1).T
    rotm = rotation_matrix(alpha, beta, gamma)
    wigd = wigner_d_real(l, alpha, beta, gamma)
    
    rlm1, rlm2 = ex06_wci.get_function_object()(xyz, rotm, wigd, spherical_harmonics)
    
    ax1.plot_surface(x, y, z, rstride=1, antialiased=True,
                    cstride=1, shade=True, facecolors=color_map(rlm1[:,l+m].reshape(shape[1:]) ) )
    ax2.plot_surface(x, y, z, rstride=1, antialiased=True,
                    cstride=1, shade=True, facecolors=color_map(rlm2[:,l+m].reshape(shape[1:]) ) )
    for ax in [ax1, ax2]:
        ax.set_xlim([-1.0,1.0])
        ax.set_ylim([-1.0,1.0])
        ax.set_zlim([-1.0,1.0])
        ax.set_axis_off()
        
ce06 = CodeExercise(
            code=ex06_wci,
            parameters=ex06_pb,
            cue_outputs = [ex06_cuefig],
            update_func = update_06,
            update_mode="manual",
    exercise_key="06",
    exercise_registry=exercise_registry,
    exercise_title="Exercise 06: Two ways of rotating spherical harmonics",
    exercise_description=mdwn(r"""Write a function that computes spherical harmonics
    in a rotated reference frame, by rotating the position at which the $Y^m_l$ are 
    evaluated, and by rotating the set of spherical harmonics computed at the initial 
    location. """)
)

In [None]:
display(ce06)

## Irreducible representations

Wigner $D$ matrices have a very important property, that makes them central in the theory and practice of $SO(3)$: given a set of objects that transform under rotations (e.g. the entries in the polarizability tensor) it is possible to determine linear combinations of those entries that transform under rotations as with the application of a Wigner matrix. 

What is more, these special linear combinations are the _smallest_ possible sets of quantities that are mixed by a rotation, and it is not possible to simplify them any further. We have already encountered an example of such linear transformation: the trace of the polarizability is (proportional to) the linear combination that is constant (i.e. that is transformed by $D^0_{00}=1$). More generally any product of Cartesian components can be transformed into blocks that transform as spherical harmonics.

Performing these transformations is rather tedious (as they depend on the convention used to define the Wigner matrices) so you'll have to trust the implementation provided with this exercise. `cart2sph` takes a Cartesian tensor and transforms it into a 9-vector in which the first element transforms as a $l=0$ spherical harmonics, the elements `[1:4]` as $l=1$, and `[4:9]` as $l=2$. 

```python
print(alpha)
print(cart2sph(alpha))
```

In [None]:
alpha = six2cart(frames_alphamu[0].info["ccsd_pol"])
print(alpha)
print(cart2sph(alpha))

Note that the $l=1$ terms are zero; this is because the polarizability is a _symmetric_ tensor, and has only six independent entries: the asymmetric part of the tensor would be transformed into an $l=1$ block.

Rotating a vector in this form can be achieved by a _block diagonal_ matrix with Wigner $D$ matrices on the diagonal. Obviously the whole point of expressing a tensor in its irreducible form is to manipulate separately the different blocks, but the following visualization displays the block-diagonal form, to be compared with the $9\times 9 $ transformation in Exercise 04. 

In [None]:
ex07_pb =  ParameterPanel(
    alpha = FloatSlider(value=0,min=-np.pi,max=np.pi,step=0.01,description=r'$\alpha$'),
    beta = FloatSlider(value=0,min=0,max=np.pi,step=0.01,description=r'$\beta$'),
    gamma = FloatSlider(value=0,min=-np.pi,max=np.pi,step=0.01,description=r'$\gamma$'))

ex07_fig = plt.figure(tight_layout=True)
ax07 = ex07_fig.add_subplot(111)
ex07_cuefig = CueFigure(ex07_fig) 

ex07_cbar = None
def update_07(code_exercise):
    global ex07_cbar
    alpha, beta, gamma = code_exercise.parameters.values()
    cue_figure = code_exercise.cue_outputs[0]
    ax = cue_figure.figure.get_axes()[0]

    ROT = scipy.linalg.block_diag(
        wigner_d_real(0, alpha, beta, gamma), 
        wigner_d_real(1, alpha, beta, gamma), 
        wigner_d_real(2, alpha, beta, gamma)
    )
    
    fig = code_exercise.cue_outputs[0].figure
    ax = fig.get_axes()[0]

    cax=ax.matshow(ROT, cmap='seismic', vmin=-1, vmax=1)
    if ex07_cbar is None:
        ex07_cbar = fig.colorbar(cax, ax=ax, orientation='vertical' )
    else:
        ex07_cbar.update_normal(cax)

    ax.set_xlabel("(lm)")
    ax.set_ylabel("(lm)")    

In [None]:
ex07_code_demo = CodeExercise(
            parameters= ex07_pb,            
            cue_outputs = [ex07_cuefig],
            update_func = update_07,
    update_mode="continuous",
    #exercise_key="04",
    #exercise_registry=exercise_registry,
    exercise_title="Exercise 07",
    exercise_description=mdwn("")
)

display(ex07_code_demo)
ex07_code_demo.run_update()

In [None]:
ex07_txt = TextExercise(
    exercise_description=mdwn(r"""
How many multiplications would you have to perform to apply the rotation exploiting 
the block structure? And how about if you also exploited knowledge that the $l=1$ 
block is zero?"""),
    exercise_registry=exercise_registry,
    exercise_key="07",
    exercise_title=""
)
display(ex07_txt)

# Equivariant features and equivariant regression

_Equivariance_ indicates the property of a function for which the inputs and outputs are subject to the action of the same symmetries, and which commutes with the application of the symmetries, that is: $f(\hat{S}A) = \hat{S} f(A)$. _Invariance_ can be seen as a special case, in which  $f(\hat{S}A) = f(A)$.
This module focuses in particular on the case of 3D rotations and inversion - in technical terms the $O(3)$ group symmetries, and their combination with translations - the three-dimensional Euclidean group $E(3)$.

Equivariant (or _symmetry-adapted_) regression refers to the construction of models that obey naturally these transformation rules. In the special case of _scalar_ targets, all that is required is to build invariant features and combine them with scalar functions, while in the general case there are further restrictions as we shall see later.  

We are going to use as an example a collection of configurations for a single water molecule. The configurations are generated by distorting an equilibrium configuration along the bending mode, and the asymmetric stretching coordinate. Each frame contains also the energy and dipole moment, computed with the Partridge-Schwenke monomer potential ([Partridge, Schwenke, J. Chem. Phys. (1997)](http://doi.org/10.1063/1.473987)).  

In [None]:
h2o_frames = ase.io.read("data/water_energy-dipole.xyz", ":")

h2o_energy = np.zeros(len(h2o_frames))
h2o_dipole = np.zeros((len(h2o_frames),3))
h2o_force = np.zeros((len(h2o_frames),3,3))
for fi, f in enumerate(h2o_frames):
    h2o_energy[fi] = f.info['energy']
    h2o_dipole[fi] = f.info['dipole']
    h2o_force[fi] = f.arrays['force']

In [None]:
dipole_arrows = chemiscope.ase_vectors_to_arrows(h2o_frames, "dipole", scale=4, head_length_scale=3);
dipole_arrows["parameters"]["global"].update({ "color": 0x60A0FF })

cs = chemiscope.show(h2o_frames, properties = chemiscope.extract_properties(h2o_frames),
        shapes={ "dipole" : dipole_arrows},
        settings={
            'map' : { 'x':{'property' : "HOH"},  'y':{'property' : "OH1"}, 'color' : {'property' : 'energy'} },
            'structure': [{'axes': 'off','keepOrientation': True, 'shape': ['dipole']}]
        })

display(cs)

## Equivariance of the density coefficients

Being computed as an expansion in spherical harmonics, it is kind of obvious that density expansion coefficients are rotationally equivariant:

$$
\langle a nlm | \hat{R}A_i\rangle = 
\sum_{j\in A_i} Y^m_l(\mathbf{R}\hat{\mathbf{r}}_{ij}) \tilde{R}_{nl}(r_{ij}) =
$$
$$=\sum_{j\in A_i} \sum_{m'} D^l_{mm'}(\hat{R})Y^{m'}_l(\hat{\mathbf{r}}_{ij}) \tilde{R}_{nl}(r_{ij}) =
\sum_{m'} D^l_{mm'}(\hat{R})
\langle a nlm' | A_i\rangle
$$

In [None]:
ex08_wci = CodeInput(
        function_name="rotate_rhoi", 
        function_parameters="frame, rotm, wigd, compute_rhoi, nmax, lmax",
        docstring="""
        compute discretized density coefficients for a structure, and 
        for its rotated version. then also rotate the density coefficients
        by applying Wigner matrices. 
    
        
        :param frame: ase.Atoms frame to compute
        :param rotm: rotation matrix to be applied
        :param wigd: a list of Wigner matrices such that wigd[l] contains D^l
        :param compute_rhoi: a wrapper to `rascaline` that computes O-centered 
            H density coefficients - call as 
            compute_rhoi(frame, nmax, lmax)
            returns a TensorMap containing the expansion coefficients
        :param nmax: number of radial functions
        :param lmax: maximum angular momentum        
        
        :returns: (rhoi_a, rhoi_b) two TensorMaps containing the density coefficients
              for the rotated structure, computed in two ways
""",
        function_body="""

from copy import deepcopy
import numpy as np

# apply rotations to the positions of rot_frame
rot_frame = deepcopy(frame)
# compute density coefficients of the rotated frame
rhoi_a = compute_rhoi(rot_frame, nmax, lmax)


# computes density coefficients for the input frame
rhoi_b = compute_rhoi(frame, nmax, lmax)

for key, block in rhoi_b.items():
    l = key["o3_lambda"]
    dl = wigd[l]
    # this has dimensions nsamples, 2l+1, nmax
    block.values[:] = block.values # <-- apply the rotation 

return rhoi_a, rhoi_b
"""
)

In [None]:
def compute_rhoi_default(frame, nmax, lmax):
    hypers = {
        "cutoff": 2.0,
        "max_radial": nmax,
        "max_angular": lmax,
        "atomic_gaussian_width": 0.3,
        "cutoff_function": {"ShiftedCosine": {"width": 0.5}}, # type of cutoff and parameters
        "center_atom_weight": 1.0, # weight to include the central atom in the expansion
        "radial_basis": { "Gto": {}, }, # choice of radial basis
    }
    
    calculator = rascaline.SphericalExpansion(**hypers)
    
    rhoi = calculator.compute(frame,
                selected_keys=Labels(names=["o3_lambda",  "o3_sigma", "center_type", "neighbor_type"],
                                     values=np.array([[l,1,8,1] for l in range(0,lmax+1)]) )
                             )
    return rhoi
    
ex08_pb =  ParameterPanel(
    frame = IntSlider(value=0,min=0,max=len(h2o_frames),description=r'frame'),
    nmax = IntSlider(value=4,min=1,max=8,description=r'$n_{max}$'),
    lmax = IntSlider(value=2,min=1,max=6,description=r'$l_{max}$'),
    alpha = FloatSlider(value=0,min=-np.pi,max=np.pi,step=0.01,description=r'$\alpha$'),
    beta = FloatSlider(value=0,min=0,max=np.pi,step=0.01,description=r'$\beta$'),
    gamma = FloatSlider(value=0,min=-np.pi,max=np.pi,step=0.01,description=r'$\gamma$'),
    )

def combine_l(tmap, i_env):
    feats=[]
    for b in tmap:
        feats.append(b.values[i_env])
    return np.vstack(feats)

from mpl_toolkits.axes_grid1 import make_axes_locatable
ex08_cbar = None
def update_08(code_exercise):
    global ex08_cbar
    iframe, nmax, lmax, alpha, beta, gamma = code_exercise.parameters.values()
    fig = code_exercise.cue_outputs[0].figure
    axa, axb = fig.get_axes()[:2]
    
    rotm = rotation_matrix(alpha, beta, gamma)
    dmats = [ wigner_d_real(l, alpha, beta, gamma) for l in range(lmax+1) ]
    rhoi_a, rhoi_b = ex08_wci.get_function_object()(h2o_frames[iframe],
                                                    rotm, dmats, 
                                                    compute_rhoi_default, 
                                                    nmax, lmax)
    feats_a = combine_l(rhoi_a, 0)
    feats_b = combine_l(rhoi_b, 0)
    frange = np.max(np.abs(feats_a))
    norm = mpl.colors.SymLogNorm(vmin=-frange, vmax=frange, linthresh=1e-1)
    
    ima=axa.matshow(feats_a.T, cmap='seismic', norm=norm)
    axb.matshow(feats_b.T, cmap='seismic', norm=norm)
    if ex08_cbar is None:
        ex08_cbar = fig.colorbar(ima, ax=[axa,axb],
                                 orientation='vertical' )
    else:
        ex08_cbar.update_normal(ima)

    for ax in [axa, axb]:
        ax.set_ylabel("n")
        ax.set_xlabel("(l,m)")
        xticklabels = []
        xtickpos = []
        for l in range(lmax+1):
            ax.add_patch(mpl.patches.Rectangle(
                (-0.5+l**2,-0.5), 2*l+1, nmax,
                edgecolor='black', facecolor='none', linewidth=3
            ))
            xticklabels.append(f"$l={l}$")
            xtickpos.append((l)**2+l)
        ax.set_xticks(xtickpos); ax.set_xticklabels(xticklabels)
    

In [None]:

ex08_figure, ex08_ax = plt.subplots(2, 1, figsize=(6,4))
ex08_output = CueFigure(ex08_figure)

ex08_code_demo = CodeExercise(
            code= ex08_wci,
            parameters= ex08_pb,
            cue_outputs = [ex08_output],
            update_func = update_08,
    update_mode="manual",
    exercise_key="08",
    exercise_registry=exercise_registry,
    exercise_title="Exercise 08: Rotating the expansion",
    exercise_description=mdwn("""
The following function gets an `ase.Atoms` frame, a rotation matrix, a list of 
Wigner matrices for different $l$, and a function to compute the density expansion
as a `TensorMap`. Your task is to apply compute the expansion coefficients for both
the original frame and the rotated structure, then apply Wigner rotations to the 
coefficients of the original frame and return both. The visualizer should show you
if you succeeded in performing correctly the two types of rotation. 

_NB:_ The function already shows much of the bookkeeping needed to manipulate the blocks 
of the density coefficients, and all you have to do is to apply the rotations, 
being careful about the storage order of the coefficients. 
""")
)

display(ex08_code_demo)

## Ridge regression for the dipole moments

We use the density expansion coefficients to train a regression model for the dipole of water molecules. For simplicity, we use coefficients for the hydrogen density centered on the O atom of each molecule. If $\mathbf{X}$ is the matrix that contains the density coefficients - with one row per molecule - and $\mathbf{Y}$ is a $n\times 3$ matrix that contains the dipole coordinates, in Cartesian coordinates, a naive regression model corresponds to the loss
$$
\ell = \frac{1}{n_\mathrm{train}} \|\mathbf{X}\mathbf{w} - \mathbf{Y}\|^2 + \alpha |\mathbf{w}|^2,
$$
which is minimized for 
$$
\mathbf{w} = [\mathbf{X}^T\mathbf{X}+\alpha \mathbf{1} ]^{-1}\mathbf{X}^T\mathbf{Y}.
$$

Note that this is an example of multi-target regression, because each component of the dipole is learned separately, resulting in a weight vector with three components for each feature. In this model, we ignore completely the symmetry properties of $\boldsymbol{\mu}$.

In [None]:
ex09_wci = CodeInput(
        function_name="ridge_regression_rhoi", 
        function_parameters="frames, compute_rhoi, nmax, lmax, ftrain, alpha",
        docstring="""takes the structures, and a function that can compute density descriptors
        as TensorMaps, and computes ridge regression for the dipole moment, that is stored
        in the `info["dipole"]` member of the ase.Atoms frames. 
        also computes a train/test split and applies ridge regularization
        
        :param frames: a list of ase.Atoms structures
        :param compute_rhoi: a wrapper to `rascaline` that computes O-centered 
            H density coefficients - call as 
            compute_rhoi(frame, nmax, lmax)
            returns a TensorMap containing the expansion coefficients
        :param nmax: number of radial functions
        :param lmax: maximum angular momentum        
        :param ftrain: the fraction of the structures list to be used for training 
        :param alpha: the ridge regularization
        
        :returns: predicted dipoles, indices of the train/test split and fitted Ridge object
            (dipoles, train, test, ridge)
""",
        function_body="""

import numpy as np
from sklearn.linear_model import Ridge

# indices of the train set. NB: you can select rows from a numpy array X
# by writing X[itrain]
itrain = np.arange(len(frames))
np.random.shuffle(itrain)
ntrain = int(len(frames)*ftrain)
itest = itrain[ntrain:]
itrain = itrain[:ntrain]

# computes rhoi features and consolidates them in a dense array
rhoi = compute_rhoi(frames, nmax, lmax)
X = (
rhoi
.components_to_properties("o3_mu") # this moves the mu to the property axis (removing the equivariant interpretation)
.keys_to_properties("o3_lambda") # this consolidates the blocks into a single one with all lambda terms concatenated
.block(0).values # picks the value array
)
# makes a list of the targets, by extracting .info["dipole"] from each structure
y = np.zeros((len(X), 3)) 

# NB: if you use a Python list, convert it to a numpy array so you can index it with itrain
# e.g. if y = [1,2,3,4] you can't do y[itrain], but you can if y = np.asarray([1,2,3,4])

# initializes the Ridge object and calls fit on the training set
ridge = None

# predicts the property for ALL structures 
y_pred = y
return y_pred, itrain, itest, ridge
"""
)

In [None]:
ex09_ridge = None
ex09_pars = (6, 2)
def update_09(code_exercise):
    global ex09_ridge, ex09_pars
    nmax, lmax, ftrain, log10alpha = code_exercise.parameters.values()
    ex09_pars = (nmax, lmax)
    print_output = code_exercise.cue_outputs[0]
    print_output.clear_output()
    y = h2o_dipole
    yp, itrain, itest, ridge = ex09_wci.get_function_object()(h2o_frames, 
                                        compute_rhoi_default, nmax, lmax, 
                                        ftrain, 10**log10alpha)
    ex09_ridge = ridge
    
    with print_output:                        
        print("MAE train: ", np.mean(np.abs((y-yp)[itrain])))
        print("MAE test: ", np.mean(np.abs((y-yp)[itest])))
    
    frames = h2o_frames
    ftype = np.asarray([ "test " ] * len(frames)); ftype[itrain] = "train"
    properties={"|mu_ref|": np.sqrt((y[:]**2).sum(axis=1)), 
                "|mu_pred|" : np.sqrt((yp[:]**2).sum(axis=1)), 
                "||mu_ref|-|mu_pred||": np.abs(np.sqrt((y[:]**2).sum(axis=1))-
                                       np.sqrt((yp[:]**2).sum(axis=1))),
                "type": ftype[:]}
    
    settings={'map': {'x': {'property': "|mu_ref|"},
  'y': { 'property': '|mu_pred|'},
  'color': {'max': 1, 'min': 0, 'property': '||mu_ref|-|mu_pred||', 'scale': 'linear'},
  'symbol': 'type',
  'palette': 'inferno',
  'size': {'factor': 40}},
 'structure': [{'bonds': True,
                'shape': 'mu_ref,mu_pred',
   }]}
    y_arrows = chemiscope.ase_vectors_to_arrows(h2o_frames, "dipole", scale=4, head_length_scale=3);
    y_arrows["parameters"]["global"].update({ "color": 0xff8000 })
    yp_arrows = {'kind': 'arrow',
 'parameters': {'global': {'baseRadius': 0.1,
   'headRadius': 0.17500000000000002,
   'headLength': 0.30000000000000004,
   'color': 0xb000b0},
  'structure': [ {"vector": (mu*4).tolist()} for mu in yp ]}}
    
    chemiscope.write_input("module_03-ridge-regression.chemiscope.json.gz", 
                           frames=frames[:],
                           shapes={"mu_ref":y_arrows, "mu_pred": yp_arrows},
                           properties=properties,
                           settings=settings
                          )
                           
    with print_output:
        display(chemiscope.show_input("module_03-ridge-regression.chemiscope.json.gz"
                  ) )
        
ex09_pb =  ParameterPanel(
    nmax = IntSlider(value=4,min=1,max=8,description=r'$n_{max}$'),
    lmax = IntSlider(value=2,min=1,max=6,description=r'$l_{max}$'),
    ftrain = FloatSlider(value=0.5,min=0.1,max=0.9,step=0.1,description=r'$f_{\mathrm{train}}$'),
    log10alpha = FloatSlider(value=0,min=-20,max=5,step=0.1,description=r'$\alpha$'),
    )

ex09_code_demo = CodeExercise(
            code= ex09_wci,
            #check_registry=check_registry,
            parameters=ex09_pb,
            cue_outputs = [CueObject()],
            update_func = update_09,
    exercise_key="09",
    exercise_registry=exercise_registry,
    exercise_title="Exercise 09: Ridge regression of dipoles",
    exercise_description=mdwn("""
Implement a function that fits and evaluates a ridge regression model for the dipole
moment of water molecules. The descriptors should be density expansion coefficients
(computed by a utility function) and you should use `sklearn`'s `Ridge` class to
perform the regression.

NB: The function already contains a blurb performing some of the bookkeeping operations
such as flattening the density expansion `TensorMap` and generating the train/test split. 
""")
)
"""def ex10_chk(a,b):
    return np.allclose(a[0],b[0])
ex10_reference_input = [{'structures':read('data/mp_elastic.extxyz','::100'),'target':"K", 'f_fingerprint' :fingerprintf,'f_train':0.5,'alpha':1e-3}]
ex10_reference_output = [(np.loadtxt('data/mp_elastic_10ref.txt'),)]

check_registry.add_check(ex10_code_demo,
    asserts= [
        assert_type,
        assert_numpy_allclose,
    ],
                         inputs_parameters=ex10_reference_input,
                         outputs_references =ex10_reference_output,
                         fingerprint=lambda x,y: x)
""";

In [None]:
display(ex09_code_demo)

In [None]:
ex09b_txt = TextExercise(
    exercise_description=mdwn(r"""
Experiment with the training fraction, the regularization, and the 
number of density coefficients. How good are the train/test predictions?
Comment on the different overfitting/underfitting regimes, and then
run with your best parameters, as the model will be used in the next exercise."""),
    exercise_registry=exercise_registry,
    exercise_key="09b",
    exercise_title=""
)
display(ex09b_txt)

In [None]:
def update_10(code_exercise):
    output = code_exercise.cue_outputs[0]
    output.clear_output()
    rots = []
    for f in h2o_frames[::40]:
      for r in np.pi*np.array([[0,0,0],[0,0.125,0],[0,0.250,0],[0,0.375,0],[0,0.5,0],
                             [0.125,0.5,0],[0.25,0.5,0],[0.375,0.5,0],[0.5,0.5,0],
                             [0.5,0.5,0.125],[0.5,0.5,0.25],[0.5,0.5,0.375],[0.5,0.5,0.5]]):
        nf = deepcopy(f)
        nf.positions, nf.info["dipole"] = ex02_wci.get_function_object()(
            f.positions, f.info["dipole"], rotation_matrix(*r) )

        rhoi=compute_rhoi_default(nf, *ex09_pars)
        X = (
            rhoi
            .components_to_properties("o3_mu") # this moves the mu to the property axis (removing the equivariant interpretation)
            .keys_to_properties("o3_lambda") # this consolidates the blocks into a single one with all lambda terms concatenated
            .block(0).values # picks the value array
            )
        ypred = ex09_ridge.predict(X)
        nf.info["dipole_pred"] = ypred[0]
        rots.append(nf)
    properties={"|mu_ref|": np.sqrt(np.array([f.info["dipole"]**2 for f in rots]).sum(axis=1)), 
                "|mu_pred|" : np.sqrt(np.array([f.info["dipole_pred"]**2 for f in rots]).sum(axis=1))}
    properties.update(
        {"||mu_ref|-|mu_pred||": np.abs(properties["|mu_ref|"]-properties["|mu_pred|"])
        })
    dipoles_show = chemiscope.ase_vectors_to_arrows(rots, "dipole", scale=4)
    dipoles_show["parameters"]["global"]["color"]="0xff8000"
    pred_show = chemiscope.ase_vectors_to_arrows(rots, "dipole_pred", scale=4)
    pred_show["parameters"]["global"]["color"]="0xb000b0"
    with output:
        cs=chemiscope.show(rots, 
            properties=properties,
            shapes={
            "mu_ref": dipoles_show,
            "mu_pred": pred_show,
                },
                mode="default",
               settings=chemiscope.quick_settings(
                   x="|mu_ref|", y="|mu_pred|", color="||mu_ref|-|mu_pred||",
                   structure_settings={"shape": ["mu_ref", "mu_pred"], 
                        "keepOrientation":True}
                          )
                          )
        cs.save("module_03-dipole_rotated_base.chemiscope.json.gz")
        display(cs)

ex10_code_demo = CodeExercise(
    check_registry=check_registry,
    cue_outputs = [CueObject()],
    update_func = update_10,
    exercise_title="Exercise 10: Molecular rotations and dipole predictions",
    exercise_description=mdwn("""
After having tuned a model you are happy with, let's see how  it performs for 
different molecular configurations. This demo takes four random configurations from the
dataset, and rotates them in different ways. Observe the results: the reference value
of the dipole is represented as an orange arrow, the prediction of the model as a purple
arrow.
""")
)

In [None]:
display(ex10_code_demo)

In [None]:
ex10b_txt = TextExercise(
    exercise_description=mdwn(r"""
Compare your observations for these rotated structures for those in the validation above.
What can you conclude about the ability of this model to handle rotations?"""),
    exercise_registry=exercise_registry,
    exercise_key="10b",
    exercise_title=""
)
display(ex10b_txt)

## Symmetry adapted regression

In order to perform regression in a way that is consistent with rotational symmetry one needs to use features that are themselves equivariant, such as the density coefficients.
To learn components of a spherical tensor of order $\lambda$, we need features of the form
$\xi^{\lambda\mu}_k(A)$. This is however not enough: the regression problem has to be formulated keeping equivariance in mind: we have seen in the previous section how just mindlessly throwing density coefficients to `Ridge` led to terrible extrapolation to rotated structures. 

First, we need to select only the components with the desired $\lambda$ value; second, we need to use the same regression weight irrespective of the symmetry component $\mu$:

$$
y_{\lambda\mu}(A) = \sum_k w_k \xi^{\lambda\mu}_k(A)
$$

It is then easy to see that the rotational transformation applied to the structure translates to the features and then the predicted quantity

$$
y_{\lambda\mu}(\hat{R}A) = \sum_k w_k \xi^{\lambda\mu}_k(\hat{R}A)
 = \sum_k w_k \sum_{\mu'} D^\lambda_{\mu\mu'}(\hat{R}) \xi^{\lambda\mu'}_k(A)=
$$
$$
 = \sum_{\mu'} D^\lambda_{\mu\mu'}(\hat{R}) y_{\lambda\mu}(A) = \hat{R} y_{\lambda\mu}(A)
$$

It is also possible to show that this is the correct ansatz by starting from a generic regression expression, and then averaging the feature covariance over all possible rotations of the training set, see e.g. [Goscinski et al. (2021)](http://doi.org/10.1063/5.0057229).

Starting from this expression, and defining a ridge loss, one obtains an analytical expression for the weights
$$
\mathbf{w} = [\mathbf{C}+\alpha \mathbf{1} ]^{-1}\mathbf{z}.
$$
where both the covariance and the target vector are also averaged over the symmetry index:

$$
C^\lambda_{kk'} = \sum_{A\in \mathrm{train}, \mu} \xi^{\lambda\mu}_k (A) \xi^{\lambda\mu}_{k'} (A)
\quad
z_k = \sum_{A\in \mathrm{train}, \mu} \xi^{\lambda\mu}_k (A) y_{\lambda\mu}(A)
$$

_NB:_ an important detail is that the rotational average of any quantity with $\lambda>0$ is zero, and so regression should be computed without including a constant term (`fit_intercept=0` when using the `Ridge` class from `sklearn`). 

In [None]:
ex11_wci = CodeInput(
        function_name="sa_ridge_rhoi", 
        function_parameters="frames, compute_rhoi, nmax, ftrain, alpha",
        docstring="""takes the structures, and a function that can compute density descriptors
        as TensorMaps, and computes ridge regression for the dipole moment, that is stored
        in the `info["dipole"]` member of the ase.Atoms frames. 
        also computes a train/test split and applies ridge regularization
        
        :param frames: a list of ase.Atoms structures
        :param compute_rhoi: a wrapper to `rascaline` that computes O-centered 
            H density coefficients - call as 
            compute_rhoi(frame, nmax, lmax)
            returns a TensorMap containing the expansion coefficients
        :param nmax: number of radial functions
        :param ftrain: the fraction of the structures list to be used for training 
        :param alpha: the ridge regularization
        
        :returns: predicted dipoles, indices of the train/test split and fitted Ridge object
            (dipoles, train, test, ridge)
""",
        function_body="""

import numpy as np
from sklearn.linear_model import Ridge
from rotutils import xyz_to_spherical, spherical_to_xyz

# indices of the train set. NB: you can select rows from a numpy array X
# by writing X[itrain]
itrain = np.arange(len(frames))
np.random.shuffle(itrain)
ntrain = int(len(frames)*ftrain)
itest = itrain[ntrain:]
itrain = itrain[:ntrain]

# computes rhoi features.
rhoi = compute_rhoi(frames, nmax, 1)
X = (
rhoi.block(o3_lambda=1).  # comment here 
value.reshape(-1,nmax)   
)
# makes a list of the targets, by extracting .info["dipole"] from each structure
y = np.zeros((len(X), 3)) # <-- modify to extract the targets, and convert to a numpy array

y = xyz_to_spherical(y).flatten()  # comment here 


# initializes the Ridge object and calls fit on the training set
ridge = Ridge(alpha=alpha, fit_intercept=False)   # comment here
   
# <-- training

# predicts the property for ALL structures 
y_pred = y  # <-- prediction

y_pred = spherical_to_xyz(y_pred.reshape(-1,,3)) # comment here

return y_pred, itrain, itest, ridge
"""
)

In [None]:
ex11_ridge = None
ex11_pars = (6, 2)
def update_11(code_exercise):
    global ex11_ridge, ex11_pars
    nmax, ftrain, log10alpha = code_exercise.parameters.values()
    ex11_pars = (nmax, 1)
    print_output = code_exercise.cue_outputs[0]
    print_output.clear_output()
    y = h2o_dipole
    yp, itrain, itest, ridge = ex11_wci.get_function_object()(h2o_frames, 
                                        compute_rhoi_default, nmax, 
                                        ftrain, 10**log10alpha)
    ex11_ridge = ridge
    
    with print_output:                        
        print("MAE train: ", np.mean(np.abs((y-yp)[itrain])))
        print("MAE test: ", np.mean(np.abs((y-yp)[itest])))
    
    frames = h2o_frames
    ftype = np.asarray([ "test " ] * len(frames)); ftype[itrain] = "train"
    properties={"|mu_ref|": np.sqrt((y[:]**2).sum(axis=1)), 
                "|mu_pred|" : np.sqrt((yp[:]**2).sum(axis=1)), 
                "||mu_ref|-|mu_pred||": np.abs(np.sqrt((y[:]**2).sum(axis=1))-
                                       np.sqrt((yp[:]**2).sum(axis=1))),
                "type": ftype[:]}
    
    settings={'map': {'x': {'property': "|mu_ref|"},
  'y': { 'property': '|mu_pred|'},
  'color': {'max': 1, 'min': 0, 'property': '||mu_ref|-|mu_pred||', 'scale': 'linear'},
  'symbol': 'type',
  'palette': 'inferno',
  'size': {'factor': 40}},
 'structure': [{'bonds': True,
                'shape': 'mu_ref,mu_pred',
   }]}
    y_arrows = chemiscope.ase_vectors_to_arrows(h2o_frames, "dipole", scale=4, head_length_scale=3);
    y_arrows["parameters"]["global"].update({ "color": 0xff8000 })
    yp_arrows = {'kind': 'arrow',
 'parameters': {'global': {'baseRadius': 0.1,
   'headRadius': 0.17500000000000002,
   'headLength': 0.30000000000000004,
   'color': 0xb000b0},
  'structure': [ {"vector": (mu*4).tolist()} for mu in yp ]}}
    
    chemiscope.write_input("module_03-equivariant-regression.chemiscope.json.gz", 
                           frames=frames[:],
                           shapes={"mu_ref":y_arrows, "mu_pred": yp_arrows},
                           properties=properties,
                           settings=settings
                          )
                           
    with print_output:
        display(chemiscope.show_input("module_03-equivariant-regression.chemiscope.json.gz"
                  ) )
        
ex11_pb =  ParameterPanel(
    nmax = IntSlider(value=4,min=1,max=8,description=r'$n_{max}$'),
    ftrain = FloatSlider(value=0.5,min=0.1,max=0.9,step=0.1,description=r'$f_{\mathrm{train}}$'),
    log10alpha = FloatSlider(value=0,min=-20,max=5,step=0.1,description=r'$\alpha$'),
    )

ex11_code_demo = CodeExercise(
            code= ex11_wci,
            #check_registry=check_registry,
            parameters=ex11_pb,
            cue_outputs = [CueObject()],
            update_func = update_11,
    exercise_key="11",
    exercise_registry=exercise_registry,
    exercise_title="Exercise 11: Equivariant regression of dipoles",
    exercise_description=mdwn("""
Implement a function that fits and evaluates an equivariant regression model for the dipole
moment of water molecules. The descriptors should be density expansion coefficients
(computed by a utility function) and you should use `sklearn`'s `Ridge` class to
perform the regression.

NB: The stub of the function implements already several operations that are necessary to
make sure that `Ridge` does "the right thing" to perform equivariant regression. 
Add detailed comments explaining why these are necessary to implement the equations 
for equivariant regression. 
""")
)
"""def ex10_chk(a,b):
    return np.allclose(a[0],b[0])
ex10_reference_input = [{'structures':read('data/mp_elastic.extxyz','::100'),'target':"K", 'f_fingerprint' :fingerprintf,'f_train':0.5,'alpha':1e-3}]
ex10_reference_output = [(np.loadtxt('data/mp_elastic_10ref.txt'),)]

check_registry.add_check(ex10_code_demo,
    asserts= [
        assert_type,
        assert_numpy_allclose,
    ],
                         inputs_parameters=ex10_reference_input,
                         outputs_references =ex10_reference_output,
                         fingerprint=lambda x,y: x)
""";

In [None]:
display(ex11_code_demo)

In [None]:
ex11b_txt = TextExercise(
    exercise_description=mdwn(r"""
Experiment with the training fraction, the regularization, and the 
number of density coefficients. How does the accuracy compare with what 
you had observed when using all the density expansion coefficients? 
Make hypotheses as to why results differ.
"""),
    exercise_registry=exercise_registry,
    exercise_key="11b",
    exercise_title=""
)
display(ex11b_txt)

In [None]:
def update_12(code_exercise):
    output = code_exercise.cue_outputs[0]
    output.clear_output()
    rots = []
    for f in h2o_frames[::40]:
      for r in np.pi*np.array([[0,0,0],[0,0.125,0],[0,0.250,0],[0,0.375,0],[0,0.5,0],
                             [0.125,0.5,0],[0.25,0.5,0],[0.375,0.5,0],[0.5,0.5,0],
                             [0.5,0.5,0.125],[0.5,0.5,0.25],[0.5,0.5,0.375],[0.5,0.5,0.5]]):
        nf = deepcopy(f)
        nf.positions, nf.info["dipole"] = ex02_wci.get_function_object()(
            f.positions, f.info["dipole"], rotation_matrix(*r) )

        rhoi=compute_rhoi_default(nf, *ex11_pars)
        X = (
            rhoi # this consolidates the blocks into a single one with all lambda terms concatenated
            .block(1).values.reshape(-1, ex11_pars[0]) # picks the value array
            )
        ypred = ex11_ridge.predict(X)
        nf.info["dipole_pred"] = spherical_to_xyz(ypred.reshape(-1,3))[0]
        rots.append(nf)
    properties={"|mu_ref|": np.sqrt(np.array([f.info["dipole"]**2 for f in rots]).sum(axis=1)), 
                "|mu_pred|" : np.sqrt(np.array([f.info["dipole_pred"]**2 for f in rots]).sum(axis=1))}
    properties.update(
        {"||mu_ref|-|mu_pred||": np.abs(properties["|mu_ref|"]-properties["|mu_pred|"])
        })
    dipoles_show = chemiscope.ase_vectors_to_arrows(rots, "dipole", scale=4)
    dipoles_show["parameters"]["global"]["color"]="0xff8000"
    pred_show = chemiscope.ase_vectors_to_arrows(rots, "dipole_pred", scale=4)
    pred_show["parameters"]["global"]["color"]="0xb000b0"
    with output:
        cs=chemiscope.show(rots, 
            properties=properties,
            shapes={
            "mu_ref": dipoles_show,
            "mu_pred": pred_show,
                },
                mode="default",
               settings=chemiscope.quick_settings(
                   x="|mu_ref|", y="|mu_pred|", color="||mu_ref|-|mu_pred||",
                   structure_settings={"shape": ["mu_ref", "mu_pred"], 
                        "keepOrientation":True}
                          )
                          )
        cs.save("module_03-dipole_rotated_equiv.chemiscope.json.gz")
        display(cs)

ex12_code_demo = CodeExercise(
    check_registry=check_registry,
    cue_outputs = [CueObject()],
    update_func = update_12,
    exercise_title="Exercise 12: Molecular rotations and equivariant predictions",
    exercise_description=mdwn("""
Let's repeat the experiment of predicting the dipole of a few rigidly-rotated structures.
This uses the equivariant model you trained above.
""")
)

In [None]:
display(ex12_code_demo)

In [None]:
ex12b_txt = TextExercise(
    exercise_description=mdwn(r"""
Compare your results with those for the non-equivariant regression model. 
"""),
    exercise_registry=exercise_registry,
    exercise_key="12b",
    exercise_title=""
)
display(ex12b_txt)

The density expansion coefficients contain limited information about the atomic positions, which affects the accuracy that can be achieved, especially by a simple linear regression model. You can read about ways to build systematically [more informative equivariant reprentations](http://doi.org/10.1063/5.0021116), and how to construct [equivariant deep learning architectures](http://arxiv.org/abs/1802.08219v3). 