# Neo-Eulerian clustering

Duncan has a few datasets he's been working on with Robert that could benefit from a clustering approach. We've decided to run with DBSCAN as it's n-cluster agnostic and suitable for the data.

1. Load in the euler-angle data
2. Convert to quaternions
3. Calculate grain boundaries (defined as boundaries with more than 5 degree mistilt)
4. Compute homochoric (distance-preserving) representation
5. Cluster using DBSCAN
6. Means back to quaternions
7. Project into fundamental zone
8. Check overlap

In [None]:
%matplotlib inline
import numpy as np
from math import acos, pi
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
from transforms3d.euler import euler2quat
from transforms3d.quaternions import qmult, qinverse, quat2axangle

In [None]:
with open('/home/bm424/Desktop/upload_scripts_apollo/case_study_1_bainite_data.ctf') as f:
    lines = f.readlines()
lines = lines[17:]  # The first few lines are descriptive

In [None]:
# For each line, split by tab stop and take the 5-7 columns inclusive.
data = np.radians(np.array([line.split('\t')[5:8] for line in lines]).astype(float))
# The data appears to be doubled somehow, so just take the first half.
data = data.reshape(223, 1190, 3)[:, :595, :].reshape(-1, 3)
data

In [None]:
# Visual representation using the first Euler angle
plt.imshow(data.reshape(223, 595, 3)[:, :, 0])

In [None]:
# Convert to quaternions. This is slow as 'euler2quat' only operates on one row at a time.
data_quat = np.array([euler2quat(*d, axes='rzyz') for d in data]).reshape(223, 595, 4)

In [None]:
# Vertical misorientation angles
v = np.arccos(np.square(np.sum(data_quat[1:] * data_quat[:-1], axis=2)) * 2 - 1)[:, :-1]
# Grain boundaries picked out where the angle is greater than 5 degrees
plt.figure()
plt.imshow(v > 0.0873)

In [None]:
# Horizontal misorientation angles
h = np.arccos(np.square(np.sum(data_quat[:, 1:] * data_quat[:, :-1], axis=2)) * 2 - 1)[:-1, :]

In [None]:
# For simplicity just consider vertical ones. (There's more of them)
to_compare = zip(data_quat[:-1, :-1][v > 0.0873], data_quat[1:, 1:][v > 0.0873])

# Compute misorientation quaternions
misorientations = np.array([qmult(q2, qinverse(q1)) for q1, q2 in to_compare])

In [None]:
# If computing distance matrix, need a subset of the data for the sake of RAM
# random_indices = np.random.choice(np.arange(len(misorientations)), 20000)
# misorientations = misorientations[random_indices]
# dmatrix = np.arccos(np.round(np.square(np.einsum('ik,jk->ij', misorientations, misorientations)), 8) * 2 - 1)

In [None]:
axes = np.array([quat2axangle(d)[0] for d in misorientations])
angles = np.array([quat2axangle(d)[1] for d in misorientations])
radius = (0.75*(angles - np.sin(angles)))**(1/3)  # Homochoric scaling. Note there is no inverse for this '>_<
rf = axes * radius[:, np.newaxis]

In [None]:
from sklearn.cluster import DBSCAN

In [None]:
labels = DBSCAN(eps=0.03, min_samples=80).fit_predict(rf)  # Play with parameters - there are a lot of clusters on different scales
print(len(set(labels)))  # N clusters found

In [None]:
# Visualise in 3d
ax = plt.figure().add_subplot(111, projection='3d', aspect='equal')
for label in set(labels):
    if label < 0:  # Plot the 'noise' cluster separately
        r = rf[labels == label]
        ax.scatter(r[:, 0], r[:, 1], r[:, 2], s=0.1, c='k')
        continue
    r = rf[labels==label]
    ax.scatter(r[:, 0], r[:, 1], r[:, 2], s=1)
ax.set_xlim(-0.4, 0.4)
ax.set_ylim(-0.4, 0.4)
ax.set_zlim(-0.4, 0.4)

In [None]:
# Visualise the clusters more easily by taking a slice near the x plane 
ax = plt.figure().add_subplot(111, aspect='equal')
slicer = np.abs(rf[:, 0]) < 0.05
rf_slice = rf[slicer]
labels_slice = labels[slicer]
for label in set(labels_slice):
    if label == -1:
        r = rf_slice[labels_slice == label]
        plt.scatter(r[:, 1], r[:, 2], s=0.2, c='k')
        continue
    r = rf_slice[labels_slice == label]
    plt.scatter(r[:, 1], r[:, 2], s=1)

Plenty of clusters are found with the above parameters, but many small ones aren't. I think each individual grain boundary has its own little cluster, demonstrating the importance of applying the symmetry reduction *before* doing this kind of clustering.

In [None]:
# Plot the real space distribution of clusters
blank = np.zeros_like(v)
blank[v > 0.0873] = labels + 2

plt.figure(figsize=(10, 4))
plt.imshow(blank)
plt.colorbar(label='cluster index')


plt.tight_layout()

In [None]:
# Get the misorientation quaternions associated with each cluster
misorientations_sorted = {label: misorientations[labels==label] for label in set(labels)}

In [None]:
misorientations_labeled = np.vstack(np.hstack((np.ones((len(misorientations_sorted[label]), 1))*label, misorientations_sorted[label])) for label in misorientations_sorted)

In [None]:
np.savetxt('misorientations_labeled.txt', misorientations_labeled, fmt=['%03.2i', '%09.6f', '%09.6f', '%09.6f', '%09.6f'], delimiter='\t\t')

## To Do:

- Apply symmetry operations. Probably should be done before clustering based on the above although there is in principle no problem with finding loads of clusters and compressing them afterwards.
- Be more intelligent with finding the high orientations
- Implement quaternion averaging (It should be possible simply to take the mean of the homochoric points but there is no inverse of $\omega - \sin\omega$ to get back to axis-angle representation)