# **DESIGN OF EXPERIMENTS**

---

In this notebook the generation of the dataset for the Finite Element Analysis (FEA) of a variable stiffness cylindrical shell is investigated. For the puporse a Design of Experiments (DOE) with Latine Hypercube Sampling (LHS) is considered. We have to import all the required libraries. Instead of the library smt, as in the related paper, here we will use the library pyDOE since allows to obtain easily reproducible results.

In [1]:
import os
import sys
import time
import random
import pandas as pd
import seaborn as sns
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
# !pip install smt
# from smt.sampling_methods import LHS
# !pip install pyDOE
import pyDOE
from scipy.optimize import fsolve
#from google.colab import files

## **Geometry and design parameters**

---

We can start defining the geometric properties of the cylindrical shell that you want to analyze, together with the mesh size of the Finite Element Model (FEM). The value of the radii and of the height are taken from a manufactured VS cylinder at TU Delft. The mesh size has been chosen after a sensitivity analysis on the buckling load of various Constant Stiffness (CS) cylindrical shells.

In [2]:
h = 705  # height
r = 300  # radius
mesh_size = 10  # mesh size
x_elems = int(h / mesh_size)
deg2rad = np.pi / 180

To avoid mistake all the variables that can be changed to modify the DOE generation will be grouped in the following cell. They are:

* **sets**: specify which sets generate with the sampling method:

  1. *train*: training set
  
  2. *val*: validation set
  
  3. *test*: testing set
  
* **n_samples**: fix the number of samples to be generated. This number is the sum of the samples of all the sets:

  1. *small*
  
  2. *large*
  
  3. *int*: specific integer value
  
* **angles_fct**: specify the function used to generate the fibers angles:

  1. *harmlin*
  
  2. *linear*
  
  3. *constant*
  
* **plies**: total number of plies:

  1. *int*
  
* **symmetric**: define symmetric stacking sequence:

  1. *True/False*
  
* **balanced**: define a balanced stacking sequence:

  1. *True/False*
  
* **radius_of_curvature**: define the maximum radii of curvature allowed:

  1. *int*

It is important to remark that the choice of the numer of sample does not consider important aspects as the variability of the output inside the design space or the size of the design space. If someone wants to obtain a statistical measure of the number of samples required for a correct learning the VC dimension should be analyzed.

The generation of the dataset and, also the final optimization step, require the knowledge of the boundaries of the problem. We have to specify the minimum and maximum value for each of the design variable. Below the unit of measure of the features are reported.

* **Harmlin**:

    * $A$: $[mm]$
    
    * $\phi$: $[deg]$
    
    * $\omega$: $[1/rad]$
    
    * $\beta$: $[deg]$
    
* **Linear**:

    * $T_i$: $[deg]$

* **Constant**:

    * $\vartheta$: $[deg]$

In [3]:
plies = 8 # number of plies

sets = ['Train', 'Test']  # sets to generate

load_case = 'torsion'

angles_fct = 'harmlin'  # formulation of the fiber

pieces = 2  # number of pieces of the piecewise linear formulation

symmetric = True  # symmetric stacking sequence
balanced = True  # balanced stacking sequence

radius_curvature = 635 # minimum radius of curvature

param = 8  # multiplicative factor for the sets size

In [4]:
if angles_fct == 'harmlin':
    features = ['Amplitude', 'PhaseShift', 'Omega', 'Beta']
    range_dict = {'Amplitude': [-200., 200.],
                  'PhaseShift': [-90., 90.],
                  'Omega': [0., 2],
                  'Beta': [-90., 90.]}
elif angles_fct == 'linear' and isinstance(pieces, int):
    features=[]
    range_dict = {}
    for p in range(pieces+1):
        features.append('T' + str(p))
        if p == 0 or p == pieces:
            range_dict['T' + str(p)] =  [-45., 45.]
        else:
            range_dict['T' + str(p)] = [-89., 89.]
elif angles_fct == 'linear' and not isinstance(pieces, int):
    raise Exception('Please specify an integer number of pieces')
    
elif angles_fct == 'constant':
    features= ['Theta']
    range_dict = {'Theta': [-89., 90.]}

if symmetric and balanced: # if symmetric and balanced
    eff_plies = int(plies / 4)
    folder_ss = 'symmetric_balanced'
if symmetric and (symmetric != balanced): # if symmetric and not balanced
    eff_plies = int(plies / 2)
    folder_ss = 'symmetric'
if balanced and (balanced != symmetric): # if balanced and not symmetric
    eff_plies = int(plies / 2)
    folder_ss = 'balanced'

k_max = 1 / radius_curvature

Create the folder to store the dataset and the folders for the specific sets. 

In [5]:
directory = '../dataset/' + load_case + '/' + folder_ss + '/' + str(param) + 'x/' + angles_fct

try:
    if not os.path.isdir(directory):
        os.makedirs(directory)
except OSError:
    print('Error: Creating directory. ' + directory)

for set in sets:
    try:
        if not os.path.isdir(directory + '/' + set.lower()):
            os.makedirs(directory + '/' + set.lower())
    except OSError:
        print('Error: Creating directory. ' + directory.lower())

We can define a dictionary in which store all the information about the model and the sampling. This dictionary will be really helpful in other steps of the overall framework.

In [6]:
model = {'Height': h,
         'Radius': r,
         'MaxCurvature': k_max,
         'MeshSize': mesh_size,
         'Plies': plies,
         'EffectivePlies': eff_plies,
         'Symmetric': symmetric,
         'Balanced': balanced,
         'AnglesFunction': angles_fct,
         'LoadCase': load_case}
col = ['Theta' + str(i) for i in range(1, x_elems + 1)]

Since the library pyDOE is able to generate just samples in the range $[0, 1]$ we need to store the range of each feature for the rescaling.

In [7]:
interval = []
min_int = []
for feature in features:
    interval.append(range_dict[feature][1] - range_dict[feature][0])
    min_int.append(range_dict[feature][0])

range_df = pd.DataFrame(range_dict)
range_df.index = ['min', 'max']
range_df.to_csv(directory + '/features_min_max.csv', index=False)  
range_df

Unnamed: 0,Amplitude,PhaseShift,Omega,Beta
min,-200.0,-90.0,0.0,-90.0
max,200.0,90.0,2.0,90.0


This cell is about the sets and the number of samples.

In [8]:
# # DEPRECATED
# # From paper: ...
# if n_samples == 'small':
#     set_dim = 10 * eff_plies * len(features)
# elif n_samples == 'large':
#     set_dim = int(3 * (eff_plies * len(features) + 1) * (eff_plies * len(features) + 2) / 2)
# else:
#     set_dim = n_samples

In [9]:
set_dim = (eff_plies * len(features)) * param * 10
    
foo = len(sets)

if foo == 1:
    smpls = {'Train' : [set_dim]}
elif foo == 2:
    smpls = {'Train' : [round(set_dim * 0.8)],
            'Test': [round(set_dim * 0.2)]}
else:
    smpls = {'Train' : [round(set_dim * 0.6)],
            'Test': [round(set_dim * 0.2)],
            'Val': [round(set_dim * 0.2)]}

for key in smpls.keys():
    model.update({key: smpls[key]})

model_df = pd.DataFrame(model)
model_df.to_csv(directory + '/model_info.csv', index=False, float_format='%.6f')
model_df

Unnamed: 0,Height,Radius,MaxCurvature,MeshSize,Plies,EffectivePlies,Symmetric,Balanced,AnglesFunction,LoadCase,Train,Test
0,705,300,0.001575,10,8,2,True,True,harmlin,torsion,512,128


## **Sampling**

---

The way in which in the paper the maximum curvature is imposed was too approximative. In this case the nonlinear equation of the curvature will be solved with *fsolve()*. In this way it is possible to obtain an approximate value of the true function and not the correct value of an approximate function. Instead of modify the frequency, in this case, will be corrected the amplitude. The reason behind this change is that, since the nonlinear equation is solved iteratively, every correction of the frequency makes the position of the maximum curvature change along the path. Modifying instead the amplitude is an easier solution.

**FOR THE HARMLIN FUNCTION**

In [10]:
def a_max(height, mesh_size, amplitude, phase_shift, omega, beta, k_max):
    x = np.linspace(mesh_size / 2, h - mesh_size / 2, 1000) 
    dy = omega * (2 * np.pi / h) * amplitude * np.cos(omega * (2 * np.pi / h) * x + phase_shift) + np.tan(beta )
    ddy = - (omega**2) * (4 * (np.pi**2) / (h**2)) * amplitude * np.sin(omega * (2 * np.pi / h) * x + phase_shift)
    k = ddy / ((1 + dy**2)**(1.5))

    max_value = np.amax(abs(k))
    max_pos = np.where(abs(k) == max_value)

    if max_value > k_max:
        def curvature(a_hat):
            dy = omega * (2 * np.pi / h) * a_hat * np.cos(omega * (2 * np.pi / h) * x[max_pos[0][0]] + phase_shift) + np.tan(beta)
            ddy = - (omega**2) * (4 * (np.pi**2) / (h**2)) * a_hat * np.sin(omega * (2 * np.pi / h) * x[max_pos[0][0]] + phase_shift)
            return k_max - abs(ddy / ((1 + dy**2)**(1.5)))
        new_a = fsolve(curvature, amplitude)
    else:
        new_a = amplitude
    
    return new_a

Generate the input file containing the combinations of input features for each sample and in each ply

In [11]:
if angles_fct == 'harmlin':
    for act_set in sets:
          for ply in range(1, eff_plies + 1):
                np.random.seed(3 + ply) # for reproducibility
                design = pyDOE.lhs(len(features), smpls[act_set][0])
                x = design * interval + min_int
                for j in range(0, smpls[act_set][0]):
                    new_a = a_max(h, mesh_size, x[j, 0], x[j, 1] * deg2rad, x[j, 2], x[j, 3] * deg2rad, k_max)
                    # np.random.seed(5) # for reproducibility
                    # x[j, 0] = new_a * np.random.uniform(0., 1)
                    x[j, 0] = new_a
                x_df = pd.DataFrame(x, columns=features)
                x_df.to_csv(directory + '/' + act_set + '/' + 'ply' + str(ply) + '.csv', index=False, float_format='%.4f')
                
elif angles_fct == 'linear':
    for act_set in sets:
          for ply in range(1, eff_plies + 1):
                np.random.seed(3 + ply) # for reproducibility
                design = pyDOE.lhs(len(features), smpls[act_set][0])
                T = design * interval + min_int
                theta = np.zeros((T.shape[0], x_elems))
                for j in range(0, smpls[act_set][0]):
                    pieces_length = [np.ceil(x_elems/pieces)]
                    for piece in range(pieces-1):
                        pieces_length.append(np.floor(x_elems/pieces))
                    if pieces_length[0] == pieces_length[1]:
                        pieces_length[0] += 1

                    x_tmp = np.linspace(0,h/pieces,int(pieces_length[0]))
                    tht_tmp = 0 + T[j][0] + (T[j][1] - T[j][0]) * x_tmp / x_tmp[-1]
                    k_tmp = (T[j][1] - T[j][0]) * deg2rad / x_tmp[-1] * np.cos(tht_tmp * deg2rad)

                    max_k = np.max(k_tmp)
                    min_k= np.min(k_tmp)
                    if max_k >= abs(min_k):
                        maxim = max_k
                        flag_pos = True
                    else:
                        maxim = abs(min_k)
                        flag_pos = False

                    if maxim > k_max and flag_pos:
                        iterate = True
                        while iterate:
                            T[j][1] -= 1
                            tht_tmp = 0 + T[j][0] + (T[j][1] - T[j][0]) * x_tmp / x_tmp[-1]
                            k_tmp = (T[j][1] - T[j][0]) * deg2rad / x_tmp[-1] * np.cos(tht_tmp * deg2rad)
                            max_k = np.max(k_tmp)
                            if max_k < k_max:
                                iterate = False
                    elif maxim > k_max and not flag_pos:
                        iterate = True
                        while iterate:
                            T[j][1] += 1
                            tht_tmp = 0 + T[j][0] + (T[j][1] - T[j][0]) * x_tmp / x_tmp[-1]
                            k_tmp = (T[j][1] - T[j][0]) * deg2rad / x_tmp[-1] * np.cos(tht_tmp * deg2rad)
                            max_k = abs(np.min(k_tmp))
                            if max_k < k_max:
                                iterate = False

                    tht = tht_tmp[0:-1]

                    for i in range(1,pieces):
                        x_tmp = np.linspace(0,h/pieces,int(pieces_length[i]))
                        tht_tmp = 0 + T[j][i] + (T[j][i+1] - T[j][i]) * x_tmp / x_tmp[-1]
                        k_tmp = (T[j][i+1] - T[j][i]) * deg2rad / x_tmp[-1] * np.cos(tht_tmp * deg2rad)

                        max_k = np.max(k_tmp)
                        min_k= np.min(k_tmp)
                        if max_k >= abs(min_k):
                            maxim = max_k
                            flag_pos = True
                        else:
                            maxim = abs(min_k)
                            flag_pos = False

                        if maxim > k_max and flag_pos:
                            iterate = True
                            while iterate:
                                T[j][i+1] -= 1
                                tht_tmp = 0 + T[j][i] + (T[j][i+1] - T[j][i]) * x_tmp / x_tmp[-1]
                                k_tmp = (T[j][i+1] - T[j][i]) * deg2rad / x_tmp[-1] * np.cos(tht_tmp * deg2rad)
                                max_k = np.max(k_tmp)
                                if max_k < k_max:
                                    iterate = False
                        elif maxim > k_max and not flag_pos:
                            iterate = True
                            while iterate:
                                T[j][i+1] += 1
                                tht_tmp = 0 + T[j][i] + (T[j][i+1] - T[j][i]) * x_tmp / x_tmp[-1]
                                k_tmp = (T[j][i+1] - T[j][i]) * deg2rad / x_tmp[-1] * np.cos(tht_tmp * deg2rad)
                                max_k = abs(np.min(k_tmp))
                                if max_k < k_max:
                                    iterate = False
                        tht = np.concatenate([tht, tht_tmp])
                        theta[j, :] = tht
                    T_df = pd.DataFrame(T, columns=features)
                    T_df.to_csv(directory + '/' + act_set + '/' + 'ply' + str(ply) + '.csv', index=False, float_format='%.4f')
                    theta_df = pd.DataFrame(theta, columns=col)
                    theta_df.to_csv(directory + '/' + act_set + '/' + 'theta' + str(ply) + '.csv', index=False, float_format='%.4f')
                    
elif angles_fct == 'constant':
    for act_set in sets:
          for ply in range(1, eff_plies + 1):
                np.random.seed(3 + ply) # for reproducibility
                design = pyDOE.lhs(len(features), smpls[act_set][0])
                x = design * interval + min_int
                x_df = pd.DataFrame(x, columns=features)
                x_df.to_csv(directory + '/' + act_set + '/' + 'ply' + str(ply) + '.csv', index=False, float_format='%.2f')
                theta = np.repeat(x, x_elems, axis=1)
                theta_df = pd.DataFrame(theta, columns=col)
                theta_df.to_csv(directory + '/' + act_set + '/' + 'theta' + str(ply) + '.csv', index=False, float_format='%.2f')

At this point we can also obtain the local angle for each ply in each element of the mesh. Remember that the continuous variation of the local angle of the fiber is approximated as piecewise constant inside each element. Let's first define the function that, given the input parameters of the fiber shapes, returns the local angle inside each element.

In [12]:
def harmlin(height, mesh_size, amplitude, phase_shift, omega, beta):
    x = np.linspace(mesh_size / 2, h - mesh_size / 2, x_elems)  # normalized domain
    y = amplitude * (np.sin(omega * (2 * np.pi / h) * x + phase_shift)) + x * np.tan(beta)
    simb_dy = 2 * np.pi / h * omega * amplitude * np.cos(omega * (2 * np.pi / h) * x + phase_shift) + np.tan(beta)
    return np.arctan(simb_dy) * 180 / np.pi

Now we can calculate the fibers angles of each element.

In [13]:
if angles_fct == 'harmlin':
    for act_set in sets:
          for ply in range(1, eff_plies + 1):
                tht = np.empty((0, x_elems), int)
                file_name = directory + '/' + act_set + '/ply' + str(ply) + '.csv'
                x = pd.read_csv(file_name, names = features, sep=",", skiprows=1)
                for i in range(0, smpls[act_set][0]):
                    theta = np.array([harmlin(h, mesh_size, x.values[i][0], x.values[i][1] * np.pi / 180, x.values[i][2], x.values[i][3] * np.pi / 180)])
                    tht = np.append(tht, theta, axis=0)
                tht_df = pd.DataFrame(tht, columns=col)
                tht_df.to_csv(directory + '/' + act_set + '/' + 'theta' + str(ply) + '.csv', index=False, float_format='%.3f')

In [14]:
#!zip -r dataset.zip dataset
#files.download("dataset.zip")