## The aim 

Optimising Prussian White (PW) synthesis with aspect to final pH and time for addition of acid with respect to particle size and morphology. 

## How?

Create a simple 2x2 DoE with mid point to find a optimum for the output parameters in this pH-space. If time allows it, some follow up synthesis experiments will be done to reach a more global maximum. 

## Setting up DoE


In [2]:
#Import all packages 
import pandas as pd
import numpy as np
from numpy.random import rand
import itertools
from matplotlib.pyplot import *
import scipy.stats as stats
import statsmodels.api as sm
import statsmodels.formula.api as smf

In [15]:
# create dictionary for parameters
input_labels = {
    'A' : 'pH',                 #End pH of synthesis
    'B' : '%t'                   #Time in % of the total addition time 
}

# create list of data for high and low. 
data = [
    ('A',3.5,4,4.5),
    ('B',0,10,20),
]

# create pandas dataframe in a pandas dataframe
inputs_df = pd.DataFrame(data,columns=['index', 'low', 'center', 'high'])
inputs_df = inputs_df.set_index(['index'])
inputs_df['label'] = inputs_df.index.map( lambda z : input_labels[z] )

#print dataframe
inputs_df

Unnamed: 0_level_0,low,center,high,label
index,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
A,3.5,4,4.5,pH
B,0.0,10,20.0,%t


In [16]:
#encode the raw data

# compute averages and span
inputs_df['average'] = inputs_df.apply( lambda z : ( z['high'] + z['low'])/2 , axis=1)
inputs_df['span'] = inputs_df.apply( lambda z : ( z['high'] - z['low'])/2 , axis=1)

# encode the data
inputs_df['encoded_low'] = inputs_df.apply( lambda z : ( z['low']  - z['average'] )/( z['span'] ), axis=1)
inputs_df['encoded_center'] = inputs_df.apply( lambda z : ( z['center'] - z['average'] )/( z['span'] ), axis=1)
inputs_df['encoded_high'] = inputs_df.apply( lambda z : ( z['high'] - z['average'] )/( z['span'] ), axis=1)

inputs_df = inputs_df.drop(['average','span'],axis=1)

inputs_df

Unnamed: 0_level_0,low,center,high,label,encoded_low,encoded_center,encoded_high
index,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
A,3.5,4,4.5,pH,-1.0,0.0,1.0
B,0.0,10,20.0,%t,-1.0,0.0,1.0


In [18]:
#Create the design matrix for the experiment

encoded_inputs= list(itertools.product([-1,1],[-1,1]))
encoded_inputs

results=pd.DataFrame(encoded_inputs)
results=results[results.columns[::-1]]
results.columns=['A','B']
results.loc[len(results.index)] = [0,0]         #Add mid point to experiment
results

Unnamed: 0,A,B
0,-1,-1
1,1,-1
2,-1,1
3,1,1
4,0,0


In [27]:
#Translates the desing matrix into the experimental matrix 
#where we can see what experiments we need to do for a full factorial

real_experiment = results

var_labels = []
for var in ['A','B']:
    var_label = inputs_df.loc[var]['label']
    var_labels.append(var_label)
    real_experiment[var_label] = results.apply(
        lambda z : inputs_df.loc[var]['low'] if z[var]<0 else (inputs_df.loc[var]['high'] if z[var]>0 else inputs_df.loc[var]['center']), 
        axis=1)

print("The values of each real variable in the experiment are:")

results


The values of each real variable in the experiment are:


Unnamed: 0,A,B,pH,%t
0,-1,-1,3.5,0.0
1,1,-1,4.5,0.0
2,-1,1,3.5,20.0
3,1,1,4.5,20.0
4,0,0,4.0,10.0


## Experiment and database building

Now the experiments are being done, powder is being synthesised. All samples will be analysed with XRD and SEM. The SEM images will then be processed visually to get numerical values of the mean particle size and distribution. (How will I handle the SEM data, only save means or save all the raw data as well?)

Experiments will not be done in ramdom order due to saftey reasons, otherwise randomy generate in which order the synthesis would be done would result is less bias that could affect the outcome. 



A database in SQLite will be built to accommodate all my synthesis data. 

In [None]:
#Add experimental data from database or from dataframe

