# Demo of `Z3` on Materials Science Use Case

## Load Data

We use the same dataset as in the paper "Data-driven exploration and continuum modeling of dislocation
networks" (submitted to MSMSE journal).

In [1]:
import pandas as pd # data frame
import random # constraint generation
from z3 import * # solver

dataset = pd.read_csv('C:/MyData/Versetzungsdaten/delta_sampled_merged_last_voxel_data_size2400_order2_speedUp2.csv')
dataset.drop(columns=list(dataset)[0], inplace=True) # drop 1st column (unnamed id column)

## Define Target and Features

We want to predict a certain type of reaction density summed over all slip systems.
All other pyhsical quantities are used as features, i.e.,
we also include the same kind of reaction density measured in neighboring voxels.

In [2]:
target = 'rho_glissile'
dataset[target] = dataset[target + '_1']
for slip_system in range(2,13): # sum over slip systems
    dataset[target] = dataset[target] + dataset[target + '_' + str(slip_system)]
    
features = [x for x in list(dataset) if not target in x] # exclude if feature name contains the target string

## Compute Feature Qualities

We use the absolute Pearson correlation of a feature with the prediction target as a measure for the feature's quality.
Rounding the quality improves optimization speed greatly, as the solver uses ["infinite precision arithmetic by default"](http://theory.stanford.edu/~nikolaj/programmingz3.html#sec-solving-arithmetical-fragments) and represents real numbers as rational numbers.

In [3]:
featureQualities = [round(abs(dataset[x].corr(dataset[target])), 2) for x in features]

## Define Optimization Problem

The objective function basically sums up the utility of the selected features.
For the constraints, we use a simple generator which combines randomly chosen variables to logical conditions.

In [4]:
selections = Bools(' '.join(['x' + str(i) for i in range(len(featureQualities))]))
optimizer = Optimize()
objective = optimizer.maximize(Sum([q * s for (q, s) in zip(featureQualities, selections)]))
random.seed(25) # Add some random constraints
for iteration in range(200):
    constraint_picker = random.random()
    if constraint_picker < 0.4:
        chosen_literals = random.sample(selections, k=2)
        optimizer.add(Xor(chosen_literals[0], chosen_literals[1])) # can only combine two literals
    else:
        num_literals = random.randint(1, 21) # can combine a set of literals
        chosen_literals = random.sample(selections, k=num_literals) # sample without replacement
        if constraint_picker < 0.55:
            optimizer.add(AtMost(*chosen_literals, random.randint(1, num_literals)))
        elif constraint_picker < 0.7:
            featureCosts = list(zip(chosen_literals, [random.randint(1, 21) for x in chosen_literals]))
            optimizer.add(PbLe(featureCosts, 10 * len(chosen_literals))) # sum of feature costs <= some_threshold
        elif constraint_picker < 0.85:
            optimizer.add(And(chosen_literals))
        else:
            optimizer.add(Or(chosen_literals))

## Optimize

Just run the optimizer.

In [5]:
print('Satisfiable? ' + str(optimizer.check())) # runs the optimization
print('Objective value: ' + objective.value().as_decimal(prec = 0))
print('Value if selecting all features: ' + str(int(sum(featureQualities))))
print('Number of total features: ' + str(len(selections)))
print('Number of selected features: ' + str(sum([str(optimizer.model()[x]) == 'True' for x in selections])))
print('Optimizer statistics:')
print(optimizer.statistics())
print('First 5 constraints:')
for i in range(5):
    print(optimizer.assertions()[i])

Satisfiable? sat
Objective value: 1625.?
Value if selecting all features: 1656
Number of total features: 6389
Number of selected features: 6131
Optimizer statistics:
(:ba-conflicts            388
 :ba-cuts                 1
 :ba-lemmas               184
 :ba-propagations         92598
 :ba-resolves             1323
 :ba-subsumes             16
 :eliminated-vars         451
 :max-memory              411.63
 :maxres-cores            206
 :maxres-correction-sets  4
 :memory                  410.01
 :num-allocs              48901950353.00
 :rlimit-count            67402033
 :sat-backjumps           1316
 :sat-backtracks          42
 :sat-conflicts           1564
 :sat-decisions           6524247
 :sat-del-clause          31
 :sat-elim-literals       5
 :sat-minimized-lits      995
 :sat-mk-clause-2ary      2415
 :sat-mk-clause-3ary      1731
 :sat-mk-clause-nary      1441
 :sat-mk-var              8690
 :sat-probing-assigned    1
 :sat-propagations-2ary   871187
 :sat-propagations-3ary   3