<a href="https://colab.research.google.com/github/CsonVass/raman/blob/main/DissolutionFromRaman.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

<div class="markdown-google-sans">

# <strong>Előkészítés</strong>
</div>


<div class="markdown-google-sans">

## <strong>Szükséges könyvátrak importálása</strong>
</div>


In [127]:
import math
import matplotlib.pyplot as plt
import numpy as np
import os
import pandas as pd
import tensorflow as tf
from keras import layers
from matplotlib.backends.backend_pdf import PdfPages
from sklearn.model_selection import train_test_split
from tensorflow.python.ops.nn_ops import softmax

<div class="markdown-google-sans">

## <strong>Raman térképek és kioldódási görbék importálása Google Drive tárhelyről</strong>
</div>

* <u>map_curve_pairs</u>: Szótár a Raman térképek és kioldódási görbék párosításához 
* <u>missing</u>: Esetleges hiányzó/rossz formátumban lévő adatokat helyezzük ebbe a tömbbe. Ezek az adatok nem kerülnek felhasználásra egyáltalán (jelenleg nem kerül ide semmi)

In [14]:
map_curve_pairs = {}
missing = []
path_to_hpmc = '/content/drive/MyDrive/Colab Notebooks/Raman/maps'

dir_list = os.listdir(path_to_hpmc)
for f in dir_list: 
  path_to_map = path_to_hpmc + '/' + f
  map_name = f.split('-')[0].split('_')[1]
  map_curve_pairs[map_name] = {}
  map_csv = pd.read_csv(path_to_map, header=None)
  if len(map_csv.index) == 1:
    tmp = []
    for i in range(0, 31*31, 31):
      tmp.append([map_csv.iloc[0:1, i:i+31].values])
    map_csv = pd.DataFrame(tmp)
  map_curve_pairs[map_name]['map'] = tf.reshape(tf.convert_to_tensor(map_csv, tf.float32), [31, 31])

dissolutionCureves = []
curves_csv = pd.read_csv('/content/drive/MyDrive/Colab Notebooks/Raman/Tablettak_minden_adat.csv', header=None)
names = curves_csv.iloc[0:1, 1:]
time = curves_csv.iloc[4:,0:1]
curve_points = curves_csv.iloc[4:,1:].apply(lambda x: x.str.replace(',','.'))
for i in range(len(curve_points.columns)):
    curve_i = curve_points.iloc[:, i:i+1]
    if(names.iloc[0,i] in map_curve_pairs):
      map_curve_pairs[names.iloc[0,i]]['curve'] = tf.reshape(tf.convert_to_tensor(curve_i, tf.float32), [1, 37])
    else: 
      missing.append(names.iloc[0,i])

mcp_cpy = map_curve_pairs.copy()
for name in mcp_cpy:
  if not 'curve' in map_curve_pairs[name] or len(map_curve_pairs[name]['map']) != 31:
    missing.append(map_curve_pairs[name])
    map_curve_pairs.pop(name)  

<div class="markdown-google-sans">

## <strong>Beolvasott adatok felbontása tanító és ellenőrző halmazokra</strong>
</div>

A görbe értékek normalizálása 0-100 intervallumról 0-1 intervallumra

In [15]:
x = []
y = []
keys = []

for key in sorted(map_curve_pairs):
 x.append(map_curve_pairs[key]['map'])
 y.append(map_curve_pairs[key]['curve'])

y = [y_ / 100 for y_ in y]


x_train, x_test, y_train, y_test = train_test_split(x, y)

x_train = np.array(x_train)
y_train = np.array(y_train)

y_train = y_train.reshape((108, 37))

x_test = np.array(x_test)
y_test = np.array(y_test)

y_test = y_test.reshape((36, 37))

<div class="markdown-google-sans">

# <strong>Neurális hálók</strong>
</div>

A hálók felépítéséhez a Sequential API-t használjuk, amely intuitívabb a Funtional API-nál, viszont kevésbé testreszabható.
A hálók bemenetei réteg neuronjai a Raman térképek 31x31 pixelei, a kimentei layer a kioldódási görbe pontjai.

A hálókat lehet importálni Google Drive-ról, létrehozni újakat és menteni őket szintén Google Drive-ra.

<div class="markdown-google-sans">

## <strong>Kiértékelést segítő függvények</strong>
</div>

* <u>model_loss_value</u>: Visszaadja a model loss értékét
* <u>accuracy_for_one</u>: Visszaadja egy test-re vett pontosságát a modellnek
* <u>accuracy_for_all</u>: Visszaadja a test-esetek átlagot pontosságát
* <u>visualize_results</u>: Kirajzolja egymásra az elvárt (kék) és számított (piros) eredményeket 

In [157]:
def model_loss_value(model, x_, y_):
  score = model.evaluate(x_, y_, verbose=0)
  return score

def accuracy_for_one(test_, pre_):
  s = 0
  for i in range(len(pre_)):
    s = s + (test_[i] - pre_[i]) ** 2
   
  f2 = 50 * math.log(((1 + 1/len(pre_) * s) ** -0.5)*100,10)
  return f2

def accuracy_for_all(test_, pre_):
  f = 0
  for i in range(len(pre_)):
    f = f + accuracy_for_one(test_[i], pre_[i])
  
  f2 = f / len(pre_)

  return f2

def visualize_results(time_, test_, pre_):
  x1 = np.array(time_).flatten()
  for t in range(len(test_)):  
    y1 = test_[t]
    if t % 2 == 0:
      plt.figure()
    plt.subplot(1, 2, t%2+1)

    plt.tick_params(  
      which='both',      
      bottom=True,      
      top=False,         
      left = True,
      right = False,
      labelbottom=True,
      labelleft=True,
      labelsize=5)
        
    plt.title('Accuracy')
    plt.ylabel('dissolution')
    plt.xlabel('time')
    plt.plot(x1, y1)
    plt.plot(x1, pre_[t], color="red")
    plt.legend(['measureed', 'predicted'], loc='upper left')
    plt.grid()
    plt.xticks(x1[0::5])

def save_results_graph(time_, test_, pre_, fileName):
  pp = PdfPages("drive/MyDrive/Colab Notebooks/saved_model/" + fileName + "/results.pdf") 
  x1 = np.array(time_).flatten()
  for t in range(len(test_)):  
    y1 = test_[t]
    if t % 2 == 0:
      plt.figure()
    plt.subplot(1, 2, t%2+1)

    plt.tick_params(  
      which='both',      
      bottom=True,      
      top=False,         
      left = True,
      right = False,
      labelbottom=True,
      labelleft=True,
      labelsize=5)
        
    plt.title('Accuracy')
    plt.ylabel('dissolution')
    plt.xlabel('time')
    plt.plot(x1, y1)
    plt.plot(x1, pre_[t], color="red")
    plt.legend(['measureed', 'predicted'], loc='upper left')
    plt.grid()
    plt.xticks(x1[0::5])
    if t % 2 == 1:
      plt.savefig(pp, format='pdf')
  pp.close()

def visualize_loss(history):
  plt.plot(history.history['loss'])
  plt.plot(history.history['val_loss'])
  plt.title('model loss')
  plt.ylabel('loss')
  plt.xlabel('epoch')
  plt.legend(['train', 'test'], loc='upper left')
  plt.grid()

def save_loss_graph(history, fileName):
  visualize_loss(history)
  plt.savefig("drive/MyDrive/Colab Notebooks/saved_model/" + fileName + "/loss.png")
  

<div class="markdown-google-sans">

## <strong>Modell beállítása</strong>
</div>


<div class="markdown-google-sans">

### <strong>Importálás Google Drive tárhelyről</strong>
</div>

In [85]:
model_name = input('Enter a name for the model: ')
path_to_model = 'drive/MyDrive/Colab Notebooks/saved_model/' + model_name + '/model'
model = tf.keras.models.load_model(path_to_model)

Enter a name for the model: cnn_1


<div class="markdown-google-sans">

### <strong>Felépítése</strong>
</div>

<div class="markdown-google-sans">

#### <strong>CNN 1</strong>
</div>

In [None]:
model = tf.keras.models.Sequential()
model.add(layers.Conv2D(input_shape=(31, 31, 1), filters=64, kernel_size=(3, 3), activation="relu"))
model.add(layers.MaxPool2D((2,2), strides=(2,2)))
model.add(layers.Conv2D(input_shape=(31, 31, 1), filters=64, kernel_size=(3, 3), activation="relu"))
model.add(layers.MaxPool2D((2,2), strides=(2,2)))
model.add(layers.Flatten())
model.add(layers.Dropout(0.5))
model.add(layers.Dense(37, activation="relu"))

model.compile(optimizer="adam",
              loss="mean_squared_logarithmic_error", 
               metrics=[])

<div class="markdown-google-sans">

#### <strong>Dense 1</strong>
</div>

In [77]:
model = tf.keras.models.Sequential()
model.add(layers.Flatten())
model.add(layers.Dense(50, activation="relu"))
model.add(layers.Dense(50, activation="relu"))
model.add(layers.Dense(50, activation="relu"))
model.add(layers.Dropout(0.5))
model.add(layers.Dense(37, activation="relu"))

model.compile(optimizer="adam",
              loss="mean_squared_logarithmic_error", 
               metrics=[])

<div class="markdown-google-sans">

## <strong>Tanítás</strong>
</div>

<div class="markdown-google-sans">

### <strong>Modell tanítása</strong>
</div>


In [86]:
history = model.fit(x_train, y_train, validation_split=0.33, batch_size=10, epochs=100,
          verbose=0)

<div class="markdown-google-sans">

### <strong>Modell mentése Google Drive tárhelyre</strong>
</div>


In [99]:
model_name = input('Enter a name for the model: ')
path_to_model = 'drive/MyDrive/Colab Notebooks/saved_model/' + model_name + '/model'
model.save(path_to_model) 

Enter a name for the model: test




<div class="markdown-google-sans">

## <strong>Értékelés</strong>
</div>


<div class="markdown-google-sans">

### <strong>Háló loss értékének kiszámítása a teszt adatok segítéségvel</strong>
</div>

In [160]:
train_loss = model_loss_value(model, x_train, y_train) 
test_loss = model_loss_value(model, x_test, y_test)

print("Train loss: {0}".format(train_loss))
print("Test loss: {0}".format(test_loss))

Train loss: 0.004128384403884411
Test loss: 0.008509163744747639


<div class="markdown-google-sans">

### <strong>Model eredményeinek visszaállítása 0-100 intervallumra </strong>
</div>

In [None]:
pre = model.predict(x_test)
y_test100 = [y_ * 100 for y_ in y_test]
pre100 = [p * 100 for p in pre]

<div class="markdown-google-sans">

### <strong>Tanítási és tesztelési loss értékek vizualizációja</strong>
</div>


In [None]:
visualize_loss(history)

<div class="markdown-google-sans">

### <strong>Kioldódási görbék vizualizációja</strong>
</div>


In [None]:
visualize_results(time, y_test100, pre100)

<div class="markdown-google-sans">

### <strong>Pontosság kiszámítása</strong>
</div>


In [90]:
accuracy_for_all(y_test100, pre100)

46.77552079339473

<div class="markdown-google-sans">

## <strong>Statisztikák mentése</strong>
</div>


<div class="markdown-google-sans">

### <strong>Görbék</strong>
</div>


In [None]:
if (model_name is None):
  model_name = 'base'
save_results_graph(time, y_test100, pre100, model_name)

In [None]:
if (model_name is None):
  model_name = 'base'
save_loss_graph(history, model_name)

<div class="markdown-google-sans">

### <strong>Értékek</strong>
</div>


In [162]:
if (model_name is None):
  model_name = 'base'
with open('drive/MyDrive/Colab Notebooks/saved_model/' + model_name + '/details.txt', 'w') as writefile:
    writefile.write("Accuracy (F2): {0}".format(accuracy_for_all(y_test100, pre100)) + "\n")
    writefile.write("Train loss: {0}".format(model_loss_value(model, x_train, y_train)) + "\n")
    writefile.write("Test loss: {0}".format(model_loss_value(model, x_test, y_test)))