# Creating a Keras model

## Keras model building steps
* Specify Architecture
    * How many layers?
    * How many nodes in each layer?
    * What activation function to use in each layer?
        * ReLU - Rectified Linear Unit
        * Identity function
        * Hyperbolic Tangent
* Compile
    * Specify loss function
    * Optimization parameters
* Fit
    
* Predict

## Model specification

In [1]:
import os
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '1'
import tensorflow as tf
print(f"{tf.config.list_physical_devices('GPU') = }")

tf.config.list_physical_devices('GPU') = [PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]


In [2]:
import numpy as np
#read data
predictors = np.loadtxt('hourly_wages.csv', delimiter=',', skiprows=1, usecols=range(1,10))
target = np.loadtxt('hourly_wages.csv', delimiter=',', skiprows=1, usecols=0)
n_cols = predictors.shape[1]
predictors.shape, target.shape

((534, 9), (534,))

* There are two ways to build up a model, and we will focus on sequential, which is the easier way to build a model.
* Sequential models require that each layer has weights or connections only to the one layer coming directly after it in the network diagram.
* There are more exotic models out there with complex patterns of connections, but Sequential will do the trick for everything we need here.

In [3]:
from tensorflow.keras.layers import Dense
from tensorflow.keras.models import Sequential

# easier way to crate a model
model = Sequential()
model.add(Dense(100, activation='relu', input_shape = (n_cols,)))
model.add(Dense(100, activation='relu'))
model.add(Dense(1))

* We start adding layers using the add method of the model.
* The type of layer you have seen, that standard layer type, is called a Dense layer. It is called Dense because all of the nodes in the previous layer connect to all of the nodes in the current layer.
* As you advance in deep learning, you may start using layers that aren't Dense.
* In each layer, we specify the number of nodes as the first positional argument, and the activation function we want to use in that layer using the keyword argument activation.
* Keras supports every activation function you will want in practice.
* In the first layer, we need to specify input shapes as shown here. That says the input will have n_cols columns, and there is nothing after the comma, meaning it can have any number of rows, that is, any number of data points.
* You'll notice the last layer has 1 node. That is the output layer, and it matches those diagrams where we ended with only a single node as the output or prediction of the model.
* This model has 2 hidden layers, and an output layer.
* You may be struck that each hidden layers has 100 nodes. Keras and TensorFlow do the math for us, so don't feel afraid to use much bigger networks than we've seen before. It's quite common to use 100 or 1000s nodes in a layer.
* You'll learn more about choosing an appropriate number of nodes later. 

In [4]:
model.summary()

Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 dense (Dense)               (None, 100)               1000      
                                                                 
 dense_1 (Dense)             (None, 100)               10100     
                                                                 
 dense_2 (Dense)             (None, 1)                 101       
                                                                 
Total params: 11,201
Trainable params: 11,201
Non-trainable params: 0
_________________________________________________________________


# Compiling and fitting a model

## Why you need to compile your model
* Specify the optimizer
    * Many options and mathematically complex
    * "Adam" is usually a good choice: **adjusts learning rate as it does gradient descent**
* Loss function
    * "mean_squared_error" common for regression

## Compiling a model

In [5]:
model.compile(optimizer='adam', loss='mean_squared_error')

## What is fitting a model
* Applying backpropagation and gradient descent with your data to update the weights
* Scaling data before fitting can ease optimization

## Fitting a model

In [6]:
model.fit(predictors, target)

2023-09-02 16:14:06.812394: W tensorflow/compiler/xla/service/gpu/llvm_gpu_backend/gpu_backend_lib.cc:530] Can't find libdevice directory ${CUDA_DIR}/nvvm/libdevice. This may result in compilation or runtime failures, if the program we try to run uses routines from libdevice.
Searched for CUDA in the following directories:
  ./cuda_sdk_lib
  /usr/local/cuda-11.2
  /usr/local/cuda
  .
You can choose the search directory by setting xla_gpu_cuda_data_dir in HloModule's DebugOptions.  For most apps, setting the environment variable XLA_FLAGS=--xla_gpu_cuda_data_dir=/path/to/cuda will work.
2023-09-02 16:14:06.813956: W tensorflow/compiler/xla/service/gpu/llvm_gpu_backend/gpu_backend_lib.cc:274] libdevice is required by this HLO module but was not found at ./libdevice.10.bc
2023-09-02 16:14:06.814176: W tensorflow/core/framework/op_kernel.cc:1830] OP_REQUIRES failed at xla_ops.cc:362 : INTERNAL: libdevice not found at ./libdevice.10.bc
2023-09-02 16:14:06.834788: W tensorflow/compiler/xla/s

InternalError: Graph execution error:

Detected at node 'StatefulPartitionedCall_4' defined at (most recent call last):
    File "/home/mauricio/miniconda3/envs/dev/lib/python3.10/runpy.py", line 196, in _run_module_as_main
      return _run_code(code, main_globals, None,
    File "/home/mauricio/miniconda3/envs/dev/lib/python3.10/runpy.py", line 86, in _run_code
      exec(code, run_globals)
    File "/home/mauricio/miniconda3/envs/dev/lib/python3.10/site-packages/ipykernel_launcher.py", line 17, in <module>
      app.launch_new_instance()
    File "/home/mauricio/miniconda3/envs/dev/lib/python3.10/site-packages/traitlets/config/application.py", line 1043, in launch_instance
      app.start()
    File "/home/mauricio/miniconda3/envs/dev/lib/python3.10/site-packages/ipykernel/kernelapp.py", line 736, in start
      self.io_loop.start()
    File "/home/mauricio/miniconda3/envs/dev/lib/python3.10/site-packages/tornado/platform/asyncio.py", line 195, in start
      self.asyncio_loop.run_forever()
    File "/home/mauricio/miniconda3/envs/dev/lib/python3.10/asyncio/base_events.py", line 603, in run_forever
      self._run_once()
    File "/home/mauricio/miniconda3/envs/dev/lib/python3.10/asyncio/base_events.py", line 1909, in _run_once
      handle._run()
    File "/home/mauricio/miniconda3/envs/dev/lib/python3.10/asyncio/events.py", line 80, in _run
      self._context.run(self._callback, *self._args)
    File "/home/mauricio/miniconda3/envs/dev/lib/python3.10/site-packages/ipykernel/kernelbase.py", line 516, in dispatch_queue
      await self.process_one()
    File "/home/mauricio/miniconda3/envs/dev/lib/python3.10/site-packages/ipykernel/kernelbase.py", line 505, in process_one
      await dispatch(*args)
    File "/home/mauricio/miniconda3/envs/dev/lib/python3.10/site-packages/ipykernel/kernelbase.py", line 412, in dispatch_shell
      await result
    File "/home/mauricio/miniconda3/envs/dev/lib/python3.10/site-packages/ipykernel/kernelbase.py", line 740, in execute_request
      reply_content = await reply_content
    File "/home/mauricio/miniconda3/envs/dev/lib/python3.10/site-packages/ipykernel/ipkernel.py", line 422, in do_execute
      res = shell.run_cell(
    File "/home/mauricio/miniconda3/envs/dev/lib/python3.10/site-packages/ipykernel/zmqshell.py", line 546, in run_cell
      return super().run_cell(*args, **kwargs)
    File "/home/mauricio/miniconda3/envs/dev/lib/python3.10/site-packages/IPython/core/interactiveshell.py", line 3024, in run_cell
      result = self._run_cell(
    File "/home/mauricio/miniconda3/envs/dev/lib/python3.10/site-packages/IPython/core/interactiveshell.py", line 3079, in _run_cell
      result = runner(coro)
    File "/home/mauricio/miniconda3/envs/dev/lib/python3.10/site-packages/IPython/core/async_helpers.py", line 129, in _pseudo_sync_runner
      coro.send(None)
    File "/home/mauricio/miniconda3/envs/dev/lib/python3.10/site-packages/IPython/core/interactiveshell.py", line 3284, in run_cell_async
      has_raised = await self.run_ast_nodes(code_ast.body, cell_name,
    File "/home/mauricio/miniconda3/envs/dev/lib/python3.10/site-packages/IPython/core/interactiveshell.py", line 3466, in run_ast_nodes
      if await self.run_code(code, result, async_=asy):
    File "/home/mauricio/miniconda3/envs/dev/lib/python3.10/site-packages/IPython/core/interactiveshell.py", line 3526, in run_code
      exec(code_obj, self.user_global_ns, self.user_ns)
    File "/tmp/ipykernel_22982/1750234257.py", line 1, in <module>
      model.fit(predictors, target)
    File "/home/mauricio/miniconda3/envs/dev/lib/python3.10/site-packages/keras/utils/traceback_utils.py", line 65, in error_handler
      return fn(*args, **kwargs)
    File "/home/mauricio/miniconda3/envs/dev/lib/python3.10/site-packages/keras/engine/training.py", line 1685, in fit
      tmp_logs = self.train_function(iterator)
    File "/home/mauricio/miniconda3/envs/dev/lib/python3.10/site-packages/keras/engine/training.py", line 1284, in train_function
      return step_function(self, iterator)
    File "/home/mauricio/miniconda3/envs/dev/lib/python3.10/site-packages/keras/engine/training.py", line 1268, in step_function
      outputs = model.distribute_strategy.run(run_step, args=(data,))
    File "/home/mauricio/miniconda3/envs/dev/lib/python3.10/site-packages/keras/engine/training.py", line 1249, in run_step
      outputs = model.train_step(data)
    File "/home/mauricio/miniconda3/envs/dev/lib/python3.10/site-packages/keras/engine/training.py", line 1054, in train_step
      self.optimizer.minimize(loss, self.trainable_variables, tape=tape)
    File "/home/mauricio/miniconda3/envs/dev/lib/python3.10/site-packages/keras/optimizers/optimizer.py", line 543, in minimize
      self.apply_gradients(grads_and_vars)
    File "/home/mauricio/miniconda3/envs/dev/lib/python3.10/site-packages/keras/optimizers/optimizer.py", line 1174, in apply_gradients
      return super().apply_gradients(grads_and_vars, name=name)
    File "/home/mauricio/miniconda3/envs/dev/lib/python3.10/site-packages/keras/optimizers/optimizer.py", line 650, in apply_gradients
      iteration = self._internal_apply_gradients(grads_and_vars)
    File "/home/mauricio/miniconda3/envs/dev/lib/python3.10/site-packages/keras/optimizers/optimizer.py", line 1200, in _internal_apply_gradients
      return tf.__internal__.distribute.interim.maybe_merge_call(
    File "/home/mauricio/miniconda3/envs/dev/lib/python3.10/site-packages/keras/optimizers/optimizer.py", line 1250, in _distributed_apply_gradients_fn
      distribution.extended.update(
    File "/home/mauricio/miniconda3/envs/dev/lib/python3.10/site-packages/keras/optimizers/optimizer.py", line 1245, in apply_grad_to_update_var
      return self._update_step_xla(grad, var, id(self._var_key(var)))
Node: 'StatefulPartitionedCall_4'
libdevice not found at ./libdevice.10.bc
	 [[{{node StatefulPartitionedCall_4}}]] [Op:__inference_train_function_880]

# Classification models

## Classification
* 'categorical_crossentropy' loss function: itś by far the most common
    * Similar to log loss: Lower is better
* Add `metrics = ['accuracy']` to compile step for easy-to-understand diagnostics
* Output layer has separate node for each possible outcome, and uses 'softmax' activation
    * The softmax activation function ensures the predictions sum to 1, so they can be intepreted as probabilities.

In [None]:
from tensorflow.keras.layers import Dense
from tensorflow.keras.models import Sequential
from tensorflow.keras.utils import to_categorical
import pandas as pd
from sklearn.model_selection import train_test_split

In [None]:
df = pd.read_csv('titanic_all_numeric.csv')
df

In [None]:
X = df.drop(['survived'], axis=1).values.astype('float64')
X

In [None]:
y = to_categorical(df.survived)
y

In [None]:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=91, random_state=1)
model = Sequential()
model.add(Dense(32, activation='relu', input_shape=(X.shape[1],)))
model.add(Dense(2, activation='softmax'))
model.summary()

In [None]:
model.compile(optimizer='sgd', loss='categorical_crossentropy', metrics=['accuracy'])
model.fit(X_train, y_train)

## Using models
* Save
* Reload
* Make predictions

## Saving, reloading, and using your Model

In [None]:
from tensorflow.keras.models import load_model

model.save('model_file.h5')
! ls -lh model_file.h5

In [None]:
model = load_model('model_file.h5')
yhat = model.predict(X_test)
yhat[:,1]

## Verifying model structure

In [None]:
model.summary()