**ANN.py** Trains and tests a simple artificial neural network

In [None]:
from sklearn.preprocessing import LabelEncoder
from tensorflow.keras.layers import Dense
from tensorflow.keras.models import Model
from keras.utils import np_utils
from keras.layers import Input
import pandas as pd
import numpy as np
import os

os.chdir(r"C:\Carabid_Data\Invert")

Importing modules <br>
Set dataset directory (adjust this to your own directory)

In [None]:
df = pd.read_csv("shuffletrain.csv")

Y = df['AllTaxa']
X = df.drop(["AllTaxa"], axis=1)
# convert to numpy arrays
X = np.array(X)
# work with labels
# encode class values as integers
encoder = LabelEncoder()
encoder.fit(Y)
encoded_Y = encoder.transform(Y)
# convert integers to dummy variables (i.e. one hot encoded)
dummy_y = np_utils.to_categorical(encoded_Y)

Read in feature vector dataset <br>
Labels ('AllTaxa' for LITL dataset or 'Order' for order level dataset) are converted to one hot encoded labels as `dummy_y` <br>
Numeric data from df is set to `X`. If contexual metadata or morphometric data is to be removed, the following lines of code can be used before `X = np.array(X)` respectively:

In [None]:
#For removing contextual metadata
X = X.drop(X.loc[:, 'decLat':'day'].columns, axis=1)
#For removing morphometric data
X = X.drop(X.loc[:, 'Area':'rawIntDensBlue'].columns, axis=1)

In [None]:
ncol = X.shape[1]
num_class = dummy_y.shape[1]
inputs = Input(shape = (ncol,))
annx = Dense(128, activation = 'relu')(inputs)
predict = Dense(num_class, activation = "softmax")(annx)
ann = Model(inputs, predict)

ann.compile(optimizer = 'adam', loss = 'categorical_crossentropy', metrics = ['accuracy'])

The ANN is structured with a single dense layer. The size of the input layer and softmax layer are determined by the number of training variables and classes, respectively

In [None]:
ann.fit(
    x=X, y=dummy_y,
    epochs=10, batch_size=128,
    verbose = 1)

The model is trained using `X` and `dummy_y`

In [None]:
validdf = pd.read_csv("shufflevalidlitl.csv")
validX = validdf.drop(["AllTaxa"], axis = 1)
validY = validdf["AllTaxa"]
encoder = LabelEncoder()
encoder.fit(validY)
encoded_validY = encoder.transform(validY)
# convert integers to dummy variables (i.e. one hot encoded)
dummy_validY = np_utils.to_categorical(encoded_validY)

preds = model.predict(validX)

The model is tested using test data generated by **Shuffle.R**