# Report on the Neural Network Model

## Overview of the analysis 

This analysis explains my attempts to construct a deep learning neural network to predict which applicants will be successful with funding, using the Alphabet Soup dataset.  

## Results: 

**Data Preprocessing**

- Target variable: IS_SUCCESSFUL 

- Feature variables: APPLICATION_TYPE, AFFILIATION, CLASSIFICATION, USE_CASE, ORGANIZATION, STATUS, INCOME_AMT, SPECIAL_CONSIDERATIONS, ASK_AMT

- Useless variables: EIN, NAME

**Compiling, Training, and Evaluating the Model**

On my first attempt, my preprocessed data had 43 inputs. I created a neural network with input_dim of 43, 2 hidden layers with 100 units each, and 10 epochs. I chose 100 units as a value because it is advisable to start with a number 2 to 3 times as big as the input dimension. 2 layers and 10 epochs were a modest and sensible starting point for benchmarking. Relu was the activation function for all of the layers except the last one, which used sigmoid. 

My best performance with these specifications was: 

- **Loss: 0.5517758131027222, Accuracy: 0.729912519454956**

While decent, this result did not meet our desired threshold of 75% accuracy.  To address this, I fed the keras-tuner in Google Colaboratory all of the same specifications from above, hoping for a promising lead on the best model. Its output for the best model could not be replicated with the specifications it suggested, but they led me to experiment with similar settings. I ended up running 100 epochs with an input layer of 9 units in relu activation, and 2 hidden layers of 9 units, one in relu and one in tanh activation. The output layer used the sigmoid function.  With these parameters, I was able to achieve this performance:

- **Loss: 0.5544543266296387, Accuracy: 0.7315452098846436**

**Reflections**

At this point I was at a loss as to how to improve this model's performance.  I went back to the drawing board ... which of these "features" were really contributing to the IS_SUCCESSFUL values?  


How many neurons, layers, and activation functions did you select for your neural network model, and why?
Were you able to achieve the target model performance?
What steps did you take in your attempts to increase model performance?

**Summary**: Summarize the overall results of the deep learning model. Include a recommendation for how a different model could solve this classification problem, and then explain your recommendation.

In [None]:
# Split our preprocessed data into our features and target arrays

y = df["IS_SUCCESSFUL"]
X = df.drop(["IS_SUCCESSFUL"], axis=1)

In [None]:
best_features = SelectKBest(score_func=chi2, k=5)
fit = best_features.fit(X, y)

dfscores = pd.DataFrame(fit.scores_)
dfcolumns = pd.DataFrame(X.columns)

feature_Scores = pd.concat([dfcolumns, dfscores], axis=1)
feature_Scores.columns = ["Specs", "Score"]

print(feature_Scores.nlargest(5, "Score"))

In [None]:
model = ExtraTreesClassifier()
model.fit(X, y)

print("\n\n", model.feature_importances_)

feat_importances = pd.Series(model.feature_importances_, index=X.columns)

feat_importances.nlargest(5).plot(kind="barh")

plt.show()

In [None]:
corrmat = df.corr()

top_corr_features = corrmat.index

plt.figure(figsize=(20,20))

sns.heatmap(df[top_corr_features].corr(), annot=False, cmap="RdYlGn")

In [None]:
# Split the preprocessed data into a training and testing dataset

X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=0)

In [None]:
# Create a StandardScaler instances

scaler = StandardScaler()

# Fit the StandardScaler

X_scaler = scaler.fit(X_train)

# Scale the data

X_train_scaled = X_scaler.transform(X_train)
X_test_scaled = X_scaler.transform(X_test)

print(X_train_scaled.shape)
print(X_test_scaled.shape)

In [None]:
clf = RandomForestClassifier(random_state=0, n_estimators=128).fit(X_train_scaled, y_train)

print(f'Training Score: {clf.score(X_train_scaled, y_train)}')
print(f'Testing Score: {clf.score(X_test_scaled, y_test)}')

In [None]:
feature_importances = clf.feature_importances_ 

features = sorted(zip(df.columns, clf.feature_importances_), key = lambda x: x[1])
cols = [f[0] for f in features]
width = [f[1] for f in features]

fig, ax = plt.subplots()

fig.set_size_inches(10,200)
plt.margins(y=0.001)

ax.barh(y=cols, width=width)

plt.show()

### AFFIL, ORG, APP_TYPE

In [None]:
sel = SelectFromModel(clf)
sel.fit(X_train_scaled, y_train)

X_selected_train, X_selected_test, y_train, y_test = train_test_split(sel.transform(X), y, random_state=1)
scaler = StandardScaler().fit(X_selected_train)
X_selected_train_scaled = scaler.transform(X_selected_train)
X_selected_test_scaled = scaler.transform(X_selected_test)

clf = RandomForestClassifier(random_state=1, n_estimators=500).fit(X_selected_train_scaled, y_train)
print(f'Training Score: {clf.score(X_selected_train_scaled, y_train)}')
print(f'Testing Score: {clf.score(X_selected_test_scaled, y_test)}')

In [None]:
# Define the model - deep neural net, i.e., the number of input features and hidden nodes for each layer.

nn = tf.keras.models.Sequential()

# First hidden layer

nn.add(tf.keras.layers.Dense(units=63, input_dim=21, activation="tanh"))

# Second hidden layer

nn.add(tf.keras.layers.Dense(units=63, activation="tanh"))

# Output layer

nn.add(tf.keras.layers.Dense(units=1, activation="tanh"))

# Check the structure of the model
nn.summary()

In [None]:
# Compile the model

nn.compile(loss="binary_crossentropy", optimizer="adam", metrics=["accuracy"]) 

In [None]:
# Train the model

fit_model = nn.fit(X_train_scaled, y_train, epochs=25) 

In [None]:
# Evaluate the model using the test data

model_loss, model_accuracy = nn.evaluate(X_test_scaled, y_test, verbose=2)
print(f"Loss: {model_loss}, Accuracy: {model_accuracy}")

In [None]:
# Export our model to HDF5 file

# nn.save("nn_optimized.h5")