## Overview

This project focuses on building a binary classification model using deep learning techniques to determine the likelihood of success for organizations funded by Alphabet Soup. Leveraging a dataset containing information on over 34,000 funded organizations, the goal is to analyze the metadata and apply machine learning and neural network approaches to predict the outcomes of future funding applications.

## Alphabet Soup Analysis

### Analysis Summary

After testing four models, Model 4, which employed auto-optimization, achieved the highest accuracy of 75.96% on the test data, exceeding the 75% accuracy target. However, it still exhibited a relatively high loss percentage of 0.68.

Model 1 had a lower accuracy of 0.72 and a higher loss percentage of 0.62. It used two hidden layers, each with 100 neurons and ReLU activation, while the output layer used Sigmoid activation. ReLU was chosen because the data appeared to be non-linear.

Model 2 implemented LeakyReLU activation to mitigate the problem of dead neurons inhibiting further learning. It featured three hidden layers, each with 100 neurons, and Sigmoid activation for the output layer. Although the training accuracy was high at 0.96, the test accuracy dropped to 0.60, indicating overfitting.

Model 3 adopted the Tanh activation function in an attempt to improve accuracy. This model had three hidden layers with 50 neurons each. Reducing the number of neurons was intended to prevent overfitting, but the attempt was unsuccessful. Its test accuracy was lower than Model 4, reaching only 0.72.

In summary, while Model 4 delivered the best performance with the highest test accuracy, it still faced challenges with a high loss percentage.


### Data Preprocessing

* What variable(s) are the target(s) for your model?

  * The target variable for each model was the IS_SUCCESSFUL column. The primary objective is to predict which applicants are likely to succeed. To achieve this, all other features in the dataset are utilized to build a predictive model.

* What variable(s) are the features for your model?
  * The available features for are: EIN, NAME, APPLICATION_TYPE, AFFILIATION, CLASSIFICATION, USE_CASE, ORGANIZATION, STATUS, INCOME_AMT, SPECIAL_CONSIDERATIONS, and ASK_AMT.

* What variable(s) should be removed from the input data because they are neither targets nor features?
  * In the initial model, I excluded the EIN and NAME columns as instructed. However, during the first optimization attempt, I decided to reintroduce the NAME column. This was done to address potential bias that could arise from certain organizations appearing multiple times in the dataset.


### Compiling, Training, and Evaluating the Model

* How many neurons, layers, and activation functions did you select for your neural network model, and why?

  * In the first model, I implemented two hidden layers, each containing 100 neurons, with ReLU activation. For the output layer, I applied the Sigmoid activation function. I selected ReLU because, upon examining the data, I noticed its non-linear characteristics, making it a logical choice for the activation function to begin with.

![image.png](attachment:image.png)


  * In the second model, I switched to LeakyReLU activation after observing that ReLU wasn't effective. One key issue I noticed in the first model was that accuracy and loss remained nearly constant from the first epoch to the last, indicating that the model had essentially stopped learning. To address this, I used LeakyReLU, which helps keep neurons active and allows the model to continue learning. I also added a third hidden layer, with 100 neurons in each of the three layers, and used Sigmoid for the output activation.

![image-2.png](attachment:image-2.png)


  * In the third model, I experimented with Tanh activation, as it is another non-linear function that could potentially improve accuracy. At this point, I was trying different activations to see which might work best. I used 3 hidden layers with 50 neurons each, selecting 50 neurons to avoid the overfitting issue I encountered in the previous model.

![image-3.png](attachment:image-3.png)


  * In the fourth and final model, I used auto-optimization to identify the best model and hyperparameters, aiming for an accuracy above 75%. The model consists of 3 layers, with the first layer having 2 neurons, followed by layers containing 9, 7, 1, 5, 3 and 3 neurons respectively. Sigmoid activation was applied consistently throughout the model.

![image-4.png](attachment:image-4.png)


* Were you able to achieve the target model performance?

  * I successfully met the target performance of 75% accuracy or higher in all of my attempts, including the initial model and the three optimization iterations, with the third optimization being particularly successful.
  

* What steps did you take in your attempts to increase model performance?

  * In my first optimization, I introduced a third hidden layer and switched from ReLU to the LeakyReLU activation function, replacing the one used in the initial attempt.




