Objective - Compile and Evaluate a Binary Classification Model using a Neural Network that predicts if applicants will be successful if funded through venture capital firm.
Scenario - Given a historical dataset CSV file containing more than 34,000 organizations that have received funding, employ neural network knowledge to evaluate dataset features and create a binary classifier model that will predict an applicant will become a successful or failed business.
Product - Jupyter notebook with -
-
Data preprocessing for a neural network model.
-
binary classification model using a deep neural network.
-
Utilize model-fit-predict pattern to compile and evaluate.
-
Model optimization.
Data encoding with OneHotEncoder,
train_test_split(),
Feature Scaling with StandardScaler,
keras.callbacks.EarlyStopping(),
keras.callbacks.ModelCheckpoint(),
Supplemental processing and analysis:
Beyond the scope of the assignment, the author sought to conduct additional analysis of the data obtained; supplemental material script with model building follows the primary challenge. Additionally, supplemetal experimental notebooks are included.
This project leverages Jupyter Lab v3.4.4 and Python version 3.9.13 packaged by conda-forge | (main, May 27 2022, 17:01:00) with the following packages:
-
sys - module provides access to some variables used or maintained by the interpreter and to functions that interact strongly with the interpreter.
-
NumPy - an open source Python library used for working with arrays, contains multidimensional array and matrix data structures with functions for working in domain of linear algebra, fourier transform, and matrices.
-
pandas - software library written for the python programming language for data manipulation and analysis.
-
Path - from pathlib - Object-oriented filesystem paths, Path instantiates a concrete path for the platform the code is running on.
-
Scikit-learn - an open source machine learning library that supports supervised and unsupervised learning; provides various tools for model fitting, data preprocessing, model selection, model evaluation, and many other utilities.
-
tensorflow - an end-to-end machine learning platform.
-
tf.keras - a compact, easy to learn, high-level Python library run on top of TensorFlow framework; made with focus of understanding deep learning techniques, such as creating layers for neural networks maintaining the concepts of shapes and mathematical details.
-
keras - a deep learning API written in Python, running on top of the machine learning platform TensorFlow.
-
train_test_split - from sklearn.model_selection, a quick utility that wraps input validation and next(ShuffleSplit().split(X, y)) and application to input data into a single call for splitting (and optionally subsampling) data in a oneliner.
-
OneHotEncoder - from sklearn.preprocessing, encode categorical features as a one-hot numeric array. Features are encoded using a one-hot (aka ‘one-of-K’ or ‘dummy’) encoding scheme; creates a binary column for each category and returns a sparse matrix or dense array.
-
StandardScaler - from sklearn.preprocessing, standardize features by removing the mean and scaling to unit variance.
-
matplotlib.pyplot a state-based interface to matplotlib. It provides an implicit, MATLAB-like, way of plotting. It also opens figures on your screen, and acts as the figure GUI manager
MacBook Pro (16-inch, 2021)
Chip Appple M1 Max
macOS Monterey version 12.6
Homebrew 3.6.11
Homebrew/homebrew-core (git revision 01c7234a8be; last commit 2022-11-15)
Homebrew/homebrew-cask (git revision b177dd4992; last commit 2022-11-15)
Python Platform: macOS-13.0.1-arm64-arm-64bit
Python version 3.9.13 packaged by conda-forge
Scikit-Learn 1.1.3
Tensor Flow Version: 2.10.0
Keras Version: 2.10.0
pandas 1.5.1
pip 22.3 from /opt/anaconda3/lib/python3.9/site-packages/pip (python 3.9)
git version 2.37.2
In the terminal, navigate to directory where you want to install this application from the repository and enter the following command
git clone git@github.com:Billie-LS/DeepL_Adventure_Angels.git
From terminal, the installed application is run through jupyter lab web-based interactive development environment (IDE) interface by typing at prompt:
> jupyter lab
The file you will run is:
credit_risk_resampling.ipynb
Version control can be reviewed at:
https://github.com/Billie-LS/DeepL_Adventure_Angels
Loki 'billie' Skylizard LinkedIn @GitHub
Vinicio De Sola LinkedIn @GitHub
Jeff Heaton LinkedIn @GitHub YouTube
Santiago Pedemonte LinkedIn @GitHub
None
None
MIT License
Copyright (c) [2022] [Loki 'billie' Skylizard]
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.