Skip to content

National-Animal-Nutrition-Program/2023ADSA_Chen

Repository files navigation

ADSA NANP Modeling Workshop - Model Validation

Author: James Chen

Repository: https://github.com/Niche-Squad/ADSA_modelvalidation

⚠️IMPORTANT⚠️

Make sure you can successfully run the verification code block in the notebook verify.ipynb before the workshop on June 25, 2023

0. Overview

This repository contains materials for the ADSA NANP Modeling Workshop on model validation. This workshop is designed to offer a hands-on introduction to the prevalent risks involved in evaluating predictive models. We will run through several simulations to demonstrate the importance of model validation, aiming to answer the following questions:

  • Why is it necessary to split our data in model validation?
  • How to appropriately tune hyperparameters in a model?
  • How does different splitting methods influence our conclusions during model validation?

1. Environment Setup

Please make sure to set up your environment before the workshop. You can choose to run the workshop materials on your local machine or on Anaconda Cloud.

back to top

1.1 Option 1 (recommended): Run on your local machine

To ensure the best quality of this workshop and to avoid any internet connectivity issues, we recommend that you run the workshop materials on your local machine. To do so, you will need to install Python and Jupyter Notebook, which is an interactive coding environment, on your machine.

1.1.1: Install Anaconda Distribution

Anaconda is the easiest way to get started with Python and Jupyter Notebook. Go to https://www.anaconda.com/download to download the Anaconda distribution and install it on your machine. You DO NOT need to change any settings during the installation.

1.1.2: Download the workshop repository

GO to https://github.com/Niche-Squad/ADSA_modelvalidation and click the green button Code -> Download ZIP to download the workshop repository. Unzip the downloaded file and save it to a location on your machine.

1.1.3: Verify your environment

Open the Anaconda Navigator and launch Jupyter Notebook.

In the Jupyter Notebook, navigate to the workshop repository folder that you just unzipped.

Open the file verify.ipynb. Follow the instruction in the notebook to run the verification code block. If you can run the code block without any errors, you are all set for the workshop!

back to top

1.2 Option 2: Run on Anaconda Cloud

If you encounter any issues with the local environment setup, you can choose to run the workshop materials online. The advantage of this option is that you do not need to install any software on your machine. However, you will need to have a stable internet connection during the workshop.

1.2.1: Register an Anaconda Cloud account

Anaconda offers a free cloud service that allows you to run Jupyter Notebook on the cloud. It requires you to register an account for the service. Go to https://www.anaconda.com/code-in-the-cloud and follow the instructions to register an account.

1.2.2: Clone the workshop repository

Once you log in to your Anaconda Cloud account, you will see a dashboard. Go to Other section and open Terminal.

In Terminal, run the following command to clone the workshop repository to your cloud account:

git clone https://github.com/Niche-Squad/ADSA_modelvalidation

1.2.3: Verify the online environment

On the left panel of the dashboard, click ADSA_modelvalidation and open the file verify.ipynb. Follow the instruction in the notebook to run the verification code block. If you can run the code block without any errors, you are all set for the workshop!

back to top

2. Getting Started

Now you are ready to start the workshop! We will cover three important topics in model validation. Please use the jupyter notebooks in this repository to follow along with the following topics:

2.1 note01_train_test_split.ipynb

Why is it necessary to split our data in model validation?

2.2 note02_hyperparameter.ipynb

How to appropriately tune hyperparameters in a model?

2.3 note03_factor_validation.ipynb

How does different splitting methods influence our conclusions during model validation?

back to top

About

This repository contains materials for the ADSA NANP Modeling Workshop on Model Validation

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published