# BIOEE 4940 : **Introduction to Quantitative Analysis in Ecology**
### ***Spring 2021***
### Instructor: **Xiangtao Xu** ( ✉️ xx286@cornell.edu)
### Teaching Assistant: **Yanqiu (Autumn) Zhou** (✉️ yz399@cornell.edu)

---

## <span style="color:royalblue">Lecture 1</span> *Welcome and Course Overview*

### Introduction

Welcome to BIOEE 4940 *Introduction to Quantitative Analysis in Ecology*.

Some quick facts about me:

- **Name**: Xu, Xiangtao (徐湘涛 in Chinese Characters, he/him/his)
- **Affiliation**: Department of Ecology and Evolutionary Biology
- **Academic Background**: Undergrad in Geography/Ecology (minor in Computer Science), Ph.D. in Geosciences, Post-doc in Ecology/Ecophysiology
- **Research Keywords**: Plant Ecology, Global Change, Bioshpere Modeling, Ecological Remote Sensing
- **Experience** in Quantitative Analysis: Statistical/Spatial/Timeseries Analyses in Matlab/Python/R. Terrestrial Biosphere Modeling/High-Performance Computing in Fortran/Shell. Learning Markdown and Julia
- Very excited to share my experience and lessons on quantitative analysis in ecology with you and learn with you together!

Now, let's go around and have a quick intro in cluding a few key words for your research， your experiences in quantitative analysis, and your expectation on the course.

| Name  | Keyword | Experience |
|:---   | :----:  | ---: |
| Xiangtao Xu   | Plant Ecology  | Python (main) / Matlab/R/Fortran. Regression/Modeling/Spatial/Timeseries |
| Yanqiu Zhou   |  | |


[comment]: ~10min?

---
### Quantitative Analysis in Ecology: Goals/Challenges/Opportunities
* #### What is quantitative ecology?

##### From wikipedia:
>Quantitative ecology is the application of ~advanced~ mathematical and statistical tools to any number of problems in the field of ecology. It is a small but growing subfield in ecology, reflecting the demand among practicing ecologists to interpret ever larger and more complex data sets using quantitative reasoning. Quantitative ecologists might apply some combination of deterministic or stochastic mathematical models to theoretical questions or they might use sophisticated methods in applied statistics for experimental design and hypothesis testing. Typical problems in quantitative ecology include ***estimating*** the dynamics and status of wild populations, ***modeling*** the impacts of anthropogenic or climatic change on ecological communities, and ***predicting*** the spread of invasive species or disease outbreaks.
>
>Quantitative ecology, ***which mainly focuses on statistical and computational methods for addressing applied problems***, is distinct from theoretical ecology which tends to explore focus on understanding the dynamics of simple mechanistic models and their implications for a general set of biological systems using mathematical arguments.

    
##### Example 1: How leaf maximum carboxylation rate changes with temperature. ([Slot & Winters 2017](http://doi.wiley.com/10.1111/pce.13071))

<img src="./QE_example1_Vcmax.png" alt="Example 1: How leaf photosynthetic potential changes with temperature; Slot et al. 2017 PCE" style="width: 400px;"/>

##### Example 2: The global spectrum of plant form and function. ([Díaz et al. 2015](http://dx.doi.org/10.1038/nature16489))

<img src="./QE_example2_plant_function_spectrum.png" alt="the global spectrum of plant form and function. ([Díaz et al. 2015]" style="width: 400px;"/>

##### Example 3: Fitting a hierarchical semi-mechanistic model to tree growth/mortality data ([Muller-Landau et al. 2006](http://doi.wiley.com/10.1111/j.1461-0248.2006.00904.x), [Camac et al. 2018](https://www.pnas.org/content/115/49/12459.short?rss=1))

<img src="./QE_example3_tree_demographics.png" alt="Fitting a hierarchical semi-mechanistic model to tree growth/mortality data" style="width: 400px;"/>

##### Example 4: Inferring How basal area fraction of trees associated with Ecto-Mycorrhizae (EM) vary environmental factors using machine learning ([Steidinger et al. 2019](http://www.nature.com/articles/s41586-019-1128-0))

<img src="./QE_example4_EM_abundance.png" alt="Inferring How basal area fraction of trees associated with Ecto-Mycorrhizae (EM) vary environmental factors using machine learning" style="width: 400px;"/>

##### Example 5: Predictions of land carbon uptake from process-based terrestrial biosphere models ([Friedlingstein et al. 2014](http://journals.ametsoc.org/doi/abs/10.1175/JCLI-D-12-00579.1))

<img src="./QE_example5_TBM_predictions.png" alt="Predictions of land carbon uptake from process-based terrestrial biosphere models" style="width: 400px;"/>

* #### What is special about quantitative *ecology*?

##### Situating ecology as a big-data science ([Farley et al. 2018](https://academic.oup.com/bioscience/article/68/8/563/5049569))

<img src="./QE_big_data.png" alt="Situating ecology as a big-data science" style="width: 400px;"/>

* Rapidly increasing data availability yet with heterogenous formats and quality --> How to assimilate data to enhance mechanistic understanding and predictive ability?


* #### Discussion: What is a good/bad quantitative analysis?
> All happy families are alike; each unhappy family is unhappy in its own way.

A simple example - What is the relationship between remotely-sensed vegetation optical depth (VOD) and model-simualted canopy water content (CWC)?

<img src="./QE_example_good_bad.png" alt="Example of different line fit between VOD and CWC" style="width: 600px;"/>

Common problem in big-data era
<img src="./overfitting-comics.jpg" alt="comics on overfitting" style="width: 600px;"/>



 

[comment]: ~20min?

---
### Learning Objectives and Contents

#### Learning Objectives:
1. Explain and understand basic concepts and theory of common statistical/computational methods (motivation/requirement/limitation/pitfalls)
2. Hands-on experience of conducting common quantitative analysis (Statistical inference, Regression, Bayesian) in an ecological context.
3. Get familiar with public ecological data (particuarly remote sensing) and basic data visualization techniques
4. Designing and carrying out quantitative project (in the format of presentation and/or short report).


#### Contents
* ##### Section 1: Introduction and Overview 

    1.1. Introduction on ecological data analysis (today)

    1.2. Access and visualization of ecological data in Jupyter Notebook

* ##### Section 2: Know Your Data

    2.1. How to describe your data (Revisit Statistics 101)
        
    2.2. Statistical inference and hypothesis testing 

* ##### Assignment 1

* ##### Section 3: Regression Models 

    3.1. Covariance, correlation, and regression
        * Linear Model
        * Discussion: Correlation and Causation

    3.2. More complex regressions
        * GLM/GAM?
        * Hierarchical models
        * Quantile regression
        * Non-linear fit

    3.3. Model selection and variable importance
        * Information Criterion
        * Regularization (Ridge/LASSO)


* ##### Assignment 2

* ##### Section 4: Dealing with high-dimenional Data

    4.1. Dimension Reduction (PCA, CCA, Classification, ...)
    
    4.2. Remote sensing data and spatio-temporal analysis

* ##### Assignment 3

* ##### Section 5: A Primer for Bayesian Inference (TBD)

    5.1. Basics of Bayesian statistics
        * Re-do linear regression using Bayesian methods

    5.2. Inverse problems and MCMC

* ##### Section 6: Other Special topics (TBD)
    6.1. Computational and Data-driven Method (Machine-learning)
    6.2. Intro on ecosystem modeling

* Final Project


---
### Grading and Logistics 

* ##### Office Hour:
    Monday 10:55AM-11:30AM (right after Monday lectures)

* ##### Grading
    Assignment (45%)
    Course performance and discussion (25%)
    Final Project (30%)

