Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Set up documentation #59

Merged
merged 19 commits into from
Oct 17, 2023
Merged
Show file tree
Hide file tree
Changes from 5 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 13 additions & 0 deletions .readthedocs.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
version: 2

build:
os: ubuntu-22.04
tools:
python: "3.11"

mkdocs:
configuration: docs/mkdocs.yml

python:
install:
- requirements: docs/requirements.txt
80 changes: 11 additions & 69 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,77 +1,19 @@
# scikit-learn-knn-regression

This package is in active development.
> [!WARNING]
> This package is in active development!

## Developer Guide
`scikit-learn-knn-regression` (a.k.a. `sknnr`) is a package for running k-nearest neighbor imputation methods, including GNN (Ohmann & Gregory, 2002), MSN (Moeur & Stage 1995), and RFNN (Crookston & Finley, 2008), using estimators that are fully compatible with [`scikit-learn`](https://scikit-learn.org/stable/).
aazuspan marked this conversation as resolved.
Show resolved Hide resolved

### Setup

This project uses [hatch](https://hatch.pypa.io/latest/) to manage the development environment and build and publish releases. Make sure `hatch` is [installed](https://hatch.pypa.io/latest/install/) first:
## Acknowledgements

```bash
$ pip install hatch
```
`sknnr` was inspired by the [yaImpute](https://cran.r-project.org/web/packages/yaImpute/index.html) package for R (Crookston & Finley 2008). Thanks to Andrew Hudak for the inclusion of the [Moscow Mountain / St. Joes dataset](api/datasets/moscow_stjoes.md) (Hudak 2010), and Tom DeMeo for the inclusion of the [SWO Ecoplot dataset](api/datasets/swo_ecoplot.md) (Atzet et al., 1996). Development of this package was funded in part by an appointment to the United States Forest Service (USFS) Research Participation Program administered by the Oak Ridge Institute for Science and Education (ORISE) through an interagency agreement between the U.S. Department of Energy (DOE) and the U.S. Department of Agriculture (USDA).
aazuspan marked this conversation as resolved.
Show resolved Hide resolved

Now you can [enter the development environment](https://hatch.pypa.io/latest/environment/#entering-environments) using:
## References

```bash
$ hatch shell
```

This will install development dependencies in an isolated environment and drop you into a shell (use `exit` to leave).

### Pre-commit

Use [pre-commit](https://pre-commit.com/) to run linting, type-checking, and formatting:

```bash
$ pre-commit run --all-files
```

...or install it to run automatically before every commit with:

```bash
$ pre-commit install
```

You can run pre-commit hooks separately and pass additional arguments to them. For example, to run `black` on a single file:

```bash
$ pre-commit run black --files=src/sknnr/_base.py
```

### Testing

Unit tests are *not* run by `pre-commit`, but can be run manually using `hatch` [scripts](https://hatch.pypa.io/latest/config/environment/overview/#scripts):

```bash
$ hatch run test:all
```

Measure test coverage with:

```bash
$ hatch run test:coverage
```

Any additional arguments are passed to `pytest`. For example, to run a subset of tests matching a keyword:

```bash
$ hatch run test:all -k gnn
```

### Releasing

First, use `hatch` to [update the version number](https://hatch.pypa.io/latest/version/#updating).

```bash
$ hatch version [major|minor|patch]
```

Then, [build](https://hatch.pypa.io/latest/build/#building) and [publish](https://hatch.pypa.io/latest/publish/#publishing) the release to PyPI with:

```bash
$ hatch clean
$ hatch build
$ hatch publish
```
- Atzet, T, DE White, LA McCrimmon, PA Martinez, PR Fong, and VD Randall. 1996. Field guide to the forested plant associations of southwestern Oregon. USDA Forest Service. Pacific Northwest Region, Technical Paper R6-NR-ECOL-TP-17-96.
- Crookston, NL, Finley, AO. 2008. yaImpute: An R package for kNN imputation. Journal of Statistical Software, 23(10), 16.
- Hudak, A.T. 2010. Field plot measures and predictive maps for "Nearest neighbor imputation of species-level, plot-scale forest structure attributes from LiDAR data". Fort Collins, CO: U.S. Department of Agriculture, Forest Service, Rocky Mountain Research Station. https://www.fs.usda.gov/rds/archive/Catalog/RDS-2010-0012.
- Moeur M, Stage AR. 1995. Most Similar Neighbor: An Improved Sampling Inference Procedure for Natural Resources Planning. Forest Science, 41(2), 337–359.
- Ohmann JL, Gregory MJ. 2002. Predictive Mapping of Forest Composition and Structure with Direct Gradient Analysis and Nearest Neighbor Imputation in Coastal Oregon, USA. Canadian Journal of Forest Research, 32, 725–741.
4 changes: 4 additions & 0 deletions docs/abbreviations.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
*[GNN]: Gradient Nearest Neighbor
*[MSN]: Most Similar Neighbor
*[kNN]: k-Nearest Neighbor
*[RFNN]: Random Forest Nearest Neighbor
grovduck marked this conversation as resolved.
Show resolved Hide resolved
82 changes: 82 additions & 0 deletions docs/mkdocs.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,82 @@
site_name: SKNNR Documentation
aazuspan marked this conversation as resolved.
Show resolved Hide resolved
repo_url: https://github.com/lemma-osu/scikit-learn-knn-regression
repo_name: lemma-osu/scikit-learn-knn-regression
docs_dir: pages/

nav:
- Home: index.md
- Installation: installation.md
- Usage: usage.md
- "API Reference":
- Estimators:
- RawKNNRegressor: api/estimators/raw.md
- EuclideanKNNRegressor: api/estimators/euclidean.md
- MahalanobisKNNRegressor: api/estimators/mahalanobis.md
- GNNRegressor: api/estimators/gnn.md
- MSNRegressor: api/estimators/msn.md
- Transformers:
- StandardScalerWithDOF: api/transformers/standardscalerwithdof.md
- MahalanobisTransformer: api/transformers/mahalanobis.md
- CCATransformer: api/transformers/cca.md
- CCorATransformer: api/transformers/ccora.md
grovduck marked this conversation as resolved.
Show resolved Hide resolved
- Datasets:
- Dataset: api/datasets/dataset.md
- "Moscow Mountain / St. Joes": api/datasets/moscow_stjoes.md
- "SWO Ecoplot": api/datasets/swo_ecoplot.md
- Contributing: contributing.md

theme:
name: material
features:
- search.suggest
- search.highlight
- navigation.instant
- navigation.path
- content.code.copy
- content.code.annotate
palette:
- media: "(prefers-color-scheme: light)"
scheme: default
toggle:
icon: material/weather-night
name: Dark mode
- media: "(prefers-color-scheme: dark)"
scheme: slate
toggle:
icon: material/weather-sunny
name: Light mode

plugins:
- search
- mkdocstrings:
handlers:
python:
paths: [../src]
options:
show_source: false
aazuspan marked this conversation as resolved.
Show resolved Hide resolved
inherited_members: true
undoc_members: true
docstring_style: numpy
show_if_no_docstring: true
show_signature_annotations: true
show_root_heading: true
show_category_heading: true
merge_init_into_class: true
signature_crossrefs: true

markdown_extensions:
- abbr
- admonition
- tables
- footnotes
- toc:
permalink: true
- pymdownx.snippets:
auto_append:
- docs/abbreviations.md
- pymdownx.highlight:
anchor_linenums: true
line_spans: __span
pygments_lang_class: true
- pymdownx.inlinehilite
- pymdownx.superfences
1 change: 1 addition & 0 deletions docs/pages/api/datasets/dataset.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
::: sknnr.datasets._base.Dataset
1 change: 1 addition & 0 deletions docs/pages/api/datasets/moscow_stjoes.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
::: sknnr.datasets.load_moscow_stjoes
1 change: 1 addition & 0 deletions docs/pages/api/datasets/swo_ecoplot.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
::: sknnr.datasets.load_swo_ecoplot
1 change: 1 addition & 0 deletions docs/pages/api/estimators/euclidean.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
::: sknnr.EuclideanKNNRegressor
1 change: 1 addition & 0 deletions docs/pages/api/estimators/gnn.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
::: sknnr.GNNRegressor
1 change: 1 addition & 0 deletions docs/pages/api/estimators/mahalanobis.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
::: sknnr.MahalanobisKNNRegressor
1 change: 1 addition & 0 deletions docs/pages/api/estimators/msn.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
::: sknnr.MSNRegressor
1 change: 1 addition & 0 deletions docs/pages/api/estimators/raw.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
::: sknnr.RawKNNRegressor
1 change: 1 addition & 0 deletions docs/pages/api/transformers/cca.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
::: sknnr.transformers.CCATransformer
1 change: 1 addition & 0 deletions docs/pages/api/transformers/ccora.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
::: sknnr.transformers.CCorATransformer
1 change: 1 addition & 0 deletions docs/pages/api/transformers/mahalanobis.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
::: sknnr.transformers.MahalanobisTransformer
1 change: 1 addition & 0 deletions docs/pages/api/transformers/standardscalerwithdof.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
::: sknnr.transformers.StandardScalerWithDOF
91 changes: 91 additions & 0 deletions docs/pages/contributing.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,91 @@
# Contributing

## Developer Guide

### Setup

This project uses [hatch](https://hatch.pypa.io/latest/) to manage the development environment and build and publish releases. Make sure `hatch` is [installed](https://hatch.pypa.io/latest/install/) first:

```bash
$ pip install hatch
```

Now you can [enter the development environment](https://hatch.pypa.io/latest/environment/#entering-environments) using:

```bash
$ hatch shell
```

This will install development dependencies in an isolated environment and drop you into a shell (use `exit` to leave).

### Pre-commit

Use [pre-commit](https://pre-commit.com/) to run linting, type-checking, and formatting:

```bash
$ pre-commit run --all-files
```

...or install it to run automatically before every commit with:

```bash
$ pre-commit install
```

You can run pre-commit hooks separately and pass additional arguments to them. For example, to run `black` on a single file:

```bash
$ pre-commit run black --files=src/sknnr/_base.py
```

### Testing

Unit tests are *not* run by `pre-commit`, but can be run manually using `hatch` [scripts](https://hatch.pypa.io/latest/config/environment/overview/#scripts):

```bash
$ hatch run test:all
```

Measure test coverage with:

```bash
$ hatch run test:coverage
```

Any additional arguments are passed to `pytest`. For example, to run a subset of tests matching a keyword:

```bash
$ hatch run test:all -k gnn
```

### Documentation

Documentation is built with [mkdocs](https://www.mkdocs.org/). During development, you can run a live-reloading server with:

```bash
$ hatch run docs:serve
```

The API reference is generated from Numpy-style docstrings using [mkdocstrings](https://mkdocstrings.github.io/). New classes can be added to the API reference by creating a new markdown file in the `docs/pages/api` directory, adding that file to the [`nav` tree](https://www.mkdocs.org/user-guide/configuration/#nav) in `docs/mkdocs.yml`, and [including the docstring](https://mkdocstrings.github.io/python/usage/#injecting-documentation) in the markdown file:

```markdown
::: sknnr.module.class
```
aazuspan marked this conversation as resolved.
Show resolved Hide resolved

Whenever the docs are updated, they will be automatically rebuilt and deployed by [ReadTheDocs](https://about.readthedocs.com). Build status can be monitored [here](https://readthedocs.org/projects/sknnr/builds/).

### Releasing

First, use `hatch` to [update the version number](https://hatch.pypa.io/latest/version/#updating).

```bash
$ hatch version [major|minor|patch]
```

Then, [build](https://hatch.pypa.io/latest/build/#building) and [publish](https://hatch.pypa.io/latest/publish/#publishing) the release to PyPI with:

```bash
$ hatch clean
$ hatch build
$ hatch publish
```
51 changes: 51 additions & 0 deletions docs/pages/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
# SKNNR Docs

`scikit-learn-knn-regression` (a.k.a. `sknnr`) is a package for running k-nearest neighbor imputation methods, including GNN (Ohmann & Gregory, 2002), MSN (Moeur & Stage 1995), and RFNN (Crookston & Finley, 2008), using estimators that are fully compatible with [`scikit-learn`](https://scikit-learn.org/stable/).
grovduck marked this conversation as resolved.
Show resolved Hide resolved

aazuspan marked this conversation as resolved.
Show resolved Hide resolved
## Quick-Start

1. Follow the [installation guide](installation.md).
2. Import any `sknnr` estimator, like [MSNRegressor](api/estimators/msn.md), as a drop-in replacement for a `scikit-learn` regressor.
```python
from sknnr import MSNRegressor

est = MSNRegressor()
```
3. Load a custom dataset like [SWO Ecoplot](api/datasets/swo_ecoplot.md) (or bring your own).
```python
from sknnr.datasets import load_swo_ecoplot

X, y = load_swo_ecoplot(return_X_y=True, as_frame=True)
```
4. Train, predict, and score [as usual](https://scikit-learn.org/stable/getting_started.html#fitting-and-predicting-estimator-basics).
```python
from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(X, y)

est = est.fit(X_train, y_train)
est.score(X_test, y_test)
```
5. Check out the additional features like [independent scoring](usage.md/#independent-scores-and-predictions), [dataframe indexing](usage.md/#retrieving-dataframe-indexes), and [dimensionality reduction](usage.md/#dimensionality-reduction).
```python
# Evaluate the model using the second-nearest neighbor in the test set
print(est.fit(X, y).independent_score_)

# Get the dataframe index of the nearest neighbor to each plot
print(est.kneighbors(return_dataframe_index=True, return_distance=False))

# Apply dimensionality reduction using CCorA ordination
MSNRegressor(n_components=3).fit(X_train, y_train)
```

## Acknowledgements

`sknnr` was inspired by the [yaImpute](https://cran.r-project.org/web/packages/yaImpute/index.html) package for R (Crookston & Finley 2008). Thanks to Andrew Hudak for the inclusion of the [Moscow Mountain / St. Joes dataset](api/datasets/moscow_stjoes.md) (Hudak 2010), and Tom DeMeo for the inclusion of the [SWO Ecoplot dataset](api/datasets/swo_ecoplot.md) (Atzet et al., 1996). Development of this package was funded in part by an appointment to the United States Forest Service (USFS) Research Participation Program administered by the Oak Ridge Institute for Science and Education (ORISE) through an interagency agreement between the U.S. Department of Energy (DOE) and the U.S. Department of Agriculture (USDA).
aazuspan marked this conversation as resolved.
Show resolved Hide resolved

## References

- Atzet, T, DE White, LA McCrimmon, PA Martinez, PR Fong, and VD Randall. 1996. Field guide to the forested plant associations of southwestern Oregon. USDA Forest Service. Pacific Northwest Region, Technical Paper R6-NR-ECOL-TP-17-96.
- Crookston, NL, Finley, AO. 2008. yaImpute: An R package for kNN imputation. Journal of Statistical Software, 23(10), 16.
- Hudak, A.T. 2010. Field plot measures and predictive maps for "Nearest neighbor imputation of species-level, plot-scale forest structure attributes from LiDAR data". Fort Collins, CO: U.S. Department of Agriculture, Forest Service, Rocky Mountain Research Station. https://www.fs.usda.gov/rds/archive/Catalog/RDS-2010-0012.
- Moeur M, Stage AR. 1995. Most Similar Neighbor: An Improved Sampling Inference Procedure for Natural Resources Planning. Forest Science, 41(2), 337–359.
- Ohmann JL, Gregory MJ. 2002. Predictive Mapping of Forest Composition and Structure with Direct Gradient Analysis and Nearest Neighbor Imputation in Coastal Oregon, USA. Canadian Journal of Forest Research, 32, 725–741.
aazuspan marked this conversation as resolved.
Show resolved Hide resolved
17 changes: 17 additions & 0 deletions docs/pages/installation.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
# Installation

!!! info
`sknnr` will be available through PyPI and conda-forge once it is ready for release. Until then, you can install it from source.

## From source

```bash
pip install git+https://github.com/lemma-osu/scikit-learn-knn-regression@main
```

## Dependencies

- Python >= 3.8
- scikit-learn >= 1.2
- numpy
- scipy
Loading
Loading