Skip to content

Commit

Permalink
Update references to the github organization name. (#304)
Browse files Browse the repository at this point in the history
  • Loading branch information
jtibshirani authored and swager committed Sep 21, 2018
1 parent 369a8e4 commit 43600d5
Show file tree
Hide file tree
Showing 5 changed files with 14 additions and 14 deletions.
16 changes: 8 additions & 8 deletions DEVELOPING.md
Expand Up @@ -8,22 +8,22 @@ The core forest implementation is written in C++, with an R interface powered by

### Code structure

![GRF Architecture Diagram](https://github.com/swager/grf/blob/master/documentation/arch_diagram.png)
![GRF Architecture Diagram](https://github.com/grf-labs/grf/blob/master/documentation/arch_diagram.png)

The forest implementation is composed of two top-level components, [ForestTrainer](https://github.com/swager/grf/blob/master/core/src/forest/ForestTrainer.h) and [ForestPredictor](https://github.com/swager/grf/blob/master/core/src/forest/ForestPredictor.h).
The forest implementation is composed of two top-level components, [ForestTrainer](https://github.com/grf-labs/grf/blob/master/core/src/forest/ForestTrainer.h) and [ForestPredictor](https://github.com/grf-labs/grf/blob/master/core/src/forest/ForestPredictor.h).

ForestTrainer drives the tree-growing process, and has two pluggable components.
* [RelabelingStrategy](https://github.com/swager/grf/blob/master/core/src/relabeling/RelabelingStrategy.h) is applied before every split, and produces a set of relabelled outcomes given the observations for a group of samples. In the case of quantile forests, for example, this strategy computes the quantiles for the group of samples, then relabels them with a factor corresponding to the quantile they belong to.
* [SplittingRule](https://github.com/swager/grf/blob/master/core/src/splitting/SplittingRule.h) is called to find the best split for a particular node, given a set of outcomes. There are currently implementations for standard regression and multinomial splitting.
* [RelabelingStrategy](https://github.com/grf-labs/grf/blob/master/core/src/relabeling/RelabelingStrategy.h) is applied before every split, and produces a set of relabelled outcomes given the observations for a group of samples. In the case of quantile forests, for example, this strategy computes the quantiles for the group of samples, then relabels them with a factor corresponding to the quantile they belong to.
* [SplittingRule](https://github.com/grf-labs/grf/blob/master/core/src/splitting/SplittingRule.h) is called to find the best split for a particular node, given a set of outcomes. There are currently implementations for standard regression and multinomial splitting.

The trained forest produces a [Forest](https://github.com/swager/grf/blob/master/core/src/forest/Forest.h) object. This can then be passed to the ForestPredictor to predict on test samples. The predictor has a pluggable 'prediction strategy', which computes a prediction given a test sample. Prediction strategies can be one of two types:
* [DefaultPredictionStrategy](https://github.com/swager/grf/blob/master/core/src/prediction/DefaultPredictionStrategy.h) computes a prediction given a weighted list of training sample IDs that share a leaf with the test sample. Taking quantile forests as an example, this strategy would compute the quantiles of the weighted leaf samples.
* [OptimizedPredictionStrategy](https://github.com/swager/grf/blob/master/core/src/prediction/OptimizedPredictionStrategy.h) does not predict using a list of neighboring samples and weights, but instead precomputes summary values for each leaf during training, and uses these during prediction. This type of strategy will also be passed to ForestTrainer, so it can define how the summary values are computed.
The trained forest produces a [Forest](https://github.com/grf-labs/grf/blob/master/core/src/forest/Forest.h) object. This can then be passed to the ForestPredictor to predict on test samples. The predictor has a pluggable 'prediction strategy', which computes a prediction given a test sample. Prediction strategies can be one of two types:
* [DefaultPredictionStrategy](https://github.com/grf-labs/grf/blob/master/core/src/prediction/DefaultPredictionStrategy.h) computes a prediction given a weighted list of training sample IDs that share a leaf with the test sample. Taking quantile forests as an example, this strategy would compute the quantiles of the weighted leaf samples.
* [OptimizedPredictionStrategy](https://github.com/grf-labs/grf/blob/master/core/src/prediction/OptimizedPredictionStrategy.h) does not predict using a list of neighboring samples and weights, but instead precomputes summary values for each leaf during training, and uses these during prediction. This type of strategy will also be passed to ForestTrainer, so it can define how the summary values are computed.

Prediction strategies can also compute variance estimates for the predictions, given a forest trained with grouped trees. Because of performance constraints, only 'optimized' prediction strategies can provide variance estimates.

A particular type of forest is created by pulling together a set of pluggable components. As an example, a quantile forest is composed of a QuantileRelabelingStrategy, ProbabilitySplittingRule, and QuantilePredictionStrategy.
The factory classes [ForestTrainers](https://github.com/swager/grf/blob/master/core/src/forest/ForestTrainers.h) and [ForestPredictors](https://github.com/swager/grf/blob/master/core/src/forest/ForestPredictors.h) define the common types of forests like regression, quantile, and causal forests.
The factory classes [ForestTrainers](https://github.com/grf-labs/grf/blob/master/core/src/forest/ForestTrainers.h) and [ForestPredictors](https://github.com/grf-labs/grf/blob/master/core/src/forest/ForestPredictors.h) define the common types of forests like regression, quantile, and causal forests.

### Creating a custom forest

Expand Down
4 changes: 2 additions & 2 deletions README.md
@@ -1,4 +1,4 @@
[![Build Status](https://travis-ci.org/swager/grf.svg?branch=master)](https://travis-ci.org/swager/grf)
[![Build Status](https://travis-ci.org/grf-labs/grf.svg?branch=master)](https://travis-ci.org/grf-labs/grf)
![CRAN Downloads overall](http://cranlogs.r-pkg.org/badges/grand-total/grf)

# grf: generalized random forests
Expand Down Expand Up @@ -26,7 +26,7 @@ install.packages("grf")
Any published release can also be installed from source:

```R
install.packages("https://raw.github.com/swager/grf/master/releases/grf_0.10.0.tar.gz", repos = NULL, type = "source")
install.packages("https://raw.github.com/grf-labs/grf/master/releases/grf_0.10.0.tar.gz", repos = NULL, type = "source")
```

Note that to install from source, a compiler that implements C++11 is required (clang 3.3 or higher, or g++ 4.8 or higher). If installing on Windows, the RTools toolchain is also required.
Expand Down
2 changes: 1 addition & 1 deletion REFERENCE.md
Expand Up @@ -107,7 +107,7 @@ The parameter `min.node.size` relates to the minimum size a leaf node is allowed

There are several important caveats to this parameter:
- When honesty is enabled, the leaf nodes are 'repopulated' after splitting with a fresh subsample. This means that the final tree may contain leaf nodes smaller than the `min.node.size` setting.
- For regression forests, the splitting will only stop once a node has become smaller than `min.node.size`. Because of this, trees can have leaf nodes that violate the `min.node.size` setting. We initially chose this behavior to match that of other random forest packages like `randomForest` and `ranger`, but will likely be changed as it is misleading (see [#143](https://github.com/swager/grf/issues/143)).
- For regression forests, the splitting will only stop once a node has become smaller than `min.node.size`. Because of this, trees can have leaf nodes that violate the `min.node.size` setting. We initially chose this behavior to match that of other random forest packages like `randomForest` and `ranger`, but will likely be changed as it is misleading (see [#143](https://github.com/grf-labs/grf/issues/143)).
- When training a causal forest, `min.node.size` takes on a slightly different notion related to the number of treatment and control samples. More detail can be found in the 'Split Penalization' section below, under the 'Causal Forests' heading.

#### `alpha`
Expand Down
4 changes: 2 additions & 2 deletions r-package/grf/DESCRIPTION
Expand Up @@ -7,7 +7,7 @@ Author: Julie Tibshirani [aut, cre],
Rina Friedberg [ctb],
Luke Miner [ctb],
Marvin Wright [ctb]
BugReports: https://github.com/swager/grf/issues
BugReports: https://github.com/grf-labs/grf/issues
Maintainer: Julie Tibshirani <jtibs@cs.stanford.edu>
Description: A pluggable package for forest-based statistical estimation and inference.
GRF currently provides methods for non-parametric least-squares regression,
Expand All @@ -30,4 +30,4 @@ RoxygenNote: 6.0.1.9000
Suggests:
testthat
SystemRequirements: GNU make
URL: https://github.com/swager/grf
URL: https://github.com/grf-labs/grf
2 changes: 1 addition & 1 deletion r-package/grf/man/grf.Rd
Expand Up @@ -6,7 +6,7 @@ A pluggable package for forest-based statistical estimation and inference. GRF c

In addition, GRF supports 'honest' estimation (where one subset of the data is used for choosing splits, and another for populating the leaves of the tree), and confidence intervals for least-squares regression and treatment effect estimation.

This package is currently in beta, and we expect to make continual improvements to its performance and usability. For a practical description of the GRF algorithm, including explanations of model parameters and troubleshooting suggestions, please see the [GRF reference](https://github.com/swager/grf/blob/master/REFERENCE.md).
This package is currently in beta, and we expect to make continual improvements to its performance and usability. For a practical description of the GRF algorithm, including explanations of model parameters and troubleshooting suggestions, please see the [GRF reference](https://github.com/grf-labs/grf/blob/master/REFERENCE.md).
}

\examples{
Expand Down

0 comments on commit 43600d5

Please sign in to comment.