Skip to content

Conversation

@aloctavodia
Copy link
Member

@aloctavodia aloctavodia commented Nov 12, 2021

The main change introduced in this PR is how particles weights are computed. All particles except for the first one (previous tree) are grown from scratch and re-weighted at each step. After all particles stop growing we compute the weights of all particles including the previous tree and normalize. This improve accuracy but at the same time makes the sampler "stickier", in a follow up PR I will introduce better "tree-movements" that should improve sampling.

Just to illustrate, this is an example of fitting a function with 5 covariables related to the output function and then progressively adding more covariables non-related to the output variable (pure noise).
bart_fit_covariantes_m200

and a partial dependant plot for 10 covariables (only the first 5 are related to the output)
pdp_friedman_200_10

@codecov
Copy link

codecov bot commented Nov 12, 2021

Codecov Report

Merging #5177 (9d49feb) into main (a11eaa2) will decrease coverage by 0.01%.
The diff coverage is 90.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #5177      +/-   ##
==========================================
- Coverage   78.11%   78.10%   -0.02%     
==========================================
  Files          88       88              
  Lines       14159    14161       +2     
==========================================
- Hits        11061    11060       -1     
- Misses       3098     3101       +3     
Impacted Files Coverage Δ
pymc/bart/pgbart.py 95.14% <90.00%> (-1.10%) ⬇️

Copy link
Member

@michaelosthege michaelosthege left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apart from some variable renaming there seem to be some algorithmic changes.
I can't really review those..

Do you maybe want to list them in the PR description and linkt it from the RELEASE-NOTES.md?

RELEASE-NOTES.md Outdated
- Added partial dependence plots and individual conditional expectation plots [5091](https://github.com/pymc-devs/pymc3/pull/5091).
- Added linear response, increased number of trees fitted per step [5044](https://github.com/pymc-devs/pymc3/pull/5044).
- Added partial dependence plots and individual conditional expectation plots [5091](https://github.com/pymc-devs/pymc3/pull/5091).
- Modify how particle weights are computed. This imporves accuracy of the modeled function (see [5177](https://github.com/pymc-devs/pymc3/pull/5177)).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- Modify how particle weights are computed. This imporves accuracy of the modeled function (see [5177](https://github.com/pymc-devs/pymc3/pull/5177)).
- Modify how particle weights are computed. This improves accuracy of the modeled function (see [5177](https://github.com/pymc-devs/pymc3/pull/5177)).

@aloctavodia aloctavodia merged commit c0c5a80 into pymc-devs:main Nov 17, 2021
@aloctavodia aloctavodia deleted the bart_wt branch November 17, 2021 13:25
morganstrom pushed a commit to morganstrom/pymc that referenced this pull request Nov 17, 2021
* improve accuracy and other minor fixes

* update release notes

* fix typo
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants