No description or website provided.
Switch branches/tags
Clone or download
sdual Merge pull request #39 from sdual/skstan-docs
Create docs by using Sphinx
Latest commit b27b799 Oct 18, 2018
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
docs add docs Oct 15, 2017
image implement result object Oct 8, 2017
skstan module for release version Oct 9, 2017
test revice assertion Apr 12, 2017
.coveragerc fix eof May 9, 2017
.gitignore add .DS_Store to gitignore Oct 9, 2017
.travis.yml fix travis.yml Oct 9, 2017
MANIFEST.in include README.md Oct 9, 2017
README.md fix description Nov 25, 2017
main.py revice hieralchical structure Apr 11, 2017
requirements.txt added travis.yml Apr 23, 2017
setup.py use capital letter Nov 25, 2017

README.md

scikit-stan

Build Status codecov

What is scikit-stan

scikit-stan will enable you to use various bayesian models based on stan(http://mc-stan.org) and pystan with an elegant interface like a scikit-learn or keras.

Demo

import numpy as np

from skstan.regression.linear_models import LogisticRegression

if __name__ == '__main__':
    x = np.array(
        [
            [1,2,3,],
            [1,2,7,],
            [1,0,3,],
            [1,1,3,],
            [3,7,3,],
        ]
    )
    y = np.array([0,0,0,0,1])

    glm = LogisticRegression(shrinkage=10, chains=8)
    fit = glm.fit(x, y)

Then we got result object fit, and field stanfit is a stanfit object of pystan.

print(fit.stanfit)

It gives following

Inference for Stan model: anon_model_f63cd5ccdd67c22034b2490ae4c9cdd1.
4 chains, each with iter=2000; warmup=1000; thin=1; 
post-warmup draws per chain=1000, total post-warmup draws=4000.

           mean se_mean     sd   2.5%    25%    50%    75%  97.5%  n_eff   Rhat
alpha[0]   2.23    0.29   8.88 -14.39  -3.87   2.02   8.07  20.67    966    1.0
alpha[1]   7.81    0.18   5.29  -1.08   4.01   7.48  11.23  19.01    880    1.0
alpha[2]  -9.79    0.22   5.87 -22.91 -13.41  -9.37  -5.38  -0.17    728    1.0
beta      -2.48    0.29    9.8 -22.63  -9.03   -2.3   3.99  16.91   1146    1.0
yp[0]    -13.99    0.32  11.19 -40.69 -20.24 -11.35  -5.42    0.3   1259    1.0
yp[1]    -53.15    1.14  32.08 -128.4 -71.99 -48.46 -29.35  -5.24    790    1.0
yp[2]    -29.61     0.6  16.66 -67.24 -39.44 -27.97  -17.0  -4.37    771    1.0
yp[3]     -21.8    0.44  13.17 -52.98 -29.03 -19.57 -11.91  -3.23    894    1.0
yp[4]     29.51    0.69  24.68   0.58   10.3  23.36  42.72  90.17   1276    1.0
lp__      -2.16    0.05   1.48  -5.93   -2.9  -1.81  -1.07  -0.32    956    1.0

Samples were drawn using NUTS at Thu Apr 13 07:52:33 2017.
For each parameter, n_eff is a crude measure of effective sample size,
and Rhat is the potential scale reduction factor on split chains (at 
convergence, Rhat=1).

Result object of skstan also have prediction methods. Predicted values can be obtained as samples from distribution with a predict_dist method, because it is bayesian model.

yp_dist = fit.predict_dist(x)
print(yp_dist)

Then we got

array([[  2.63886682e-08,   5.23976746e-04,   5.54863097e-05, ...,
          2.46008578e-08,   3.74830192e-01,   3.45994043e-03],
       [  1.07746578e-22,   1.01664809e-18,   4.12813154e-26, ...,
          5.64992544e-19,   7.24386097e-12,   1.75795155e-23],
       [  8.04688037e-22,   4.44522113e-12,   1.42920488e-11, ...,
          7.71565191e-13,   5.13118658e-05,   4.26331280e-05],
       [  4.60810657e-15,   4.82743551e-08,   2.81612678e-08, ...,
          1.37772153e-10,   5.51614998e-03,   3.84594197e-04],
       [  9.99999998e-01,   1.00000000e+00,   1.00000000e+00, ...,
          9.99965378e-01,   1.00000000e+00,   1.00000000e+00]])

So let's check the histgram of first row with pandas.Series.

import pandas as pd
pd.Series(yp_dist[0]).hist(bins=20)

Histgram of first row

If you need a median of samples, you can get it with just predict method

yp = fit.predict(x)
print(yp)

gives

array([  1.17280235e-05,   9.01419773e-22,   7.16023732e-13,
         3.18368664e-09,   1.00000000e+00])

How to install

Install

Installers for the latest released version are available at PyPI.

pip3 install skstan

Install from sources

git clone https://github.com/BayesianFreaks/scikit-stan
cd scikit-stan
python3 setup.py install

Uninstall

pip3 uninstall scikit-stan

Using python2?

Are you joking?

We can't touch you because we are living in the future from you, and you're living in past ages. Please say hello to Nobunaga Oda.

We will always use newest features of the latest version of python, so you should use the latest version of python.

Models

Ready

Regression Models

  • Linear Regrassion
  • Poisson Regression
  • Logistic Regression

Next Steps

Regression Models

  • Gamma Regression
  • GLMM
  • etc...

Time Series Models

  • AR Model
  • MA Model
  • ARMA Model
  • ARIMA Model
  • ARCH Model
  • GARCH Model
  • TAR Model
  • State Space Model
  • or Some Dynamic Regression Models
  • etc...

Clustering Model

  • Gaussian Mixture Model
  • Latent Dirichlet Allocation
  • etc...

Particular Application

  • Modeling about online-advertisement
  • Decompose time series data
  • Empirical Bayesian Estimation