Skip to content

reproducing figures using JuliaLang in "Introduction to Machine Learning by Bayesian Inference" written by Suyama

Notifications You must be signed in to change notification settings

triwave33/julia_bayes_ml_suyama

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

98 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

julia_bayes_ml_suyama

reproducing figures in "Introduction to Machine Learning by Bayesian Inference" written by Suyama, Atsushi using JuliaLang.

image

(ISBN 9784061538320)

Environments

  • Julia: 1.3.1

    • "LaTeXStrings" => v"1.0.3"
    • "Combinatorics" => v"1.0.0"
    • "Makie" => v"0.9.5"
    • "IJulia" => v"1.20.2"
    • "AbstractPlotting" => v"0.9.17"
    • "Plots" => v"0.28.4"
    • "Colors" => v"0.9.6"

Fig 2_1

2.1.5 Approximate calculation of expectation by sampling

fig_2_1


Fig 2_2

2.2.1 Bernoulli distribution

Entropy of Bernoulli distribution

fig_2_2


Fig 2_3

2.2.2 Binomial distribution

fig_2_3

Julia Tips

layout of divided figures

l = @layout [a; b c]
a = bar(...)
b = bar(...)
c = bar(...)
plot(a,b,c, layout=l)

Fig 2_4

2.2.4 Multinomial distribution

fig_2_4

Julia tips

  • 3D barplot

Makie.jl provides 3d scatter function as meshscatter. 3d barplot can be obtained from meshscatter like below.

using Makie
using AbstractPlotting

markersize = Vec3f0.(1,1, -vec(mplot))

Here, 1x1 tile on scatterring point is extended down to (x,y) plane by typing -1 * z_value (in this case; vec(mplot)). Then, we get "bars"!

  • layout

Makie layout is used for displaying three bargraph. Both an entire region in the figure or a sub-region containing each barplot are called "scene" We can define the size of resion (see script).

Once scenes are defined. We can overwrite each one by calling plot function like this.

meshgrid!(scene1, ...)

Fig 2.6

2.2.5 Poison distribution

fig_2_6


Fig 2.7

Beta distribution

beta distribution

gamma function

fig_2_7

Fig 2.8

2.3.2 Dirichlet distribution

dirichlet distribution

where

is Gamma function.

fig_2_8

Julia tips

If you want to omit the color bar in a plot. Use legend=:none option in plot function.

legend=:none

Fig 2.9

2.3.3 Gamma distribution

gamma distribution

where

fig_2_9

Fig 2.10

2.3.4 One-dimensional Gaussian distribution

1D Gaussian distribution

fig_2_10

Fig 2.11

2.3.4 One-dimensional Gaussian distribution

Kullback-Laibler divergence

KL divergence for 1D Gaussian distribution

fig_2_11


Fig 2.12

2.3.2 Multivariate Gaussian distribution

Multivariate Gaussian distribution

fig_2_12

Fig 2.13

2.3.6 Wishart distribution

Wishart distribution

fig_2_13


Fig.3.3

3.2 Learning and prediction of descrete probability distribution

3.2.1 Learning and prediction of Bernouli distribution

Consider Bernouli destribution on a binary variablble x.

Here, we want to learn the parameter µ. Thus we set Beta distribution as a prior distribution over µ.

According to sampling (observation of N data points), posterior distribution about the parameter (µ) can be expressed as a beta distribution with new hyperparameters.

where

fig_3_03


Fig.3.4

Learning and prediction of 1-D Gaussian distribution

Precision (lambda) is unknown.

where

Logarithmic form of predictive distribution is expressed below,

and alike to (logarithmic form of) Student's t distribution

fig_3_04

Fig.3.6 & 3.7

3.5 Linear Regression

3.5.1 model creation

fig_3_06

Fig.3.6 sampling of 3rd order function from pre-trained model

fig_3_07

Fig 3.7 sampling of synthesized data (y_n) from the function.

Fig.3.8

Calculation of posterior distribution and predictive distribution

poterior distribution of parameter w in linear regression model

where

predictive distribution

where

fig_3_08

Fig 3.9

Comparison between models

marginal likelihood

fig_3_09

Fig 3.10

Comparison between models

fig_3_10

Fig 3.11

Comparison between models

fig_3_11

Fig 4.1 & 4.2

The reason to adapt mixture model

A single Gaussian distribution cannot represent sample distributions with multi classes (culsters).

fig_4_1

Similary, a polynominal linear regression curve cannot fit to two trends. When M (polynominal dimension) is 4, the fitted curve shows average values between two trends. When M is 30, the curve goes back and forth between two trends. We should assume multiple (two) regresion functions in such data trends.

fig_4_2

Fig 4.4

Gibbs sampling

fig_4_4

About

reproducing figures using JuliaLang in "Introduction to Machine Learning by Bayesian Inference" written by Suyama

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published