Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MADE #39

Open
wants to merge 8 commits into
base: master
Choose a base branch
from
Open

MADE #39

wants to merge 8 commits into from

Conversation

tejank10
Copy link
Contributor

@tejank10 tejank10 commented May 6, 2018

MADE is Masked AutoEncoder for Density Estimation. @MikeInnes @jekbradbury

Copy link

@jekbradbury jekbradbury left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks awesome, Tejan! Thanks so much!

MADE/made.jl Outdated
rng = MersenneTwister(made.seed)
made.seed = (made.seed + 1) % made.num_masks

# sample the order of the inpdimensionsuts and the connectivity of all neurons

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo

MADE/made.jl Outdated
#Getting data. The data used here is binarized MNIST dataset

X = npzread("/path/to/your/data.npy")
X = X'

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Punning on adjoint is not really recommended, right? And deprecated in 0.7, so maybe use permutedims?

hs = push!([in], hs...)
layers = [MaskedDense(hs[i], hs[i + 1]; σ = relu) for i = 1:length(hs) - 1]

net = Chain(layers..., MaskedDense(hs[end], out))

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couldn't this have a special case for the first and last layers (rather than just the last one), allowing you to avoid the push! in line 57?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't it be an issue if hs = []? In that case, Chain(MaskedDense(in, hs[1]), layers..., MaskedDense(hs[end], out)) gives BoundsError

MADE/made.jl Outdated

function set_mask(a::MaskedDense, mask)
a.mask = mask
end

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unless there's validation that you need to do this probably doesn't need to be a setter method and could just be replaced at it so call locations by directly setting the attribute?

end

function MaskedDense(in::Integer, out::Integer; σ = identity)
return MaskedDense(param(glorot_uniform(out, in)), param(zeros(out)), ones(out, in), σ)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could probably be modified to be more type-generic? Would be nice to be able to support GPU and/or Float32

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please explain a bit more on that? Because in and out have to be integers, MaskedDense has been modeled after Dense.

MADE/made.jl Outdated
nin::Integer
nout::Integer
hidden_sizes::Array{Integer, 1}
net::Chain

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These should probably be concretely typed (Int and either Vector{Int} or a tuple)

L = length(made.hidden_sizes)

# fetch the next seed and construct a random stream
rng = MersenneTwister(made.seed)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this reinitialization going to be slow?

MADE/made.jl Outdated
num_masks
seed::UInt # for cycling through num_masks orderings

m::Dict

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe give this dict a concrete type for clarity about what it's going to hold?

@MikeInnes
Copy link
Member

MikeInnes commented Jul 5, 2018

Can you update this to use Flux.MNIST rather than the numpy file? We'll need this to work out of the box.

Also this line seemed sketchy, given that maskeddense is immutable, what's going on there?

@tejank10
Copy link
Contributor Author

tejank10 commented Jul 5, 2018

I have updated the model with MNIST, rather than binarized MNIST which was used by the original paper. I have also fixed the mentioned line. Its just that masks are getting updated there, similar to how the weights are updated after running through the optimizer (someone please correct me if I am wrong there).

@MikeInnes MikeInnes force-pushed the master branch 2 times, most recently from d7061a7 to 98fe429 Compare July 12, 2018 16:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants