Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

I have some trouble translating a model from PyMC3 #109

Closed
laygr opened this issue Jun 19, 2018 · 4 comments
Closed

I have some trouble translating a model from PyMC3 #109

laygr opened this issue Jun 19, 2018 · 4 comments
Labels

Comments

@laygr
Copy link

laygr commented Jun 19, 2018

I'm very excited for this library, so I really want to learn how to use it. I'm translating the example Cheating among students from PyMC3 to ZhuSuan, from the book Probabilistic Programming & Bayesian Methods for Hackers.

It is a clever solution for inferring how many students cheated on an exam. It involves a "privacy algorithm" where students answer whether they cheated with the truth or randomly according to the flipping of two coins.

First coin flip:

  • Head => they answer with the truth. That is, if they cheated or not
  • Tails => they answer according to the second coin flip

Second coin flip:

  • Heads => they answer that they cheated
  • Tails => the answer that they didn't cheat

This way, we can't point to an individual student and claim that they cheated, since only the student knows the results of the coin flips. And so, their privacy remains protected.

After the survey, we observe that 35 students out of a 100 answered that they cheated. The inference problem is to infer the real probability of cheating. Note that even if no student cheated, we would expect that 25 students would answer that they cheated.

However, it seems I'm getting something wrong since my results don't look similar to the results obtained in the original code.

This is my graphical model (a running Jupiter Notebook for this code can be found here):

def create_model(observed):
  with zs.BayesianNet(observed=observed) as model:
    
    #this is what we want to infer, the probability that a student cheats
    cheating_freq = zs.Uniform('cheating_freq', minval=0.0, maxval=1.0, group_ndims=1)
    
    not_cheated = zs.Bernoulli('not_cheated', logits=n_ones*logit(cheating_freq), group_ndims=1)
    first_coin_flips = zs.Bernoulli('first_coin_flips', logits=n_ones*logit(0.5), group_ndims=1)
    second_coin_flips = zs.Bernoulli('second_coin_flips', logits=n_ones*logit(0.5), group_ndims=1)
    
    answers = first_coin_flips * not_cheated + (1 - first_coin_flips) * second_coin_flips
    
    observed_proportion = tf.cast(tf.reduce_sum(answers)/N, tf.float32)
    observations = zs.Binomial('observations', logits=logit(observed_proportion),
                               n_experiments=tf.constant(N), group_ndims=1)
  return model

This is the code for the inference model:

n_chains = 1
n_iters = 40000
burnin = 20000
n_leapfrogs = 5

adapt_step_size = tf.placeholder(tf.bool, shape=[], name="adapt_step_size")
adapt_mass = tf.placeholder(tf.bool, shape=[], name="adapt_mass")
hmc = zs.HMC(step_size=1e-3, n_leapfrogs=n_leapfrogs, 
             adapt_step_size=adapt_step_size, adapt_mass=adapt_mass,
             target_acceptance_rate=0.9)

def log_joint(observed):
  model = create_model(observed)
  cf, nc = model.local_log_prob(['cheating_freq', 'not_cheated'])
  return cf + nc

qcheating_freq = tf.Variable(0.1 * tf.ones([n_chains,1]), trainable=True)
obs = tf.constant([35])
sample_op, hmc_info = hmc.sample(log_joint, observed={'observations':obs}, latent={'cheating_freq':qcheating_freq})

And this is the code for running the inference:

with tf.Session() as sess:
  sess.run(tf.global_variables_initializer())
  samples = []
  print('Sampling...')
  for i in range(n_iters):
      _, fc_sample, acc, ss = sess.run(
          [sample_op, hmc_info.samples['cheating_freq'],
           hmc_info.acceptance_rate,hmc_info.updated_step_size],
          feed_dict={adapt_step_size: i < burnin // 2, adapt_mass: i < burnin // 2}
      )
      if i % 250 == 0:
          print('Sample {}: Acceptance rate = {}, updated step size = {}'
                  .format(i, np.mean(acc), ss))
      if i >= burnin:
          samples.append(fc_sample)
  print('Finished.')
  samples = np.vstack(samples)

If you have the time, I'll appreciate your help a lot.

@laygr
Copy link
Author

laygr commented Jun 20, 2018

A better name for the variable not_cheated is cheated, since a 1 means that the student cheated and a 0 means that he didn’t cheat.

@meta-inf
Copy link
Member

Basically you specified a joint distribution (cheating_freq', 'not_cheated and observation which you forgot to include), but only sampled a subset of the variables (cheating_freq). You could

  • include not_cheated into the sampler, but it's discrete, and HMC can't work on it without further augmentations;
  • derive the joint distribution in which not_cheated is marginalized, but it's hard.

So I guess instead of a single HMC that samples everything at once, you should use Gibbs like the PyMC3 tutorials did.

@laygr
Copy link
Author

laygr commented Jun 20, 2018

I did try several configurations, but I kept getting python errors; it’s good to know that there was no way of succeeding with the HMC algorithm, thanks! I considered it an almighty algorithm. Perhaps I should take the Probabilistic Graphical Models specialization on Coursera.
I didn’t see the Gibbs sampler in the docs.

@meta-inf
Copy link
Member

meta-inf commented Jun 20, 2018 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants