##### Copyright 2018 The TensorFlow Authors.

Licensed under the Apache License, Version 2.0 (the "License");

In [0]:
#@title Licensed under the Apache License, Version 2.0 (the "License"); { display-mode: "form" }
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# Rasch Model [(Rasch, 1960)](https://en.wikipedia.org/wiki/Rasch_model) (with TFP)

<table class="tfo-notebook-buttons" align="left">
  <td>
    <a target="_blank" href="https://drive.google.com/file/d/1lacnVcpnS-A4Ye9jZ0cl9oY21xIP6yNk/view?usp=sharing"><img src="https://www.tensorflow.org/images/colab_logo_32px.png" />Run in Google Colab</a>
  </td>
  <td>
    <a target="_blank" href=""><img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />View source on GitHub</a>
  </td>
</table>
<br>
<br>
<br>

Original content [this Repository](https://github.com/blei-lab/edward), created by [the Blei Lab](http://www.cs.columbia.edu/~blei/)

Ported to Tensorflow Probability by Matthew McAteer ([`@MatthewMcAteer0`](https://twitter.com/MatthewMcAteer0)), with help from the TFP team at  Google ([`tfprobability@tensorflow.org`](mailto:tfprobability@tensorflow.org)).

---

>[Dependencies & Prerequisites](#scrollTo=2ZtWUjXYRXQi)

>[Introduction](#scrollTo=2ZtWUjXYRXQi)

>>[Data](#scrollTo=2ZtWUjXYRXQi)

>>[Model](#scrollTo=2ZtWUjXYRXQi)

>>[Inference](#scrollTo=2ZtWUjXYRXQi)

>>[Criticism](#scrollTo=2ZtWUjXYRXQi)

>[References](#scrollTo=2ZtWUjXYRXQi)

## Dependencies & Prerequisites

In [0]:
!pip3 install -q tfp-nightly
!pip3 install -q observations

In [0]:
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

# import edward as ed
import matplotlib.pyplot as plt
import numpy as np
import tensorflow as tf
import tensorflow_probability as tfp

tfd = tf.distributions

# from edward.models import Bernoulli, Normal, Empirical
from scipy.special import expit


In [0]:
def session_options(enable_gpu_ram_resizing=True, enable_xla=True):
    """
    Allowing the notebook to make use of GPUs if they're available.
    
    XLA (Accelerated Linear Algebra) is a domain-specific compiler for linear 
    algebra that optimizes TensorFlow computations.
    """
    config = tf.ConfigProto()
    config.log_device_placement = True
    if enable_gpu_ram_resizing:
        # `allow_growth=True` makes it possible to connect multiple colabs to your
        # GPU. Otherwise the colab malloc's all GPU ram.
        config.gpu_options.allow_growth = True
    if enable_xla:
        # Enable on XLA. https://www.tensorflow.org/performance/xla/.
        config.graph_options.optimizer_options.global_jit_level = (
            tf.OptimizerOptions.ON_1)
    return config


def reset_sess(config=None):
    """
    Convenience function to create the TF graph & session or reset them.
    """
    if config is None:
        config = session_options()
    global sess
    tf.reset_default_graph()
    try:
        sess.close()
    except:
        pass
    sess = tf.InteractiveSession(config=config)

    
def evaluate(tensors):
    """
    A "Universal" evaluate function for both running either Graph mode (default)
    or Eager mode (https://www.tensorflow.org/guide/eager) in Tensorflow.
    """
    if context.executing_eagerly():
        return (t.numpy() for t in tensprs)
    with tf.get_default_session() as sess:
        return sess.run(tensors)

reset_sess()


def strip_consts(graph_def, max_const_size=32):
  """
  Strip large constant values from graph_def.
  """
  strip_def = tf.GraphDef()
  for n0 in graph_def.node:
    n = strip_def.node.add()
    n.MergeFrom(n0)
    if n.op == 'Const':
      tensor = n.attr['value'].tensor
      size = len(tensor.tensor_content)
      if size > max_const_size:
        tensor.tensor_content = bytes("<stripped %d bytes>"%size, 'utf-8')
  return strip_def


def draw_graph(model, *args, **kwargs):
  """
  Visualize TensorFlow graph.
  """
  graph = tf.Graph()
  with graph.as_default():
    model(*args, **kwargs)
  graph_def = graph.as_graph_def()
  strip_def = strip_consts(graph_def, max_const_size=32)
  code = """
      <script>
        function load() {{
          document.getElementById("{id}").pbtxt = {data};
        }}
      </script>
      <link rel="import" href="https://tensorboard.appspot.com/tf-graph-basic.build.html" onload=load()>
      <div style="height:600px">
        <tf-graph-basic id="{id}"></tf-graph-basic>
      </div>
  """.format(data=repr(str(strip_def)), id='graph'+str(np.random.rand()))

  iframe = """
      <iframe seamless style="width:1200px;height:620px;border:0" srcdoc="{}"></iframe>
  """.format(code.replace('"', '&quot;'))
  IPython.display.display(IPython.display.HTML(iframe))

## Introduction

The Rasch model, named after Georg Rasch, is a family of psychometric models for creating measurements from categorical data, such as answers to questions on a reading assessment or questionnaire responses, as a function of the trade-off between (a) the respondent's abilities, attitudes, or personality traits and (b) the item difficulty.[1] For example, they may be used to estimate a student's reading ability or the extremity of a person's attitude to capital punishment from responses on a questionnaire. In addition to psychometrics and educational research, the Rasch model and its extensions are used in other areas, including the health profession[2] and market research[3] because of their general applicability.[4]

The mathematical theory underlying Rasch models is a special case of item response theory and, more generally, a special case of a generalized linear model. However, there are important differences in the interpretation of the model parameters and its philosophical implications[5] that separate proponents of the Rasch model from the item response modeling tradition. A central aspect of this divide relates to the role of specific objectivity,[6] a defining property of the Rasch model according to Georg Rasch, as a requirement for successful measurement.

In [0]:
# tf.flags.DEFINE_integer("nsubj", default=200, help="")
# tf.flags.DEFINE_integer("nitem", default=25, help="")
# tf.flags.DEFINE_integer("T", default=5000, help="Number of posterior samples.")
# FLAGS = tf.flags.FLAGS

nsubj = 200
nitem = 25
T = 5000 # Number of posterior samples

### Data

In [0]:
trait_true = np.random.normal(size=[nsubj, 1])
thresh_true = np.random.normal(size=[1, nitem])
X_data = np.random.binomial(1, expit(trait_true - thresh_true))

### Model

In [0]:
trait = tfd.Normal(loc=0., scale=1.).sample(sample_shape=[nsubj, 1])
thresh = tfd.Normal(loc=0., scale=1.).sample(sample_shape=[1, nitem])
X = tfd.Bernoulli(logits=(trait - thresh))

### Inference

In [0]:
q_trait = tfd.Empirical(params=tf.get_variable("q_trait/params",
                                             [T, nsubj, 1]))
q_thresh = tfd.Empirical(params=tf.get_variable("q_thresh/params",
                                              [T, 1, nitem]))

inference = ed.HMC({trait: q_trait, thresh: q_thresh}, data={X: X_data})
inference.run(step_size=0.1)


In [0]:
# Alternatively, use variational inference.
q_trait = tfd.Normal(
    loc=tf.get_variable("q_trait/loc", [nsubj, 1]),
    scale=tf.nn.softplus(
        tf.get_variable("q_trait/scale", [nsubj, 1])))
q_thresh = tfd.Normal(
    loc=tf.get_variable("q_thresh/loc", [1, nitem]),
    scale=tf.nn.softplus(
        tf.get_variable("q_thresh/scale", [1, nitem])))

inference = ed.KLqp({trait: q_trait, thresh: q_thresh}, data={X: X_data})
inference.run(n_iter=2500, n_samples=10)

### Criticism

In [0]:
# Check that the inferred posterior mean captures the true traits.
plt.scatter(trait_true, q_trait.mean().eval())
plt.show()

print("MSE between true traits and inferred posterior mean:")
print(np.mean(np.square(trait_true - q_trait.mean().eval())))

In [0]:
# Visualizing the graph we've constructed
# draw_graph(linear_mixed_effects_model, features_train)

## References

1. Rasch, G. (1960/1980). Probabilistic models for some intelligence and attainment tests.(Copenhagen, Danish Institute for Educational Research), expanded edition (1980) with foreword and afterword by B.D. Wright. Chicago: The University of Chicago Press.
2. Bezruczko, N. (2005). Rasch measurement in health sciences. Maple Grove, MN: Jam Press.
3. Bechtel, G. G. (1985). Generalizing the Rasch model for consumer rating scales. Marketing Science, 4(1), 62-73.
4. Wright, B. D. (1977). Solving measurement problems with the Rasch model. Journal of Educational Measurement, 14(2), 97-116.
5. Linacre J.M. (2005). Rasch dichotomous model vs. One-parameter Logistic Model. Rasch Measurement Transactions, 19:3, 1032
6. Rasch, G. (1977). On Specific Objectivity: An attempt at formalizing the request for generality and validity of scientific statements. The Danish Yearbook of Philosophy, 14, 58-93.
7. Thurstone, L. L. (1927). A law of comparative judgment. Psychological Review, 34(4), 273.
8. Thurstone and sensory scaling: Then and now. (1994). Thurstone and sensory scaling: Then and now. Psychological Review, 101(2), 271–277. doi:10.1037/0033-295X.101.2.271
9. Andrich, D. (1978b). Relationships between the Thurstone and Rasch approaches to item scaling. Applied Psychological Measurement, 2, 449–460.
10. Kuhn, Thomas S. "The function of measurement in modern physical science." Isis (1961): 161-193.
11. *Bond, T.G. & Fox, C.M. (2007). Applying the Rasch Model: Fundamental measurement in the human sciences. 2nd Edn (includes Rasch software on CD-ROM). Lawrence Erlbaum. Page 265
12. Rasch, G. (1961). On general laws and the meaning of measurement in psychology, pp. 321–334 in Proceedings of the Fourth Berkeley Symposium on Mathematical Statistics and Probability, IV. Berkeley, California: University of California Press. Available free from Project Euclid
13. Andersen, E.B. (1977). Sufficient statistics and latent trait models, Psychometrika, 42, 69–81.
14. Andrich, D. (2010). Sufficiency and conditional estimation of person parameters in the polytomous Rasch model. Psychometrika, 75(2), 292-308.
15. Birnbaum, A. (1968). Some latent trait models and their use in inferring an examinee’s ability. In Lord, F.M. & Novick, M.R. (Eds.), Statistical theories of mental test scores. Reading, MA: Addison–Wesley.
16. Holster, Trevor A.; Lake, J. W. (2016). "Guessing and the Rasch model". Language Assessment Quarterly. 13 (2): 124-141. doi:10.1080/15434303.2016.1160096.
17. Byrka, Katarzyna; Jȩdrzejewski, Arkadiusz; Sznajd-Weron, Katarzyna; Weron, Rafał (2016-09-01). "Difficulty is critical: The importance of social factors in modeling diffusion of green products and practices". Renewable and Sustainable Energy Reviews. 62: 723–735. doi:10.1016/j.rser.2016.04.063.

In [0]:
from IPython.core.display import HTML
def css_styling():
    styles = open("../styles/custom.css", "r").read()
    return HTML(styles)
css_styling()

#  "#F15854",  // red
#  "#5DA5DA",  // blue
#  "#FAA43A",  // orange
#  "#60BD68",  // green
#  "#F17CB0",  // pink
#  "#B2912F",  // brown
#  "#B276B2",  // purple
#  "#DECF3F",  // yellow
#  "#4D4D4D",  // gray