Skip to content

Commit

Permalink
Updating the similarity module functions in the Apps layer (#390)
Browse files Browse the repository at this point in the history
* update similarity.py

* updating test filefor new functions

* update similarity tutorial

* First round of comments

* small changes

* gaussian backend error

* small change to n_mean

* change event numbers in tutorial

* adding non-trivial tests

* Tom review and comments

* updating tests from JM comments

* fixing some tests for travis CI

* Fix merge

* Update feature_vector_orbits example

* Update feature_vector_events example

* Fix line widths

* Update changelog

* Revert "Fix line widths"

This reverts commit 56b07d5.

* remove hafnina expression

Co-authored-by: trbromley <brotho02@gmail.com>
Co-authored-by: Tom Bromley <49409390+trbromley@users.noreply.github.com>
  • Loading branch information
3 people committed Jun 18, 2020
1 parent 6e2c338 commit 9713d05
Show file tree
Hide file tree
Showing 4 changed files with 855 additions and 315 deletions.
4 changes: 4 additions & 0 deletions .github/CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,10 @@

<h3>New features since last release</h3>

* Feature vectors of graphs can now be calculated exactly in the `similarity` module of the
applications layer.
[(#390)](https://github.com/XanaduAI/strawberryfields/pull/390)

* Adds the `apps.qchem.dynamics` module for simulating vibrational quantum dynamics in molecules.
The `dynamics.evolution()` function provides a custom operation that encodes the input chemical
information for use in a Strawberry Fields `Program`. The `sample_fock()` function allows for
Expand Down
102 changes: 63 additions & 39 deletions examples_apps/run_tutorial_similarity.py
Original file line number Diff line number Diff line change
Expand Up @@ -144,7 +144,7 @@
##############################################################################
# Now that we have mastered orbits and events, how can we make a feature vector? It was shown in
# :cite:`schuld2019quantum` that one way of making a feature vector of a graph is through the
# frequencies of events. Specifically, for a :math:`k` photon event :math:`E_{k, n_{\max}}`
# frequencies of orbits or events. For example, for a :math:`k` photon event :math:`E_{k, n_{\max}}`
# with maximum count per mode :math:`n_{\max}` and corresponding probability :math:`p_{k,
# n_{\max}}:=p_{E_{k, n_{\max}}}(G)` with respect to a graph :math:`G`, a feature vector can be
# written as
Expand All @@ -156,71 +156,94 @@
# where :math:`\mathbf{k} := (k_{1}, k_{2}, \ldots , k_{K})` is a list of different total photon
# numbers.
#
# For example, if :math:`\mathbf{k} := (2, 4, 6)` and :math:`n_{\max} = 2`, we have
# For example, if :math:`\mathbf{k} := (2, 4, 6, 8)` and :math:`n_{\max} = 2`, we have
#
# .. math::
# f_{(2, 4, 6), 2} = (p_{2, 2}, p_{4, 2}, p_{6, 2}).
# f_{(2, 4, 6, 8), 2} = (p_{2, 2}, p_{4, 2}, p_{6, 2}, p_{8, 2}).
#
# In this case, we are interested in the probabilities of events :math:`E_{2, 2}`, :math:`E_{4,
# 2}`, and :math:`E_{6, 2}`. Suppose we are sampling from a four-mode device and have the samples
# 2}`, :math:`E_{6, 2}`, and :math:`E_{8, 2}`. Suppose we are sampling from a four-mode device
# and have the samples
# ``[0, 3, 0, 1]`` and ``[1, 2, 0, 1]``. These samples are part of the orbits ``[3, 1]`` and
# ``[2, 1, 1]``, respectively. However, ``[3, 1]`` is not part of the :math:`E_{4, 2}` event while
# ``[2, 1, 1]`` is.
#
# Calculating a feature vector
# ----------------------------
#
# We provide two methods for calculating a feature vector of GBS event probabilities in
# Strawberry Fields:
# We provide three methods for calculating a feature vector in the :mod:`~.apps.similarity` module
# of Strawberry Fields:
#
# 1. Through sampling.
# 2. Using a Monte Carlo estimate of the probability.
# 2. Using exact probability calculations.
# 3. Using a Monte Carlo estimate of the probability.
#
# In the first method, all one needs to do is generate some GBS samples from the graph of
# interest and fix the composition of the feature vector. For example, for a feature vector
# :math:`f_{\mathbf{k} = (2, 4, 6), n_{\max}=2}` we use:
# interest and fix the composition of the feature vector. For example, to obtain feature vector
# :math:`f_{\mathbf{k} = (2, 4), n_{\max}=2}` for the first MUTAG graph, we use:

print(similarity.feature_vector_sampling(m0, event_photon_numbers=[2, 4, 6], max_count_per_mode=2))
print(similarity.feature_vector_events_sampling(m0, [2, 4], 2))

##############################################################################
# For the second method, suppose we want to calculate the event probabilities exactly rather than
# through sampling. To do this, we consider the event probability :math:`p_{k, n_{\max}}` as the
# sum over all sample probabilities in the event. In GBS, each sample probability is determined by
# the hafnian of a relevant sub-adjacency matrix. While this is tough to calculate, what makes
# calculating :math:`p_{k, n_{\max}}` really challenging is the number of samples the corresponding
# event contains! For example, the 6-photon event over 17 modes :math:`E_{k=6, n_{\max}=2}`
# We can also use any orbits of our choice instead of events:
print(similarity.feature_vector_orbits_sampling(m0, [[1, 1], [2], [1, 1, 1, 1], [2, 1, 1]]))

##############################################################################
# For the second method, we calculate the orbit probabilities exactly rather than
# through sampling. Considering a feature vector of orbit probabilities,
# the probability of a single orbit :math:`p(O)` is given by:
#
# .. math::
# p(O) = \sum_{S \in O} p(S)
#
# where :math:`S` represents a GBS output click pattern. Calculating each :math:`p(S)` requires
# computing a `hafnian <https://the-walrus.readthedocs.io/en/latest/hafnian.html>`__, which gets
# exponentially difficult with increasing photon number. The exact probability of an event
# :math:`p_{k,n_{\max}}` can be calculated in a similar manner.
#
# Built-in functions :func:`~.feature_vector_orbits` and :func:`~.feature_vector_events`
# can be used to get exact feature vectors. These functions
# use a keyword argument ``samples`` to signal producing either exact or Monte Carlo estimated probabilities,
# as shown later. ``samples`` is set to ``None`` to get an exact feature vector by default. To use Monte Carlo
# estimation, ``samples`` can be set to the number of samples desired to be used in the estimation.
# For example, to get the exact event probabilities in the feature vector example
# :math:`f_{\mathbf{k} = (2, 4), n_{\max}=2}` seen previously, we use:

print(similarity.feature_vector_events(nx.Graph(m0_a), [2, 4], 2))

##############################################################################
# Although they are precise, exact calculations for large matrices can be tough to evaluate. Additionally,
# what makes calculating :math:`p_{k, n_{\max}}` really challenging is the number of samples the
# corresponding event contains. For example, the 6-photon event over 17 modes :math:`E_{k=6, n_{\max}=2}`
# contains the following number of samples :

print(similarity.event_cardinality(6, 2, 17))

##############################################################################
# To avoid calculating a large number of sample probabilities, an alternative is to perform a
# Monte Carlo approximation. Here, samples within an event are selected uniformly at random and
# their resultant probabilities are calculated. If :math:`N` samples :math:`\{S_{1}, S_{2},
# \ldots , S_{N}\}` are generated, then the event probability can be approximated as
# To avoid calculating a large number of sample probabilities, an alternative is to perform
# Monte Carlo estimation. Here, samples within an orbit or event are selected uniformly
# at random and their resultant probabilities are calculated. For example, for an event
# :math:`E_{k, n_{\max}}`, if :math:`N` samples :math:`\{S_{1}, S_{2}, \ldots , S_{N}\}`
# are generated, then the event probability can be approximated as
#
# .. math::
# p(E_{k, n_{\max}}) \approx \frac{1}{N}\sum_{i=1}^N p(S_i) |E_{k, n_{\max}}|,
#
# with :math:`|E_{k, n_{\max}}|` denoting the cardinality of the event.
#
# This method can be accessed using the :func:`~.prob_event_mc` function. The 4-photon event is
# approximated as:
# This method can be accessed using the :func:`~.feature_vector_events` function
# with ``samples`` set to the number of samples desired to be used in the estimation.
# For example, to get MC-estimated probabilities for our example feature vector
# :math:`f_{\mathbf{k} = (2, 4), n_{\max}=2}`, we use:

print(similarity.prob_event_mc(nx.Graph(m0_a), 4, max_count_per_mode=2, n_mean=6))
print(similarity.feature_vector_events(nx.Graph(m0_a), [2, 4], 2, samples=1000))

##############################################################################
# The feature vector can then be calculated through Monte Carlo sampling using
# :func:`~.feature_vector_mc`.
#
# .. note::
# The results of :func:`~.prob_event_mc` and :func:`~.feature_vector_mc` are probabilistic and
# may vary between runs. Increasing the optional ``samples`` parameter will increase accuracy
# but slow down calculation.
#
# The second method of Monte Carlo approximation is intended for use in scenarios where it is
# computationally intensive to pre-calculate a statistically significant dataset of samples from
# GBS.
# The results of using Monte Carlo estimation with :func:`~.feature_vector_orbits` and
# :func:`~.feature_vector_events` are probabilistic and may vary between runs. Increasing
# the ``samples`` parameter will increase the precision but slow down the calculation.
#
# Machine learning with GBS graph kernels
# ---------------------------------------
Expand Down Expand Up @@ -250,10 +273,10 @@
events = [8, 10]
max_count = 2

f1 = similarity.feature_vector_sampling(m0, events, max_count)
f2 = similarity.feature_vector_sampling(m1, events, max_count)
f3 = similarity.feature_vector_sampling(m2, events, max_count)
f4 = similarity.feature_vector_sampling(m3, events, max_count)
f1 = similarity.feature_vector_events_sampling(m0, events, max_count)
f2 = similarity.feature_vector_events_sampling(m1, events, max_count)
f3 = similarity.feature_vector_events_sampling(m2, events, max_count)
f4 = similarity.feature_vector_events_sampling(m3, events, max_count)

import numpy as np

Expand All @@ -262,9 +285,10 @@
print(R)

##############################################################################
# There is freedom in the choice of ``events`` composing the feature vectors and we encourage the
# reader to explore different combinations. Note, however, that odd photon-numbered events have
# zero probability because ideal GBS only generates and outputs pairs of photons.
# The choice of what ``events`` to use for the feature vectors can be significant and we encourage the
# reader to explore different combinations. We can also use any orbits of our choice above instead of events.
# Note, however, that GBS samples with odd total number of photons have zero probability when using ideal
# GBS, which only generates and outputs pairs of photons.
#
# Given our points in the feature space and their target labels, we can use
# scikit-learn's Support Vector Machine `LinearSVC <https://scikit-learn.org/stable/modules/generated/sklearn.svm
Expand Down

0 comments on commit 9713d05

Please sign in to comment.