Let's have a look at the data from Tutorial3 (Density Matrix). There, we aim to reconstruct the wavefunction of the 2-qubit W-state:
$$\vert\psi \rangle = \frac{1}{\sqrt{2}}(\vert 01\rangle + \vert10\rangle)$$
The data can be found in the files "N2_W_state_100_samples_data.txt" and "N2_W_state_1000_samples_data.txt".

Following Tutorial 3, we import the modules

In [10]:
import numpy as np
import matplotlib.pyplot as plt

import torch

from qucumber.nn_states import DensityMatrix

from qucumber.callbacks import MetricEvaluator
import qucumber.utils.unitaries as unitaries

import qucumber.utils.training_statistics as ts
import qucumber.utils.data as data
import qucumber

# set random seed on cpu but not gpu, since we won't use gpu for this tutorial
qucumber.set_random_seed(1234, cpu=True, gpu=False)

Now we can import the data:

In [22]:
train_path = "N2_W_state_1000_samples_data.txt"
train_bases_path = "N2_W_state_1000_samples_bases.txt"
matrix_path_real = "N2_W_state_target_real.txt"
matrix_path_imag = "N2_W_state_target_imag.txt"
bases_path = "N2_IC_bases.txt"


train_samples, true_matrix, train_bases, bases = data.load_data_DM(train_path, matrix_path_real, matrix_path_imag, train_bases_path, bases_path)

And generate the DensityMatrix, together with its Hilbert space. Still all taken from Tutorial 3.

In [23]:
unitary_dict = unitaries.create_dict()
nv = train_samples.shape[-1]
nh = na = nv

nn_state = DensityMatrix(num_visible=nv, num_hidden=nh, num_aux=na, unitary_dict=unitary_dict, gpu=False)
space = nn_state.generate_hilbert_space()

With the Hilbert space in hand, we can now have a quick look at the frequency of each basis element in the data:

In [24]:
c1 = 0 # counts 00
c2 = 0 # counts 01
c3 = 0 # counts 10
c4 = 0 # counts 11

# If there is a way of doing list comprehensions with torch.Tensors, please enlighten me :p 

for sample in train_samples:
    if torch.equal(sample,space[0]):
       c1 += 1
    elif torch.equal(sample,space[1]):
       c2 += 1 
    elif torch.equal(sample,space[2]):
       c3 += 1 
    elif torch.equal(sample,space[3]):
       c4 += 1 

So now we see that we have 

In [25]:
import pandas as pd
import pprint as pp
count = {
    "basis vector": ["0 0", "0 1", "1 0", "1 1"],
    "occurences in data": [c1, c2, c3, c4]
}
df = pd.DataFrame(count)
df

Unnamed: 0,basis vector,occurences in data
0,0 0,2336
1,0 1,2178
2,1 0,2115
3,1 1,2371


So, we see that each basis element occurs in roughly equal numbers.

But now recall that the state we are interested in reconstructing is
$$\vert\psi \rangle = \frac{1}{\sqrt{2}}(\vert 01\rangle + \vert10\rangle)$$
So, shouldn´t the data be evenly distributed between the $\vert01\rangle$ and $\vert10\rangle$ basis elements?

New day.
With a fresh look, I realized I forgot to extract the basis relevant for training amplitudes.

In [29]:
z_samples = data.extract_refbasis_samples(train_samples, train_bases)

  z_samples = train_samples[idx]


In [27]:
d1 = 0 # counts 00
d2 = 0 # counts 01
d3 = 0 # counts 10
d4 = 0 # counts 11

# If there is a way of doing list comprehensions with torch.Tensors, please enlighten me :p 

for sample in z_samples:
    if torch.equal(sample,space[0]):
       d1 += 1
    elif torch.equal(sample,space[1]):
       d2 += 1 
    elif torch.equal(sample,space[2]):
       d3 += 1 
    elif torch.equal(sample,space[3]):
       d4 += 1 
    
z_count = {
    "basis vector": ["0 0", "0 1", "1 0", "1 1"],
    "occurences in data": [d1, d2, d3, d4]
}
z_df = pd.DataFrame(z_count)
z_df

Unnamed: 0,basis vector,occurences in data
0,0 0,120
1,0 1,373
2,1 0,369
3,1 1,138


Hmm. Still not quite what I was expecting.
I also wonder why the data here contained X and Y basis measurments anyway. After all, there was no phase to learn, i.e. the target wavefunction was not
$$\vert\psi \rangle = \frac{1}{\sqrt{2}}(e^{i\theta_1} \vert 01\rangle + e^{i\theta_2} \vert10\rangle)$$