# Data Sets Tutorial
This tutorial demonstrates how to create and use `DataSet` objects.  At its core Gate Set Tomography finds a gate set which best fits some experimental data, and in pyGSTi a `DataSet` is used to hold that data.  `DataSet`s are essentially nested dictionaries which associate a count (a number, typically an integer) with (gate string, SPAM label) pairs so that `dataset[gateString][spamLabel]` can be used to read & write the number of `spamLabel` outcomes of the experiment given by the sequence `gateString`.

There are a few important differences between a `DataSet` and a dictionary-of-dictionaries:
- `DataSet` objects can be in one of two modes: *static* or *non-static*.  When in *non-static* mode, data can be freely modified within the set, making this mode to use during the data-entry.  In the *static* mode, data cannot be modified and the `DataSet` is essentially read-only.  The `done_adding_data` method of a `DataSet` switches from non-static to static mode, and should be called, as the name implies, once all desired data has been added (or modified).  Once a `DataSet` is static, it is read-only for the rest of its life; to modify its data the best one can do is make a non-static *copy* via the `copy_nonstatic` member and modify the copy.

- When data for a gate string is present in a `DataSet`, counts must exist for *all* SPAM labels.  That is, for a given gate string, you cannot store counts for only a subset of the SPAM labels.  Because of this condition, dictionary-access syntax of the SPAM label (i.e. `dataset[gateString][spamLabel]`) *cannot* be used to write counts for new `gateString` keys; One must either assign an entire dictionary of SPAM label-count pairs to `dataset[gateString]` or use the `add_`*xxx* methods (these methods add data for *all* SPAM labels at once). 

Once a `DataSet` is constructed, filled with data, and made *static*, it is typically passed as a parameter to one of pyGSTi's algorithm or driver routines to find a `GateSet` estimate based on the data.  This tutorial focuses on how to construct a `DataSet` and modify its data.  Later tutorials will demonstrate the different GST algorithms.

In [1]:
from __future__ import print_function
import pygsti

## Creating a `DataSet`
There three basic ways to create `DataSet` objects in `pygsti`:
* By creating an empty `DataSet` object and manually adding counts corresponding to gate strings.  Remember that the `add_`*xxx* methods must be used to add data for gate strings not yet in the `DataSet`.  Once the data is added, be sure to call `done_adding_data`, as this restructures the internal storage of the `DataSet` to optimize the access operations used by algorithms.
* By loading from a text-format dataset file via `pygsti.io.load_dataset`.  The result is a ready-to-use-in-algorithms *static* `DataSet`, so there's no need to call `done_adding_data` this time.
* By using a `GateSet` to generate "fake" data via `generate_fake_data`. This can be useful for doing simulations of GST, and comparing to your experimental results.

We do each of these in turn in the cells below.

In [2]:
#1) Creating a data set from scratch
#    Note that tuples may be used in lieu of GateString objects
ds1 = pygsti.objects.DataSet(spamLabels=['plus','minus'])
ds1.add_count_dict( ('Gx',), {'plus': 10, 'minus': 90} )
ds1.add_count_dict( ('Gx','Gy'), {'plus': 40, 'minus': 60} )
ds1[('Gy',)] = {'plus': 10, 'minus': 90} # dictionary assignment

#Modify existing data using dictionary-like access
ds1[('Gx',)]['plus'] = 15
ds1[('Gx',)]['minus'] = 85

#GateString objects can be used.
gs = pygsti.objects.GateString( ('Gx','Gy'))
ds1[gs]['plus'] = 45
ds1[gs]['minus'] = 55

ds1.done_adding_data()

In [3]:
#2) By creating and loading a text-format dataset file.  The first
#    row is a directive which specifies what the columns (after the
#    first one) holds.  Other allowed values are "plus frequency", 
#    "minus count", etc.  Note that "plus" and "minus" in are the 
#    SPAM labels and must match those of any GateSet used in 
#    conjuction with this DataSet.
dataset_txt = \
"""## Columns = plus count, count total
{} 0 100
Gx 10 90
GxGy 40 60
Gx^4 20 90
"""
with open("tutorial_files/Example_TinyDataset.txt","w") as tinydataset:
    tinydataset.write(dataset_txt)
ds2 = pygsti.io.load_dataset("tutorial_files/Example_TinyDataset.txt")

Loading tutorial_files/Example_TinyDataset.txt: 100%


In [4]:
#3) By generating fake data (using the std1Q_XYI standard gate set module)
from pygsti.construction import std1Q_XYI

#Depolarize the perfect X,Y,I gate set
depol_gateset = std1Q_XYI.gs_target.depolarize(gate_noise=0.1)

#Compute the sequences needed to perform Long Sequence GST on 
# this GateSet with sequences up to lenth 512
gatestring_list = pygsti.construction.make_lsgst_experiment_list(
    std1Q_XYI.gs_target, std1Q_XYI.prepStrs, std1Q_XYI.effectStrs,
    std1Q_XYI.germs, [1,2,4,8,16,32,64,128,256,512])

#Generate fake data (Tutorial 00)
ds3 = pygsti.construction.generate_fake_data(depol_gateset, gatestring_list, nSamples=1000,
                                             sampleError='binomial', seed=100)
ds3b = pygsti.construction.generate_fake_data(depol_gateset, gatestring_list, nSamples=50,
                                              sampleError='binomial', seed=100)

#Write the ds3 and ds3b datasets to a file for later tutorials
pygsti.io.write_dataset("tutorial_files/Example_Dataset.txt", ds3, spamLabelOrder=['plus','minus']) 
pygsti.io.write_dataset("tutorial_files/Example_Dataset_LowCnts.txt", ds3b) 

## Viewing `DataSets`

In [5]:
#It's easy to just print them:
print("Dataset1:\n", ds1)
print("Dataset2:\n", ds2)
print("Dataset3 is too big to print, so here it is truncated to Dataset2's strings\n", ds3.truncate(ds2.keys()))

Dataset1:
 Gx  :  {'minus': 85.0, 'plus': 15.0}
GxGy  :  {'minus': 55.0, 'plus': 45.0}
Gy  :  {'minus': 90.0, 'plus': 10.0}


Dataset2:
 {}  :  {'minus': 100.0, 'plus': 0.0}
Gx  :  {'minus': 80.0, 'plus': 10.0}
GxGy  :  {'minus': 20.0, 'plus': 40.0}
Gx^4  :  {'minus': 70.0, 'plus': 20.0}


Dataset3 is too big to print, so here it is truncated to Dataset2's strings
 {}  :  {'minus': 1000.0, 'plus': 0.0}
Gx  :  {'minus': 501.0, 'plus': 499.0}
GxGy  :  {'minus': 504.0, 'plus': 496.0}
Gx^4  :  {'minus': 829.0, 'plus': 171.0}




## Iteration over data sets

In [6]:
# A DataSet's keys() method returns a list of GateString objects
ds1.keys()

[GateString(Gx), GateString(GxGy), GateString(Gy)]

In [7]:
# There are many ways to iterate over a DataSet.  Here's one:
for gatestring in ds1.keys():
    dsRow = ds1[gatestring]
    for spamlabel in dsRow.keys():
        print("Gatestring = %s, SPAM label = %s, count = %d" % \
            (str(gatestring).ljust(5), str(spamlabel).ljust(6), dsRow[spamlabel]))

Gatestring = Gx   , SPAM label = plus  , count = 15
Gatestring = Gx   , SPAM label = minus , count = 85
Gatestring = GxGy , SPAM label = plus  , count = 45
Gatestring = GxGy , SPAM label = minus , count = 55
Gatestring = Gy   , SPAM label = plus  , count = 10
Gatestring = Gy   , SPAM label = minus , count = 90


## Advanced features of data sets

### `collisionAction` argument
When creating a `DataSet` one may specify the `collisionAction` argument as either `"aggregate"` (the default) or `"keepseparate"`.  The former instructs the `DataSet` to simply add the counts of like outcomes when counts are added for an already existing gate sequence.  `"keepseparate"`, on the other hand, causes the `DataSet` to tag added count data by appending a fictitious `"#<n>"` gate label to a gate sequence that already exists, where `<n>` is an integer.  When retreiving the keys of a `keepseparate` data set, the `stripOccuranceTags` argument to `keys()` determines whether the `"#<n>"` labels are included in the output (if they're not - the default - duplicate keys may be returned).  Access to different occurances of the same data are provided via the `occurrance` argument of the `get_row` and `set_row` functions, which should be used instead of the usual bracket indexing.

In [8]:
ds_agg = pygsti.objects.DataSet(spamLabels=['plus','minus'], collisionAction="aggregate") #the default
ds_agg.add_count_dict( ('Gx','Gy'), {'plus': 10, 'minus': 90} )
ds_agg.add_count_dict( ('Gx','Gy'), {'plus': 40, 'minus': 60} )
print("Aggregate-mode Dataset:\n",ds_agg)

ds_sep = pygsti.objects.DataSet(spamLabels=['plus','minus'], collisionAction="keepseparate")
ds_sep.add_count_dict( ('Gx','Gy'), {'plus': 10, 'minus': 90} )
ds_sep.add_count_dict( ('Gx','Gy'), {'plus': 40, 'minus': 60} )
print("Keepseparate-mode Dataset:\n",ds_sep)


Aggregate-mode Dataset:
 GxGy  :  {'minus': 150.0, 'plus': 50.0}


Keepseparate-mode Dataset:
 GxGy  :  {'minus': 90.0, 'plus': 10.0}
GxGy#1  :  {'minus': 60.0, 'plus': 40.0}




### `measurementGates` argument
When creating a `DataSet` one may facilitate the analsys of **intermediate measurements** within a gate sequence by specifying an association among gate labels via the `measurementGates` argument.  We'll use the term "measurement-gate" to mean a gate that also has classical outputs.  The `measurementGates` argument may be set to a dictionary whose keys are measurement-gate names (user defined) and whose values are lists of gate labels which correspond to the performing the measurement-gate and getting different outcomes.  Note that it doesn't matter how many outcomes the gate has (it doesn't need to correspond to the end-measurement outcomes, for instance).

Internally, when computing the total number of counts for a gate sequence, the `DataSet` will sum the counts for *all* of the possible outcomes of a measurement gate.  This allows standard GST protocols to estimate the gates corresponding to a single outcome as *trace-reducing* maps.  Finally, note that `generate_fake_data` also accepts a similar `measurementGates` argument for generating simulated intermediate-measurement data (see its docstring for more info).

Below we provide a brief demonstration of a `DataSet` created with a measurement gate.

In [9]:
ds_mg = pygsti.obj.DataSet(spamLabels=('plus','minus'),
                           measurementGates={'Zmeas': ['Gmz_plus', 'Gmz_minus']})

ds_mg.add_count_dict(('Gx',), {'plus': 10, 'minus': 90})
ds_mg.add_count_dict(('Gy',), {'plus': 20, 'minus': 80})
ds_mg.add_count_dict(('Gmz_plus',), {'plus': 10, 'minus': 90})
ds_mg.add_count_dict(('Gmz_minus',), {'plus': 50, 'minus': 50})
ds_mg.add_count_dict(('Gmz_plus','Gx'), {'plus': 100, 'minus': 800})
ds_mg.add_count_dict(('Gmz_minus','Gx'), {'plus': 50, 'minus': 50})
ds_mg.add_count_dict(('Gmz_minus','Gy'), {'plus': 20, 'minus': 20})

#Print the contents of ds_mg in detail -- note the totals for the
# gate sequences containing the measurement gate elements Gmz_*
# have aggregated total counts.
for g in ds_mg:
    counts = {}
    fracs = {}
    for sl in ds_mg.get_spam_labels():
        counts[sl] = ds_mg[g][sl]
        fracs[sl] = ds_mg[g].fraction(sl)
    print( str(tuple(g)) + ":" + "\ncounts = " + str(counts) 
          + "\nfracs = " + str(fracs)
          + "\ntotal = %d\n" % ds_mg[g].total())

('Gx',):
counts = {'minus': 90.0, 'plus': 10.0}
fracs = {'minus': 0.90000000000000002, 'plus': 0.10000000000000001}
total = 100

('Gy',):
counts = {'minus': 80.0, 'plus': 20.0}
fracs = {'minus': 0.80000000000000004, 'plus': 0.20000000000000001}
total = 100

('Gmz_plus',):
counts = {'minus': 90.0, 'plus': 10.0}
fracs = {'minus': 0.45000000000000001, 'plus': 0.050000000000000003}
total = 200

('Gmz_minus',):
counts = {'minus': 50.0, 'plus': 50.0}
fracs = {'minus': 0.25, 'plus': 0.25}
total = 200

('Gmz_plus', 'Gx'):
counts = {'minus': 800.0, 'plus': 100.0}
fracs = {'minus': 0.80000000000000004, 'plus': 0.10000000000000001}
total = 1000

('Gmz_minus', 'Gx'):
counts = {'minus': 50.0, 'plus': 50.0}
fracs = {'minus': 0.050000000000000003, 'plus': 0.050000000000000003}
total = 1000

('Gmz_minus', 'Gy'):
counts = {'minus': 20.0, 'plus': 20.0}
fracs = {'minus': 0.5, 'plus': 0.5}
total = 40



## The `MultiDataSet` object: a dictionary of `DataSet`s
Sometimes it is useful to deal with several sets of data all of which hold counts for the *same* set of gate sequences.  For example, colleting data to perform GST on Monday and then again on Tuesday, or making an adjustment to an experimental system and re-taking data, could create two separate data sets with the same sequences.  PyGSTi has a separate data type, `pygsti.objects.MultiDataSet`, for this purpose.  A `MultiDataSet` looks and acts like a simple dictionary of `DataSet` objects, but underneath implements some certain optimizations that reduce the amount of space and memory required to store the data.  Primarily, it holds just a *single* list of the gate sequences - as opposed to an actual dictionary of `DataSet`s in which each `DataSet` contains it's own copy of the gate sequences.  In addition to being more space efficient, a `MultiDataSet` is able to aggregate all of its data into a single "summed" `DataSet` via `get_datasets_sum(...)`, which can be useful for combining several "passes" of experimental data.  

Several remarks regarding a `MultiDataSet` are worth mentioning:
- you add `DataSets` to a `MultiDataSet` using the `add_dataset` method.  However only *static* `DataSet` objects can be added.  This is because the MultiDataSet must keep all of its `DataSet`s locked to the same set of sequences, and a non-static `DataSet` allows the addition or removal of only its sequences.  (If the `DataSet` you want to add isn't in static-mode, call its `done_adding_data` method.)
- you can also create and add a `DataSet` via the `add_dataset_counts` method, which takes a 2D numpy array of count data instead of a `DataSet` object.  This can be more convenient than `add_dataset` in certain circumstances.
- square-bracket indexing accesses the `MultiDataSet` as if it were a dictionary of `DataSets`.
- `MultiDataSets` can be loaded and saved from a single text-format file with columns for each contained `DataSet` - see `pygsti.io.load_multidataset`.

Here's a brief example of using a `MultiDataSet`:

In [10]:
multiDS = pygsti.objects.MultiDataSet()

#Create some datasets                                           
ds = pygsti.objects.DataSet(spamLabels=['plus','minus'])
ds.add_count_dict( (), {'plus': 10, 'minus': 90} )
ds.add_count_dict( ('Gx',), {'plus': 10, 'minus': 90} )
ds.add_counts_1q( ('Gx','Gy'), 20, 80 )
ds.add_counts_1q( ('Gx','Gx','Gx','Gx'), 20, 80 )
ds.done_adding_data()

ds2 = pygsti.objects.DataSet(spamLabels=['plus','minus'])            
ds2.add_count_dict( (), {'plus': 15, 'minus': 85} )
ds2.add_count_dict( ('Gx',), {'plus': 5, 'minus': 95} )
ds2.add_count_dict( ('Gx','Gy'), {'plus': 30, 'minus': 70} )
ds2.add_count_dict( ('Gx','Gx','Gx','Gx'), {'plus': 40, 'minus': 60} )
ds2.done_adding_data()

multiDS['myDS'] = ds
multiDS['myDS2'] = ds2

nStrs = len(multiDS)
dslabels = list(multiDS.keys())
print("MultiDataSet has %d gate strings and DataSet labels %s" % (nStrs, dslabels))
    
for dslabel in multiDS:
    ds = multiDS[dslabel]
    print("Empty string data for %s = " % dslabel, ds[()])       

for ds in multiDS.itervalues():
    print("Gx string data (no label) =", ds[('Gx',)])     

for dslabel,ds in multiDS.iteritems():
    print("GxGy string data for %s =" % dslabel, ds[('Gx','Gy')])  

dsSum = multiDS.get_datasets_sum('myDS','myDS2')
print("\nSummed data:")
print(dsSum)


MultiDataSet has 2 gate strings and DataSet labels ['myDS', 'myDS2']
Empty string data for myDS =  {'minus': 90.0, 'plus': 10.0}
Empty string data for myDS2 =  {'minus': 85.0, 'plus': 15.0}
Gx string data (no label) = {'minus': 90.0, 'plus': 10.0}
Gx string data (no label) = {'minus': 95.0, 'plus': 5.0}
GxGy string data for myDS = {'minus': 80.0, 'plus': 20.0}
GxGy string data for myDS2 = {'minus': 70.0, 'plus': 30.0}

Summed data:
{}  :  {'minus': 175.0, 'plus': 25.0}
Gx  :  {'minus': 185.0, 'plus': 15.0}
GxGy  :  {'minus': 150.0, 'plus': 50.0}
GxGxGxGx  :  {'minus': 140.0, 'plus': 60.0}




In [11]:
multi_dataset_txt = \
"""## Columns = DS0 plus count, DS0 minus count, DS1 plus frequency, DS1 count total                                
{} 0 100 0 100                                                                                                      
Gx 10 90 0.1 100                                                                                                    
GxGy 40 60 0.4 100                                                                                                  
Gx^4 20 80 0.2 100                                                                                                  
"""

with open("tutorial_files/TinyMultiDataset.txt","w") as output:
    output.write(multi_dataset_txt)
multiDS_fromFile = pygsti.io.load_multidataset("tutorial_files/TinyMultiDataset.txt", cache=True)

print("\nLoaded from file:\n")
print(multiDS_fromFile)

Loading tutorial_files/TinyMultiDataset.txt: 100%
Writing cache file (to speed future loads): tutorial_files/TinyMultiDataset.txt.cache

Loaded from file:

MultiDataSet containing: 2 datasets, each with 4 strings
 Dataset names = DS0, DS1
 SPAM labels = plus, minus
Gate strings: 
{}
Gx
GxGy
Gx^4



## The `TDDataSet` object: time dependent data
When your data is time-stamped, either for each individual count or by groups of counts, there are additional (richer) options for analysis.  The way pyGSTi iteracts with time-stamped data is the `TDDataSet` object.  A `TDDataSet` holds *series* of count data rather than binned numbers-of-counts, which are added via its `add_series_data` method.  Outcome counts are input by giving at least two parallel arrays of 1) SPAM labels (i.e. outcome labels) and 2) time stamps.  Optionally, one can provide a third array of repetitions, specifying how many times the corresponding outcome occurred at the time stamp.  While in reality no two outcomes are taken at exactly the same time, a `TDDataSet` allows for arbitrarily *coarse-grained* time-dependent data in which multiple outcomes are all tagged with the *same* time stamp.

Below we demonstrate how to create and initialize a `TDDataSet`.

In [12]:
#Create an empty dataset                                                                       
tdds = pygsti.objects.TDDataSet(spamLabels=['plus','minus'])

#Add a "single-shot" series of outcomes, where each spam label (outcome) has a separate time stamp
tdds.add_series_data( ('Gx',), #gate sequence                                                                 
                      ['plus','plus','minus','plus','minus','plus','minus','minus','minus','plus'], #spam labels                                                                                                                 
                      [0.0, 0.2, 0.5, 0.6, 0.7, 0.9, 1.1, 1.3, 1.35, 1.5]) #time stamps                                                                                              

#Add a "coarse-grained" series of outcomes: 
# 3 'plus' outcomes at time 0.0, followed by 2 'minus' outcomes at time 1.0
tdds.add_series_data( ('Gy',),  #gate sequence                                                               
                      ['plus','minus'], #spam labels                                                         
                      [0.0, 1.0], #time stamps                                                               
                      [3,2]) #repeats     

#The above coarse-grained addition is logically identical to:
# tdds.add_series_data( ('Gy',),  #gate sequence                                                               
#                       ['plus','plus','plus','minus','minus'], #spam labels                                                         
#                       [0.0, 0.0, 0.0, 1.0, 1.0]) #time stamps                                                               
# (However, the TDDataSet will store the coase-grained addition more efficiently.) 

`TDDataSet`s have the same "static" and "non-static" modes as a `DataSet`, so when one is done populating the `TDDataSet` with data one should call `done_adding_data`:

In [13]:
tdds.done_adding_data()

Access to the underlying time series data is done by indexing on the gate sequence to get a `TDDataSetRow` object, which has various methods for retrieving its underlying data: 

In [14]:
tdds_row = tdds[('Gx',)]
print("INFO for Gx string:\n")
print( tdds_row )
      
print( "Raw spam label indices:", tdds_row.sli )
print( "Raw time stamps:", tdds_row.time )
print( "Raw repetitions:", tdds_row.reps )
print( "Number of entries in raw arrays:", len(tdds_row) )

print( "Spam Labels:", tdds_row.get_sl() )
print( "Repetition-expanded spam labels:", tdds_row.get_expanded_sl() )
print( "Repetition-expanded spam label indices:", tdds_row.get_expanded_sli() )
print( "Repetition-expanded time stamps:", tdds_row.get_expanded_times() )
print( "DataSet-like counts per spam label:", tdds_row.get_counts() )
print( "DataSet-like total counts:", tdds_row.total() )
print( "DataSet-like spam label fraction:", tdds_row.fraction('plus') )

print("\n")

tdds_row = tdds[('Gy',)]
print("INFO for Gy string:\n")
print( tdds_row )
      
print( "Raw spam label indices:", tdds_row.sli )
print( "Raw time stamps:", tdds_row.time )
print( "Raw repetitions:", tdds_row.reps )
print( "Number of entries in raw arrays:", len(tdds_row) )

print( "Spam Labels:", tdds_row.get_sl() )
print( "Repetition-expanded spam labels:", tdds_row.get_expanded_sl() )
print( "Repetition-expanded spam label indices:", tdds_row.get_expanded_sli() )
print( "Repetition-expanded time stamps:", tdds_row.get_expanded_times() )
print( "DataSet-like counts per spam label:", tdds_row.get_counts() )
print( "DataSet-like total counts:", tdds_row.total() )
print( "DataSet-like spam label fraction:", tdds_row.fraction('plus') )



INFO for Gx string:

Spam Label Indices = [0 0 1 0 1 0 1 1 1 0]
Time stamps = [ 0.    0.2   0.5   0.6   0.7   0.9   1.1   1.3   1.35  1.5 ]
Repetitions = [1 1 1 1 1 1 1 1 1 1]

Raw spam label indices: [0 0 1 0 1 0 1 1 1 0]
Raw time stamps: [ 0.    0.2   0.5   0.6   0.7   0.9   1.1   1.3   1.35  1.5 ]
Raw repetitions: [1 1 1 1 1 1 1 1 1 1]
Number of entries in raw arrays: 10
Spam Labels: ['plus', 'plus', 'minus', 'plus', 'minus', 'plus', 'minus', 'minus', 'minus', 'plus']
Repetition-expanded spam labels: ['plus', 'plus', 'minus', 'plus', 'minus', 'plus', 'minus', 'minus', 'minus', 'plus']
Repetition-expanded spam label indices: [0 0 1 0 1 0 1 1 1 0]
Repetition-expanded time stamps: [ 0.    0.2   0.5   0.6   0.7   0.9   1.1   1.3   1.35  1.5 ]
DataSet-like counts per spam label: OrderedDict([('plus', 5.0), ('minus', 5.0)])
DataSet-like total counts: 10
DataSet-like spam label fraction: 0.5


INFO for Gy string:

Spam Label Indices = [0 1]
Time stamps = [ 0.  1.]
Repetitions = [3 2]

Raw 

Finally, it is possible to read text-formatted time-dependent data in the special case when
1. the outcomes are all single-shot 
2. the time stamps of the outcomes are the integers (starting at zero) for *all* of the gate sequences.
This corresponds to the case when each sequence is performed and measured simultaneously at equally spaced intervals.  We realize this is a bit fictitous and more text-format input options will be created in the future.

In [15]:
tddataset_txt = \
"""## 0 = minus
## 1 = plus
{} 011001
Gx 111000111
Gy 11001100
"""
with open("tutorial_files/TDDataset.txt","w") as output:
    output.write(tddataset_txt)
    
tdds_fromfile = pygsti.io.load_tddataset("tutorial_files/TDDataset.txt")

print(tdds_fromfile)

Loading tutorial_files/TDDataset.txt: 100%
{}  :  Spam Label Indices = [0 1 1 0 0 1]
Time stamps = [ 0.  1.  2.  3.  4.  5.]
( no repetitions )

Gx  :  Spam Label Indices = [1 1 1 0 0 0 1 1 1]
Time stamps = [ 0.  1.  2.  3.  4.  5.  6.  7.  8.]
( no repetitions )

Gy  :  Spam Label Indices = [1 1 0 0 1 1 0 0]
Time stamps = [ 0.  1.  2.  3.  4.  5.  6.  7.]
( no repetitions )



