# Creating Data
Use this file to create and alter datasets.

## Instructions

Enter the following parameters as desired:

N - the number of ions in the lattice

data_size - the size of data set that should be generated prior to filtration, depending on N if you select data_size to be too large the program will crash, but if you set it to be too small you wont get enough data after filtration. At the end of this noteboook there is code to combine compatible datasets as a solution to the second problem.

trapping_strengths - specify the desired trapping strengths in the ion chain

In [1]:
import sys
sys.path.insert(0, '/Users/marinadrygala/Desktop/Marina/')
import classesmu
from classesmu import BatchSimulatedSpinLattice as bsslmu
import pickle
import numpy as np
import torch
import math
import ic_functions
from ic_functions import chain, circle
import time


"""Enter the desired values for the variables below as mentioned in the instructions """
data_size=200000
N=4
trapping_strengths = [5,1]

"""Generates the desired instance of the ionchain object."""
ic = classesmu.IonChain(N,trapping_strengths)


use_cuda = torch.cuda.is_available()
device = torch.device("cuda" if use_cuda else "cpu")



## Instructions C'td
Alter the code for $\mu$ as desired.

In [2]:
"""The transverse mode frequencies"""
omega_m = torch.tensor(ic.w_x)

"""The number of entries in the Jij matrix that are above the main diagonal"""
size_Jij = N*(N-1)/2

"""Calculating a predictably useful value for mu"""
mu = torch.zeros(N)
for i in range(N-1):
    mu[i] = omega_m[i] + (omega_m[i+1]-omega_m[i])*0.1
mu[-1] = (omega_m[-1]*1.01)



In [None]:
"""For removing desired values of mu"""
indices=torch.ones(n,dtype=torch.uint8)
removed_mus=[]
for i in removed_mus:
    indices[i]=0
mu = mu[indices]

## Instruction C'td
We use the functions in the following cell to produce a single number that summarizes each $J_{i,j}$. The gaussian and linear functions could both be used, however we have used the linear function. The theta function gives an idication of how the weight of the values in the $J_{i,j}$ matrix are distributed with relation to how close ions $i$ and $j$ are. If we pick a minimum threshold value for theta(Jij) then we can filter out data that doesn't closely represent the physical system.

In [3]:
"""m if the number of detunings is and important value for training and prediction"""
m=len(mu)

"""The row and column indices for the entries of an nxn matrix that fall above the diagonal"""
row_indices, col_indices = np.triu_indices(N,1)

"""A parameter that can be set as desired if using the gaussian function to filter the data."""
sigma=2.5

"""The gaussian and linear functions serve a similar purpose and as stated in the instructions either can be used.
They output a vector with entries corresponding to a weighting of the entries in Jij matricies.
For an i,j pair, the distance between i and j and the corresponding entry outputted by the functions have an inverse
relationship. This property is then utilized by theta. A higher theta(Jij) where Jij is normalized as described,
indicates that more of the interactions are between ions that are close together, and these Jijs more closely
represent the physical properties we are interested in."""
def gaussian():
    fij = np.exp(-(row_indices-col_indices)**2/(2*sigma**2))
    return fij

def linear():
    fij = np.abs(row_indices-col_indices)
    return 1/fij

filter_func = linear()

def theta(Jij):
    theta = 1/size_Jij*np.sum(np.abs(Jij)*filter_func)
    return theta


## Instruction C'td
We use the code in the following cell to produce a test set for our data containing Jijs that we are interested in.
In the cell below that we apply theta on the Jijs contained in our test set to find out an appropriate minimum threshold value for which we call epsilon.

In [None]:
"""This code is to make a test set that includes a chain lattice, a circular lattice and some circular lattices modified
to contain one additional interaction(if n>3). In each of these cases all of the interactions are of the same strength. 
Alot of the possible interactions are just rotations or reflections of one another and so produce the same theta value.
Only some of these are included and thus a formula for a reasonable test set size is as follows."""

test_set_size=math.floor(N/2)+1

"""chain is a function located in ic_functions and produces a vectorized form of a chain lattice."""
chain_lattice=chain(N)

"""stacking the necessary copies of the chain lattice to be modified as needed, to produce the desired test set."""
test_set = np.tile(chain_lattice,(test_set_size,1))
"""modifying the entries of the matrix to produce desired test set"""
test_set[:,N-2]=np.ones((test_set_size))
test_set[0,N-2]=0
id_size=test_set_size-2
test_set[2:2+id_size,1:1+id_size]=np.identity(id_size)

"""converting to a torch tensor"""
test_set=torch.from_numpy(test_set)

"""Normalizing the test set"""
norms=torch.sqrt((test_set**2).sum(1))
test_set=torch.einsum('ij,i->ij',[test_set,1/norms])



In [None]:
"""Applying theta function along each row of the test set. The threshold value epsilon is set to be the minimum value
of theta produced by any Jij in the test set, rounded down to 2 decimal places. Epsilon will be used later for filtration"""
np_test_set = test_set.numpy()
thetas=np.apply_along_axis(theta,1,np_test_set)
epsilon = min(thetas) // 0.01 / 100
print(thetas)
print(epsilon)

## Instruction C'td
The following cells are used to generate and store data.

If a larger amount of data is needed than can be generated with the available memory the last cell is used to combine multiple data files into a single one.

In [None]:
"""Omega should take on a tensor of dimension data_sizeXnXm. We select the values Omega from a random uniform
distribution from the interval [-1,1]"""
#torch.manual_seed(2)
Omega = 2*torch.rand(data_size,N,m)-1
"""bsslmu is a function imported from the classesmu file and when .normalize() is called a tensor of dimension
data_sizeXsize_Jij where ith row of the output corresponds to the vectorized form of the Jij produced by the ith
tensor of Omega."""
Jijs = bsslmu(ic, mu, Omega, dev=device).normalize()
"""We reshape Omega to be of dimension data_sizeX(N*m) for training purposes"""
Omegas=Omega.reshape(data_size,-1)



In [None]:
"""Filtering Data so that any Jij,Omega pair in our data set that does not satisfy theta(Jij)>epsilon is removed"""
thetas=torch.from_numpy(np.apply_along_axis(theta,1,Jijs))
indices=thetas>epsilon
Jijs_filt=Jijs[indices]
Omegas_filt=Omegas[indices]

In [None]:
"""This is inefficient and may not produce as much data as we would like. The length of the filtered data set helps us
deterimine if we should run the program again."""
len(Jijs_filt)

In [None]:
"""Saving the data into a dictionary. When we train our network the inputs will be the Jijs, the outputs the Omegas. At
each training epoch will be interested in how well our network can predict an appropriate Omega for Jijs in our test
set. To generate network sizes, and network predictions it will be necessary to save the ionchain and the values of mu
as well as m, and the list of mus removed from our original mu."""
d={'inputs': Jijs_filt,
  'outputs': Omegas_filt,
   'test': test_set,
  'ic': ic,
  'm':m,
  'mu':mu,
  'removed_mus': removed_mus}

"""the program will save the data into the specified filname"""
filename = 'Data_N={}_m={}_Epsilon={}_Size={}.pickle'.format(N,m,epsilon,len(Jijs_filt))
f = open(filename, 'wb')
pickle.dump(d, f)
f.close()

In [None]:
"""Code to combine data sets created using the same ionchain and mu."""
# d1 = pickle.load(open('Data_N=10_m=9_Epsilon=0.06_Size=1715.pickle', "rb"))
# d2 = pickle.load(open('Data_N=10_m=9_Epsilon=0.06_Size=96413.pickle', "rb"))
# d3 = pickle.load(open('Data_N=10_m=9_Epsilon=0.06_Size=1687.pickle', "rb"))



# d={'inputs': torch.cat((d1['inputs'],d2['inputs'],d3['inputs'])),
#   'outputs': torch.cat((d1['outputs'],d2['outputs'],d3['outputs'])),
#    'test': d1['test'],
#   'ic': d1['ic'],
#   'm':d1['m'],
#   'mu':d1['mu'], 
#   'removed_mus':d1['removed_mus']}
# filename = 'Data_N={}_m={}_Epsilon={}_Size={}.pickle'.format(n,m,epsilon,
#                                                          len(d1['inputs'])+len(d2['inputs'])+len(d3['inputs']))
# f = open(filename, 'wb')
# pickle.dump(d, f)
# f.close()

# len(d['inputs'])

In [8]:
d=pickle.load(open('/Users/marinadrygala/Desktop/Marina/mu_fixed/Data/Data_N=4_Epsillon=0.27_Size=180984.pickle',"rb"))
d['m']=m
d['mu']=mu
#d['removed_mus']=removed_mus

f = open('/Users/marinadrygala/Desktop/Marina/mu_fixed/Data/Data_N=4_Epsillon=0.27_Size=180984_2.pickle', 'wb')
pickle.dump(d, f)
f.close()