# Lab 2- Numpy

Read through the following notebook to get an introduction to numpy: [Numpy Intro](jrjohansson-lectures/Lecture-2-Numpy.ipynb)

## Exercise 2.1

Let start with some basic reshape manipulations. Consider a classification task. We can imagine the training data X consisting of N examples each with M inputs, so the shape of X is (M,N). We usually express the output of the Neural Network, which for the training sample encodes the true class of each of the M examples in X, in a "one-hot" matrix of shape (N,C), where C is the number of classes and each row corresponds to the true class for the corresponding example in X. So for a given row Y[i], all elements are 0 except for the column corresponding to the true class.

For example consider a classification task of separating between 4 classes. We'll call them A, B, C, and D.


In [3]:
import numpy as np

Y=np.array( [ [0, 1, 0, 0], # Class B
              [1, 0, 0, 0], # Class A
              [0, 0, 1, 0], # Class C
              [0, 0, 0, 1]  # Class D
            ])

print "Shape of Y:", Y.shape

Shape of Y: (4, 4)


Lets imagine that we want to change to a 2 classes instead by combining classes A with B and C with D. Use np.reshape and np.sum to create a new vector Y1. Hint: change the shape of Y into (8,2), sum along the correct axes, and change shape to (4,2).

In [4]:
#Y1= Y # Replace Y with operations on Y which result in the requested answer. 

Y1 = np.reshape(Y, (8,2))
Y1 = np.sum(Y1, axis = 1)
Y1 = np.reshape(Y1, (4,2))
print Y1


[[1 0]
 [1 0]
 [0 1]
 [0 1]]


## Exercise 2.2

Oftentimes we find that neutral networks work best when their input is mostly between 0,1. Below, we create a random dataset that is normal distributed (mean of 4, sigma of 10). Shift the data so that the mean is 0.5 and 68% of the data lies between 0 and 1.

In [5]:
X=np.random.normal(4,10,1000)
print np.mean(X)
print np.min(X)
print np.max(X)

4.21414303669
-29.914494204
41.2006876759


In [6]:
X1 = X
for i in range(1000):
    X1[i] = (X1[i]+6)/20 #add min, divide by range
print np.mean(X1)

0.510707151834


## Exercise 2.3

Using np.random.random and np.random.normal to generate two datasets. Then use np.where to repeat exercise 1.4 showing that one creates a flat distribution and the other does not. 

In [7]:
A1 = np.random.random((1000))
A2 = np.random.normal(0.5,0.5,1000)

N1 = np.where(A1<0.25)
N4 = np.where(A1>0.75)
c1 = np.where(A1<0.5)
N2 = np.setdiff1d(c1, N1)
c2 = np.where(A1>0.5)
N3 = np.setdiff1d(c2, N4)
SN1 = np.size(N1)
SN2 = np.size(N2)
SN3 = np.size(N3)
SN4 = np.size(N4)

P1 = np.where(A2<0.25)
P4 = np.where(A2>0.75)
d1 = np.where(A2<0.5)
P2 = np.setdiff1d(d1, P1)
d2 = np.where(A2>0.5)
P3 = np.setdiff1d(d2, P4)
SP1 = np.size(P1)
SP2 = np.size(P2)
SP3 = np.size(P3)
SP4 = np.size(P4)

print "Number of Entries passing N1:", SN1
print "Number of Entries passing N2:", SN2
print "Number of Entries passing N3:", SN3
print "Number of Entries passing N4:", SN4

print "Number of Entries passing P1:", SP1
print "Number of Entries passing P2:", SP2
print "Number of Entries passing P3:", SP3
print "Number of Entries passing P4:", SP4

Number of Entries passing N1: 260
Number of Entries passing N2: 231
Number of Entries passing N3: 239
Number of Entries passing N4: 270
Number of Entries passing P1: 297
Number of Entries passing P2: 207
Number of Entries passing P3: 202
Number of Entries passing P4: 294


## Exercise 2.4

Now lets play with some real data. We will load a file of example Neutrino interactions in LArTPC detector. There are 2 read out planes in the detector with 240 wires each, sampled 4096 times. Shift the images in the same way as exercise 2.2.

In [8]:
import h5py
f=h5py.File("/data/LArIAT/h5_files/nue_CC_3-1469384613.h5","r")
print f.keys()
images=f["features"] #interested in features
print images.shape

[u'Eng', u'Track_length', u'enu_truth', u'features', u'lep_mom_truth', u'mode_truth', u'pdg']
(2500, 2, 240, 4096)


In [9]:
print images[0]

imagmin = np.min(images)
imagmax = np.max(images)
imagrange = imagmax - imagmin
normal_images = (images - imagmin)/imagrange
                
print "done"

[[[ 0.  0.  0. ...,  0.  0.  0.]
  [ 0.  0.  0. ...,  0.  0.  0.]
  [ 0. -1. -1. ...,  0.  0.  0.]
  ..., 
  [ 0.  1.  1. ...,  0.  0.  0.]
  [ 0.  0.  0. ...,  0.  0.  0.]
  [ 0.  0.  0. ...,  0.  0.  0.]]

 [[ 0.  0.  0. ...,  0.  0.  0.]
  [-1. -1.  0. ..., -1. -1. -1.]
  [ 0.  0.  0. ...,  0.  0.  0.]
  ..., 
  [-1. -1. -1. ..., -1. -1. -1.]
  [ 0.  0.  0. ...,  0.  0.  0.]
  [ 0.  0.  0. ...,  0.  0.  0.]]]
done
