# Common Excel tasks demonstrated in NumPy

In [1]:
import numpy as np

## Mean of a list/array:

In [2]:
x = np.arange(1,1001)
print(np.mean(x))

500.5


## Sum of a list/array:

In [3]:
npsum = np.sum(np.arange(1,100))
print(npsum)

4950


## Rounding off all elements of a list/array:

In [4]:
unrounded = np.random.uniform(10,20,[3,3])
print("Before rounding:\n",unrounded)

# rounding it to the nearest integer
roundedint = np.around(unrounded)
print("\nNearest integer:\n",roundedint)

# rounding it off to 3 decimal points:
rounded3dp = np.around(unrounded, decimals = 3)
print("\n3 decimal points:\n",rounded3dp)

# rounding it off to nearest 10s:
rounded10s = np.around(unrounded, decimals = -1)
print("\nNearest 10s:\n",rounded10s)

Before rounding:
 [[12.9952399  10.55737258 19.18578246]
 [16.89150082 10.70374339 15.17674215]
 [17.7017954  17.79360967 16.41009571]]

Nearest integer:
 [[13. 11. 19.]
 [17. 11. 15.]
 [18. 18. 16.]]

3 decimal points:
 [[12.995 10.557 19.186]
 [16.892 10.704 15.177]
 [17.702 17.794 16.41 ]]

Nearest 10s:
 [[10. 10. 20.]
 [20. 10. 20.]
 [20. 20. 20.]]


## Generating random data (floating point values):

### 1. Generating random data according to required dimensions: 

`np.random.rand(d0,d1,...,dn)`

<sub>The dimensions are represented as d0, d1, d2... till dn, we can have as many dimensions as we want</sub>

In [5]:
x = np.random.rand(2,3,4) #since we specified 3 numbers, the array produced will be 3-dimensional
print(x)

[[[0.19512557 0.79023975 0.41637689 0.55327087]
  [0.61091747 0.41512634 0.60482207 0.91526135]
  [0.55413882 0.18982628 0.56261327 0.76621825]]

 [[0.43174439 0.68229849 0.36019239 0.69302464]
  [0.09338781 0.6424747  0.37738869 0.71534625]
  [0.06558965 0.60220068 0.53581662 0.73774429]]]


In [6]:
y = np.random.rand(2,2,2,2) # now you will see a 4-dimesional array:
print(y)

[[[[0.90997892 0.37497747]
   [0.16848162 0.36186025]]

  [[0.20581691 0.01776121]
   [0.93013559 0.62603002]]]


 [[[0.68246436 0.11797568]
   [0.69325955 0.78287718]]

  [[0.7056686  0.30346444]
   [0.61357739 0.73141731]]]]


### 2. Sampling random data from a **normal distribution** according to required dimensions: 

`np.random.randn(d0,d1,...,dn)`

In [7]:
z = np.random.randn(5,4)
print(z)

print(np.mean(z))

[[-2.94122512e-01 -9.92312532e-01  2.55492974e-01 -1.57450530e+00]
 [-3.16507764e-01 -1.28064444e+00 -2.08129294e-01 -3.42077988e-01]
 [-4.96958026e-01  1.59949880e+00 -7.42711134e-02  6.25542708e-01]
 [-4.45861522e-01  1.56017706e+00 -6.04815782e-01  2.58192050e-01]
 [-4.20462689e-01  1.06142680e+00 -1.73619624e+00  1.42941130e-03]]
-0.17125526956676176


### 3. Generate random integers from low (inclusive) to high (exclusive): 

`np.randint(low[, high, size, dtype])` 

<sub>(the parameters in square brackets are optional)</sub>

According to the [official numpy documentation](https://docs.scipy.org/doc/numpy-1.15.0/reference/generated/numpy.random.randint.html):

Return random integers from the “discrete uniform” distribution of the specified dtype in the “half-open” interval \[low, high). If high is None (the default), then results are from \[0, low).

In [8]:
w = np.random.randint(256)
print(w)

153


In [9]:
v = np.random.randint(128,256) #returns a single int between 128 and 256 (256 excluded)
print(v)

240


In [10]:
u = np.random.randint(128,256,size = [12,3,4]) # returns a 12*3*4 3D array populated with numbers from 128 to 256 (256 excluded)

# this is the same as writing np.random.randint(128,256,[12,3,4]) or np.random.randint(128,256,(12,3,4))
# We normally write `size = [...]` so as to make it clear to the user what we are trying to achieve. 
# We may even write `low=128, high=256` and so on if we want to, but that's up to us if we want the code to be even more clear.
print(u)

[[[162 169 245 220]
  [200 238 129 143]
  [210 166 246 238]]

 [[143 243 198 239]
  [128 206 179 216]
  [255 227 227 240]]

 [[183 129 226 147]
  [201 199 173 186]
  [161 180 197 161]]

 [[222 186 222 233]
  [179 201 147 151]
  [212 196 130 228]]

 [[238 159 179 140]
  [199 252 173 163]
  [232 171 199 129]]

 [[213 142 185 135]
  [137 204 145 228]
  [143 219 172 252]]

 [[225 228 175 131]
  [232 180 135 235]
  [139 235 189 243]]

 [[211 187 234 248]
  [236 168 182 195]
  [155 240 145 242]]

 [[159 151 168 171]
  [133 213 133 232]
  [197 172 141 233]]

 [[169 233 216 251]
  [134 236 172 165]
  [225 238 214 129]]

 [[225 215 215 145]
  [173 171 221 247]
  [253 221 183 186]]

 [[147 214 202 145]
  [226 201 131 185]
  [179 196 155 242]]]


The `dtype` is basically the data type. You may be wondering, if we are already using `int`, then what is the use of a data type?

The thing is, numpy actually has a lot more data types (int32, int64, and long int, etc.) which are more exact representations of int in the memory. This option enables us to choose this, but it is not used generally, unless we have a very low memory budget.

### 3.1 Generate random integers of a fixed data type:

`np.random.random_integers(low[, high, size])` => Same as the previous function except we can't choose the datatype, which is set to `np.int` by default.

It is deprecated (i.e. no longer supported) in the latest version of numpy, which is why I get the warning below:

In [11]:
t = np.random.random_integers(0,12,[3,4])

  """Entry point for launching an IPython kernel.


### 4. Generate random floats in the half-open interval \[0.0,1.0):

<sub>Half open means that 0.0 is included but 1.0 is excluded</sub>

`np.random.ranf([size])` or 
`np.random.random_sample([size])` or 
`np.random.random([size])` or 
`np.random.sample([size])`

(This time, python provides 4 ways of achieving the exact same result... Kinda silly in my opinion, but maybe there's a deeper reason we don't understand yet)

In [12]:
p = np.random.ranf() # if you skip the [size] parameter, then it just returns a single value. 
# Try it out in the other ones by yourself
print(p)
q = np.random.random([2,3])
print(q)
r = np.random.random_sample([2,3])
print(r)
s = np.random.sample([2,3])
print(s)

0.3083429261932271
[[0.19599775 0.37591648 0.39400465]
 [0.19062593 0.36517083 0.7660336 ]]
[[0.95735662 0.25739196 0.53281486]
 [0.6116408  0.25729716 0.18546406]]
[[0.35690429 0.96086701 0.16854734]
 [0.13814293 0.5566175  0.7557197 ]]


### 5. Generate random floats in a chosen interval:

`np.random.uniform(start, stop, [size])`

In [13]:
ubiquity = np.random.uniform(0,100, [2,5])
print(ubiquity)

[[92.06969948 98.56645669 28.74051742 98.37955603 51.82740969]
 [68.92030661 56.95085797 79.67224607  4.73077008 44.43947337]]


### 6. Choose one number randomly from a list of numbers you enter yourself:

`np.random.choice(a[, size, replace, p])`

`a`: The array you want to choose values from. If you enter a single int `n`, then it will treat it as the array `arange(n)`.

`size`: The shape of the output array

`replace`: Whether you don't mind getting the same values again (if replace=True, you may get the same numbers more than once, else you will get each number only once)

`p`: The probability associated with each element: So if you want one element to occur more often than the others, you basically use the `p` array. The size of the `p` array must be the same as that of the `a` array.

In [14]:
l = np.random.choice(12)
print(l) # it will choose a single number from the arange(0,12)

6


In [15]:
m = np.random.choice([123,12,1,0],size = [2,4])
print(m)

[[  1   0   0   0]
 [  1  12  12 123]]


In [16]:
n = np.random.choice([123,12,1,0,45,65,78,90.5,"Hi"],size=[2,2], replace=True)
print(n)

o = np.random.choice([123,12,1,0,45,65,78,90.5,"Hi"],size=[2,2], replace=False)
print(o)

[['12' '0']
 ['12' '0']]
[['Hi' '65']
 ['0' '1']]


### 7. Shuffle a sequence:

`np.random.shuffle(X)`

In [17]:
a = np.arange(20)
print(a)
np.random.shuffle(a)
print(a)

# You could use this to shuffle your music if you use it along with the ffmpeg library, definitely go check it out!

[ 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19]
[ 9  8  4  5  3 12 15 14 13  2 19  6 18  1  0 16 11  7 10 17]
