---
> # **Statistical methods, sorting, and set operations**
---

## Basic statistical operations:

In [2]:
import numpy as np

**.randn syntax**
<br>
_Return a sample (or samples) from the “standard normal” distribution._
> ```Python
numpy.random.randn
numpy.random.randn(d0, d1, ..., dn)
```

In [3]:
# setup a random 2 * 4 matrix
arr = 10 * np.random.randn(3,5)

print(arr)

[[ -4.01821021  -8.02840718  13.4340279    9.24213162   8.82117019]
 [  9.24321467  17.415843   -19.87294963   3.44523364 -10.18563121]
 [ 11.3455357   10.45743597   5.94204754   3.50693092  -1.79008925]]


In [3]:
# compute the mean for all elements
print(arr.mean())

4.608381514764186


In [4]:
# compute the meas by each row
print(arr.mean(axis = 1))

[ 0.20809493  3.56855041 10.0484992 ]


In [5]:
# comput the means by each column
print(arr.mean(axis = 0))

[-0.88424358 -3.30283919 -2.63711228  8.80173138 21.06437124]


In [6]:
# sum all the elemets
print(arr.sum())

69.12572272146278


In [7]:
# compute the medians
print(np.median(arr, axis = 1))

[-0.4679778   9.21530399 13.68508501]


## Sorting

In [8]:
# create a 10 element array of randoms
unsorted = np.random.randn(10)

print(unsorted)

[-0.63690019  0.5511895  -1.29509757 -0.99119479 -1.56292721  0.62776371
  0.35226564  0.35532227  0.70438139 -1.5259828 ]


In [9]:
# create copy and sort
sorted = np.array(unsorted)
sorted.sort()

print(sorted)
print()
print(unsorted)

[-1.56292721 -1.5259828  -1.29509757 -0.99119479 -0.63690019  0.35226564
  0.35532227  0.5511895   0.62776371  0.70438139]

[-0.63690019  0.5511895  -1.29509757 -0.99119479 -1.56292721  0.62776371
  0.35226564  0.35532227  0.70438139 -1.5259828 ]


In [10]:
# inplace sorting
unsorted.sort()

print(unsorted)

[-1.56292721 -1.5259828  -1.29509757 -0.99119479 -0.63690019  0.35226564
  0.35532227  0.5511895   0.62776371  0.70438139]


## Finding unique elements

In [11]:
# find all unique elements
array = np.array([1,2,1,4,2,1,4,2])

print(np.unique(array))

[1 2 4]


## Set.operations with np.array data type

In [12]:
s1 = np.array(['desk','chair','bulb'])
s2 = np.array(['lamp','bulb','chair'])

print(s1,s2)

['desk' 'chair' 'bulb'] ['lamp' 'bulb' 'chair']


**numpy.intersect1d**


_Input arrays. Will be flattened if not already 1D
_Find the intersection of two arrays_.
<br>
_Return the sorted, unique values that are in both of the input arrays_.

**numpy.intersect1d syntax**

```Python
numpy.intersect1d(ar1, ar2, assume_unique=False, return_indices=False)[source]
```

**Ex:**
```Python
np.intersect1d([1, 3, 4, 3], [3, 1, 2, 1])
array([1, 3])
```


In [13]:
print(np.intersect1d(s1, s2))

['bulb' 'chair']


**numpy.union1d**<br>
_Find the union of two arrays._<br>
_Return the unique, sorted array of values that are in either of the two input arrays_

**numpy.union1d syntax**
> ```Python
numpy.union1d(ar1, ar2)[source]
```

In [14]:
print(np.union1d(s1,s2))

['bulb' 'chair' 'desk' 'lamp']


**numpy.setdiff1d**<br>
_Find the set difference of two arrays._<br>
_Return the sorted, unique values in ar1 that are not in ar2._

**numpy.setdiff1d syntax**
> ```Python
numpy.setdiff1d(ar1, ar2, assume_unique=False)[source]
```

In [15]:
print(np.setdiff1d(s1,s2)) # elements in s1 that are not in s2

['desk']


**numpy.in1d**<br>
_Test whether each element of a 1-D array is also present in a second array._<br>
_Returns a boolean array the same length as ar1 that is True where an element of ar1 is in ar2 and False otherwise._

**numpy.in1d syntax**
> ```Python
numpy.in1d(ar1, ar2, assume_unique=False, invert=False)[source]
```

**Recommend!** <br>
_using **isin** instead of **in1d** for new code._<br>
**numpy.in1d == numpy.isin**

In [16]:
print(np.isin(s1,s2)) # which element of s1 is also in s2
# Логика: Во втором наборе данных - Стола нет, но стул и лампочка есть.

[False  True  True]
