# NUMPY CAPSTONE PROJECT - BLOOD DONATION
![blood_donation.png](blood_donation.png)
<p>Blood transfusion saves lives - from replacing lost blood during major surgery or a serious injury to treating various illnesses and blood disorders. Ensuring that there's enough blood in supply whenever needed is a serious challenge for the health professionals. According to <a href="https://www.webmd.com/a-to-z-guides/blood-transfusion-what-to-know#1">WebMD</a>, "about 5 million Americans need a blood transfusion every year".</p>
<p>Our dataset is from a mobile blood donation vehicle in Taiwan.</p>
<p>The data is stored in <code>datasets/transfusion.data</code> and it is structured according to RFMTC marketing model (a variation of RFM). 
<p>In this project, you are going to inspect the data using Numpy.</p>

#### IMPORTING LIBRARIES AND DATA

* Import `numpy` as np and genfromtxt as follows: `from numpy import genfromtxt`

* Call the data by using gentxt as follows: `gentxt("YourDirectory", delimiter = ","`

In [49]:
import numpy as np

from numpy import genfromtxt 

my_data = genfromtxt("datasets/transfusion.data",
                    delimiter = ",")
my_data

array([[     nan,      nan,      nan,      nan,      nan],
       [2.00e+00, 5.00e+01, 1.25e+04, 9.80e+01, 1.00e+00],
       [0.00e+00, 1.30e+01, 3.25e+03, 2.80e+01, 1.00e+00],
       ...,
       [2.30e+01, 3.00e+00, 7.50e+02, 6.20e+01, 0.00e+00],
       [3.90e+01, 1.00e+00, 2.50e+02, 3.90e+01, 0.00e+00],
       [7.20e+01, 1.00e+00, 2.50e+02, 7.20e+01, 0.00e+00]])

* Inspect our data's type by `my_data`

In [2]:
type(my_data)

numpy.ndarray

In [3]:
my_data.ndim

2

* Use `ndim` to see how many dimensions data has.

In [4]:
my_data.ndim

2

* Return the first row our data.

In [5]:
my_data[0]

array([nan, nan, nan, nan, nan])

* First row contains `nan` values. Delete `nan` values by `np.delete()`
* Note: `nan` values are located in `0,0`

In [50]:
my_data=np.delete(my_data,0, axis=0) 

* Return `my_data` to check whether you removed `nan` values or not.

In [51]:
my_data

array([[2.00e+00, 5.00e+01, 1.25e+04, 9.80e+01, 1.00e+00],
       [0.00e+00, 1.30e+01, 3.25e+03, 2.80e+01, 1.00e+00],
       [1.00e+00, 1.60e+01, 4.00e+03, 3.50e+01, 1.00e+00],
       ...,
       [2.30e+01, 3.00e+00, 7.50e+02, 6.20e+01, 0.00e+00],
       [3.90e+01, 1.00e+00, 2.50e+02, 3.90e+01, 0.00e+00],
       [7.20e+01, 1.00e+00, 2.50e+02, 7.20e+01, 0.00e+00]])

* To see the dimensions of the data, use `shape`

In [8]:
my_data.shape

(748, 5)

* To see how many unit(eleman) you have on your data, use `size`

In [9]:
my_data.size

3740

* To see the data type inside `my_data`, use `dtype`

In [10]:
my_data.dtype

dtype('float64')

* To see the size of the each unit(eleman), use `itemsize`

In [11]:
my_data.itemsize

8

* Create a matrix that has 2 rows and 5 columns and contains 0 by `np.zeros`. Name it as `sifir`

In [12]:
sifir=np.zeros((2,5))
sifir

array([[0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.]])

* Create a matrix that has 2 rows and 5 columns and contains 1 by `np.ones`. Name it as `bir`

In [13]:
bir=np.ones((2,5))
bir

array([[1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1.]])

* Create a matrix that has 2 rows and 5 columns and contains 38 by `np.full`. Name it as `otuzsekiz`

In [14]:
otuzsekiz=np.full((2,5),38)
otuzsekiz

array([[38, 38, 38, 38, 38],
       [38, 38, 38, 38, 38]])

* Create an eye matrix that has 5 rows and 5 columns by `np.eye`. Name it as `eye`

In [19]:
eye=np.eye(5,5)
eye

array([[1., 0., 0., 0., 0.],
       [0., 1., 0., 0., 0.],
       [0., 0., 1., 0., 0.],
       [0., 0., 0., 1., 0.],
       [0., 0., 0., 0., 1.]])

* Create a matrix that has 2 rows and 5 columns and contains random values between 0 and 1 by `np.random.random`. Name it as `random`

In [16]:
random=np.random.rand(2,5)
#random=np.random.random((2,5))
random

array([[0.16087781, 0.6556656 , 0.87932452, 0.04118184, 0.15654564],
       [0.5692953 , 0.29320858, 0.03507791, 0.91911427, 0.25096885]])

* Create a matrix that has 2 rows and 5 columns(use `reshape` for that) and contains values increases 1 at a time, and between 1 and 10 by `np.linspace`. Name it as `linsp`

In [22]:
linsp=np.linspace(1,10,10).reshape(2,5)
linsp

array([[ 1.,  2.,  3.,  4.,  5.],
       [ 6.,  7.,  8.,  9., 10.]])

* Extract `linsp` with `np.sqrt` and name the result as `linsp`

In [25]:
linsp=np.sqrt(linsp)
linsp

array([[1.        , 1.18920712, 1.31607401, 1.41421356, 1.49534878],
       [1.56508458, 1.62657656, 1.68179283, 1.73205081, 1.77827941]])

* exponentiate `random` and name the result as `random`

In [26]:
random=random**2
random

array([[0.02588167, 0.42989738, 0.77321161, 0.00169594, 0.02450654],
       [0.32409714, 0.08597127, 0.00123046, 0.84477105, 0.06298536]])

* Sum `linsp` and `random` and name it as `toplam`

In [28]:
toplam=linsp+random
toplam

array([[1.02588167, 1.6191045 , 2.08928563, 1.41590951, 1.51985532],
       [1.88918172, 1.71254783, 1.68302329, 2.57682185, 1.84126477]])

* Divide `bir` and `sifir` and name it as `bolme`
* If you receive and warning or error, briefly explain why

In [29]:
bolme=bir/sifir
#Python divide by zero error
#Herhangi bir sayıyı sıfıra bölmeye çalışmaktan kaynaklanan bir hatadır.
#Matematiksel işlemlerde bir sayıyı sıfıra bölmek tanımsız olarak değerlendirildiği için hata verir.

  bolme=bir/sifir


* Subtract `bir` and `sifir` and name it as `cikarma`

In [31]:
cikarma=bir-sifir
cikarma

array([[1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1.]])

* divide  `cikarma` and `toplam`. Then, name it as `bolme`

In [32]:
bolme=cikarma/toplam
bolme

array([[0.97477129, 0.61762536, 0.4786325 , 0.70625982, 0.65795737],
       [0.5293297 , 0.5839253 , 0.59416884, 0.38807495, 0.54310494]])

* Multiply `toplam` and `bolme` by element basis and name it as `ecarpma`

In [34]:
ecarpma=toplam*bolme
ecarpma

array([[1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1.]])

* Multiply `ecarpma` and `eye` by matrix basis and name it as `mcarpma`

In [36]:
mcarpma=ecarpma@eye
mcarpma

array([[1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1.]])

* Create matrix `a` that has following values:

`[[ 1 2 3 4 5]
  [ 6 7 8 9 10]]`

In [38]:
a=np.array([[1,2,3,4,5],[6,7,8,9,10]])
a

array([[ 1,  2,  3,  4,  5],
       [ 6,  7,  8,  9, 10]])

* Return the **boolean values** result of the values that are more than 3

In [39]:
a>3

array([[False, False, False,  True,  True],
       [ True,  True,  True,  True,  True]])

* Return the values that are more than 3

In [40]:
a[a>3]

array([ 4,  5,  6,  7,  8,  9, 10])

* Set the values that are more than 3 to 0 and name the result as `a`

In [41]:
a[a>3]=0
a

array([[1, 2, 3, 0, 0],
       [0, 0, 0, 0, 0]])

* Join `a` and `mcarpma ` by using stack functions(`axis=1`) and name it as `stc`

In [42]:
stc=np.stack((a,mcarpma),axis=1)
stc



array([[[1., 2., 3., 0., 0.],
        [1., 1., 1., 1., 1.]],

       [[0., 0., 0., 0., 0.],
        [1., 1., 1., 1., 1.]]])

* Take the 1'st and 3'rd rows from `stc`, assign them to a new matrix.Name this new matrix as `guncel`

In [44]:
guncel=stc[:, 0:1]
guncel

array([[[1., 2., 3., 0., 0.]],

       [[0., 0., 0., 0., 0.]]])

* Make guncel 2 dimensional array.

In [45]:
guncel=guncel.reshape(2,5)
guncel.ndim

2

* Do you remember the `my_data` that we defined above?
* Join `my_data` with `guncel` by using `concatenate` method vertically(alt alta). Name the result as `data`

In [52]:
data=np.concatenate([my_data,guncel])
data

array([[2.00e+00, 5.00e+01, 1.25e+04, 9.80e+01, 1.00e+00],
       [0.00e+00, 1.30e+01, 3.25e+03, 2.80e+01, 1.00e+00],
       [1.00e+00, 1.60e+01, 4.00e+03, 3.50e+01, 1.00e+00],
       ...,
       [7.20e+01, 1.00e+00, 2.50e+02, 7.20e+01, 0.00e+00],
       [1.00e+00, 2.00e+00, 3.00e+00, 0.00e+00, 0.00e+00],
       [0.00e+00, 0.00e+00, 0.00e+00, 0.00e+00, 0.00e+00]])

* Sum the columns of `data`

In [53]:
data.sum(axis=0)

array([7.112000e+03, 4.127000e+03, 1.031253e+06, 2.564300e+04,
       1.780000e+02])

* Sum the rows of `data`

In [58]:
data.sum(axis=1)


array([1.2651e+04, 3.2920e+03, 4.0530e+03, 5.0680e+03, 6.1020e+03,
       1.0120e+03, 1.7740e+03, 3.0480e+03, 2.2840e+03, 1.1650e+04,
       5.8350e+03, 7.5700e+02, 2.5410e+03, 3.3110e+03, 1.5240e+03,
       1.2690e+03, 3.5650e+03, 3.8170e+03, 1.5240e+03, 7.6000e+02,
       7.6000e+02, 2.7930e+03, 1.5250e+03, 1.5250e+03, 2.2840e+03,
       3.5580e+03, 1.5240e+03, 3.0510e+03, 1.2710e+03, 2.0330e+03,
       3.5730e+03, 2.5430e+03, 2.5430e+03, 2.2900e+03, 4.0820e+03,
       2.0390e+03, 3.0620e+03, 1.5270e+03, 3.5740e+03, 1.7840e+03,
       3.3190e+03, 1.2730e+03, 1.2740e+03, 1.2730e+03, 5.0940e+03,
       2.2920e+03, 2.2970e+03, 5.0600e+02, 5.0600e+02, 5.0600e+02,
       2.8090e+03, 2.8100e+03, 1.5300e+03, 3.0660e+03, 1.2740e+03,
       4.8430e+03, 2.0390e+03, 1.7880e+03, 4.0990e+03, 1.5300e+03,
       1.7880e+03, 2.0460e+03, 2.5610e+03, 1.2760e+03, 7.6500e+02,
       4.0930e+03, 1.0210e+03, 5.0600e+02, 1.7860e+03, 2.3110e+03,
       1.0220e+03, 1.0220e+03, 4.3430e+03, 5.0800e+02, 5.0900e

* Return the maximum values of each column

In [59]:
data.max(axis=0)

array([7.40e+01, 5.00e+01, 1.25e+04, 9.80e+01, 1.00e+00])

* Return the maximum values of each row

In [60]:
data.max(axis=1)

array([1.250e+04, 3.250e+03, 4.000e+03, 5.000e+03, 6.000e+03, 1.000e+03,
       1.750e+03, 3.000e+03, 2.250e+03, 1.150e+04, 5.750e+03, 7.500e+02,
       2.500e+03, 3.250e+03, 1.500e+03, 1.250e+03, 3.500e+03, 3.750e+03,
       1.500e+03, 7.500e+02, 7.500e+02, 2.750e+03, 1.500e+03, 1.500e+03,
       2.250e+03, 3.500e+03, 1.500e+03, 3.000e+03, 1.250e+03, 2.000e+03,
       3.500e+03, 2.500e+03, 2.500e+03, 2.250e+03, 4.000e+03, 2.000e+03,
       3.000e+03, 1.500e+03, 3.500e+03, 1.750e+03, 3.250e+03, 1.250e+03,
       1.250e+03, 1.250e+03, 5.000e+03, 2.250e+03, 2.250e+03, 5.000e+02,
       5.000e+02, 5.000e+02, 2.750e+03, 2.750e+03, 1.500e+03, 3.000e+03,
       1.250e+03, 4.750e+03, 2.000e+03, 1.750e+03, 4.000e+03, 1.500e+03,
       1.750e+03, 2.000e+03, 2.500e+03, 1.250e+03, 7.500e+02, 4.000e+03,
       1.000e+03, 5.000e+02, 1.750e+03, 2.250e+03, 1.000e+03, 1.000e+03,
       4.250e+03, 5.000e+02, 5.000e+02, 5.000e+02, 1.000e+03, 5.000e+02,
       5.000e+02, 5.000e+02, 1.500e+03, 1.000e+03, 

* Return the minimum values of each column

In [61]:
data.min(axis=0)

array([0., 0., 0., 0., 0.])

* Return the minimum values of each row

In [63]:
data.min(axis=1)

array([1., 0., 1., 1., 0., 0., 1., 0., 1., 1., 0., 0., 1., 0., 1., 1., 1.,
       1., 1., 1., 1., 0., 1., 1., 0., 0., 0., 1., 1., 0., 0., 1., 1., 1.,
       0., 1., 1., 1., 1., 1., 1., 0., 1., 0., 1., 1., 0., 0., 0., 0., 0.,
       1., 0., 0., 1., 1., 1., 1., 0., 0., 0., 1., 0., 1., 1., 0., 1., 0.,
       0., 0., 0., 0., 1., 0., 1., 1., 1., 0., 0., 0., 1., 0., 0., 0., 1.,
       0., 0., 0., 0., 1., 1., 0., 0., 0., 0., 0., 0., 1., 1., 1., 1., 1.,
       0., 0., 0., 0., 0., 0., 0., 0., 1., 0., 0., 1., 0., 0., 1., 0., 0.,
       1., 1., 1., 1., 1., 0., 0., 1., 0., 1., 1., 0., 0., 0., 0., 0., 0.,
       0., 0., 0., 0., 0., 0., 0., 0., 1., 1., 0., 0., 1., 0., 1., 0., 0.,
       1., 0., 0., 0., 0., 0., 0., 0., 0., 0., 1., 1., 1., 0., 0., 0., 0.,
       0., 0., 0., 0., 0., 0., 1., 1., 1., 0., 1., 0., 0., 0., 0., 0., 0.,
       0., 0., 1., 0., 0., 0., 0., 0., 1., 1., 0., 0., 0., 0., 0., 1., 0.,
       0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 1., 0., 0., 0., 0., 0.,
       1., 1., 0., 1., 1.

* Find the index of the biggest value
* Note: The value you're about to reach is the index of our `data`'s flatten value.

In [64]:
np.argmax(data)

2

* Find the index of the smallest value

In [65]:
np.argmin(data)

5

* Transpose to `data` and set it the result as `datat`

In [67]:
datat=data.T
datat

array([[2.00e+00, 0.00e+00, 1.00e+00, ..., 7.20e+01, 1.00e+00, 0.00e+00],
       [5.00e+01, 1.30e+01, 1.60e+01, ..., 1.00e+00, 2.00e+00, 0.00e+00],
       [1.25e+04, 3.25e+03, 4.00e+03, ..., 2.50e+02, 3.00e+00, 0.00e+00],
       [9.80e+01, 2.80e+01, 3.50e+01, ..., 7.20e+01, 0.00e+00, 0.00e+00],
       [1.00e+00, 1.00e+00, 1.00e+00, ..., 0.00e+00, 0.00e+00, 0.00e+00]])