# NUMPY CAPSTONE PROJECT - BLOOD DONATION
![blood_donation.png](blood_donation.png)
<p>Blood transfusion saves lives - from replacing lost blood during major surgery or a serious injury to treating various illnesses and blood disorders. Ensuring that there's enough blood in supply whenever needed is a serious challenge for the health professionals. According to <a href="https://www.webmd.com/a-to-z-guides/blood-transfusion-what-to-know#1">WebMD</a>, "about 5 million Americans need a blood transfusion every year".</p>
<p>Our dataset is from a mobile blood donation vehicle in Taiwan.</p>
<p>The data is stored in <code>datasets/transfusion.data</code> and it is structured according to RFMTC marketing model (a variation of RFM). 
<p>In this project, you are going to inspect the data using Numpy.</p>

#### IMPORTING LIBRARIES AND DATA

* Import `numpy` as np and genfromtxt as follows: `from numpy import genfromtxt`

* Call the data by using gentxt as follows: `gentxt("YourDirectory", delimiter = ","`

In [3]:
import numpy as np
from numpy import genfromtxt     #r kullan tek \
my_data = genfromtxt("C:\\Users\\DELL\\numpycapstone\\datasets\\transfusion.data",
                    delimiter = ",")
my_data

array([[     nan,      nan,      nan,      nan,      nan],
       [2.00e+00, 5.00e+01, 1.25e+04, 9.80e+01, 1.00e+00],
       [0.00e+00, 1.30e+01, 3.25e+03, 2.80e+01, 1.00e+00],
       ...,
       [2.30e+01, 3.00e+00, 7.50e+02, 6.20e+01, 0.00e+00],
       [3.90e+01, 1.00e+00, 2.50e+02, 3.90e+01, 0.00e+00],
       [7.20e+01, 1.00e+00, 2.50e+02, 7.20e+01, 0.00e+00]])

* Inspect our data's type by `my_data`

In [7]:
type(my_data)   #numpy formatı

numpy.ndarray

* Use `ndim` to see how many dimensions data has.

In [146]:
my_data.ndim

2

* Return the first row our data.

In [147]:
my_data

array([[     nan,      nan,      nan,      nan,      nan],
       [2.00e+00, 5.00e+01, 1.25e+04, 9.80e+01, 1.00e+00],
       [0.00e+00, 1.30e+01, 3.25e+03, 2.80e+01, 1.00e+00],
       ...,
       [2.30e+01, 3.00e+00, 7.50e+02, 6.20e+01, 0.00e+00],
       [3.90e+01, 1.00e+00, 2.50e+02, 3.90e+01, 0.00e+00],
       [7.20e+01, 1.00e+00, 2.50e+02, 7.20e+01, 0.00e+00]])

In [10]:
my_data[0]

array([nan, nan, nan, nan, nan])

* First row contains `nan` values. Delete `nan` values by `np.delete()`
* Note: `nan` values are located in `0,0`

In [10]:
my_data=np.delete(my_data,0,0)

In [11]:
my_data

array([[2.00e+00, 5.00e+01, 1.25e+04, 9.80e+01, 1.00e+00],
       [0.00e+00, 1.30e+01, 3.25e+03, 2.80e+01, 1.00e+00],
       [1.00e+00, 1.60e+01, 4.00e+03, 3.50e+01, 1.00e+00],
       ...,
       [2.30e+01, 3.00e+00, 7.50e+02, 6.20e+01, 0.00e+00],
       [3.90e+01, 1.00e+00, 2.50e+02, 3.90e+01, 0.00e+00],
       [7.20e+01, 1.00e+00, 2.50e+02, 7.20e+01, 0.00e+00]])

* Return `my_data` to check whether you removed `nan` values or not.

In [12]:
np.isnan(my_data)   #hasnans olabilir mi

array([[False, False, False, False, False],
       [False, False, False, False, False],
       [False, False, False, False, False],
       ...,
       [False, False, False, False, False],
       [False, False, False, False, False],
       [False, False, False, False, False]])

* To see the dimensions of the data, use `shape`

In [13]:
my_data.shape

(748, 5)

* To see how many unit(eleman) you have on your data, use `size`

In [152]:
np.size(my_data)

3745

* To see the data type inside `my_data`, use `dtype`

In [14]:
my_data.dtype

dtype('float64')

* To see the size of the each unit(eleman), use `itemsize`

In [15]:
my_data.itemsize

8

* Create a matrix that has 2 rows and 5 columns and contains 0 by `np.zeros`. Name it as `sifir`

In [17]:
sifir=np.zeros((2,5))
sifir

array([[0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.]])

* Create a matrix that has 2 rows and 5 columns and contains 1 by `np.ones`. Name it as `bir`

In [18]:
bir=np.ones((2,5))
bir

array([[1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1.]])

* Create a matrix that has 2 rows and 5 columns and contains 38 by `np.full`. Name it as `otuzsekiz`

In [19]:
otuzsekiz=np.full((2,5),38)
otuzsekiz

array([[38, 38, 38, 38, 38],
       [38, 38, 38, 38, 38]])

* Create an eye matrix that has 5 rows and 5 columns by `np.eye`. Name it as `eye`

In [49]:
eye=np.eye(5)
eye

array([[1., 0., 0., 0., 0.],
       [0., 1., 0., 0., 0.],
       [0., 0., 1., 0., 0.],
       [0., 0., 0., 1., 0.],
       [0., 0., 0., 0., 1.]])

* Create a matrix that has 2 rows and 5 columns and contains random values between 0 and 1 by `np.random.random`. Name it as `random`

In [32]:
random=np.random.random((2,5))
random

array([[0.18997941, 0.71925704, 0.65695029, 0.00794711, 0.0212093 ],
       [0.6912165 , 0.06436199, 0.26527288, 0.36572647, 0.9524153 ]])

* Create a matrix that has 2 rows and 5 columns(use `reshape` for that) and contains values increases 1 at a time, and between 1 and 10 by `np.linspace`. Name it as `linsp`

In [21]:
linsp=np.linspace(1,10,10).reshape((2,5))
linsp

array([[ 1.,  2.,  3.,  4.,  5.],
       [ 6.,  7.,  8.,  9., 10.]])

* Extract `linsp` with `np.sqrt` and name the result as `linsp`

In [22]:
linsp=np.sqrt(linsp)
linsp

array([[1.        , 1.41421356, 1.73205081, 2.        , 2.23606798],
       [2.44948974, 2.64575131, 2.82842712, 3.        , 3.16227766]])

* exponentiate `random` and name the result as `random`

In [34]:
random=random**2
random

array([[3.60921745e-02, 5.17330684e-01, 4.31583684e-01, 6.31564904e-05,
        4.49834317e-04],
       [4.77780249e-01, 4.14246592e-03, 7.03697032e-02, 1.33755848e-01,
        9.07094897e-01]])

* Sum `linsp` and `random` and name it as `toplam`

In [35]:
toplam=linsp+random
toplam

array([[1.03609217, 1.93154425, 2.16363449, 2.00006316, 2.23651781],
       [2.92726999, 2.64989378, 2.89879683, 3.13375585, 4.06937256]])

* Divide `bir` and `sifir` and name it as `bolme`
* If you receive and warning or error, briefly explain why

In [41]:
bolme=toplam/bir
bolme

array([[1.03609217, 1.93154425, 2.16363449, 2.00006316, 2.23651781],
       [2.92726999, 2.64989378, 2.89879683, 3.13375585, 4.06937256]])

In [43]:
bolme=toplam/sifir
bolme

  bolme=toplam/sifir


array([[inf, inf, inf, inf, inf],
       [inf, inf, inf, inf, inf]])

* Subtract `bir` and `sifir` and name it as `cikarma`

In [38]:
cikarma=bir-sifir
cikarma

array([[1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1.]])

* divide  `cikarma` and `toplam`. Then, name it as `bolme`

In [46]:
bolme=cikarma/toplam
bolme

array([[0.96516509, 0.51772047, 0.46218527, 0.49998421, 0.44712365],
       [0.34161523, 0.37737362, 0.34497071, 0.31910591, 0.24573813]])

* Multiply `toplam` and `bolme` by element basis and name it as `ecarpma`

In [47]:
ecarpma=toplam*bolme
ecarpma

array([[1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1.]])

* Multiply `ecarpma` and `eye` by matrix basis and name it as `mcarpma`

In [55]:
mcarpma=ecarpma @ eye
mcarpma

array([[1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1.]])

* Create matrix `a` that has following values:

`[[ 1 2 3 4 5]
  [ 6 7 8 9 10]]`

In [61]:
a=np.array([[ 1, 2, 3, 4, 5],[6 ,7, 8, 9, 10]])
a

array([[ 1,  2,  3,  4,  5],
       [ 6,  7,  8,  9, 10]])

* Return the **boolean values** result of the values that are more than 3

In [62]:
a>3

array([[False, False, False,  True,  True],
       [ True,  True,  True,  True,  True]])

* Return the values that are more than 3

In [63]:
a[a>3]

array([ 4,  5,  6,  7,  8,  9, 10])

* Set the values that are more than 3 to 0 and name the result as `a`

In [73]:
a[a>3]==0
a

array([[ 1,  2,  3,  4,  5],
       [ 6,  7,  8,  9, 10]])

* Join `a` and `mcarpma ` by using stack functions(`axis=1`) and name it as `stc`

In [74]:
stc=np.stack((a,mcarpma),axis=1)
stc

array([[[ 1.,  2.,  3.,  4.,  5.],
        [ 1.,  1.,  1.,  1.,  1.]],

       [[ 6.,  7.,  8.,  9., 10.],
        [ 1.,  1.,  1.,  1.,  1.]]])

* Take the 1'st and 3'rd rows from `stc`, assign them to a new matrix.Name this new matrix as `guncel`

In [68]:
guncel=stc[:,0:1]
guncel

array([[[ 1.,  2.,  3.,  4.,  5.]],

       [[ 6.,  7.,  8.,  9., 10.]]])

* Make guncel 2 dimensional array.

In [71]:
guncel=guncel.reshape(2,5)
gncel.ndim

3

* Do you remember the `my_data` that we defined above?
* Join `my_data` with `guncel` by using `concatenate` method vertically(alt alta). Name the result as `data`

In [80]:
data=np.concatenate((my_data,guncel),axis=0)
data

array([[2.00e+00, 5.00e+01, 1.25e+04, 9.80e+01, 1.00e+00],
       [0.00e+00, 1.30e+01, 3.25e+03, 2.80e+01, 1.00e+00],
       [1.00e+00, 1.60e+01, 4.00e+03, 3.50e+01, 1.00e+00],
       ...,
       [7.20e+01, 1.00e+00, 2.50e+02, 7.20e+01, 0.00e+00],
       [1.00e+00, 2.00e+00, 3.00e+00, 4.00e+00, 5.00e+00],
       [6.00e+00, 7.00e+00, 8.00e+00, 9.00e+00, 1.00e+01]])

* Sum the columns of `data`

In [82]:
data.sum(axis=0)

array([7.118000e+03, 4.134000e+03, 1.031261e+06, 2.565600e+04,
       1.930000e+02])

* Sum the rows of `data`

In [84]:
data.sum(axis=1)

array([12651.,  3292.,  4053.,  5068.,  6102.,  1012.,  1774.,  3048.,
        2284., 11650.,  5835.,   757.,  2541.,  3311.,  1524.,  1269.,
        3565.,  3817.,  1524.,   760.,   760.,  2793.,  1525.,  1525.,
        2284.,  3558.,  1524.,  3051.,  1271.,  2033.,  3573.,  2543.,
        2543.,  2290.,  4082.,  2039.,  3062.,  1527.,  3574.,  1784.,
        3319.,  1273.,  1274.,  1273.,  5094.,  2292.,  2297.,   506.,
         506.,   506.,  2809.,  2810.,  1530.,  3066.,  1274.,  4843.,
        2039.,  1788.,  4099.,  1530.,  1788.,  2046.,  2561.,  1276.,
         765.,  4093.,  1021.,   506.,  1786.,  2311.,  1022.,  1022.,
        4343.,   508.,   509.,   509.,  1023.,   508.,   508.,   508.,
        1534.,  1022.,  1022.,  1022.,  1537.,  1536.,   510.,   510.,
         510.,  1795.,   511.,   510.,   510.,   510.,  2796.,  1789.,
        4356.,  2302.,  1023.,  1789.,  2047.,  3342.,  2303.,  1283.,
        1283.,  4343.,  2067.,  1284.,   769.,  2576.,  1283.,  2309.,
      

* Return the maximum values of each column

In [85]:
#np.amax(data,axis=0)
data.max(axis=0)

array([7.40e+01, 5.00e+01, 1.25e+04, 9.80e+01, 1.00e+01])

* Return the maximum values of each row

In [86]:
data.max(axis=1)
#np.amax(data,axis=1)

array([1.250e+04, 3.250e+03, 4.000e+03, 5.000e+03, 6.000e+03, 1.000e+03,
       1.750e+03, 3.000e+03, 2.250e+03, 1.150e+04, 5.750e+03, 7.500e+02,
       2.500e+03, 3.250e+03, 1.500e+03, 1.250e+03, 3.500e+03, 3.750e+03,
       1.500e+03, 7.500e+02, 7.500e+02, 2.750e+03, 1.500e+03, 1.500e+03,
       2.250e+03, 3.500e+03, 1.500e+03, 3.000e+03, 1.250e+03, 2.000e+03,
       3.500e+03, 2.500e+03, 2.500e+03, 2.250e+03, 4.000e+03, 2.000e+03,
       3.000e+03, 1.500e+03, 3.500e+03, 1.750e+03, 3.250e+03, 1.250e+03,
       1.250e+03, 1.250e+03, 5.000e+03, 2.250e+03, 2.250e+03, 5.000e+02,
       5.000e+02, 5.000e+02, 2.750e+03, 2.750e+03, 1.500e+03, 3.000e+03,
       1.250e+03, 4.750e+03, 2.000e+03, 1.750e+03, 4.000e+03, 1.500e+03,
       1.750e+03, 2.000e+03, 2.500e+03, 1.250e+03, 7.500e+02, 4.000e+03,
       1.000e+03, 5.000e+02, 1.750e+03, 2.250e+03, 1.000e+03, 1.000e+03,
       4.250e+03, 5.000e+02, 5.000e+02, 5.000e+02, 1.000e+03, 5.000e+02,
       5.000e+02, 5.000e+02, 1.500e+03, 1.000e+03, 

* Return the minimum values of each column

In [87]:
data.min(axis=0)

array([0., 1., 3., 2., 0.])

* Return the minimum values of each row

In [88]:
data.min(axis=1)

array([1., 0., 1., 1., 0., 0., 1., 0., 1., 1., 0., 0., 1., 0., 1., 1., 1.,
       1., 1., 1., 1., 0., 1., 1., 0., 0., 0., 1., 1., 0., 0., 1., 1., 1.,
       0., 1., 1., 1., 1., 1., 1., 0., 1., 0., 1., 1., 0., 0., 0., 0., 0.,
       1., 0., 0., 1., 1., 1., 1., 0., 0., 0., 1., 0., 1., 1., 0., 1., 0.,
       0., 0., 0., 0., 1., 0., 1., 1., 1., 0., 0., 0., 1., 0., 0., 0., 1.,
       0., 0., 0., 0., 1., 1., 0., 0., 0., 0., 0., 0., 1., 1., 1., 1., 1.,
       0., 0., 0., 0., 0., 0., 0., 0., 1., 0., 0., 1., 0., 0., 1., 0., 0.,
       1., 1., 1., 1., 1., 0., 0., 1., 0., 1., 1., 0., 0., 0., 0., 0., 0.,
       0., 0., 0., 0., 0., 0., 0., 0., 1., 1., 0., 0., 1., 0., 1., 0., 0.,
       1., 0., 0., 0., 0., 0., 0., 0., 0., 0., 1., 1., 1., 0., 0., 0., 0.,
       0., 0., 0., 0., 0., 0., 1., 1., 1., 0., 1., 0., 0., 0., 0., 0., 0.,
       0., 0., 1., 0., 0., 0., 0., 0., 1., 1., 0., 0., 0., 0., 0., 1., 0.,
       0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 1., 0., 0., 0., 0., 0.,
       1., 1., 0., 1., 1.

* Find the index of the biggest value
* Note: The value you're about to reach is the index of our `data`'s flatten value.

In [89]:
np.argmax(data)   #3.index

2

* Find the index of the smallest value

In [90]:
np.argmin(data)

5

* Transpose to `data` and set it the result as `datat`

In [93]:
datat=np.transpose(data)
#data.T



array([[2.00e+00, 0.00e+00, 1.00e+00, ..., 7.20e+01, 1.00e+00, 6.00e+00],
       [5.00e+01, 1.30e+01, 1.60e+01, ..., 1.00e+00, 2.00e+00, 7.00e+00],
       [1.25e+04, 3.25e+03, 4.00e+03, ..., 2.50e+02, 3.00e+00, 8.00e+00],
       [9.80e+01, 2.80e+01, 3.50e+01, ..., 7.20e+01, 4.00e+00, 9.00e+00],
       [1.00e+00, 1.00e+00, 1.00e+00, ..., 0.00e+00, 5.00e+00, 1.00e+01]])