### Getting Started With Python & Data Analysis
#### Data is the core of data science, hence, scoping and collecting the right data for a project is very crucial to achieving the required results. A complete Data Science Pipeline involves 
#### 1. Data Scoping
#### 2. Data Review
#### 3. Feature engineering 
#### 4. Feature Review 
#### 5. Model Selection and review 
#### 6. Model Evaluation and Insights
#### 7. Interaction Production 
#### 8. Feedback

#### Conducting Exploratory Data Analysis (EDA) on the cleaned data using visualisations and statistical methods gives a quick insight into the various patterns and relationships between features in the dataset. Modelling involves using statistical and machine learning methods for classifying and clustering the processed data to create predictive models. Several evaluation methods are employed to compare the performance of these models and continuously improve before a final model is selected.

#### For the most part, the data science pipeline is not a linear process; it’s instead an iterative process.

#### Data can be presented in different forms such as CSV, JSON, Excel files, database etc. Python is very efficient in processing and wrangling most data types. The libraries Include Numpy, Pandas , Matplotlib , Scikit-Learn and TensorFlow.

#### Jupyter notebook is an interactive web environment that supports many programming languages including Python and R, allowing for explanatory text, images and visualisation.

### Introduction to NumPy & Creating Arrays.

#### NumPy is a library that has ndarray as its basic data structure used to handle arrays and matrices. A NumPy array has a grid of values all of which are of the same data type, mostly integers and floats. These arrays can also be created from Python lists.

In [2]:
#Importing Numpy library 
import numpy as np 

arr = [1,2,3,4]  #Created a simple list and assigned it to a variable 

print (arr)
print (type(arr))

[1, 2, 3, 4]
<class 'list'>


In [5]:
#Converting the list arr to an Array 
a = np.array(arr)

print(type(a)) # The type of ellemnt which is a numpy array

print(a.shape) # The shape of the array which is (4,0)

print(a.dtype) # The type of data in the array which is int

print(a.ndim) # The number of dimension which is 1

<class 'numpy.ndarray'>
(4,)
int32
1


In [8]:
# Lets create a two dimentional array
b= np.array([[1,2,3,4],[5,6,7,8]])

print(b.shape) # The shape of the array where the first dimension has 2 elements and the second has 4.

print(b.ndim) # The dimension of the array

(2, 4)
2


#### There are also some inbuilt functions that can be used to initialize numpy which include empty(), zeros(), ones(), full(), random.random().

In [11]:
zero_array = np.zeros(5) #Takes the number of zeros as an argument 

print(zero_array)

[0. 0. 0. 0. 0.]


In [14]:
empty_array =np.empty([2,2]) #Takes an array or integer as an argument

print(empty_array)

[[4.24399158e-314 8.48798317e-314]
 [1.27319747e-313 1.69759663e-313]]


In [19]:
one_array = np.ones([2,3])

one_array2 = np.ones(5)

print(one_array)
print('-----------')
print(one_array2)

[[1. 1. 1.]
 [1. 1. 1.]]
-----------
[1. 1. 1. 1. 1.]


In [21]:
np.full((2,2),10)

array([[10, 10],
       [10, 10]])

In [22]:
np.full((2,2),[1,2])

array([[1, 2],
       [1, 2]])

In [26]:
np.random.random((2,3))

array([[0.34631824, 0.08418694, 0.23452772],
       [0.17089944, 0.41017402, 0.64441591]])