![Quote](http://darlington.infinityfreeapp.com/images/quote7.png)

# 1. Introduction

Welcome to another insightful lesson in our Python Essentials course! In this important session, we will explore the basics of **using Python for data analysis**. Python, due to its simplicity and powerful libraries, stands out as the preferred language for data analysis worldwide. As data analysis has become increasingly important in various fields and industries, both individuals and organizations rely heavily on Python to make well-informed decisions based on data. By the end of this lesson, you'll gain valuable knowledge that will empower you and enhance your proficiency in this essential skill.

# 2. What is Data Analysis?

Data analysis is the systematic process of **extracting valuable insights and knowledge from raw data**. In today's data-driven world, organizations across various sectors such as *business*, *finance*, *healthcare*, *marketing*, and *research* heavily depend on it to make informed decisions. By analyzing data, organizations and individuals can effectively solve real-world problems, identify emerging trends, detect anomalies, make accurate predictions, and optimize their business processes.

# 3. Python Libraries for Data Analysis

Python offers a wide range of libraries, such as **Pandas**, **NumPy**, and **Matplotlib**, for data analysis. These libraries provide powerful tools and functions to efficiently manipulate, clean, visualize, and analyze data.

### 3.2 NumPy

NumPy (**Numerical Python**) is a powerful Python library used for **working with arrays and performing mathematical operations** on them.

#### Using NumPy

**Install NumPy**: Before using NumPy, you need to install it. You can install NumPy using the following command:

In [60]:
# Installing numpy using Jupyter notebook
!pip install numpy

Defaulting to user installation because normal site-packages is not writeable


**Import NumPy**: Once NumPy is installed, you can import it into your Python program using the import statement. It is common practice to import NumPy with the alias **np** for better readability.

In [61]:
# Importing numpy library
import numpy as np

**Performing Operations**: NumPy offers a wide range of functions and methods for efficiently performing various numerical operations on arrays, including data from your CSV file. With NumPy, we can effortlessly analyze our sales data, perform calculations, and tackle a variety of numerical tasks with ease.

### 3.3 Matplotlib

Matplotlib is a **plotting and visualization library** that enables you to create a wide variety of static, animated, and interactive visualizations in Python. With Matplotlib, you can generate plots, histograms, scatterplots, and more to explore and communicate your data effectively.

#### Using Matplotlib

**Install Matplotlib**: Before you can use Matplotlib, you need to install it by running the following command:

In [62]:
# Installing matplotlib using Jupyter notebook
!pip install matplotlib

Defaulting to user installation because normal site-packages is not writeable


At the core of the NumPy package, is the ndarray object. This encapsulates n-dimensional arrays of homogeneous data types, with many operations being performed in compiled code for performance. There are several important differences between NumPy arrays and the standard Python sequences:

- NumPy arrays have a fixed size at creation, unlike Python lists (which can grow dynamically). Changing the size of an ndarray will create a new array and delete the original.

- The elements in a NumPy array are all required to be of the same data type, and thus will be the same size in memory. The exception: one can have arrays of (Python, including NumPy) objects, thereby allowing for arrays of different sized elements.

- NumPy arrays facilitate advanced mathematical and other types of operations on large numbers of data. Typically, such operations are executed more efficiently and with less code than is possible using Python’s built-in sequences.

A growing plethora of scientific and mathematical Python-based packages are using NumPy arrays; though these typically support Python-sequence input, they convert such input to NumPy arrays prior to processing, and they often output NumPy arrays. In other words, in order to efficiently use much (perhaps even most) of today’s scientific/mathematical Python-based software, just knowing how to use Python’s built-in sequence types is insufficient - one also needs to know how to use NumPy arrays.

In [None]:
import numpy as np

Data Analysis and Visualization

Financial Mathematics

### Exersice

You have daily stock prices of a company for the last month, and you want to calculate important statistics like the average price, highest and lowest prices, and the percentage changes day-to-day.

[150.5, 152.3, 151.2, 154.1, 153.7, 152.8, 151.0, 149.8, 151.7,
                         152.4, 153.1, 154.7, 155.2, 157.0, 156.5, 158.1, 159.2, 160.0,
                         161.5, 162.7, 161.2, 160.8, 159.6, 158.2, 157.3, 156.1, 157.9,
                         159.0, 158.4, 157.7]

In [3]:
import numpy as np

In [4]:
stock_price = np.array([150.5, 152.3, 151.2, 154.1, 153.7, 152.8, 151.0, 149.8, 151.7,
                         152.4, 153.1, 154.7, 155.2, 157.0, 156.5, 158.1, 159.2, 160.0,
                         161.5, 162.7, 161.2, 160.8, 159.6, 158.2, 157.3, 156.1, 157.9,
                         159.0, 158.4, 157.7])

In [8]:
average_price= np.mean(stock_price)
np.round(average_price,2)

156.12

In [10]:
highest_price = np.max(stock_price)
highest_price

162.7

In [11]:
lowest_price = np.min(stock_price)
lowest_price

149.8

In [30]:
for i in range(len(stock_price) - 1): #to match length with the last index
    percentage_change = ((stock_price[i+1] - stock_price[i]) / stock_price[i]) * 100 #(Next price - current price) / current price time 100
    print(stock_price[i+1], np.round(percentage_change,2)) #output the next price with the percentage change

152.3 1.2
151.2 -0.72
154.1 1.92
153.7 -0.26
152.8 -0.59
151.0 -1.18
149.8 -0.79
151.7 1.27
152.4 0.46
153.1 0.46
154.7 1.05
155.2 0.32
157.0 1.16
156.5 -0.32
158.1 1.02
159.2 0.7
160.0 0.5
161.5 0.94
162.7 0.74
161.2 -0.92
160.8 -0.25
159.6 -0.75
158.2 -0.88
157.3 -0.57
156.1 -0.76
157.9 1.15
159.0 0.7
158.4 -0.38
157.7 -0.44
