# NUMPY. Numerical Computing with Python

Python language is an excellent tool for general-purpose programming, with a highly readable syntax, rich and powerful data types and totally Zen.

However, it was not designed specifically for mathematical and scientific computing. In particular, Python lists are very flexible containers, but they are poorly suited to represent efficiently common mathematical constructs like vectors and matrices.

Fortunately, exists the **numpy** package (module) which is a package that provide high-performance vector, matrix and higher-dimensional data structures for Python. It is implemented in C and Fortran so when calculations are vectorized (formulated with vectors and matrices), performance is very good. It is used in almost all numerical computation using Python.

**Why not simply use Python lists for computations instead of creating a new array type?**

There are several reasons:

* Python lists are very general. They can contain any kind of object. They are dynamically typed. They do not support mathematical functions such as matrix and dot multiplications, etc. Implementating such functions for Python lists would not be very efficient because of the dynamic typing.
* Numpy arrays are statically typed and homogeneous. The type of the elements is determined when array is created.
* Numpy arrays are memory efficient.
* Because of the static typing, fast implementation of mathematical functions such as multiplication and addition of numpy arrays can be implemented in a compiled language (C and Fortran is used).

![elgif](https://media.giphy.com/media/VHqZtsoHHPo58EL7Ua/giphy.gif)

<h1>Table of Contents<span class="tocSkip"></span></h1>
<div class="toc"><ul class="toc-item"><li><span><a href="#What-is-Numpy?" data-toc-modified-id="What-is-Numpy?-1"><span class="toc-item-num">1&nbsp;&nbsp;</span>What is Numpy?</a></span></li><li><span><a href="#Work-with-numeric-data" data-toc-modified-id="Work-with-numeric-data-2"><span class="toc-item-num">2&nbsp;&nbsp;</span>Work with numeric data</a></span></li><li><span><a href="#Go-from-Python-lists-to-Numpy-arrays" data-toc-modified-id="Go-from-Python-lists-to-Numpy-arrays-3"><span class="toc-item-num">3&nbsp;&nbsp;</span>Go from Python lists to Numpy arrays</a></span></li><li><span><a href="#Operating-with-Numpy-Arrays" data-toc-modified-id="Operating-with-Numpy-Arrays-4"><span class="toc-item-num">4&nbsp;&nbsp;</span>Operating with Numpy Arrays</a></span></li><li><span><a href="#Advantages-of-using-Numpy-arrays" data-toc-modified-id="Advantages-of-using-Numpy-arrays-5"><span class="toc-item-num">5&nbsp;&nbsp;</span>Advantages of using Numpy arrays</a></span></li><li><span><a href="#Create-Arrays" data-toc-modified-id="Create-Arrays-6"><span class="toc-item-num">6&nbsp;&nbsp;</span>Create Arrays</a></span><ul class="toc-item"><li><span><a href="#random-numbers" data-toc-modified-id="random-numbers-6.1"><span class="toc-item-num">6.1&nbsp;&nbsp;</span>random numbers</a></span></li><li><span><a href="#Range" data-toc-modified-id="Range-6.2"><span class="toc-item-num">6.2&nbsp;&nbsp;</span>Range</a></span></li><li><span><a href="#Linespace" data-toc-modified-id="Linespace-6.3"><span class="toc-item-num">6.3&nbsp;&nbsp;</span>Linespace</a></span></li><li><span><a href="#around" data-toc-modified-id="around-6.4"><span class="toc-item-num">6.4&nbsp;&nbsp;</span>around</a></span></li><li><span><a href="#Arrays-from-data" data-toc-modified-id="Arrays-from-data-6.5"><span class="toc-item-num">6.5&nbsp;&nbsp;</span>Arrays from data</a></span></li></ul></li><li><span><a href="#Return-a-numpy-array-to-list" data-toc-modified-id="Return-a-numpy-array-to-list-7"><span class="toc-item-num">7&nbsp;&nbsp;</span>Return a numpy array to list</a></span><ul class="toc-item"><li><span><a href="#Array-of-ZEROs" data-toc-modified-id="Array-of-ZEROs-7.1"><span class="toc-item-num">7.1&nbsp;&nbsp;</span>Array of ZEROs</a></span></li><li><span><a href="#Array-of-ONES" data-toc-modified-id="Array-of-ONES-7.2"><span class="toc-item-num">7.2&nbsp;&nbsp;</span>Array of ONES</a></span></li><li><span><a href="#DIAGONAL-Array" data-toc-modified-id="DIAGONAL-Array-7.3"><span class="toc-item-num">7.3&nbsp;&nbsp;</span>DIAGONAL Array</a></span></li></ul></li><li><span><a href="#Dtypes" data-toc-modified-id="Dtypes-8"><span class="toc-item-num">8&nbsp;&nbsp;</span>Dtypes</a></span></li><li><span><a href="#Explore-arrays-and-their-properties" data-toc-modified-id="Explore-arrays-and-their-properties-9"><span class="toc-item-num">9&nbsp;&nbsp;</span>Explore arrays and their properties</a></span></li><li><span><a href="#Manipulate-arrays" data-toc-modified-id="Manipulate-arrays-10"><span class="toc-item-num">10&nbsp;&nbsp;</span>Manipulate arrays</a></span><ul class="toc-item"><li><ul class="toc-item"><li><span><a href="#Transpose" data-toc-modified-id="Transpose-10.0.1"><span class="toc-item-num">10.0.1&nbsp;&nbsp;</span><a href="https://numpy.org/doc/stable/reference/generated/numpy.transpose.html" rel="nofollow" target="_blank">Transpose</a></a></span></li></ul></li><li><span><a href="#Cannot-transpose-a-list" data-toc-modified-id="Cannot-transpose-a-list-10.1"><span class="toc-item-num">10.1&nbsp;&nbsp;</span>Cannot transpose a list</a></span></li></ul></li><li><span><a href="#Updating-the-value-of-a-ndim-array" data-toc-modified-id="Updating-the-value-of-a-ndim-array-11"><span class="toc-item-num">11&nbsp;&nbsp;</span>Updating the value of a ndim array</a></span></li><li><span><a href="#Array-comparison" data-toc-modified-id="Array-comparison-12"><span class="toc-item-num">12&nbsp;&nbsp;</span>Array comparison</a></span></li><li><span><a href="#Array-indexing-and-slicing" data-toc-modified-id="Array-indexing-and-slicing-13"><span class="toc-item-num">13&nbsp;&nbsp;</span>Array indexing and slicing</a></span><ul class="toc-item"><li><span><a href="#Mix-of-indexes-and-ranges" data-toc-modified-id="Mix-of-indexes-and-ranges-13.1"><span class="toc-item-num">13.1&nbsp;&nbsp;</span>Mix of indexes and ranges</a></span></li></ul></li><li><span><a href="#Methods-of-np.arrays" data-toc-modified-id="Methods-of-np.arrays-14"><span class="toc-item-num">14&nbsp;&nbsp;</span>Methods of np.arrays</a></span><ul class="toc-item"><li><span><a href="#Numpy-methods" data-toc-modified-id="Numpy-methods-14.1"><span class="toc-item-num">14.1&nbsp;&nbsp;</span>Numpy methods</a></span></li></ul></li><li><span><a href="#Fun-things-that-Numpy-allows-us-to-do" data-toc-modified-id="Fun-things-that-Numpy-allows-us-to-do-15"><span class="toc-item-num">15&nbsp;&nbsp;</span>Fun things that Numpy allows us to do</a></span></li><li><span><a href="#Some-methods:" data-toc-modified-id="Some-methods:-16"><span class="toc-item-num">16&nbsp;&nbsp;</span>Some methods:</a></span></li><li><span><a href="#Summary" data-toc-modified-id="Summary-17"><span class="toc-item-num">17&nbsp;&nbsp;</span>Summary</a></span></li><li><span><a href="#Further-materials" data-toc-modified-id="Further-materials-18"><span class="toc-item-num">18&nbsp;&nbsp;</span>Further materials</a></span></li></ul></div>

## What is Numpy?

In [1]:
import numpy as np

NumPy, short for "Numerical Python," is a powerful library for the Python programming language. It is specifically designed to facilitate the creation and manipulation of large multidimensional arrays and matrices, along with an extensive collection of high-level mathematical functions to operate on these arrays. NumPy is widely used in scientific and data-related applications for its efficient numerical computing capabilities.

The origins of NumPy can be traced back to the "Numeric" library, initially developed by Jim Hugunin, with contributions from several other developers. Over time, NumPy evolved and expanded its capabilities. In 2005, Travis Oliphant played a crucial role in merging features from a competing library called Numarray into Numeric, resulting in the birth of NumPy. Since then, NumPy has gained tremendous popularity and has become an integral part of the Python ecosystem. It is an open-source project with a vibrant community of contributors.

At the heart of the NumPy package lies the `ndarray` (short for "n-dimensional array") object. This fundamental data structure allows users to create arrays with multiple dimensions, all containing homogeneous data types. One of the key strengths of NumPy is its ability to execute many operations efficiently by using code that is compiled for performance. This makes NumPy well-suited for tasks involving numerical computation and data manipulation.

There are several noteworthy differences between NumPy arrays and standard Python sequences:


![numpy.png](attachment:numpy.png)

**Numpy** is here to make our life easier.

## Work with numeric data

In the realm of data analysis, "data" often implies numerical information, such as stock prices, sales figures, sensor readings, sports scores, database tables, and more. For handling numerical computations in Python, the NumPy library offers specialized data structures, functions, and tools. Let's delve into an example to understand why and how NumPy is instrumental in working with numerical data.

Consider a scenario where we want to assess whether a particular region is suitable for apple cultivation based on climate-related data, including temperature, rainfall, and humidity. A straightforward approach is to establish a relationship between the annual apple yield (measured in tonnes per hectare) and climatic factors like mean temperature (in degrees Fahrenheit), rainfall (in millimeters), and mean relative humidity (in percentage). We can express this relationship as a linear equation:

`yield_of_apples = w1 * temperature + w2 * rainfall + w3 * humidity`

Here, we represent apple yield as a weighted sum of temperature, rainfall, and humidity. While this equation serves as a simplification, the actual relationship may not be strictly linear, and additional factors could come into play. However, a basic linear model often provides practical insights.

By analyzing historical data statistically, we can obtain reasonable values for the coefficients w1, w2, and w3. Here's an example of possible coefficient values:

![dat_np.png](attachment:dat_np.png)

We can now substitute these variables into the linear equation to predict the yield of the apples.

In [None]:
w1, w2, w3 = 0.3, 0.2, 0.5

`yield_of_apples = w1 *temperature + w2* rainfall + w3 * humidity`

To make the above calculation a bit easier for multiple regions, we can represent the climate data for each region as a vector, i.e. a list of numbers.

In [None]:
yield_of_apples = w1 * 73 +  w2 * 67 + w3 * 43

The three numbers in each vector represent the temperature, precipitation, and humidity data, respectively.
We can also represent the set of weights used in the formula as a vector.

In [None]:
weights = [w1, w2, w3]

How can we calculate the cultivation of each region?

In [None]:
# Define the weights as a vector
weights = [w1, w2, w3]

# Define the climate data for Kanto as a vector
kanto_as_list = [73, 67, 43]

In [None]:
# Calculate the yield of apples for Kanto using the dot product
yield_kanto = sum(w * x for w, x in zip(weights, kanto_as_list))

In [None]:
# Similarly, you can calculate the yield for other regions like Johto
johto = [91, 88, 64]
yield_johto = sum(w * x for w, x in zip(weights, johto))

## Go from Python lists to Numpy arrays

The operation we performed, which involved elementwise multiplication of two vectors and then summing the results, is commonly known as the dot product. If you'd like to explore the dot product in more detail, you can refer to the official NumPy documentation here: [NumPy Dot Product](https://numpy.org/doc/stable/reference/generated/numpy.dot.html).

Before we can utilize NumPy's built-in function to compute the dot product, we need to convert our Python lists into NumPy arrays. To do this, we'll start by installing the NumPy library using the pip package manager.

Next, we are going to import the numpy module. It is common practice to import numpy with the alias np. This is a convention, all programmers use the same names when importing libraries. It really works if I put any name, in case of numpy we always use np

In [None]:
import numpy as np

Let's start with a list, as before:

In [None]:
kanto_as_list = [73, 67, 43]

In [None]:
type(kanto_as_list)

Now, we can convert it into a numpy array with the following operation:

In [None]:
kanto_array = np.array(kanto_as_list)

In [None]:
type(kanto_array)

Let's do the same for the array:

In [None]:
weights_array = np.array([0.3, 0.2, 0.5])

Now, we can perform previous operation with simply this:

In [None]:
kanto_array * weights_array

And the final output is:

In [None]:
sum(kanto_array * weights_array)

## Operating with Numpy Arrays

The * operator performs an elementwise multiplication of two arrays if they are the same size. The sum method calculates the sum of the numbers in an array.

weights_array * kanto_array

We can now calculate the dot product of the two vectors using the np.dot function.

np.dot(weights_array, kanto_array)

You can find all mathematical operators for numpy arrays [here](https://numpy.org/doc/stable/reference/routines.math.html).

## Advantages of using Numpy arrays

NumPy arrays offer several advantages over Python lists, especially when working with numeric data in data analysis and scientific computing:

**Ease of Use**

One of the primary advantages of NumPy arrays is their ease of use for performing mathematical operations. You can write concise and intuitive expressions that operate on entire arrays, making your code more readable and reducing the need for explicit loops.

For example, suppose you have two arrays, `kanto` and `weights`, representing climate data and corresponding weights. You can easily calculate the yield of apples for the Kanto region as:

```python
yield_of_apples = (kanto * weights).sum()
```

**Performance**

Another significant advantage of NumPy arrays is their performance. NumPy's operations and functions are implemented internally in C or C++, which makes them considerably faster than using equivalent Python statements and loops that are interpreted at runtime.

In [None]:
import time
import numpy as np

Now, let's measure the time it takes to calculate the yield using Python lists

In [None]:
# First, let's define the weights and climate data for the Kanto region
weights = [0.3, 0.2, 0.5]  # Weights for temperature, rainfall, and humidity
kanto_as_list = [73, 67, 43]  # Climate data for Kanto as a list

# Function to calculate apple yield manually using Python lists
def apples_per_region(lst_):
    counter = 0
    for j, k in zip(weights, lst_):
        counter += j * k
    return counter

start_time = time.time()  # Record the start time
result = apples_per_region(kanto_as_list)  # Calculate yield using Python lists
end_time = time.time()  # Record the end time

elapsed_time = end_time - start_time  # Calculate the elapsed time
print(f"Yield calculated using Python lists: {result}")
print(f"Time taken: {elapsed_time} seconds")

In [None]:
# Now, let's perform the same calculation using NumPy arrays
kanto_array = np.array([73, 67, 43])  # Climate data for Kanto as a NumPy array
weights_array = np.array([0.3, 0.2, 0.5])  # Weights as a NumPy array

# Measure the time it takes to calculate the yield using NumPy arrays
start_time = time.time()  # Record the start time
result = np.dot(weights_array, kanto_array)  # Calculate yield using NumPy arrays
end_time = time.time()  # Record the end time

elapsed_time = end_time - start_time  # Calculate the elapsed time
print(f"Yield calculated using NumPy arrays: {result}")
print(f"Time taken: {elapsed_time} seconds")

The slight difference in execution time, where NumPy appears to take slightly longer, is primarily due to the additional overhead introduced by NumPy's internal operations for handling arrays. NumPy is optimized for complex mathematical operations on large datasets, which might involve additional setup and memory management. For very simple calculations like this example, the overhead of NumPy can sometimes make it appear slightly slower.

However, in practice, the difference in execution time is negligible for most data-intensive tasks, especially when dealing with large datasets or complex mathematical operations. NumPy truly shines when performing more complex operations or working with larger datasets, where its performance advantages become much more apparent.

So, while in this specific case, NumPy might appear slightly slower, it's essential to consider the broader context where NumPy's efficiency becomes a significant advantage.

NumPy provides a more convenient and efficient way to work with numerical data, enabling you to write cleaner, more expressive code and achieve better performance in data analysis and scientific computing tasks.

## Create Arrays
In data analysis and scientific computing, generating random data is a common task. NumPy provides powerful tools to create arrays filled with random numbers, which can be particularly useful for tasks like simulating experiments or generating synthetic datasets.

### random numbers

**Random Randint vs. Random Random**

In this section, we'll explore two essential functions for generating random numbers in NumPy: [**`numpy.random.randint`**](https://numpy.org/doc/stable/reference/random/generated/numpy.random.randint.html) and [**`numpy.random.random`**](https://numpy.org/doc/stable/reference/random/generated/numpy.random.random.html).

- **`numpy.random.randint`**: This function generates random integers within a specified range. It's useful for creating arrays of whole numbers when simulating scenarios that involve discrete values.

- **`numpy.random.random`**: This function produces random floating-point numbers between 0 and 1. It's suitable for tasks that require continuous random variables, such as modeling uncertainty in scientific experiments.

By understanding these two functions, you'll gain the capability to generate diverse sets of random data tailored to your specific needs. Whether you're conducting statistical simulations, testing algorithms, or exploring probability distributions, NumPy's random number generation capabilities will prove invaluable.

Let's dive into each of these functions to see how they work and how to use them effectively.


![Screenshot%202022-10-11%20at%2009.45.54.png](attachment:Screenshot%202022-10-11%20at%2009.45.54.png)

In [None]:
# Import the NumPy library as np
import numpy as np

# Generate an array of 10 random floating-point numbers between 0 and 1
random_numbers = np.random.random(10)

# Print the generated array
print("Random Numbers Array:")
print(random_numbers)

**OFF-TOPIC**
- **Randomness**: Randomness is a concept related to unpredictability and chance in data or events.
- **Time in Seconds**: Measuring time in seconds can be useful for timing events or processes.
- **Coordinates**: Coordinates represent points in space or on a plane, often used in geometry and mapping.

In [None]:
# Create a function to display information about a NumPy array
def numpy_info(arr):
    print("\nThe dimension is:", arr.ndim)
    print("The shape is:", arr.shape)
    print("The size is:", arr.size)

The `numpy_info` function is designed to provide information about a NumPy array. It takes an input array `arr` and displays three key characteristics of the array:

1. **Dimension (`arr.ndim`):** This method returns the number of dimensions in the input array `arr`. For example, if `arr` is a 1-dimensional array, it will return `1` (vector). If it's a 2-dimensional array, it will return `2` (matrix), and so on.

2. **Shape (`arr.shape`):** This method returns a tuple representing the dimensions of the array. The tuple contains the length of the array along each dimension. For example, if `arr` is a 1-dimensional array with 7 elements, it will return `(7,)`. If it's a 2-dimensional array with dimensions 3x4, it will return `(3, 4)`.

3. **Size (`arr.size`):** This method returns the total number of elements in the array. It's essentially the product of the lengths of all dimensions. For example, if `arr` is a 2-dimensional array with dimensions 3x4, it will return `12` because there are a total of 12 elements in the array.

The `numpy_info` function's purpose is to quickly summarize the structure and size of a NumPy array for better understanding and analysis.


1. `1darray: vector`

In [None]:
# Generate an array of 7 random floating-point numbers between 0 and 1
vector = np.random.random(7)

# Print the generated vector
print("\nRandom Vector:")
print(vector)

In [None]:
# Call the numpy_info function to display information about the 'vector' array
print("\nArray Information:")
numpy_info(vector)

2. `2darray: matrix`

In [None]:
# Generate a random 2D NumPy array with dimensions 3x4
matrix = np.random.random((3, 4)) # (rows, columns)

# Print the generated vector
print("\nRandom Matrix:")
print(matrix)

In [None]:
# Call the numpy_info function to display information about the 'matrix' array
print("\nArray Information:")
numpy_info(matrix)

3. `3darray: tensor`

In [None]:
# Generate a random 2D NumPy array with dimensions 3 x 4 x 5
tensor = np.random.randint(3, size=(3, 4, 5)) # (block, rows, columns)

# Print the generated vector
print("\nRandom tensor:")
print(tensor)

In [None]:
# Call the numpy_info function to display information about the 'matrix' array
print("\nArray Information:")
numpy_info(tensor)

![enter-the-matrix-10-638.jpeg](attachment:enter-the-matrix-10-638.jpeg)

Numpy also provides a variety of convenient functions for creating arrays with specific shapes and containing either fixed or random values. You can explore these array creation functions further by referring to the official [documentation](https://numpy.org/doc/stable/reference/routines.array-creation.html) or by using the built-in `help` function for more comprehensive information and examples. These functions simplify the process of initializing arrays to suit your specific data manipulation needs.

### Range
You can use NumPy to easily create a range of values using the `np.arange()` function. This function takes three arguments: `start`, `stop`, and `step`, and it returns an array of values starting from `start`, up to (but not including) `stop`, with increments of `step`.

Here's an example:

In [None]:
# Does the same as range() function from build-in python
range_of_values = np.arange(0, 100)

### Linespace

The NumPy `linspace` function is a versatile tool for creating sequences of evenly spaced values within a specified interval. It allows you to define a start point and an end point for a given range and determine the total number of evenly spaced values you want within that interval. Importantly, this sequence includes both the start and end points.

Here's how you can use the `np.linspace()` function:

In [None]:
np.linspace(0, 100, 50)

In this function, `start` represents the beginning of the interval, `end` is the endpoint, and `num` is the total number of values you want in the sequence. NumPy will generate a sequence of values that are evenly distributed between `start` and `end`, inclusive of both endpoints.

The `linspace` function is particularly useful for tasks like defining values along a continuous axis for plotting graphs or creating evenly spaced intervals for numerical calculations.

For more details and options, you can refer to the official [NumPy `linspace` documentation](https://numpy.org/doc/stable/reference/generated/numpy.linspace.html).

### Around
It allows us to round the arrays to the decimals that we tell it 🙃

In [None]:
np.round(np.linspace(0, 100, 50), 3)

### Arrays from data
In NumPy, you can easily create arrays from existing data structures like lists. This section explores various ways to convert between Python lists and NumPy arrays.

**Converting a List to a NumPy Array**

Suppose you have a Python list a with some values:

In [None]:
a = [90, 50, 0]

You can convert this list into a NumPy array using `np.array()`:

In [None]:
a = np.array(a)

Now, the variable `a` holds a NumPy array with the same values:

**Converting an Array to a List**: If you have a NumPy array and want to convert it back to a Python list, you have a few options:

1. Using the `tolist()` method:

In [None]:
b = a.tolist()

This will create a new list `b` containing the values from the NumPy array.

2. Using the `list()` constructor:

In [None]:
list(a)

You can directly convert the NumPy array to a Python list using the `list()` constructor.

**Multi-Dimensional Arrays**: NumPy can also handle multi-dimensional arrays with ease. For instance, if you want to create a 3D array with random integers:

In [None]:
a = np.random.randint(10, size=(1, 2, 3))

This will result in a 3D NumPy array. Be cautious when converting such arrays to lists, as the structure can become nested and less intuitive

In [None]:
c_badly_done = list(a)

In the example above, `c_badly_done` will be a nested list with the structure reflecting the original array's dimensions.

Conclusion
Understanding how to convert data between Python lists and NumPy arrays is essential for working with data efficiently. Depending on your needs, you can easily switch between these data structures while taking care to manage the dimensions properly.

Always be mindful of the data structure you are working with, especially when handling multi-dimensional arrays, as converting them to lists may result in nested structures that require additional handling.

### Array of ZEROs
np.zeros is used to create the array where all elements are 0. Its syntax is:
`np.zeros(shape, dtype=float)`

Where:
- The shape is the size of the array, and can be 1-D, 2-D or multiple dimensions.
- The dtype is float64 by default, but can be set to any data type in NumPy.

In [None]:
np.zeros((2, 5, 10))

### Array of ONES
Similar to creating arrays of zeros, you can create an array where all elements are set to 1 using `np.ones()`. The syntax and parameters of `np.ones()` are identical to those of `np.zeros()`:

In [None]:
np.ones((2, 5, 10))

### Diagonal Array

The `np.eye()` function in NumPy is used to generate a two-dimensional array with ones along the main diagonal and zeros elsewhere. This can be especially useful when you need to create identity matrices, which are square matrices with ones on the diagonal and zeros everywhere else.

Here's how you can use `np.eye()` with its parameters:

- **N**: An integer that specifies the number of rows in the resulting array. This determines the size of the square matrix.

- **M** (optional): An integer that specifies the number of columns in the array. By default, it is set to `None`, which means it will be equal to `N`, resulting in a square matrix. However, you can set `M` to a different value to create a rectangular matrix.

- **k** (optional): An integer, the default value is 0. It determines the position of the diagonal. When `k` is 0, it places ones on the main diagonal. If `k` is positive, it shifts the diagonal upwards, creating an upper diagonal with the offset of `k`. Conversely, if `k` is negative, it shifts the diagonal downwards, forming a lower diagonal with the offset of `-k`.

- **dtype** (optional): The data type of the array elements. By default, it is set to float, but you can specify other data types as needed.

In [None]:
np.eye(5, 5, dtype=int)

In [None]:
np.eye(5, 5, 2) # Return a 2-D array with ones on the diagonal and zeros elsewhere.

## Dtypes

NumPy provides various data types (dtypes) to represent and work with different kinds of data efficiently. These data types are essential for specifying the type of data an array can hold, and they are a crucial aspect of NumPy's functionality.

Here's a list of common NumPy data types:

- **int8, int16, int32, int64**: Signed integer types with different bit sizes.
- **uint8, uint16, uint32, uint64**: Unsigned (non-negative) integer types with different bit sizes.
- **float16, float32, float64**: Floating-point types with varying precision.
- **complex64, complex128**: Complex number types.
- **bool**: Boolean type.
- **object**: Generic Python object type.
- **string_**: String type (fixed-size ASCII strings).
- **unicode_**: Unicode string type (fixed-size Unicode strings).
- **datetime64**: Date and time type.
- **timedelta64**: Differences between two datetime values.
- **void**: Raw data type (for structured arrays).
- **structured**: User-defined structured data types.
- **unicodechar**: A single Unicode character.

These data types allow you to work with a wide range of data efficiently, making NumPy a powerful library for numerical and scientific computing. You can explore the [official NumPy documentation](https://numpy.org/doc/stable/user/basics.types.html) for more detailed information about these data types and their usage.

In [None]:
np.eye(5, 5) # type: float

In [None]:
np.eye(5, 5, dtype="int") # type: int

In [None]:
np.eye(5, 5, dtype="str") # type: str

In [None]:
# if a dataframe has dtype=object -> it's a string
np.eye(5, 5, dtype="object") # type: object

## Business challenge: Forex Data Analysis Challenge

In this exercise, we'll embark on a Forex data analysis challenge, simulating a scenario where you need to analyze historical foreign exchange (Forex) rate data for a currency pair. You've been provided with a dataset containing daily Forex rates for a specific currency pair over a month. Your objective is to leverage NumPy to extract valuable insights from this financial data.

### Dataset Description

The dataset comprises two NumPy arrays:
- `dates`: An array containing date values representing each trading day of the month.
- `exchange_rates`: An array of exchange rate values for a specific currency pair corresponding to each date.

Your tasks will involve:

1. Calculating the average exchange rate for the entire month.
2. Identifying the date with the highest exchange rate.
3. Determining the total trading volume for the month.

By the end of this exercise, you'll gain hands-on experience in utilizing NumPy for financial data analysis, a valuable skill for professionals in the world of Forex trading and finance.

Let's kick things off by calculating the average exchange rate for the entire month!

**Hint!**: check the documentation [here](https://numpy.org/doc/stable/reference/routines.html).

In [None]:
import numpy as np

# Sample Forex data (you can replace this with your dataset)
dates = np.array(['2023-09-01', '2023-09-02', '2023-09-03', '2023-09-04', '2023-09-05'])
exchange_rates = np.array([1.15, 1.16, 1.14, 1.17, 1.15])

# 1. Calculating the average exchange rate for the entire month
# [your code here]
print(f"Average Exchange Rate for the Month: {average_rate:.2f}")

# 2. Identifying the date with the highest exchange rate
# [your code here]
print(f"Highest Exchange Rate Date: {max_rate_date}, Rate: {max_rate:.2f}")

# 3. Determining the total trading volume for the month (assuming equal volume each day)
# [your code here]
print(f"Total Trading Volume for the Month: {total_volume:.2f}")

## [Transpose](https://numpy.org/doc/stable/reference/generated/numpy.transpose.html)

In data analysis and linear algebra, the transpose of a matrix is a fundamental operation that flips the matrix over its diagonal. This operation swaps the rows and columns of the matrix, effectively changing the orientation of the data. Transposition is commonly used in various mathematical operations and transformations, making it a crucial concept in numerical computing.

You can think of the transpose as a way to represent the same data from a different perspective, especially when dealing with multi-dimensional arrays or matrices. It is often used to align data correctly for mathematical operations, including matrix multiplication, solving linear equations, and more.

In NumPy, the `numpy.transpose()` function allows you to easily transpose arrays and matrices, providing a versatile tool for data manipulation and analysis.

![image.png](attachment:image.png)

Please note that the image above is a simplified representation of matrix transposition, showing the change in orientation from rows to columns and vice versa. In practice, the transpose operation can involve complex data transformations, but it serves as a critical building block for various mathematical computations and data manipulations.

In [None]:
# Create a sample 2D array (matrix)
matrix = np.array([[1, 2, 3],
                   [4, 5, 6],
                   [7, 8, 9]])

In [None]:
# Display the original matrix
print("Original Matrix:")
print(matrix)

In [None]:
# Use numpy.transpose() to transpose the matrix
transposed_matrix = np.transpose(matrix)

In [None]:
# Display the transposed matrix
print("\nTransposed Matrix:")
print(transposed_matrix)

In [None]:
# Alternatively, you can use the T attribute for transposition
# This achieves the same result as np.transpose()
transposed_matrix_alt = matrix.T

In [None]:
# Display the alternative transposed matrix
print("\nAlternative Transposed Matrix:")
print(transposed_matrix_alt)

In [None]:
# Check if the two methods produce the same result
print("\nAre the two transposed matrices equal?")
print(np.array_equal(transposed_matrix, transposed_matrix_alt))

## Reshaping Arrays in NumPy

Reshaping arrays is a fundamental operation in data manipulation and analysis. It involves changing the dimensions or shape of an array while preserving its original data. NumPy, a powerful library for numerical computing in Python, provides a versatile function called `numpy.reshape()` for this purpose.

Reshaping allows you to convert 1D arrays into 2D arrays (and vice versa), rearrange the dimensions of multi-dimensional arrays, and efficiently prepare data for various data analysis and machine learning tasks.

In this section, we'll explore how to reshape arrays in NumPy, including transforming arrays between different dimensions and examining the properties of reshaped arrays.

Let's dive into the world of array reshaping and discover how it can be a valuable tool in your data analysis toolkit.

![Screenshot%202023-01-24%20at%2011.21.15.png](attachment:Screenshot%202023-01-24%20at%2011.21.15.png)

In [None]:
# Create a 1D array
array_1d = np.array([1, 2, 3, 4, 5, 6])

In [None]:
# Display the original 1D array
print("Original 1D Array:")
print(array_1d)

In [None]:
# Reshape the 1D array into a 2D array with 2 rows and 3 columns
array_2d = np.reshape(array_1d, (2, 3))

In [None]:
# Display the reshaped 2D array
print("\nReshaped 2D Array:")
print(array_2d)

In [None]:
# Check the shape of the reshaped array
print("\nShape of Reshaped Array:", array_2d.shape)

In [None]:
# Reshape the 2D array back to a 1D array
array_1d_reshaped = np.reshape(array_2d, -1)

In [None]:
# Display the reshaped 1D array
print("\nReshaped 1D Array:")
print(array_1d_reshaped)

## Updating the value of a ndim array

In data analysis and numerical computing, it's often necessary to modify the values of individual elements within multi-dimensional arrays. NumPy, a powerful library for array manipulation in Python, provides various techniques to update the values of ndimensional arrays efficiently.

In this section, we'll explore methods and techniques for updating the values of ndim arrays in NumPy. Whether you're working with data cleaning, transformation, or any data manipulation task, knowing how to update array values is an essential skill.

Let's delve into the world of array value updates and discover how to perform these operations effectively using NumPy.


1. `overwriting`

In [None]:
# Create a sample 2D array
original_array = np.array([[1, 2, 3],
                            [4, 5, 6],
                            [7, 8, 9]])

print("Updated Array:")
print(original_array)

In [None]:
# Update a specific element by overwriting it
updated_array=original_array.copy()
original_array[1, 1] = 99

print("Updated Array:")
print(original_array)

2. `using its index`

In [None]:
# Update a specific element using its index
updated_array=original_array.copy()
original_array[0][-1] = 60

print("Updated Array:")
print(original_array)

3. `another syntax`

In [7]:
# Create a 2D NumPy array
array_new = np.array(([1, 2, 3], [2, 2, 0]))

In [8]:
array_new

array([[1, 2, 3],
       [2, 2, 0]])

We create a 2D NumPy array called `array_new` with two rows and three columns.

In [9]:
array_new == 2 # mask

array([[False,  True, False],
       [ True,  True, False]])

In [11]:
array_new[array_new==2]  # lista de números que cumplen que son 2

array([2, 2, 2])

In [12]:
prueba = np.array((["Aquí no", "Aquí sí", "Aquí no"], ["Aquí sí", "Aquí sí", "Aquí no"]))

In [13]:
prueba[array_new==2]  # lista de valores donde la máscara da True

array(['Aquí sí', 'Aquí sí', 'Aquí sí'], dtype='<U7')

We use boolean indexing to select elements in `array_new` where the condition `array_new == 2` is true. This operation returns an array containing all elements equal to 2.

In [14]:
array_new[array_new==2] = 20

In [15]:
array_new

array([[ 1, 20,  3],
       [20, 20,  0]])

We update the elements in `array_new` where the condition `array_new == 2` is true. In this case, we set those elements to 20.

In [16]:
array_new == 20 # mask

array([[False,  True, False],
       [ True,  True, False]])

We create a boolean mask by comparing each element of `array_new` with 20. This results in a boolean array where each element is `True` if the corresponding element in array_new is equal to 20, and `False` otherwise.

## Array comparison

NumPy arrays not only store data efficiently but also support a wide range of comparison operations. These operations allow you to compare elements within arrays, resulting in new arrays of Boolean values. These Boolean arrays serve as masks, enabling you to extract or manipulate elements based on specific conditions.

In [17]:
# Create two NumPy arrays
array1 = np.array([1, 2, 3, 4, 5])
array2 = np.array([3, 4, 5, 6, 7])

In [18]:
# Equality comparison
equal_result = (array1 == array2)
print("Equality Comparison Result:")
print(equal_result)

Equality Comparison Result:
[False False False False False]


In [20]:
# Inequality comparison
not_equal_result = (array1 != array2)
print("\nInequality Comparison Result:")
print(not_equal_result)


Inequality Comparison Result:
[ True  True  True  True  True]


In [21]:
# Greater than comparison
greater_than_result = (array1 > array2)
print("\nGreater Than Comparison Result:")
print(greater_than_result)


Greater Than Comparison Result:
[False False False False False]


In [22]:
# Less than comparison
less_than_result = (array1 < array2)
print("\nLess Than Comparison Result:")
print(less_than_result)


Less Than Comparison Result:
[ True  True  True  True  True]


## Array indexing and slicing
NumPy extends Python's list indexing notation using [] to multiple dimensions in an intuitive way. With NumPy, you can provide a comma-separated list of indices or ranges to select specific elements or subarrays, also known as slices, from a NumPy array. This powerful feature allows you to access and manipulate data within multi-dimensional arrays efficiently.

In [23]:
# Create a 2D NumPy array
arr = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
arr

array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])

In [24]:
# Accessing individual elements
element = arr[1, 2]  # Accesses the element in the second row and third column
print("Individual Element:", element)

Individual Element: 6


In [27]:
# Accessing individual elements
element = arr[1, -1]  # Accesses the element in the second row and última column
print("Individual Element:", element)

Individual Element: 6


In [28]:
# Accessing individual elements
element = arr[1, -2]  # Accesses the element in the second row and penúltima column
print("Individual Element:", element)

Individual Element: 5


In [29]:
# Slicing subarrays
subarray = arr[0:2, 1:3]  # Selects a subarray consisting of rows 0 to 1 and columns 1 to 2. Pero lo último no incluido
print("\nSubarray:")
print(subarray)


Subarray:
[[2 3]
 [5 6]]


In [26]:
# Using step in slicing
step_array = arr[:2, :2]  # Selects rows from 0 to 1 and columns from 0 to 1
print("\nStep Slicing:")
print(step_array)


Step Slicing:
[[1 2]
 [4 5]]


## Methods of np.arrays

NumPy arrays come equipped with a wide range of built-in methods that make performing various operations on arrays more convenient and efficient. These methods provide functionality for tasks such as mathematical operations, aggregation, statistics, and data manipulation.

To explore the extensive list of methods available for NumPy arrays, you can refer to the official [NumPy documentation](https://numpy.org/doc/stable/reference/arrays.ndarray.html). This documentation provides detailed information about each method along with examples of their usage.


In [30]:
# Creating a NumPy array
arr = np.array([3, 1, 2, 5, 4])
arr

array([3, 1, 2, 5, 4])

In [31]:
# Method 1: Sorting the array
sorted_arr = np.sort(arr)
print("Sorted Array:", sorted_arr)

Sorted Array: [1 2 3 4 5]


In [32]:
# Method 2: Finding the maximum and minimum values
max_value = np.max(arr)
min_value = np.min(arr)
print("Max Value:", max_value)
print("Min Value:", min_value)

Max Value: 5
Min Value: 1


In [33]:
# Method 3: Calculating the sum and mean
sum_values = np.sum(arr)
mean_value = np.mean(arr)
print("Sum of Values:", sum_values)
print("Mean of Values:", mean_value)

Sum of Values: 15
Mean of Values: 3.0


In [34]:
# Method 4: Reshaping the array
reshaped_arr = arr.reshape(1, 5)
print("Reshaped Array:", reshaped_arr)

Reshaped Array: [[3 1 2 5 4]]


In [35]:
# Method 5: Calculating the square root
sqrt_arr = np.sqrt(arr)
print("Square Root of Array:", sqrt_arr)

Square Root of Array: [1.73205081 1.         1.41421356 2.23606798 2.        ]


In [36]:
# Method 6: Applying a mathematical function element-wise
exp_arr = np.exp(arr)
print("Exponential of Array:", exp_arr)

Exponential of Array: [ 20.08553692   2.71828183   7.3890561  148.4131591   54.59815003]


In [37]:
# Method 7: Finding unique values and their counts
unique_values, counts = np.unique(arr, return_counts=True)
print("Unique Values:", unique_values)
print("Counts:", counts)

Unique Values: [1 2 3 4 5]
Counts: [1 1 1 1 1]


## Business challenge: E-commerce Customer Data Analysis

You work for an e-commerce company, and your manager has asked you to analyze customer data to gain insights into customer behavior. Your task is to calculate some key metrics and generate reports based on the provided data.

### Customer Data

You are provided with a NumPy array named `customer_data`, which contains information about each customer. Each row in the array represents a customer, and the columns contain the following information:

1. Customer ID
2. Total Orders Placed
3. Total Amount Spent (in dollars)
4. Days Since Last Purchase
5. Email Subscribed (1 for subscribed, 0 for not subscribed)

Your goal is to create a set of functions to analyze this data and generate reports for your manager.

### Functions to Implement

You need to implement the following functions:

1. **`average_order_value(data)`**: Calculate the average order value (AOV) for all customers. AOV is calculated as the total amount spent divided by the total number of orders.

2. **`customer_lifetime_value(data)`**: Calculate the customer lifetime value (CLV) for each customer. CLV is calculated as the product of the average order value and the number of days since the last purchase.

3. **`high_value_customers(data, threshold)`**: Identify high-value customers whose CLV exceeds a given threshold.

4. **`email_subscription_rate(data)`**: Calculate the email subscription rate, which is the percentage of customers who have subscribed to emails.

5. **`inactive_customers(data, days)`**: Identify customers who have not made a purchase in a specified number of days.

### Instructions

You are provided with a NumPy array named `customer_data` containing customer information. Your task is to implement the functions mentioned above to perform the following tasks:

- Calculate the average order value (AOV) for all customers.

- Calculate the customer lifetime value (CLV) for each customer.

- Identify high-value customers whose CLV exceeds a given threshold. (Take 300 as threshold)

- Calculate the email subscription rate.

- Identify customers who have not made a purchase in a specified number of days. (take 30 days)

You should test your functions with the provided data and display the results in a clear and organized manner.

Feel free to use NumPy, lambda functions, list comprehension, and any other Python techniques you find suitable for this analysis.

**Note**: You can assume that the `customer_data` array is already loaded with data.


In [65]:
import numpy as np

# Sample customer data (replace with your actual data)
customer_data = np.array([
    [101, 5, 250.00, 30, 1],
    [102, 3, 150.00, 15, 0],
    [103, 7, 350.00, 45, 1],
    [104, 2, 100.00, 60, 0],
    [105, 4, 200.00, 10, 1]
])
customer_data[:,2]
""" COLUMNS:
0- Customer ID
1- Total Orders Placed
2- Total Amount Spent (in dollars)
3- Days Since Last Purchase
4- Email Subscribed (1 for subscribed, 0 for not subscribed)"""

def average_order_value(data):
  """AOV is calculated as the total amount spent divided by the total number of orders."""
  total_amount_spent = data[:,2]
  total_number_of_orders = data[:,1]
  aov = total_amount_spent / total_number_of_orders
  return aov

def customer_lifetime_value(data):
  """CLV is calculated as the product of the average order value and the number of days since the last purchase."""
  days_since_last_purchase = data[:,3]
  clv = average_order_value(data) * days_since_last_purchase
  return clv

def high_value_customers(data, threshold):
  """ Identify high-value customers whose CLV exceeds a given threshold."""
  # Se calcula el clv con la función.
  clv = customer_lifetime_value (data)
  # Se verifican las posiciones donde clv supera el threshold.
  mask = clv > threshold
  # Se selecionan los datos que cumplen la condición de la máscara y se devuele el ID del usuario
  high_value = data[mask][:,0]
  return high_value

def email_subscription_rate(data):
  """Calculate the email subscription rate, which is the percentage of customers who have subscribed to emails."""
  subscribed = data[:,4]
  # Se suman los valores de subscribed (1 sí, 0 no) dividiendo por la cantidad total de usuarios que se tienen datos.
  subscription_rate = np.sum(subscribed)/len(subscribed) * 100
  # Se podría equivalentemente calcular la media.
  subscription_rate = np.mean(subscribed) * 100
  # Para considerar los NaN y no tenerlos en cuenta seria dividir por todos los que no son nan
  subscription_rate = np.sum(subscribed)/np.sum(~np.isnan(subscribed)) * 100
  return subscription_rate

def inactive_customers(data, days):
  """Identify customers who have not made a purchase in a specified number of days."""
  # Se seleciona la columa de los datos de los días hasta la última compra
  days_since_last_purchase = data[:,3]
  # Se verifican las posiciones donde pasaron más dias que los días de la última compra.
  mask = days > days_since_last_purchase
  # Se selecionan los datos que cumplen la condición de la máscara y se devuele el ID del usuario
  inactive = data[mask][:,0]
  return inactive

In [56]:
average_order_value(customer_data)

array([50., 50., 50., 50., 50.])

In [57]:
customer_lifetime_value(customer_data)

array([1500.,  750., 2250., 3000.,  500.])

In [58]:
high_value_customers(customer_data, threshold=300)

array([101., 102., 103., 104., 105.])

In [66]:
email_subscription_rate(customer_data)

60.0

In [61]:
inactive_customers(customer_data, days=30)

array([102., 105.])

In [67]:
customer_data = np.array([
    [101, 5, 250.00, 30, 1],
    [102, 3, 150.00, 15, 0],
    [103, 7, 350.00, 45, 1],
    [104, 2, 100.00, 60, 0],
    [105, 4, 200.00, 10, 1]
])

# Ejercicio de ejemplo donde se seleccionan todos los customers id excepto el del medio
mask = [True, True, False, True, True]
customer_data[mask][:,0]

array([101., 102., 104., 105.])

## Fun things that Numpy allows us to do

Numpy is a versatile library that offers a wide range of functionalities beyond numerical computing. It opens up opportunities for various data manipulation tasks, including image processing. In this section, we'll explore some exciting things that you can do with Numpy, from analyzing images to other fun applications.

### Exploring Numpy's Versatility

To discover the myriad of functionalities that Numpy offers, you can use the `dir(object)` function, which displays all possible methods that can be applied to a Numpy object. This allows you to explore the extensive capabilities of Numpy for different data manipulation tasks.

### Image Processing with Pillow and scikit-image

In addition to numerical data, Numpy can be used to work with images through libraries like Pillow and scikit-image:

- [Pillow](https://pillow.readthedocs.io/en/stable/): Pillow, the Python Imaging Library, provides comprehensive image processing capabilities for your Python interpreter. You can open and manipulate images with ease using functions like `Image.open("path")`.

- [scikit-image](https://scikit-image.org/docs/stable/): Scikit-image extends your image processing capabilities in Python. With functions like `io.imread("path")`, you can read and process images efficiently for various applications.

In the upcoming sections, we'll delve into practical examples and explore the fun side of Numpy, demonstrating its versatility and usefulness in diverse data-related tasks.


We can process images with Numpy!

In [None]:
from skimage import io

In [None]:
photo = io.imread("../images/bcn.jpeg")

In [None]:
numpy_info(photo)

In [None]:
photo

In [None]:
# pixel: square w/color
# color: [r, g, b]
# [0, 255, 89]

![RGB cube](https://upload.wikimedia.org/wikipedia/commons/d/d6/RGB_color_cube.svg)

`exploring the picture numerically`

In [None]:
photo.shape[0] # FIRST DIMENSION: 800, height, y axis
photo.shape[1] # SECOND DIMENSION: 1200, length, x axis
photo.shape[2] # THIRD DIMENSION: b3, color, (r, g, b) -> depth

print(f'First dimension is {photo.shape[0]}, the second dimension is {photo.shape[1]} and the third dimension is {photo.shape[2]}')

In [None]:
# Color grading
photo[0][0]

# RED: photo[0][0][0] is 227
# GREEN: photo[0][0][1] is 208
# BLUE: photo[0][0][2] is 201

pic = io.imshow(photo)

1. `reversing a picture`

In [None]:
# Do you remember we can reverse an array with array[::-1]? Let's try it
reversed_pic = io.imshow(photo[::-1])

2. `cropping a picture`

In [None]:
# We can also select a specific part
cathedral = io.imshow(photo[200:500, 400:800])

3. `changing color`

In [None]:
# All dimensions but only get the blue
    # Every element in Y axis
    # Every element in X axis
    # Only blue element in Blue
io.imshow(photo[:, :, 2]);

4. `np.where`

In [None]:
# np.where: how it works
# https://numpy.org/doc/stable/reference/generated/numpy.where.html

# a new array (filter_2)
# will be photo with updated values (np.where)


# if the value is greater than 100
    # that value will be changed int0 255
# if not greater:
    # the value will be zero

# filter_2 the result of updating based on the np.where condition
    # np.where access every single element


4. 1.  `brightness: high`

In [None]:
# Let's test with 200
brightness_high = np.where(photo > 100, 200, 0) # brightness
io.imshow(brightness_high);

In [None]:
# Let's test with 255
brightness_high = np.where(photo > 100, 255, 0) # brightness
io.imshow(brightness_high);

4. 2.  `brightness: low`

In [None]:
# Let's test with 50
brightness_low = np.where(photo > 70, 50, 30)
io.imshow(brightness_low);

![RGB cube](https://www.aseprite.org/docs/color-profile/rgb-cube.png)

A digital image is decomposed on our computer as a 3-dimensional array where each pixel is a value from 0 to 255, with 0 being black and 255 being white.

## Some methods:

- [Multiply](https://numpy.org/doc/stable/reference/generated/numpy.multiply.html)
- [Reshape](https://numpy.org/doc/stable/reference/generated/numpy.reshape.html)
- [Transpose](https://numpy.org/doc/stable/reference/generated/numpy.transpose.html)

## Summary
It's your turn. What have we learned today?


## Further materials
- [NumPy Cheatsheet](https://s3.amazonaws.com/assets.datacamp.com/blog_assets/Numpy_Python_Cheat_Sheet.pdf)
- [Master Numpy](https://medium.com/analytics-vidhya/master-numpy-in-45-minutes-74b2460ecb00)
- [Numpy Tricks](https://github.com/patidarparas13/Numpy-Tricks/blob/master/Numpy%2BTricks(Zero%2Bto%2BHero).ipynb)
- [101 Numpy exercises](https://www.machinelearningplus.com/python/101-numpy-exercises-python/)

## Solutions

In [None]:
# Business challenge: Forex Data Analysis challenge
import numpy as np

# Sample Forex data (you can replace this with your dataset)
dates = np.array(['2023-09-01', '2023-09-02', '2023-09-03', '2023-09-04', '2023-09-05'])
exchange_rates = np.array([1.15, 1.16, 1.14, 1.17, 1.15])

# 1. Calculating the average exchange rate for the entire month
average_rate = np.mean(exchange_rates)
print(f"Average Exchange Rate for the Month: {average_rate:.2f}")

# 2. Identifying the date with the highest exchange rate
max_rate_index = np.argmax(exchange_rates)
max_rate_date = dates[max_rate_index]
max_rate = exchange_rates[max_rate_index]
print(f"Highest Exchange Rate Date: {max_rate_date}, Rate: {max_rate:.2f}")

# 3. Determining the total trading volume for the month (assuming equal volume each day)
total_volume = np.sum(exchange_rates)  # Replace this with actual trading volume data if available
print(f"Total Trading Volume for the Month: {total_volume:.2f}")

# 4. Detecting significant fluctuations or anomalies in the exchange rate (custom logic)
# You can implement custom logic here to detect fluctuations or anomalies based on your criteria.

# For example, identifying days where the rate is significantly lower than the average:
anomalies = dates[exchange_rates < (average_rate - 0.1)]  # Adjust the threshold as needed
print("Dates with Significant Rate Anomalies:")
for date in anomalies:
    print(date)

In [None]:
# Business challenge: E-commerce Customer Data Analysis
import numpy as np

# Sample customer data (replace with your actual data)
customer_data = np.array([
    [101, 5, 250.00, 30, 1],
    [102, 3, 150.00, 15, 0],
    [103, 7, 350.00, 45, 1],
    [104, 2, 100.00, 60, 0],
    [105, 4, 200.00, 10, 1]
])

# Function to calculate the average order value (AOV)
def average_order_value(data):
    total_orders = data[:, 1].sum()
    total_amount_spent = data[:, 2].sum()
    aov = total_amount_spent / total_orders
    return aov

# Function to calculate the customer lifetime value (CLV)
def customer_lifetime_value(data):
    clv = (data[:, 2] / data[:, 3]) * data[:, 1]
    return clv

# Function to identify high-value customers above a threshold CLV
def high_value_customers(data, threshold):
    clv = customer_lifetime_value(data)
    high_value = data[clv > threshold]
    return high_value

# Function to calculate the email subscription rate
def email_subscription_rate(data):
    subscribed_count = data[:, 4].sum()
    total_customers = len(data)
    subscription_rate = (subscribed_count / total_customers) * 100
    return subscription_rate

# Function to identify inactive customers based on days since last purchase
def inactive_customers(data, days):
    inactive = data[data[:, 3] > days]
    return inactive

# Test the functions and display results
print("Average Order Value (AOV): $", average_order_value(customer_data))
print("\nCustomer Lifetime Value (CLV):")
print(customer_lifetime_value(customer_data))
print("\nHigh-Value Customers (CLV > $300):")
print(high_value_customers(customer_data, 300))
print("\nEmail Subscription Rate: %", email_subscription_rate(customer_data))
print("\nInactive Customers (No purchase in > 30 days):")
print(inactive_customers(customer_data, 30))