<a href="https://colab.research.google.com/github/MehmoodBhutta/data-science-work/blob/main/numpy_practice_notebook_iii.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [None]:
import numpy as np

# Numpy Practice III

## Q1. Product Sales

A company has data available for sales (in $) of several of it's products across multiple years. The sales data for each product is stored in a separate 1D array. Suppose that the company has data for 4 products from years 2016-2022 inclusive:

In [None]:
a = np.random.randint(low=10, high=100, size=7)
b = np.random.randint(low=20, high=90, size=7)
c = np.random.randint(low=30, high=70, size=7)
d = np.random.randint(low=10, high=80, size=7)

Please answer the following questions using the given data:

1.   Give a list of all products sorted in descending order by their average yearly sales
2.   The average yearly sales of products a & c from years 2018-2020 inclusive
3.   For each product, the difference between the maximum sales revenue and minimum sales revenue
4.   The percentage difference in sales for all products across all years
5.   The company wants to calculate the average sales for each year. However, the company prefers some products over others. Therefore, it wants to take a *weighted* average. The weights for each product are given in a dictionary. Note that the weights sum to 1. Calculate the *weighted average* sales for all years




In [None]:
weights = {
    'a': 0.2,
    'b': 0.1,
    'c': 0.4,
    'd': 0.3
}

In [None]:
sales = np.vstack((a, b, c, d))
sales.shape

(4, 7)

In [None]:
sales

array([[35, 46, 20, 86, 93, 16, 86],
       [82, 21, 44, 40, 27, 27, 28],
       [69, 57, 32, 63, 56, 39, 40],
       [30, 33, 64, 50, 10, 17, 52]])

## Q2. Offsets
You have a 3D NumPy array data with shape (2, 3, 1) and a 1D NumPy array offset with shape (3,).

Add offset to each slice of data along the second axis using broadcasting. What is the shape of the result?

In [None]:
x = np.random.randint(0, 100, size=(2, 3, 1))
offset = np.array([61, 93, 28])



print(offset[None, :, None].shape)

y = offset[None, :, None] + x
y

(1, 3, 1)


array([[[ 88],
        [154],
        [ 35]],

       [[136],
        [161],
        [112]]])

## Q3. Radius

Produce a 3-dimensional array of values that calculate the radius from the origin of all the points in an (X, Y, Z) grid of shape (100, 100, 100), with X, Y and Z ranging from -10 to 10:

$R = \sqrt{X^2 + Y^2 + Z^2}$ where $R$ is the radius from the origin

In [None]:
X = np.linspace(-10, 10, 100)
Y = np.linspace(-10, 10, 100)
Z = np.linspace(-10, 10, 100)

In [None]:
R = np.sqrt(X[:, None, None] ** 2 + Z **2)
R.shape

(100, 1, 100)

## Q4. Basic Normalization

Given the shape-(2,3,4) array given below:

Normalize x such that each of its rows, within each sheet, will sum to a value of 1. Make use of the sequential function np.sum, which should be called only once, and broadcast-division.

In [None]:
x = np.array([[[ 0,  1,  2,  3],
               [ 4,  5,  6,  7],
               [ 8,  9, 10, 11]],
              [[12, 13, 14, 15],
               [16, 17, 18, 19],
               [20, 21, 22, 23]]])

In [None]:
sums = x.sum(axis=2)
normalized_x = x / sums[..., np.newaxis]

normalized_x

array([[[1.],
        [1.],
        [1.]],

       [[1.],
        [1.],
        [1.]]])

## Q5. Pairwise differences

Suppose you have the following data collected from multiple sensors:

`a_1 = np.random.uniform(0, 90, size=10)` representing the recorded humidity values at location A for the first 10 years

`a_2 = np.random.uniform(0, 90, size=5)` representing the recorded humidity values at location A for the last 5 years

`b_1 = np.random.uniform(10, 100, size=10)` representing the recorded humidity values at location B for the first 10 years

`b_2 = np.random.uniform(10, 100, size=5)` representing the recorded humidity values at location B for the last 5 years


Please calculate the *root sum of squared pairwise differences* in the recorded humidity values between location A and B for all years.

Sum of squared pairwise differences $ = \sqrt{(x_1 - y_1)^2 + (x_2 - y_2)^2 + \ldots + (x_N - y_N)^2} $ where N = number of years

In [None]:
a_1 = np.random.uniform(0, 90, size=10)
a_2 = np.random.uniform(0, 90, size=5)
b_1 = np.random.uniform(10, 100, size=10)
b_2 = np.random.uniform(10, 100, size=5)


a = np.hstack((a_1, a_2))
b = np.hstack((b_1, b_2))

diff = np.sqrt(np.sum((a - b) ** 2))
diff

130.48001727345212

## Q6. Image Normalization

A digital image is simply an array of numbers, which instructs a grid of pixels on a monitor to shine light of specific colors, according to the numerical values in that array.

An RGB-image can thus be stored as a 3D NumPy array of shape - (V, H, 3)
.
 V is the number of pixels along the vertical direction,
 H is the number of pixels along the horizontal, and the size-3 dimension stores the red, blue, and green color values for a given pixel. Thus a
 array would be a 32x32 RGB image.

You often work with a collection of images. Suppose we want to store N images in a single array; thus we now consider a 4D shape-(N, V, H, 3) array. For the sake of convenience, let’s simply generate a 4D-array of random numbers as a placeholder for real image data. We will generate 500, 48x48 RGB images:

`images = np.random.rand(500, 48, 48, 3)`

Using the function `np.max` and broadcasting, normalize images such that the largest value within each color-channel of each image is 1.

In [None]:
images = np.random.rand(500, 48, 48, 3)

In [None]:
images_max_channels = np.max(images, axis=(1, 2)) #(500, 3)
images_max_channels.shape

(500, 3)

In [None]:
normalized_images = images / images_max_channels[:, np.newaxis, np.newaxis, :]

In [None]:
normalized_images = images / np.expand_dims(images_max_channels, axis=(1, 2))

In [None]:
normalized_images.max(axis=(1, 2))

array([[1., 1., 1.],
       [1., 1., 1.],
       [1., 1., 1.],
       ...,
       [1., 1., 1.],
       [1., 1., 1.],
       [1., 1., 1.]])