<a href="https://colab.research.google.com/github/Manish927/Algorithm/blob/patch1/NumPy_Array_Broadcasting.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [2]:
# Imagine you have the following data which you want to feed into a machine learning algorithm
import numpy as np


In [12]:
# Consider below data to feed into machine learning
# The data contains the age, income, years of service, gender and whether the person owns a house or not. However, the machine learning algorithm works best when numeric features are scaled to the range [-1, 1].
data = np.array([[25, 50000, 4.5, 1, 0],
                [45, 80000, 12.2, 0, 1],
                [35, 60000, 5, 1, 1]])
data

array([[2.50e+01, 5.00e+04, 4.50e+00, 1.00e+00, 0.00e+00],
       [4.50e+01, 8.00e+04, 1.22e+01, 0.00e+00, 1.00e+00],
       [3.50e+01, 6.00e+04, 5.00e+00, 1.00e+00, 1.00e+00]])

In [13]:
# we want to scale the age, income and the years of service, but not the gender or house ownership. We will standardise the columns to scale
# i.e substract the mean and divide by the standard deviation.

# scaling just the first three element in each row, standardising by subtracting the mean and dividing by the standard deviation
data[:, :3] = (data[:, :3] - np.mean(data[:, :3])) / np.std(data[:, :3])
data


array([[-0.68725673,  0.94048486, -0.68792444,  1.        ,  0.        ],
       [-0.68660531,  1.91761838, -0.68767364,  0.        ,  1.        ],
       [-0.68693102,  1.26619603, -0.68790815,  1.        ,  1.        ]])

Imagine you have a list of prices for various products, and you want to calculate the final price for each productt in different cities. Each city has a unique tax rate.
- You have five products A,B,C,D,E and priced at 100, 250, 500, 750, 1000 repectively.
- There are 3 cities X, Y, Z with tax rates of 5%, 7.5% and 10% repectively.

In [14]:
product_prices = np.array([100, 250, 500, 750, 1000])
tax_rates = np.array([0.05, 0.075, 0.1]) + 1
tax_rates

array([1.05 , 1.075, 1.1  ])

In [17]:
final_prices = product_prices[:, np.newaxis] * tax_rates  # which is 5 rows and 3 column (cities)
final_prices

array([[ 105.  ,  107.5 ,  110.  ],
       [ 262.5 ,  268.75,  275.  ],
       [ 525.  ,  537.5 ,  550.  ],
       [ 787.5 ,  806.25,  825.  ],
       [1050.  , 1075.  , 1100.  ]])

In [18]:
# How can we find the average price per city after tax
average_prices = np.mean(final_prices, axis=0)
average_prices

array([546., 559., 572.])

In [19]:
#H How can we find average price of each product after tax ?
average_prices = np.mean(final_prices, axis=1)
average_prices

array([ 107.5 ,  268.75,  537.5 ,  806.25, 1075.  ])

In [25]:
# which city has the highest average price after tax ? below code will return the index
city_with_highest_average = np.argmax(final_prices.mean(axis = 0))
city_with_highest_average

np.int64(2)

In [26]:
city_with_highest_average = np.max(final_prices.mean(axis = 0))
city_with_highest_average

np.float64(572.0)