## Business Scenario

MallCo, a bustling shopping hub, has noticed a decline in customer retention despite steady footfall
 To address this, the management team wants to unlock actionable insights from their customer data to improve loyalty programs, and identify high-value customers
.
The dataset includes the following information about 200 customers:
CustomerID: Unique identifier for each customer.
Gender: Male (0) or Female (1).
Age: Customer's age in years.
Annual Income (k$): Annual income in thousands of dollars.
Spending Score (1-100): A score that indicates spending habits and mall enga
gement.
Your role is to analyze the data, answer critical business questions, and recommend strategies for improving customer retention.


In [2]:
import numpy as np

## 1. Loading the dataset


In [3]:
# Define the converter function
dataset = 'C:/Users/anjal/Downloads/Mall_Customers.csv'
data = np.genfromtxt(dataset, delimiter=',', skip_header=1)
data

array([[  1.,  nan,  19.,  15.,  39.],
       [  2.,  nan,  21.,  15.,  81.],
       [  3.,  nan,  20.,  16.,   6.],
       [  4.,  nan,  23.,  16.,  77.],
       [  5.,  nan,  31.,  17.,  40.],
       [  6.,  nan,  22.,  17.,  76.],
       [  7.,  nan,  35.,  18.,   6.],
       [  8.,  nan,  23.,  18.,  94.],
       [  9.,  nan,  64.,  19.,   3.],
       [ 10.,  nan,  30.,  19.,  72.],
       [ 11.,  nan,  67.,  19.,  14.],
       [ 12.,  nan,  35.,  19.,  99.],
       [ 13.,  nan,  58.,  20.,  15.],
       [ 14.,  nan,  24.,  20.,  77.],
       [ 15.,  nan,  37.,  20.,  13.],
       [ 16.,  nan,  22.,  20.,  79.],
       [ 17.,  nan,  35.,  21.,  35.],
       [ 18.,  nan,  20.,  21.,  66.],
       [ 19.,  nan,  52.,  23.,  29.],
       [ 20.,  nan,  35.,  23.,  98.],
       [ 21.,  nan,  35.,  24.,  35.],
       [ 22.,  nan,  25.,  24.,  73.],
       [ 23.,  nan,  46.,  25.,   5.],
       [ 24.,  nan,  31.,  25.,  73.],
       [ 25.,  nan,  54.,  28.,  14.],
       [ 26.,  nan,  29.,

### You will notice the gender column is all NaNs. Why is that ? How can you solve for it? <br>
All elements in a NumPy array should be of the same type. Here np.genfromtxt tries to infer the data type, 
and due to a mismatch between the expected data type and the actual content, the Gender column is showing NAN.<br> 

We can explicitly specify the data type for each column using the dtype parameter: <br> 
#### dtype = [('CustomerID', int), ('Gender', 'U10'), ('Age', int), ('Annual Income (k$)', int), ('Spending Score (1-100)', int)] <br>

#### dtype=None
It will scan the data and determine the most appropriate type based on the values it encounters.i.e.
Numerical columns will be inferred as integer/float types. and String columns will be inferred as object types.<br> 

#### dtype='<i4'
To convert the "Gender" column also to integers like other columns for consistency, efficiency and convenience.

In [4]:
# convert Male to 0 and Female to 1
def gender_converter(value):
    if value == 'Male':
        return 0
    elif value == 'Female':
        return 1

data = np.genfromtxt(dataset, delimiter=",", dtype='<i4', encoding=None, names=True, converters={1: gender_converter})
data[:10], data.shape

(array([( 1, 0, 19, 15, 39), ( 2, 0, 21, 15, 81), ( 3, 1, 20, 16,  6),
        ( 4, 1, 23, 16, 77), ( 5, 1, 31, 17, 40), ( 6, 1, 22, 17, 76),
        ( 7, 1, 35, 18,  6), ( 8, 1, 23, 18, 94), ( 9, 0, 64, 19,  3),
        (10, 1, 30, 19, 72)],
       dtype=[('CustomerID', '<i4'), ('Gender', '<i4'), ('Age', '<i4'), ('Annual_Income_k', '<i4'), ('Spending_Score_1100', '<i4')]),
 (200,))

In [5]:
# Convert to a two-dimensional NumPy array
mall_data = np.array([list(item) for item in data])
mall_data.shape

(200, 5)

## 2. Understand Customer Demographics
#### Calculate the average Age, Annual Income, and Spending Score to understand the typical customer profile.
#### Analyze customer distribution by gender. Does one gender tend to spend more or earn more on average?

In [6]:
genders = mall_data[:, 1] 
genders

array([0, 0, 1, 1, 1, 1, 1, 1, 0, 1, 0, 1, 1, 1, 0, 0, 1, 0, 0, 1, 0, 0,
       1, 0, 1, 0, 1, 0, 1, 1, 0, 1, 0, 0, 1, 1, 1, 1, 1, 1, 1, 0, 0, 1,
       1, 1, 1, 1, 1, 1, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 0, 0, 1, 1, 0, 0,
       1, 1, 0, 1, 0, 1, 1, 1, 0, 0, 1, 0, 1, 1, 0, 0, 0, 1, 1, 0, 1, 1,
       1, 1, 1, 0, 0, 1, 1, 0, 1, 1, 0, 0, 1, 1, 0, 0, 0, 1, 1, 0, 0, 0,
       0, 1, 1, 0, 1, 1, 1, 1, 1, 1, 0, 1, 1, 0, 1, 1, 0, 0, 0, 0, 0, 0,
       1, 1, 0, 1, 1, 0, 0, 1, 1, 0, 1, 1, 0, 0, 0, 1, 1, 0, 0, 0, 1, 1,
       1, 1, 0, 1, 0, 1, 1, 1, 0, 1, 0, 1, 0, 1, 1, 0, 0, 0, 0, 0, 1, 1,
       0, 0, 0, 0, 1, 1, 0, 1, 1, 0, 1, 0, 1, 1, 1, 1, 0, 1, 1, 1, 1, 0,
       0, 0])

In [7]:
males = np.sum(genders == 0)
females = np.sum(genders == 1)
males, females

(88, 112)

In [8]:
ages = mall_data[:,2]
ages

array([19, 21, 20, 23, 31, 22, 35, 23, 64, 30, 67, 35, 58, 24, 37, 22, 35,
       20, 52, 35, 35, 25, 46, 31, 54, 29, 45, 35, 40, 23, 60, 21, 53, 18,
       49, 21, 42, 30, 36, 20, 65, 24, 48, 31, 49, 24, 50, 27, 29, 31, 49,
       33, 31, 59, 50, 47, 51, 69, 27, 53, 70, 19, 67, 54, 63, 18, 43, 68,
       19, 32, 70, 47, 60, 60, 59, 26, 45, 40, 23, 49, 57, 38, 67, 46, 21,
       48, 55, 22, 34, 50, 68, 18, 48, 40, 32, 24, 47, 27, 48, 20, 23, 49,
       67, 26, 49, 21, 66, 54, 68, 66, 65, 19, 38, 19, 18, 19, 63, 49, 51,
       50, 27, 38, 40, 39, 23, 31, 43, 40, 59, 38, 47, 39, 25, 31, 20, 29,
       44, 32, 19, 35, 57, 32, 28, 32, 25, 28, 48, 32, 34, 34, 43, 39, 44,
       38, 47, 27, 37, 30, 34, 30, 56, 29, 19, 31, 50, 36, 42, 33, 36, 32,
       40, 28, 36, 36, 52, 30, 58, 27, 59, 35, 37, 32, 46, 29, 41, 30, 54,
       28, 41, 36, 34, 32, 33, 38, 47, 35, 45, 32, 32, 30])

In [9]:
np.min(ages), np.max(ages), np.mean(ages)

(18, 70, 38.85)

In [10]:
# Males age data
male_data = mall_data[genders == 0]
np.min(male_data[:,2]), np.max(male_data[:,2]), np.mean(male_data[:,2])

(18, 70, 39.80681818181818)

In [11]:
# Females age data
female_data = mall_data[genders == 1]
np.min(female_data[:,2]), np.max(female_data[:,2]), np.mean(female_data[:,2])

(18, 68, 38.098214285714285)

In [12]:
annual_incomes = mall_data[:,3]
annual_incomes

array([ 15,  15,  16,  16,  17,  17,  18,  18,  19,  19,  19,  19,  20,
        20,  20,  20,  21,  21,  23,  23,  24,  24,  25,  25,  28,  28,
        28,  28,  29,  29,  30,  30,  33,  33,  33,  33,  34,  34,  37,
        37,  38,  38,  39,  39,  39,  39,  40,  40,  40,  40,  42,  42,
        43,  43,  43,  43,  44,  44,  46,  46,  46,  46,  47,  47,  48,
        48,  48,  48,  48,  48,  49,  49,  50,  50,  54,  54,  54,  54,
        54,  54,  54,  54,  54,  54,  54,  54,  57,  57,  58,  58,  59,
        59,  60,  60,  60,  60,  60,  60,  61,  61,  62,  62,  62,  62,
        62,  62,  63,  63,  63,  63,  63,  63,  64,  64,  65,  65,  65,
        65,  67,  67,  67,  67,  69,  69,  70,  70,  71,  71,  71,  71,
        71,  71,  72,  72,  73,  73,  73,  73,  74,  74,  75,  75,  76,
        76,  77,  77,  77,  77,  78,  78,  78,  78,  78,  78,  78,  78,
        78,  78,  78,  78,  79,  79,  81,  81,  85,  85,  86,  86,  87,
        87,  87,  87,  87,  87,  88,  88,  88,  88,  93,  93,  9

In [13]:
average_annual_income = np.mean(annual_incomes)
np.min(annual_incomes), np.max(annual_incomes), average_annual_income

(15, 137, 60.56)

In [14]:
# Males annual income data
np.min(male_data[:,3]), np.max(male_data[:,3]), np.mean(male_data[:,3])

(15, 137, 62.22727272727273)

In [15]:
# Females annual income data
np.min(female_data[:,3]), np.max(female_data[:,3]), np.mean(female_data[:,3])

(16, 126, 59.25)

In [16]:
spending_scores = mall_data[:,4]
spending_scores

array([39, 81,  6, 77, 40, 76,  6, 94,  3, 72, 14, 99, 15, 77, 13, 79, 35,
       66, 29, 98, 35, 73,  5, 73, 14, 82, 32, 61, 31, 87,  4, 73,  4, 92,
       14, 81, 17, 73, 26, 75, 35, 92, 36, 61, 28, 65, 55, 47, 42, 42, 52,
       60, 54, 60, 45, 41, 50, 46, 51, 46, 56, 55, 52, 59, 51, 59, 50, 48,
       59, 47, 55, 42, 49, 56, 47, 54, 53, 48, 52, 42, 51, 55, 41, 44, 57,
       46, 58, 55, 60, 46, 55, 41, 49, 40, 42, 52, 47, 50, 42, 49, 41, 48,
       59, 55, 56, 42, 50, 46, 43, 48, 52, 54, 42, 46, 48, 50, 43, 59, 43,
       57, 56, 40, 58, 91, 29, 77, 35, 95, 11, 75,  9, 75, 34, 71,  5, 88,
        7, 73, 10, 72,  5, 93, 40, 87, 12, 97, 36, 74, 22, 90, 17, 88, 20,
       76, 16, 89,  1, 78,  1, 73, 35, 83,  5, 93, 26, 75, 20, 95, 27, 63,
       13, 75, 10, 92, 13, 86, 15, 69, 14, 90, 32, 86, 15, 88, 39, 97, 24,
       68, 17, 85, 23, 69,  8, 91, 16, 79, 28, 74, 18, 83])

In [17]:
np.mean(spending_scores)

50.2

In [18]:
# Males spending data
np.min(male_data[:,4]), np.max(male_data[:,4]), np.mean(male_data[:,4])

(1, 97, 48.51136363636363)

In [19]:
# Females spending data
np.min(female_data[:,4]), np.max(female_data[:,4]), np.mean(female_data[:,4])

(5, 99, 51.526785714285715)

__In the dataset comprising 200 entries and 5 columns, there are 88 males and 112 females. The ages of the customers range from 18 to 70 years, with an average age of 38.85 years. The average age of the male customer is 39.8 years, while the avergae age of female customer is 38 years.__

__The annual income of the customers varies between 15,000 and 137,000, with an overall average of 60,560. When broken down by gender, the average annual income for men is 62,220, whereas for women, it is 59,250.__ 

__Spending scores, which are assigned to customers on a scale from 1 to 99, appear to be somewhat evenly distributed between men and women. Men's average spending score is 48.51 and that of women is 51.52.__

## 2: Identify High-Value Customers

#### Identify the top 10 customers by Spending Score. What do they have in common (e.g., age group, gender)?

In [84]:
# Sorting based on the spending score
top_ten_spenders = mall_data[mall_data[:, 4].argsort()[::-1]][:10]
top_ten_spenders


array([[ 12,   1,  35,  19,  99],
       [ 20,   1,  35,  23,  98],
       [186,   0,  30,  99,  97],
       [146,   0,  28,  77,  97],
       [168,   1,  33,  86,  95],
       [128,   0,  40,  71,  95],
       [  8,   1,  23,  18,  94],
       [142,   0,  32,  75,  93],
       [164,   1,  31,  81,  93],
       [ 42,   0,  24,  38,  92]])

In [34]:
# No. of males and females in top spenders
np.sum(top_ten_spenders[:,1] == 0), np.sum(top_ten_spenders[:,1] == 1) 

(5, 5)

In [35]:
# Average age of top spenders
np.mean(top_ten_spenders[:,2]), np.mean(top_ten_spenders[top_ten_spenders[:,1]==0][:,2]), np.mean(top_ten_spenders[top_ten_spenders[:,1]==1][:,2]) 

(31.1, 30.8, 31.4)

In [86]:
# Avergae annual income of top spenders
np.mean(top_ten_spenders[:,3]), np.mean(top_ten_spenders[top_ten_spenders[:,1]==0][:,3]), np.mean(top_ten_spenders[top_ten_spenders[:,1]==1][:,3])


(58.7, 72.0, 45.4)

### Conclusion:
There are 5 men and 5 female in top ten spenders category, with the average age of 31 years. Their average annual income is 59k, Where men has average annual income of 72k and women has 45k$.

__The top 10 spenders are equally divided between men and women, Despite the income disparity, both men and women are among the top spenders. Younger individuals are more likely to be high spenders.__

### 3: Explore Relationships Between Features
#### Compute the pairwise correlations between Age, Annual Income, and Spending Score to uncover key drivers of spending.



In [37]:
features = mall_data[:, 2:]
features

array([[ 19,  15,  39],
       [ 21,  15,  81],
       [ 20,  16,   6],
       [ 23,  16,  77],
       [ 31,  17,  40],
       [ 22,  17,  76],
       [ 35,  18,   6],
       [ 23,  18,  94],
       [ 64,  19,   3],
       [ 30,  19,  72],
       [ 67,  19,  14],
       [ 35,  19,  99],
       [ 58,  20,  15],
       [ 24,  20,  77],
       [ 37,  20,  13],
       [ 22,  20,  79],
       [ 35,  21,  35],
       [ 20,  21,  66],
       [ 52,  23,  29],
       [ 35,  23,  98],
       [ 35,  24,  35],
       [ 25,  24,  73],
       [ 46,  25,   5],
       [ 31,  25,  73],
       [ 54,  28,  14],
       [ 29,  28,  82],
       [ 45,  28,  32],
       [ 35,  28,  61],
       [ 40,  29,  31],
       [ 23,  29,  87],
       [ 60,  30,   4],
       [ 21,  30,  73],
       [ 53,  33,   4],
       [ 18,  33,  92],
       [ 49,  33,  14],
       [ 21,  33,  81],
       [ 42,  34,  17],
       [ 30,  34,  73],
       [ 36,  37,  26],
       [ 20,  37,  75],
       [ 65,  38,  35],
       [ 24,  38

In [38]:
# Transpose the data so that each row represents a variable
features_transposed = features.T
features_transposed.shape

(3, 200)

In [39]:
# Compute the correlation coefficient matrix
correlation_matrix = np.corrcoef(features_transposed)
correlation_matrix 

array([[ 1.        , -0.01239804, -0.32722685],
       [-0.01239804,  1.        ,  0.00990285],
       [-0.32722685,  0.00990285,  1.        ]])

This output shows the pairwise correlations between Age, Annual Income, and Spending Score. In this example:
The correlation between Age and Annual Income is -0.01239804.
The correlation between Age and Spending Score is -0.32722685.
The correlation between Annual Income and Spending Score is 0.00990285.

A correlation coefficient of -0.01239804 between Age and Annual Income suggests an extremely weak negative linear relationship between the two variables. The magnitude is very close to 0, indicating that there is essentially no linear relationship between Age and Annual Income in your dataset. Changes in age have almost no impact on annual income. The negative sign indicates that, theoretically, as age increases, annual income might decrease slightly. Age does not appear to be a useful predictor of annual income.

A correlation coefficient of -0.32722685 between Age and Spending Score indicates a moderate negative linear relationship between the two variables. The negative sign indicates that as age increases, the spending score tends to decrease. Older customers tend to have lower spending scores, while younger customers tend to have higher spending scores.

A correlation coefficient of 0.00990285 between Annual Income and Spending Score suggests an extremely weak positive linear relationship between these two variables. The magnitude is very close to 0, indicating that there is essentially no linear relationship between Annual Income and Spending Score. The positive sign indicates that, theoretically, as annual income increases, the spending score might increase slightly. Annual income does not appear to be a useful predictor of spending score.

#### Filter young adults (18-25) and calculate their average Spending Score. Compare this with older age groups.

In [40]:
young_filtered_data = mall_data[(mall_data[:, 2] >= 18) & (mall_data[:, 2] <= 25)]
young_filtered_data.shape

(38, 5)

In [41]:
spending_score_young_adults = young_filtered_data[:,4]
average_spending_score_young_adults = np.mean(spending_score_young_adults)
average_spending_score_young_adults

54.94736842105263

In [42]:
np.sum(young_filtered_data[:,1] == 0), np.sum(young_filtered_data[:,1] == 1) 

(18, 20)

In [43]:
np.mean(young_filtered_data[:,4]), np.mean(young_filtered_data[young_filtered_data[:,1] == 0][:,4]), np.mean(young_filtered_data[young_filtered_data[:,1] == 1][:,4])


(54.94736842105263, 50.833333333333336, 58.65)

In [44]:
np.mean(young_filtered_data[:,3]), np.mean(young_filtered_data[young_filtered_data[:,1] == 0][:,3]), np.mean(young_filtered_data[young_filtered_data[:,1] == 1][:,3])


(45.68421052631579, 47.611111111111114, 43.95)

#### The average spending score of the young adults is about 55. 
There are 18 men with average spending score of 51 and 20 women with average spending score of 59. The average annual income of young adults is 46k.

In [45]:
older_filtered_data = mall_data[mall_data[:, 2] > 25]
older_filtered_data.shape

(162, 5)

In [46]:
spending_score_older_adults = older_filtered_data[:,4]
average_spending_score_older_adults = np.mean(spending_score_older_adults)
average_spending_score_older_adults

49.08641975308642

In [47]:
# no. of men and women
np.sum(older_filtered_data[:,1] == 0), np.sum(older_filtered_data[:,1] == 1) 

(70, 92)

In [48]:
# Average spending score
np.mean(older_filtered_data[:,4]), np.mean(older_filtered_data[older_filtered_data[:,1] == 0][:,4]), np.mean(older_filtered_data[older_filtered_data[:,1] == 1][:,4])


(49.08641975308642, 47.91428571428571, 49.97826086956522)

In [49]:
# Average annual income
np.mean(older_filtered_data[:,3]), np.mean(older_filtered_data[older_filtered_data[:,1] == 0][:,3]), np.mean(older_filtered_data[older_filtered_data[:,1] == 1][:,3])


(64.04938271604938, 65.98571428571428, 62.57608695652174)

#### The average spending score of the older adults is 49.
There are 70 men with averge spending score of 48 and 92 women with average spending score of 50. The average annual income of older adults is 64k.

### Conclusion: Young adults seem to spend more than older individuals

### Customer Segmentation

Group customers into three segments based on their Spending Score:
○ Low (1-33)
○ Medium (34-66)
○ High (67-100)
Compute the average Age and Annual Income for each segment. What are the characteristics of each group?
● Propose marketing strategies for each segment.
● Hint: Use NumPy’s slicing and aggregation techniques.

In [55]:
# Let us consider three categories of customers based on their spending scores
low_spenders = mall_data[spending_scores <= 33]
medium_spenders = mall_data[(spending_scores > 33) & (spending_scores <= 66)]
high_spenders = mall_data[spending_scores > 66]
len(low_spenders), len(medium_spenders), len(high_spenders)

(49, 94, 57)

In [56]:
# Low spenders: No. of male and female customers whose spending 
len(low_spenders[low_spenders[:,1]== 0]), len(low_spenders[low_spenders[:,1]==1])

(24, 25)

In [92]:
# Average age of custmers, males and females in low spenders category 
np.mean(low_spenders[:,2]), np.mean(low_spenders[low_spenders[:,1]==0][:,2]), np.mean(low_spenders[low_spenders[:,1]==1][:,2])


(42.87755102040816, 43.0, 42.76)

In [58]:
# low spenders average annual income, low spender male's annual income, low spenders female's annual income
np.mean(low_spenders[:,3]), np.mean(low_spenders[low_spenders[:,1]==0][:,3]), np.mean(low_spenders[low_spenders[:,1]==1][:,3])


(67.0, 70.79166666666667, 63.36)

In [59]:
# Medium spenders: No. of male and female customers
len(medium_spenders[medium_spenders[:,1]== 0]), len(medium_spenders[medium_spenders[:,1]==1])

(40, 54)

In [61]:
# Average age of custmers, males and females in medium spenders category 
np.mean(medium_spenders[:,2]), np.mean(medium_spenders[medium_spenders[:,1]==0][:,2]), np.mean(medium_spenders[medium_spenders[:,1]==1][:,2])


(42.01063829787234, 43.35, 41.01851851851852)

In [69]:
# medium spenders average annual income, medium spender male's annual income, medium spenders female's annual income
np.mean(medium_spenders[:,3]), np.mean(medium_spenders[medium_spenders[:,1]==0][:,3]), np.mean(medium_spenders[medium_spenders[:,1]==1][:,3])


(53.861702127659576, 52.825, 54.629629629629626)

In [64]:
# High spenders: No. of male and female customers whose spending more
len(high_spenders[high_spenders[:,1]== 0]), len(high_spenders[high_spenders[:,1]==1])

(24, 33)

In [65]:
# Average age of customers, males and females in high spenders category 
np.mean(high_spenders[:,2]), np.mean(high_spenders[high_spenders[:,1]==0][:,2]), np.mean(high_spenders[high_spenders[:,1]==1][:,2])


(30.17543859649123, 30.708333333333332, 29.78787878787879)

In [67]:
# high spenders average annual income, high spender male's annual income, high spenders female's annual income
np.mean(high_spenders[:,3]), np.mean(high_spenders[high_spenders[:,1]==0][:,3]), np.mean(high_spenders[high_spenders[:,1]==1][:,3])


(66.0701754385965, 69.33333333333333, 63.696969696969695)

Average age of low spenders is 43 years, medium spenders is 42 years and low spenders is 30 years.<br>
Average annual income of low spenders is 67k, medium spenders is 54k and that of high spenders is 66k.<br>
__Younger individuals spend more, Average annual income does not have any effect on spending score.__


## 5: Find Behavioral Similarities

In [71]:
# Calculate the mean and standard deviation for each column
mean = np.mean(mall_data, axis=0)
std = np.std(mall_data, axis=0)

# Normalize the dataset
normalized_data = (mall_data - mean) / std
normalized_data

array([[-1.7234121 , -1.12815215, -1.42456879, -1.73899919, -0.43480148],
       [-1.70609137, -1.12815215, -1.28103541, -1.73899919,  1.19570407],
       [-1.68877065,  0.88640526, -1.3528021 , -1.70082976, -1.71591298],
       [-1.67144992,  0.88640526, -1.13750203, -1.70082976,  1.04041783],
       [-1.6541292 ,  0.88640526, -0.56336851, -1.66266033, -0.39597992],
       [-1.63680847,  0.88640526, -1.20926872, -1.66266033,  1.00159627],
       [-1.61948775,  0.88640526, -0.27630176, -1.62449091, -1.71591298],
       [-1.60216702,  0.88640526, -1.13750203, -1.62449091,  1.70038436],
       [-1.5848463 , -1.12815215,  1.80493225, -1.58632148, -1.83237767],
       [-1.56752558,  0.88640526, -0.6351352 , -1.58632148,  0.84631002],
       [-1.55020485, -1.12815215,  2.02023231, -1.58632148, -1.4053405 ],
       [-1.53288413,  0.88640526, -0.27630176, -1.58632148,  1.89449216],
       [-1.5155634 ,  0.88640526,  1.37433211, -1.54815205, -1.36651894],
       [-1.49824268,  0.88640526, -1.0

In [87]:
normalized_data.shape

(200, 5)

In [89]:
max_spending_customer.shape

(5,)

In [83]:
spending_scores = normalized_data[:, 4]

# Find the customer with the highest Spending Score
max_spending_score_idx = np.argmax(spending_scores)
max_spending_customer = normalized_data[max_spending_score_idx]

# Compute Euclidean distance between each customer and the customer with the highest Spending Score
distances = np.linalg.norm(normalized_data - max_spending_customer, axis=1)
distances 


array([3.295782  , 2.36844829, 3.77243835, 1.2261409 , 2.31283326,
       1.2978129 , 3.61164539, 0.88634123, 4.72040627, 1.10844363,
       4.49685511, 0.        , 3.65520758, 1.1641758 , 3.90253922,
       2.35328948, 2.48726054, 2.62205437, 3.60135964, 0.20980486,
       3.20816178, 2.37880632, 3.74550537, 2.29244571, 3.59401189,
       2.20368434, 2.73239632, 2.53565077, 2.70740601, 1.09616752,
       4.60040932, 1.52465965, 4.44373116, 2.45998567, 3.51320975,
       1.39862514, 3.30174386, 1.29544548, 2.95420065, 1.65353972,
       3.40393865, 2.35615048, 3.43244403, 1.77444333, 3.08485325,
       1.81521334, 2.2553557 , 2.33155943, 2.47694045, 2.46063516,
       2.35917738, 2.76081112, 2.11608185, 3.26891659, 2.6358087 ,
       3.36006392, 2.54070421, 3.97381862, 2.35091887, 3.42256821,
       3.86481659, 3.17912462, 3.24435138, 2.49482084, 3.69279956,
       3.17149567, 2.46596615, 3.41978534, 3.1604307 , 2.52112511,
       3.95487864, 2.83364072, 3.08273157, 2.92557754, 3.75177

In [91]:
distances.shape

(200,)