# Problem-Solving with NumPy


These assessment questions are designed in such a way where you will use NumPy functions and create a python function(def) to solve each question.

### 1. Write a function, foo_1, that takes a NumPy array `arr` and returns the indices of the elements that are closest to a given target. If target is not in the given array, it should return ‘no item found’


In [1]:
import numpy as np

def foo_1(arr, target):
     
    if target not in arr:
        return 'no item found'
    indices = np.abs(arr - target).argsort()
    return indices

In [2]:
arr1 = np.array([3, 8, 7, 7, 7, 7, 7, 8, 7, 10])
target1 = 7
output1 = foo_1(arr1, target1)
print(output1)


[2 3 4 5 6 8 1 7 9 0]


In [3]:
arr2 = np.array([3, 8, 7, 7, 7, 7, 7, 8, 7, 10])
target2 = 11
output2 = foo_1(arr2, target2)
print(output2)

no item found


 Explanation

This function first checks if the target is in the array.

If not, it returns 'no item found'.

If the target is in the array, it calculates the absolute differences between each element and the target, then sorts the 

indices based on these differences. 

Finally, it returns the sorted indices.

###   2.   Write a user defined function foo_2 that takes a 1 or 2 dim NumPy array arr and returns a new n-1 dim array, where each element is replaced by the count of non-zero elements wrt axis.


In [7]:
 def foo_2(arr, axis):
    if arr.ndim != 2:
        print("Input array must be 2-dimensional.")
    else:
        x=np.where(arr != 0, 1, 0)
        counts=np.sum(x,axis=axis)
        return counts

def foo_3(arr, axis):
    if arr.ndim != 3:
        print("Input array must be 3-dimensional.")
    else:
        x=np.where(arr != 0, 1, 0)
        counts=np.sum(x,axis=axis)
        return counts
    
arr = np.array([[1, 2, 3, 4],
                [0, 5, 0, 0],
                [2, 0, 1, 0]])

foo_2(arr, axis = 1)

 

array([4, 1, 2])

In [8]:
foo_2(arr, axis = 0)

array([2, 2, 2, 1])

In [10]:
arr1 = np.array([[[1, 2, 3, 4], [0, 5, 0, 0], [2, 0, 1, 0]],
                 [[2, 0, 1, 0], [0, 5, 0, 0], [1, 2, 3, 4]]])

foo_3(arr1,axis=0)

array([[2, 1, 2, 1],
       [0, 2, 0, 0],
       [2, 1, 2, 1]])

In [11]:
foo_2(arr1 , axis = 1)

Input array must be 2-dimensional.


This function counts the non-zero elements along the specified axis using np.count_nonzero and then reshapes the result if it is 1-dimensional.

The function is then tested with the provided cases, and the results match the expected outputs you provided.

## 3.
### 3.1 You are given a 2D NumPy array of data representing student scores. Each row represents a student, and each column represents a subject. Extract the scores of all students who scored above 90 in at least one subject. Provide the resulting array.


In [13]:
data = np.array([[85, 89, 88], [78, 89, 94], [91, 87, 90], [87, 95, 84]])
indices = np.where(data > 90)
print("score of students who secured greater than 90:")
print(indices)


unique_row_indices = np.unique(indices[0])
result = data[unique_row_indices, :]
print("\nScores of students who scored above 90 in at least one subject:")
print(result)

score of students who secured greater than 90:
(array([1, 2, 3], dtype=int64), array([2, 0, 1], dtype=int64))

Scores of students who scored above 90 in at least one subject:
[[78 89 94]
 [91 87 90]
 [87 95 84]]


 Explanation
    
*  np.any(data > 90, axis=1): This creates a boolean mask by checking if any element in each row (axis=1) of the data array is greater than 90.

* This results in a boolean array where each element corresponds to whether the student scored above 90 in at least one subject.

* data[above_90_mask]: This uses boolean indexing to extract the rows (students) from the original data array where the corresponding element in above_90_mask is True.

* This gives the resulting array containing scores of students who scored above 90 in at least one subject.

### 3.2 You are provided with a 2D NumPy array of data representing sales data for different products. Each row corresponds to a product, and each column represents the monthly sales for a specific month. Calculate the following statistics for each product: mean, median, standard deviation, and the month with the highest sales.


In [9]:
import numpy as np

sales_data = np.array([[120, 150, 130, 110],
                       [200, 180, 220, 250],
                       [90, 100, 95, 110],
                       [300, 280, 320, 340]])

mean_values = np.mean(sales_data, axis=1)
median_values = np.median(sales_data, axis=1)
std_dev_values = np.std(sales_data, axis=1)

highest_sales_month = np.argmax(sales_data, axis=1)

for i in range(len(mean_values)):
    print(f"Product {i+1}:")
    print(f"  Mean: {mean_values[i]}")
    print(f"  Median: {median_values[i]}")
    print(f"  Standard Deviation: {std_dev_values[i]}")
    print(f"  Month with Highest Sales: Month {highest_sales_month[i] + 1}")
    print()

Product 1:
  Mean: 127.5
  Median: 125.0
  Standard Deviation: 14.79019945774904
  Month with Highest Sales: Month 2

Product 2:
  Mean: 212.5
  Median: 210.0
  Standard Deviation: 25.8602010819715
  Month with Highest Sales: Month 4

Product 3:
  Mean: 98.75
  Median: 97.5
  Standard Deviation: 7.39509972887452
  Month with Highest Sales: Month 4

Product 4:
  Mean: 310.0
  Median: 310.0
  Standard Deviation: 22.360679774997898
  Month with Highest Sales: Month 4



### 3.3 Create a NumPy array of size 1000 with random integers between 1 and 100. Then, use NumPy functions to find the top 10 most frequently occurring values in the array.


In [14]:
z=np.random.randint(1,100,1000)
z

array([89, 41, 59, 95, 31, 58, 60, 98, 11, 15, 82, 39, 76, 98,  6, 25, 44,
       70,  7, 93, 44,  9, 78, 33, 53, 83, 79, 34, 67, 99, 17, 70, 21, 18,
       80,  2, 37, 78, 96, 41, 96, 46, 92, 66, 70, 27, 18, 75, 62, 38, 70,
       24, 14, 58, 39, 75, 97, 65, 50, 38, 11, 50,  3, 20, 29, 60, 62, 42,
       69, 23,  8, 18, 31, 21, 85,  9, 78, 71,  1, 30, 70, 13, 83, 53, 82,
       92, 21, 94, 34, 16, 68, 18, 58, 74, 75, 39, 81, 22, 24,  7,  4, 48,
       31, 42, 64, 45, 59, 29, 51, 45, 35, 59,  6,  5, 72, 58, 90, 95, 41,
       66, 88, 31, 62, 92, 65, 99, 12, 43, 48, 90, 26, 36, 34, 18, 34, 12,
       51, 91,  4,  8, 74, 72, 81, 75,  2, 71, 22, 99, 38, 31, 89, 55, 76,
       18,  4,  5, 14, 42, 78, 10, 51, 18, 90, 24, 17, 47, 26, 96, 87, 71,
       27, 86, 53, 58,  8, 52, 95, 52, 28, 74, 85, 62, 75, 37, 99, 31, 63,
       36, 62, 86, 94, 87, 40, 22, 80, 46, 49, 49, 14, 67, 88, 65, 15, 53,
       59, 23, 57, 76, 23, 38, 89, 96, 25, 99, 62, 93, 24, 62, 12, 46, 75,
       71, 20, 53, 76, 92

In [15]:
 counts = np.bincount(z)

top_indices = np.argsort(-counts)[:10]

print("Top 10 most frequently occurring values:")
for index in top_indices:
    print(f"Value: {index}, Count: {counts[index]}")

Top 10 most frequently occurring values:
Value: 21, Count: 20
Value: 62, Count: 17
Value: 12, Count: 16
Value: 94, Count: 16
Value: 51, Count: 15
Value: 80, Count: 15
Value: 49, Count: 15
Value: 31, Count: 14
Value: 82, Count: 14
Value: 89, Count: 14


Explanation

np.random.randint(1, 101, size=1000) generates an array of 1000 random integers between 1 and 100.

np.unique(random_array, return_counts=True) returns unique values and their counts.

np.argsort(counts)[::-1] creates an index array to sort unique values by their counts in descending order.

unique_values[sorted_indices[:10]] and counts[sorted_indices[:10]] retrieve the top 10 values and their counts.