<a href="https://www.kaggle.com/code/matinmahmoudi/numpy-fun-problems-proficient?scriptVersionId=166679069" target="_blank"><img align="left" alt="Kaggle" title="Open in Kaggle" src="https://kaggle.com/static/images/open-in-kaggle.svg"></a>

# My Solutions for Fun Practice Problems - Proficient

In this notebook, I am working through the "Proficient" level practice problems from the following source:

- [Practice Problems Source](https://www.practiceprobs.com/problemsets/python-numpy/proficient/)

For additional learning and insights, I also recommend watching related content on YouTube. One particularly helpful resource I've found is:

- [YouTube Tutorial](https://www.youtube.com/watch?v=eClQWW_gbFk)

If you're following along with my series of practice problem solutions, you may also be interested in my previous work, which focused on the "Intermediate" level problems. You can find that notebook here:

- [Previous Notebook: Numpy Fun Problems - Intermediate](https://www.kaggle.com/code/matinmahmoodi/numpy-fun-problems-intermediate)

If you have any improved solutions or ideas, please feel free to share them in the comments!

### **Please show your support by upvoting if you find this notebook helpful!**


# Q1 - Movie Ratings Problem

https://www.practiceprobs.com/problemsets/python-numpy/proficient/movie-ratings/

You’re given a 10x2 array of floats where each row represents a movie. The first column represents the movie’s rating and the second column represents the director’s rating.

Create a third column that represents the overall rating. The overall rating is equal to the movie rating if it exists, otherwise the director’s rating.




In [1]:
import numpy as np

generator = np.random.default_rng(123)
ratings = np.round(generator.uniform(low=0.0, high=10.0, size=(10, 2)))
ratings[[1,2,7,9], [0,0,0,0]] = np.nan

print(ratings)
# [[ 7.  1.]
#  [nan  2.]
#  [nan  8.]
#  [ 9.  3.]
#  [ 8.  9.]
#  [ 5.  2.]
#  [ 8.  2.]
#  [nan  6.]
#  [ 9.  2.]
#  [nan  5.]]

[[ 7.  1.]
 [nan  2.]
 [nan  8.]
 [ 9.  3.]
 [ 8.  9.]
 [ 5.  2.]
 [ 8.  2.]
 [nan  6.]
 [ 9.  2.]
 [nan  5.]]


## Solution 1

In [2]:
np.isnan(ratings[:, 0])

array([False,  True,  True, False, False, False, False,  True, False,
        True])

In [3]:
overall_rating = np.where(np.isnan(ratings[:, 0]), ratings[:, 1], ratings[:, 0])
overall_rating

array([7., 2., 8., 9., 8., 5., 8., 6., 9., 5.])

In [4]:
ratings = np.column_stack((ratings, overall_rating))
ratings

array([[ 7.,  1.,  7.],
       [nan,  2.,  2.],
       [nan,  8.,  8.],
       [ 9.,  3.,  9.],
       [ 8.,  9.,  8.],
       [ 5.,  2.,  5.],
       [ 8.,  2.,  8.],
       [nan,  6.,  6.],
       [ 9.,  2.,  9.],
       [nan,  5.,  5.]])

## Solution 2

In [5]:
overall_rating = np.where(np.isnan(ratings[:, 0]), ratings[:, 1], ratings[:, 0])
ratings = np.insert(arr=ratings, values=overall_rating, axis=1, obj=2)
ratings

array([[ 7.,  1.,  7.,  7.],
       [nan,  2.,  2.,  2.],
       [nan,  8.,  8.,  8.],
       [ 9.,  3.,  9.,  9.],
       [ 8.,  9.,  8.,  8.],
       [ 5.,  2.,  5.,  5.],
       [ 8.,  2.,  8.,  8.],
       [nan,  6.,  6.,  6.],
       [ 9.,  2.,  9.,  9.],
       [nan,  5.,  5.,  5.]])

# Q2 - Big Fish Problem

https://www.practiceprobs.com/problemsets/python-numpy/proficient/big-fish/

10 fish occupy a 5x5x5 grid of water 🐟. Each fish decides to move to a new (i,j,k) location given by the 2-d array below. If multiple fish end up occupying the same cell, the biggest fish eats the smaller fish. Determine which fish survive.


Use np.set_printoptions(precision=3) to show values with just three decimal places.

In [6]:
import numpy as np

locs = np.array([
    [0,0,0],
    [1,1,2],
    [0,0,0],
    [2,1,3],
    [5,5,4],
    [5,0,0],
    [5,0,0],
    [0,0,0],
    [2,1,3],
    [1,3,1]
])

generator = np.random.default_rng(1010)
weights = generator.normal(size=10)

print(weights)
# [-1.699  0.538 -0.226 -1.09   0.554 -1.501  0.445  1.345 -1.124  0.212]

[-1.69870017  0.53799701 -0.22561399 -1.09020894  0.55391264 -1.50115445
  0.44545933  1.3448172  -1.12364327  0.21216015]


## Solution 1

In [7]:
np.set_printoptions(precision=3)

In [8]:
# Find the unique locations
unique_locs = np.unique(locs, axis=0)
unique_locs

array([[0, 0, 0],
       [1, 1, 2],
       [1, 3, 1],
       [2, 1, 3],
       [5, 0, 0],
       [5, 5, 4]])

In [9]:
# Create a dictionary to store the surviving fish at each location
surviving_fish = {}

# Iterate over the unique locations
for loc in unique_locs:
    # Convert the numpy array to a tuple
    loc = tuple(loc)
    
    # Get the indices of fish at the current location
    indices = np.where(np.all(locs == tuple(loc), axis=1))[0]
    
    # Get the weights of the fish at the current location
    fish_weights = weights[indices]
    
    # Find the index of the biggest fish
    biggest_fish_index = np.argmax(fish_weights)
    
    # Add the biggest fish to the surviving fish dictionary
    surviving_fish[loc] = indices[biggest_fish_index]

In [10]:
# Print the surviving fish
for loc, fish_index in surviving_fish.items():
    print(f"Fish at location {loc} survives. (Fish index: {fish_index})")

Fish at location (0, 0, 0) survives. (Fish index: 7)
Fish at location (1, 1, 2) survives. (Fish index: 1)
Fish at location (1, 3, 1) survives. (Fish index: 9)
Fish at location (2, 1, 3) survives. (Fish index: 3)
Fish at location (5, 0, 0) survives. (Fish index: 6)
Fish at location (5, 5, 4) survives. (Fish index: 4)


## Solution 2 - more professional

In [11]:
sorted_fish = np.argsort(weights)[::-1]
uniques, first_idxs = np.unique(locs[sorted_fish], axis=0, return_index=True)
survivors = sorted_fish[first_idxs]

print(survivors)
# [7 1 9 3 6 4]

[7 1 9 3 6 4]


# Q3 - Taco Truck Problem

https://www.practiceprobs.com/problemsets/python-numpy/proficient/taco-truck/

You own a taco truck that’s open 24/7 and manage five employees who run it. Employees work solo, eight-hour shifts. You decide the best way to set their schedule for the upcoming week is to create a bunch of random schedules and select one that looks best.

You build a 1000x21 array of random employee ids where element (i,j) gives the employee id working shift j for schedule i.

A Schedule is valid as long as no employee works two consecutive shifts. Get the row indices of all valid schedules.

In [12]:
import numpy as np

generator = np.random.default_rng(999)
schedules = generator.integers(low=0, high=5, size=(1000, 21))

print(schedules)


[[4 3 0 ... 2 0 0]
 [2 4 3 ... 3 3 2]
 [1 0 1 ... 1 2 1]
 ...
 [2 2 1 ... 3 1 4]
 [1 0 3 ... 2 3 2]
 [1 1 4 ... 2 4 2]]


## Solution 1

In [13]:
valid_schedule_indices = []

for i in range(schedules.shape[0]):
    valid = True
    for j in range(schedules.shape[1] - 1):
        if schedules[i, j] == schedules[i, j+1]:
            valid = False
            break
    if valid:
        valid_schedule_indices.append(i)

valid_schedule_indices

[25, 138, 188, 289, 375, 426, 533, 886, 975, 982]

## Solution 2 - more professional

In [14]:
is_valid = np.all(schedules[:, :-1] != schedules[:, 1:], axis=1)
np.nonzero(is_valid)[0]

array([ 25, 138, 188, 289, 375, 426, 533, 886, 975, 982])

# Q4 - Defraud The Investors Problem

https://www.practiceprobs.com/problemsets/python-numpy/proficient/defraud-the-investors/

You've developed a model that predicts the probability a 🏠 house for sale can be flipped for a profit 💸. Your model isn't very good, as indicated by its predictions on historic data.



In [15]:
import numpy as np

rng = np.random.default_rng(123)
targets = rng.uniform(low=0, high=1, size=20) >= 0.6
preds = np.round(rng.uniform(low=0, high=1, size=20), 2)

print(targets)
print(preds)

[ True False False False False  True  True False  True  True False False
  True False  True  True  True False  True False]
[0.23 0.17 0.5  0.58 0.18 0.01 0.47 0.73 0.92 0.63 0.92 0.86 0.22 0.87
 0.73 0.28 0.8  0.87 0.3  0.53]


Your investors want to see these results, but you're afraid to share them. You devise the following algorithm to make your predictions look better without looking artificial.

Step 1: 
  Choose 5 random indexes (without replacement)

Step 2: 
  Perfectly reorder the prediction scores at these indexes 
  to optimize the accuracy of these 5 predictions


Here's some code to help you evaluate the accuracy of your predictions before and after your changes.

In [16]:
def accuracy_rate(preds, targets):
    return np.mean((preds >= 0.5) == targets)

# Accuracy before finagling
accuracy_rate(preds, targets)  # 0.3

0.3

## Solution

In [17]:
idxs = np.sort(rng.choice(len(preds), size=5, replace=False)) 

print(idxs) 

[ 1  6  8 11 18]


In [18]:
print(targets[idxs]) 
print(preds[idxs])

[False  True  True False  True]
[0.17 0.47 0.92 0.86 0.3 ]


In [19]:
temp = np.argsort(targets[idxs])
print(targets[idxs]) 
print(temp)          

[False  True  True False  True]
[0 3 1 2 4]


In [20]:
preds[idxs[temp]]  # ordered by 'should be False' to 'should be True'

print(targets[idxs])     
print(preds[idxs])        
print(preds[idxs[temp]])  

[False  True  True False  True]
[0.17 0.47 0.92 0.86 0.3 ]
[0.17 0.86 0.47 0.92 0.3 ]


In [21]:
preds[idxs[temp]] = np.sort(preds[idxs])

print(targets[idxs]) 
print(preds[idxs])  

[False  True  True False  True]
[0.17 0.47 0.86 0.3  0.92]


In [22]:
# Accuracy after finagling
accuracy_rate(preds, targets) 

0.4

# Q5 - Pixel Artist Problem

https://www.practiceprobs.com/problemsets/python-numpy/proficient/pixel-artist/

You're a pixel artist 👩‍🎨. You have a collection of five 10x10-pixel images stored in a 5x3x10x10 array.


In [23]:
import numpy as np

rng = np.random.default_rng(1234)
imgs = rng.uniform(low=0, high=1, size=(5,3,10,10))

The (i,j,k,l) element corresponds to the ith image, jth color channel (RGB), kth row of pixels, jth column of pixels.

You want to plot these images, but your plotting software expects color channel to be the last dimension of the array. Rearrange your array accordingly.



## Solution 1

In [24]:
imgs_rearranged = np.transpose(imgs, (0, 2, 3, 1))

## Solution 2 - more professional

In [25]:
imgs_rearranged = np.moveaxis(imgs, source=1, destination=-1)