Welcome back! We hope you had a nice lunch break. :)

We're going to dive back in to our tour of array operations. Again, **try to focus on the broad concepts rather than memorizing syntax**: the cheat sheet will always be available to you!

Each time that we start up a new runtime (opening up a new notebook, or restarting a notebook's runtime after being idle for too long), we need to make sure that we import the packages that we want to use.

In [None]:
# make sure to run this cell, or you won't have access to numpy functions or arrays
import numpy as np

## Generating defined arrays

So far, we've only used `np.array()` to make arrays from existing data structures. However, `numpy` also includes useful  functions for generating arrays of defined shapes and sizes "from scratch": that is, without having to pre-write the columns and rows ourselves.

The simplest array generator functions are:
 * `np.ones()`: Given a **tuple** of `(rows, columns)`, returns an array of the desired shape where each value is a `1.0`.
 * `np.zeros()`: Given a **tuple** of `(rows, columns)`, returns an array of the desired shape where each value is a `0.0`.
 * `np.full()`: Given a **tuple** of `(rows, columns)` and a input value, returns an array of the desired shape where each value is the input value.
 * `np.arange()`: Given three inputs: `start, stop, step`, returns a sequence of numerics spaced by `step` in the range between `start` and `stop`, *exclusive* of `stop`.
  * Example: `np.arange(0, 6, 3)` would return an array with `0, 3`.

In [None]:
# try it out:
# create a 2 x 4 array of ones


In [None]:
# try it out:
# create a 4 x 2 array of zeros


In [None]:
# try it out:
# create a 4 x 4 array, where each value is the integer 4


In [None]:
# try it out:
# create an array with even integers from 2 to 20


## Array operations

In the introduction to arrays, we mentioned that arrays in `numpy` can be quite powerful because they allow you to perform Excel-like operations. We'll spend the rest of our afternoon lesson going over some of these operations, which will set us up for our first data exploration project tomorrow.

### Simple array arithmetic

Given arrays of the same shape, we can perform simple arithmetic element-by-element, using the very same operators we use for numerics.
- Addition with `+`
- Subtraction with `-`
- Multiplication with `*`
- Division with `/`

In [None]:
# addition with two 3 x 3 arrays

np.ones((3, 3)) + np.full((3,3), 3)

In [None]:
# subtraction with two 2 x 4 arrays

np.full((2, 4), 10) - np.full((2, 4), 4)

In [None]:
# Try it out:
# create a 5 x 3 array of 10s (integers) and multiply it by a 5 x 3 array of 2.5s (floats)


### Appending items to arrays

Not every array has to stay in the shape that it was created in. We can combine 1D arrays using `np.append()`:

In [None]:
sample_array = np.array([1, 2, 3, 4, 5, 6])
np.append(sample_array, np.array([7, 8, 9]))

Note that this doesn't update `sample_array`. If we want to update `sample_array` with the appended values, we need to save it:

In [None]:
print('Before:', sample_array)
sample_array = np.append(sample_array, np.array([7, 8, 9]))
print('After:', sample_array)

### Basic methods & functions

As we've discussed, `numpy` arrays are immensely useful for numerical operations. Many common summary calculations are available as methods for arrays.

* `.min()` and `.max()`: Returns the min or max value.
* `.sum()`: Returns the sum of values.
* `.mean()`: Returns the mean of values.
* `.std()`: Calculates the standard deviation of values.
* `.var()`: Calculates the variance of values.

These methods are solely for performing summary calculations using the elements in the array.

<img src = 'https://github.com/ccbskillssem/pythonbootcamp/raw/main/day_2/methods.png'>

The example figure above shows methods being used on the whole array, but we can *also* use these methods on sliced sections of an array, such as a single row.

In [None]:
one_to_fifteen_array = np.array([[1, 2, 3, 4, 5],
                                [6, 7, 8, 9, 10],
                                [11, 12, 13, 14, 15]])
print('Array:\n', one_to_fifteen_array)

# Try it out:
# print the max value across the whole array

# print the sum of the second row


As we just discussed in the last section, one of the major strengths of arrays is that we can access them along axes. Many of these methods can be calculated along a specific axis.

We can specify the axis by providing it as a parameter for the method. We'll provide the axis values for 2D arrays:
* `0` refers to the "y axis" (column-wise).
* `1` refers to the "x axis" (row-wise).

<img src = 'https://github.com/ccbskillssem/pythonbootcamp/raw/main/day_2/axis_values.png'>

In [None]:
# an example of calculating a column-wise mean

print('Array:\n', one_to_fifteen_array)
print('\nColumn means:', one_to_fifteen_array.mean(axis = 0))

In [None]:
# Try it out: print the min value in each row of one_to_fifteen_array


Lastly, almost all of the calculation methods for arrays are also available in function form.

* `.min()` and `.max()` are equivalent to `np.min()` and `np.max()`.
* `.sum()` is equivalent to `np.sum()`.
* `.mean()` is equivalent to `np.mean()`.
* `.std()` is equivalent to `np.std()`.
* `.var()` is equivalent to `np.var()`.

> Strangely, median is only available as a function (`np.median`). We can't explain why this is, it's a bit of a head-scratcher.

Functions and methods both look similar and perform in almost identical ways, with the key difference that methods are dependent on a class (in this case, an array). The choice of method versus function is up to you: whatever feels more intuitive for you, go for it! That being said, there are some *minor* convenience benefits to using methods. Given that most of the array methods also return arrays, we can actually use multiple methods in a sequential manner: this is called **chaining** methods together.

In [None]:
# this will give us the standard deviation of the column wise-means
# we are "chaining" .std() right after .mean(axis = 0)

print('Column wise means:', one_to_fifteen_array.mean(axis = 0))
print('Standard deviation of col-wise means:', one_to_fifteen_array.mean(axis = 0).std())

Chaining is convenient for performing sequential `numpy` operations that would otherwise be clunky to write in nested function form.

In [None]:
# try it out, using methods only:
# find the minimum value in each column of one_to_fifteen_array
# then calculate the standard deviation of the minimums


In [None]:
# try it out, using functions only:
# find the minimum value in each column of one_to_fifteen_array
# then calculate the standard deviation of the minimums


### Broadcasting
We actually introducted you to **broadcasting** at the very beginning of our intro to `numpy`: we used this to perform a simple numerical calculation over all values in an array.

In [None]:
# review of broadcasting

np.array([1, 2, 3]) * 2

Broadcasting is a fairly complex operation in the back end of `numpy`, so we won't discuss exactly how it works. For our purposes, we can imagine that this is what's going on when we broadcast with a *single* value.

<img src = 'https://github.com/ccbskillssem/pythonbootcamp/raw/main/day_2/broadcasting1.png'>

As shown above, broadcasting allows us to "stretch" our single value from 1 x 1 to "1 x 3" for computing purposes. This is akin to having values in one column of a spreadsheet, then multiplying each value by 2 to generate values in the right adjacent column.

Broadcasting can also be used to generate 2D arrays from 1D arrays: for example, if we add two 2D arrays together, `numpy` will "fill in the blanks" with broadcasting.

In [None]:
array_a = np.array([[0],
                    [10],
                    [20],
                    [30]]) # a 4 x 1 array

array_b = np.array([[1, 2, 3]]) # a 1 x 3 array

array_a + array_b

<img src = 'https://github.com/ccbskillssem/pythonbootcamp/raw/main/day_2/broadcasting2.png'>

In [None]:
# try it out:
# multiply array_a by array_b
# what do you think will be the output?


Broadcasting can get pretty complex, so we'll leave it at that. If you think that broadcasting would be useful in your own work and you'd like to learn more about the back end of it, you can read more at the numpy documentation [here](https://numpy.org/doc/stable/user/basics.broadcasting.html#basics-broadcasting).

### Modifying array values

After we've created an array, we can modify the array by simply indexing/slicing and assigning new values.

In [None]:
zeros_6x6 = np.zeros((6,6))
zeros_6x6

For example, we can change one element:

In [None]:
# change top left element to -3
zeros_6x6[0,0] = -3
zeros_6x6

Or we can *broadcast* changes across an entire row or column.

In [None]:
# first row to 8
zeros_6x6[0] = 8

# first column to 8
# remember that putting the slice (:) operator in the row index means "all rows"
# [:, 0] means "for all rows, the 0th index", which is equivalent to the first column
zeros_6x6[:, 0] = 8

zeros_6x6

Lastly, we can assign values across a slice of an array as well.

In [None]:
# bottom right quadrant to fives

zeros_6x6[3:,3:] = 5
zeros_6x6

### Random number generation

On top of providing an extensive infrastructure for array-based manipulations and computations, `numpy` also provides functionality for generating random numbers from ranges and distributions.

> This [quick article](https://engineering.mit.edu/engage/ask-an-engineer/can-a-computer-generate-a-truly-random-number/) explains why computationally generated random numbers aren't *really* random. Nevertheless, we'll just go ahead and call them random numbers for simplicity's sake.

Any time you want to generate random numbers with `numpy`, you have to instantiate a **random number generator object** using the function `np.random.default_rng()`. This generator is typically assigned to a variable called `rng` by convention. You'll see the same code copy-pasted any time we need to generate random numbers.

```
rng = np.random.default_rng()
```

The essential methods for `rng` (in our opinion, of course) allow you to draw random numbers from common distributions.

* `.integers()`: Takes three inputs: a lower bound, an upper bound, and the number of desired outputs. Draws from the set of *integers*.
* `.uniform()`: Takes three inputs: a lower bound, an upper bound, and the number of desired outputs. Each value is equally likely to be drawn from the uniform distribution in the given range.
* `.normal()`: Takes three inputs: the center (mean) of the distribution, an optional scale factor (defaults to `1`), and the number of desired outputs. Probabilities of outputs within the range correspond to the normal distribution.

As we've come to expect, the input ranges for each of these methods are *exclusive* of the upper bound.

In [None]:
# try running this a few times
# notice how the numbers are different each time

rng = np.random.default_rng()

print('Integers:\n', rng.integers(0, 11, 5))
print('Uniformly distributed floats:\n', rng.uniform(0, 1, 5))
print('Normally distributed floats:\n', rng.normal(0, 1, 5))

You can provide a **seed value** when instantiating your `rng` to ensure that your results are reproducible by others. We'll be using `2023` as a seed value for our random number draws, simply because it's an easy to remember number. This will allow us (as a class) to generate the same "random" numbers in all examples.

In [None]:
# if we provide a number as the "seed"...
# our numbers will be the same each time
rng = np.random.default_rng(2023)

# try running this a few times!
rng.integers(0, 10, 10)

---

**VERY IMPORTANT NOTE**: You must always copy-paste the `rng = np.random_default_rng(**SEED VALUE HERE**)` line into any cell where you want to generate reproducible random numbers. Your numbers will not be reproducible if you instantiate `rng` in one cell and then draw random numbers in another cell. We will generally do this for you during the bootcamp, but just keep that in mind for when you write code on your own.

---

We can also use `rng` to draw values from existing arrays, which can be useful for obtaining a random sample of values.

* `.choice()`: Takes three inputs: an existing array, the number of desired elements, and a Boolean value indicating drawing with (`True`) or without (`False`) replacement.

In [None]:
# create an array of even integers from 0 to 22, excluding 22
even_ints = np.arange(0,22,2)
even_ints

In [None]:
# try it out
# use .choice() without a seed value to draw 5 random integers from even_ints with replacement
# run this cell multiple times: you will see a different set of numbers each time you run this cell


`numpy` supports many more distributions and methods for random number generation, including but not limited to:

* Exponential
* Binomial
* Poisson
* Dirichlet
* Beta
* Permutations of existing arrays

Click [here](https://numpy.org/doc/stable/reference/random/generator.html) to read more about all the different types of random numbers that can be generated using `numpy`.

# Shapeshifting arrays

Let's say that we want to modify some aspect of our array:
* Maybe we want to *transpose* the rows and columns.
* Perhaps we want to *reshape* the numbers of rows and columns.
* We might even want to *flatten* a multi-dimensional array into a flat (1D) array.

In numpy, there are a variety of methods that you can use to accomplish this with arrays. We will be focusing on 2D arrays for simplicity.

* `.reshape()`: Given a desired number of rows and columns, returns a reshaped copy of the array.
* `.transpose()`: Returns a transposed copy of the array (`i` rows and `j` columns becomes `j` rows and `i` columns).
* `.flatten()`: Returns a 1D copy of the array, where former rows are provided in sequential order.

These methods *do not* modify the array in place: if you'd like your array to remain in its new form, you'll need to update the existing array variable or create a new variable.

Let's start by creating an array with integers from 1 to 6.

In [None]:
sample_array = np.arange(1, 7, 1)
print(sample_array) # a 1 x 6 array
print('sample_array has', sample_array.size, 'elements.')
print('sample_array has the following shape:', sample_array.shape)

## Reshaping
`.reshape()` can be used to alter the way that the elements of an array are distributed across rows and columns. For example, we can use `.reshape()` to reconfigure `sample_array` into a 6 x 1 version of itself.

In [None]:
print(sample_array.reshape(6, 1))
print('Shape: ', sample_array.reshape(6, 1).shape)

Note that `.reshape()` will only work if you provide enough rows and columns for the number of elements present in the referenced array.
<img src = 'https://github.com/ccbskillssem/pythonbootcamp/raw/main/day_2/reshape.png'>

In [None]:
# try it out:
# what happens if you try to reshape sample_array into a 2 x 2 array


In [None]:
# Try it out:
# use .reshape() to update sample_array to a 3 x 2 array


## Transposing
Using `.transpose()`, we can swap the numbers of rows and columns, essentially rotating the array by 90 degrees.

<img src = 'https://github.com/ccbskillssem/pythonbootcamp/raw/main/day_2/transpose.png'>

In [None]:
print('Before .transpose():\n', sample_array)

print('\nAfter .transpose():\n', sample_array.transpose())

## Flattening
Finally, we can use `.flatten()` to return `sample_array` to its original form.

In [None]:
# try it out:
# use .flatten() to make sample_array into a 1D array again


## Stacking arrays
We can also **stack** existing arrays along shared dimensions to create larger arrays using two convenient functions:
* `np.hstack()`: Given a tuple of arrays that have the same number of rows, returns a larger array with input arrays "stacked" from left to right.
* `np.vstack()`: Given a tuple of arrays that have the same number of columns, returns a larger array with input arrays "stacked" from top to bottom.

In [None]:
array_a = np.zeros((3, 2))
array_b = np.ones((3, 2))
array_c = np.full((3, 2), 3)

np.hstack((array_a, array_b, array_c))

In [None]:
# Try it out:
# *vertically* stack array_a, array_b, and array_c (in that order)

# *horizontally* stack array_a, array_b, and array_c (in that order)


# Boolean masking and filtering

We've done a ton with arrays so far: we've created them, assigned values to them and changed them every which way, massaged them into different shapes and sizes, performed some simple math with them, and even experimented with broadcasting math over arrays of different shapes.

The last thing we'll learn is how to *filter* arrays by their values.

In [None]:
# we're going to quickly generate an array of 25 random integers
# ranging from -10 to 10
# then we'll reshape them into a 5 x 5 array

rng = np.random.default_rng(2023)
random_ints = rng.integers(-10, 10, 25)
random_ints = random_ints.reshape((5,5))

random_ints

We can use logical operators to filter arrays by certain conditions: by doing so, we broadcast the logic check over each element and obtain an array of Booleans (`True` or `False`).

In [None]:
# which of the elements in random_ints are greater than 3?

random_ints > 3

In [None]:
# this works with multiple logic checks too
# note: you have to use the & operator, it won't work with the literal "and"

(random_ints > 3) & (random_ints < 8)

This array of Booleans can then be applied to select only values that are `True` in the Boolean array, thereby filtering values in accordance with our logic check. This method of using a Boolean array as a filter is also called **Boolean masking**.

In [None]:
# All values that are greater than 3:

random_ints[random_ints > 3]

Lastly, we can also use this method to modify array values according to logic checks.

In [None]:
# Set all values less than 0 to be 0

random_ints[random_ints < 0] = 0
random_ints

# Exercises

**1**: Here is the `even_3x4` array that we saw this morning.

In [None]:
even_3x4 = np.array([[2, 4, 6, 8],
                     [10, 12, 14, 16],
                     [18, 20, 22, 24]])

print(even_3x4)

**1A**: Using both indexing/slicing and array methods, print the values for the following operations with `even_3x4`:

1. The average value across the last three columns
2. The standard deviation for each column
3. The average of the standard deviation values in #2

In [None]:
### write your code below ###

print(even_3x4[:,1:].mean())
print(even_3x4.std(axis=0))
print(even_3x4.std(axis=0).mean())

**1B**: Using broadcasting, take the square root of each element in `even_3x4`, then generate the row-wise sum.

In [None]:
### write your code below ###

(even_3x4 ** 0.5).sum(axis = 1)

**1C**: Use broadcasting to perform an operation with `even_3x4` that will yield the following array:

```
[[ 4,  8, 12, 16],
  [12, 16, 20, 24],
  [20, 24, 28, 32]]
```

In [None]:
# hint: this is a single arithmetic operation with a 1D array

### write your code below ###

even_3x4 + np.array([2, 4, 6, 8])

**1D**: Create two 1D arrays and use broadcasting to perform an operation that will yield the following array:

```
[[-1,  1, -1],
 [-1,  1, -1],
 [-1,  1, -1]]

```

In [None]:
### write your code below ###
print(np.array([[1],
                [1],
                [1]]) * np.array([-1, 1, -1]))

print(np.ones((3,3)) * np.array([-1, 1, -1]))

**2A**: Generate an array of 100 random numbers generated from the normal distribution centered at `5.0` and a scale factor (standard deviation) of `3`, using the seed value `2023`. Assign the array to a variable called `random_normal`.

In [None]:
### write your code below ###

rng = np.random.default_rng(2023)
random_normal = rng.normal(5.0, 3, 100)
random_normal

**2C**: Write a function called `interval_range()` that takes in 3 arguments:
1. An array of 100 randomly-distributed numbers
2. A lower bound (integer)
3. An upper bound (integer)

The function should returns all values in the array that fall inside the range between the lower and upper bound, inclusive of the upper bound. Test it on `random_normal` with `2` as the lower bound and `5` as the upper bound.

In [None]:
### write your code below ###

def interval_range(random_numbers, lo, hi):
  return random_numbers[(random_numbers > lo) & (random_numbers <= hi)]

interval_range(random_normal, 2, 5)

**3**: So far, we've mainly been working with 2D arrays. Color images are a good example of an object that are represented with 3D arrays in Numpy. Run the cell below to read and display a sample image provided by the sklearn Python package.

In [None]:
from sklearn.datasets import load_sample_image
import matplotlib.pyplot as plt
import matplotlib

img_data = load_sample_image("flower.jpg")
plt.imshow(img_data)

**3A**: Print the dimension and shape of the image above. What do you think each axis represents? How many pixels are in this image?

💡 **Hint**: An image is represented here with a 3D numpy array. Since the image is in color, each pixel needs to have 3 color coordinates for red, green and blue, abbreviated as RGB.

> *Note*: Sometimes, you'll see a 4th column across the 3rd dimension, that controls the opacity of each pixel (this parameter is called alpha).

In [None]:
### write your code below ###
print(img_data.ndim)
print(img_data.shape)

# Axes 0 and 1 correspond to the y and x coordinates of each pixel, respectively. Axis 2 (3rd axis) contains
# the RGB values of each pixel. There are 427*640=273280 pixels in this image.

**3B**: Try cropping the image to focus on just the flower by defining a new 3D array, `crop_image`. Then display your cropped image.

In [None]:
### write your code below ###

crop_data=img_data[75:375, 150:475, :]

#display your cropped image
plt.imshow(crop_data)

**3D**: We can transform this image into a grayscale image where each pixel is represented by a single grayscale value rather then three RGB values. To do this, we can represent each pixel as weighted sum of its RGB values: each red, green, blue value of the pixel is multipled by a certain weight then the three values are summed to produce a single grayscale value.

Using this formula, transform the image above into a grayscale image. Save the new numpy array into the `grayscale_img` variable. Lastly, run the cell further below to display your grayscale image.

💡 **Hint**: One way to do this is: `np.dot(img_data[:, :, :], rgb_weights)`. Try computing the weighted sum yourself.

In [None]:
rgb_weights = [0.2989, 0.5870, 0.1140]

# ### write your code below ###
crop_data = crop_data[:, :, :] * rgb_weights
grayscale_img = crop_data.sum(axis=2)

# display your grayscale image
plt.imshow(grayscale_img, cmap=plt.get_cmap("gray"))

This exercise was in part inspired from this post and uses the rgb weights it provides: https://www.adamsmith.haus/python/answers/how-to-convert-an-image-from-rgb-to-grayscale-in-python

**Challenge 1**: Write a function called `exp_prob()` that returns the probability that an exponential random variable `X` falls in the interval `(a,b)`. The function should take three inputs: `sample_size`, `a` and `b`.

We've pre-written some skeleton code for generating samples from the exponential distribution:
  - `.exponential()` is a method that operates on our random number generator (`rng`) to generate an array of `sample_size` numbers drawn from the [exponential distribution](https://numpy.org/doc/stable/reference/random/generated/numpy.random.Generator.exponential.html). You can use these numbers in your code by referencing the variable `random_exponentials`. (See the optional section below for more info on `numpy` pseudo-random generation.)

💡 **Hint**: There are also multiple ways to assess if sampled numbers are within your interval, but not all of them are *efficient*. Try to see if you can accomplish this using Boolean masking.

To check your work, you can change set the *seed value* for `rng`:

```
rng = np.random.default_rng(2023)
```

Your function should look like this for `sample_size = 1000` and the interval `(0, 3)`:
```
out = exp_prob(1000, 0, 3)
print(out) # prints 0.944
```

In [None]:
def exp_prob(sample_size, a, b):
  rng = np.random.default_rng(2023) # use np.random.default_rng(2023) for reproducibility
  random_exponentials = rng.exponential(scale = 1.0, size = sample_size)

  ### write your code below ###
  within_interval = random_exponentials[(random_exponentials > a) & (random_exponentials < b)].size
  return within_interval/sample_size

exp_prob(1000, 0, 3)

**Challenge 2**: Write a function called `exp_prob_trials()` that performs the same operation as `exp_prob()`, but for a defined number of trials (multiple draws of the same sample size).

This function should take four inputs: `trials, sample_size, a, b`. `trials` should be an integer number of trials.

This function should return a tuple of the mean and standard deviation of the probability values generated across the number of trials.

As mentioned above, you can set the *seed value* of `rng` to `2023` to check your work.

Your function should look like this for `sample_size = 1000`, `trials = 10`, and the interval `(0, 3)`:
```
out = exp_prob_trials(1000, 10, 0, 3)
print(out) # prints (0.9509000000000001, 0.06869636089342722)
```

In [None]:
def exp_prob_trials(trials, sample_size, a, b):
  rng = np.random.default_rng(2023) # use np.random.default_rng(2022) for reproducibility
  # we can provide a two-element tuple to size to create multiple generated arrays
  random_exponentials = rng.exponential(scale = 1.0, size = (trials, sample_size))

  ### write your code below ###
  trial_results = np.array([])
  for sample in random_exponentials:
    within_interval = sample[(sample > a) & (sample < b)].size
    trial_results = np.append(trial_results, within_interval/sample_size)
  return trial_results.mean(), trial_results.std()

exp_prob_trials(1000, 10, 0, 3)

The following topics are useful, but we will not use them heavily in the remainder of the week. This material may be covered at the lecturer's discretion.

# [Optional] Sorting arrays

Although we can typically use `.min()`/`np.min()` and `.max()`/`np.max()` to retrieve important min/max values, there may be some occasions in which you want to sort your array's values. For that, there's `.sort()` and `np.sort()`.

However, unlike with all the other summary calculations, `.sort()` and `np.sort()` do have a significant difference in their functionality:
* `np.sort()`: A function that takes in an array and axis value, then returns a sorted copy of the input.
* `.sort()`: A method that references an existing array, takes an axis value, and **sorts the referenced array in place**.

With that in mind, it's best to consult your cheat sheet or look up the documentation if you can't remember which one does which.

In [None]:
unordered_array = np.array([2, 9, 4, 6, 11, -4 , 2, 0, 23])

# try it out
# use np.sort() on unordered_array


Both the function and method also work for 2D arrays: however, you have to take extra care if you're trying to preserve any sort of row-wise or column-wise relationship.

In [None]:
# a test array
unordered_2D = np.array([[1, 2, 3, 4],
                         [4, 3, 2, 1],
                         [0, 5, 0, 5]])

unordered_2D

In [None]:
# try it out:
# use np.sort() to sort unordered_2D row-wise


Sorting by rows operates as you might expect: row elements are reshuffled, such that the lowest value in the row goes to the first element of the row.

In [None]:
# now try it column-wise

np.sort(unordered_2D, axis = 0)

Columns are a little trickier: the lowest value in each column will be moved to the top row, and so on for the remaining sorted elements. This will result in rows having different elements compared to the starting array, so don't sort like this if your row values are meant to stay together (i.e. in a tabular format).

# [Optional] Splitting arrays
Similarly, we can also **split** an existing array into smaller arrays.
* `np.hsplit()`: Takes an array and a number of desired arrays, then equally(\*) splits the array column-wise into the desired number of sub-arrays, returning them as a **list** of arrays.
* `np.vsplit()`: Takes an array and a number of desired arrays, then equally(\*) splits the array row-wise into the desired number of sub-arrays, returning them as a **list** of arrays.

> (\*) Note that this won't work if the array can't be divided up equally. If you need smaller arrays of different sizes, you can either slice the array or look at the more advanced documentation for `np.hsplit()` ([here](https://numpy.org/doc/stable/reference/generated/numpy.hsplit.html#numpy.hsplit)) and `np.vsplit()` ([here](https://numpy.org/doc/stable/reference/generated/numpy.vsplit.html#numpy.vsplit)).

In [None]:
# let's work with the horizontally stacked a, b, and c

stacked_abc = np.hstack((array_a, array_b, array_c))
stacked_abc

In [None]:
# split into two sub-arrays, column-wise

np.hsplit(stacked_abc, 2) # the output looks bad, but this is a list of three 6 x 1 arrays

In [None]:
# try it out:
# split stacked_abc into three sub-arrays, row-wise
