# <p style="background-color: #f5df18; padding: 10px;">Programming & Plotting in Python | **Arrays** </p>


<div style="display: flex;">
    <div style="flex: 1; margin-right: 20px;">
        <h2>Questions</h2>
        <ul>
            <li>What is an array, and how does it differ from a regular Python list?</li>
            <li>What are some common array manipulation functions provided by NumPy?</li>
        </ul>
    </div>
    <div style="flex: 1;">
        <h2>Learning Objectives</h2>
        <ul>
            <li>Understand the concept of arrays and their importance in programming.</li>
            <li>Explore different operations that can be performed on arrays, such as indexing, slicing, and reshaping.</li>
            <li>Understand the advantages of using NumPy arrays over Python lists.</li>
            <li>Learn how to perform basic statistical operations on arrays using NumPy.</li>
        </ul>
    </div>
</div>


# **1-dimensional lists and arrays**
---

When handling data, it's common to encounter not just individual numbers but rather sets of values. These sets can include various data types like floats, integers, strings, or a mixture of these. In this context, we identify two primary data structures: `lists` and `arrays`.



`Lists` are usually represented by brackets [...], within which you can input the list's contents. An easy way to conceptualize this is akin to a row in an Excel spreadsheet, where multiple cells hold individual pieces of information.



The following is an example of a 1D list:

In [4]:
# List of famous astronomers
astronomers = ['Vera Rubin', 'Nancy Roman', 'Galileo Galilei', 'Carl Sagan', 'Edwin Hubble', 'Cecilia Payne-Gaposchkin']
print(astronomers)

# **Indexing**
To access the data within the list you use indexing. An index is the position of some element in an array. To do this you would use the index of the column and row to get the value.

## 🔔 **Python uses zero-based indexing!**

This means that the initial value of a list is assigned the 0th index.

To reiterate: **the initial value in a list is indexed as zero.** Consequently, the second element is indexed as 1, the third as 2, and so forth.

<p align="center">
  <img width="750" src="https://www.thecrazyprogrammer.com/wp-content/uploads/2015/05/Array-in-Java.gif">
</p>

&nbsp;

To call a certain value from a list, call the list name followed by brackets containing the index of the value you want:
```python
astronomers[index]
```
Here is a quick example:

In [1]:
astronomers[0]

The index has to be an **integer**, you cannot have the 1.5th element in an array. But not only can you count forward with index number, you can also count backward! For example `beemovie[-1]` would give you the last entry in the array, and `beemovie[-2]` would be the second to last entry.

In [2]:
astronomers[-1]

Take a look below at what happens when we try to access an index in the array that we didn't define. This is a common coding error typically called an "Out of Bounds Exception".

In [3]:
astronomers[90]

# **Slicing arrays and lists**
---

You can also extract **slices** of lists by specifying certain index values. This feature is useful when you need to visualize only a portion of your data, such as plotting a small subset of it.
&nbsp;
   

For slicing, use a colon inside where you are calling the range of indicies you'd like to return. Syntax:
```
    :x - from beginning to index x
    x: - from index x until the end
    a:b - from index a to b
    a:b:c - every c'th entry between indices a and b
```

These can be combined, for example:

```
a::c - goes from index a until the end in steps of c.
```



Slicing is *exclusive*, so the last index of a range isn't included. For example, if you want to take index two through six of a list you should do:

    list[2:7]


Take a second to use the code below to test out how slicing works.

In [None]:
test_list = [42,67,21,33,90,26]   # A list with 6 elements.

print('1st four elements:',test_list[:4])

print('Test out slices below!')
### Try out different slices below ####

print(test_list[:])

## <p style="background-color: #f5df18; padding: 10px;"> 🛑 Slicing Exercise </p>

---

Using the list defined below called `nums`, use the slicing feature to print out only even values from 4 to 12.

In [None]:
nums = [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20] # don't change this line!

### your solution here ####

[4, 6, 8, 10, 12]

# **Numpy arrays**
---

While working with lists is incredibly useful, there are some interesting properties that can become a problem. For example if you want to do a mathematical operation on a list, it doesn't work quite as easy as we expect.

For example, the following cell:

In [5]:
teehee = [1.0, 2.0, 3.0, 4.0, 5.0]
teehee5 = 5 * teehee

print(teehee5)

This won't  multiply all the values within the list by 5, but instead will repeat the original list 5 times. This clearly is a problem when working with data, which we often must manipulate and do math with!

To get around this issue we can create **arrays** from our lists using the `array()` command from the `numpy` package. **Arrays are able to store data like lists AND do math with it!**


We start off by importing the `numpy` library and renaming it as `np` (so we don't have to type as much). Next we make two lists of numbers and try adding them. Then we convert these lists to arrays using `np.array(my_list)` for each and add the arrays together.

In [6]:
# make a couple of lists
a = [1, 2, 3, 4, 5]
b = [6, 7, 8, 9, 10]

# add lists together


In [7]:
#convert lists into numpy arrays!
import numpy as np

a1 = #
b1 = #

# add arrays together


As you can see, using the `numpy` library adds the corresponding elements in the array together instead of simply printing the two added lists next to each other.

You can use all the basic math operations on arrays, and they will always apply to each element. For example, multiplying all the elements in an array by 7 is this easy:

In [9]:
# now, let's multiply the array 'a1'  by seven



Here's a representation of basic mathematical operations:

```
1. +           add
2. -           subtract
3. *           multiply
4. /           divide
5. **          power
6. np.log()    log-base e (natural log)
7. np.log10()  log-base 10
8. np.exp()    exponential
```

All of these operations apply to each element of the array, including NumPy functions like `np.log`, `np.sin`, and others. This demonstrates another way in which NumPy simplifies and enhances the use of Python for scientific purposes, making it more straightforward and convenient.

In [None]:
# let's try out some of the possible numpy mathematical operations

### **Functions that makes arrays for you: `linspace` and `arange`**
---

The first is [`np.linspace()`](https://numpy.org/doc/stable/reference/generated/numpy.linspace.html) and the second is [`np.arange()`](https://numpy.org/doc/stable/reference/generated/numpy.arange.html). Both give you an array with numbers between two values that are linearly spaced (e.g. 2, 4, 6, 8, 10), but they do it slightly differently.


The syntax is the following:

```python
np.linspace(beginning_number, end_number, number_of_points)
np.arange(beginning_number, end_number, step_size)
```


The `step_size` here corresponds to the difference between one point and the next in your array of values.

Typically `np.linspace` is used when you know how many datapoints you need, and `np.arange` is used when you want to jump by a certain amount between each number. The latter is typically only used for stepping by whole numbers, but you can still use it with decimals.


An important note is that `np.linspace` is an *inclusive* function, meaning that the end number you give is included in the output array. However, `np.arange` is *exclusive*, meaning the end number is not included. Keep this in mind as we move forward!


In [10]:
#Makes an array which has 5 entries and goes from 0 to 4. Notice the data type of the elements!



## <p style="background-color: #f5df18; padding: 10px;"> 🛑 array division </p>

Create two `numpy` arrays that have 5 entries, one that is even and one that is odd. Now divide one by the other and print the result.

In [11]:
### your answer here ##


## <p style="background-color: #f5df18; padding: 10px;"> 🛑 `linspace` exercise </p>

Create a linear array (using `np.linspace`) with 10 values between 0 and 1.  Write an `if` statement that checks whether the last entry in the array is less than one. If so, have it print out the last entry. If not, have it print out "[last entry] is not less than one."

In [12]:
### your answer here ##

## <p style="background-color: #f5df18; padding: 10px;"> 🛑 `arange` exercise </p>

Create a `linspace` array with ten entries between 1 and 100. Create an `arange` array from 100 to 200 (with 200 included!) spaced by 10s. Print them to check.

In [13]:
### your answer here ##

# 2-D arrays (and more) ##
Your data might not be a 1-D list of numbers or letters. Instead it may sometimes look like an excel spreadsheet, with rows and columns. This is a 2-D array, what we typically use for storing data.

In [None]:
hello = np.array([["who", "what", "when"],
                  ["where", "why", "how"]])

To index a specific element in a 2D array, you would give the row and column indices like a set of coordinates:

    arrayname[0,0] # again in row, col notation
    


Refer to this helpful diagram below that visually shows each index of a 2-D array.

<p align="center">
  <img width="350" src="https://iq.opengenus.org/content/images/2020/04/index.png">
</p>



Again, each index can be given as a variable, so long as the variable is an **integer**. You can't have the 1.5th element of a list.

Here is an example of how to index the same element in 3 different ways:

In [None]:
print(hello[1,2]) #get the element in the 2nd row & 3rd column
print(hello[1][2]) #This is another way to get the same value with different syntax

i = 1
j = 2
print(hello[i,j]) #A third way to get the same value, but this time using variables as the indexing value

how
how
how


You can also do similar slicing work with the 2-D arrays.

In [14]:
#example lists
a = [0, 1, 2, 3]
b = [4, 5, 6, 7]
c = [8, 9, 10, 11]
d = [12, 13, 14, 15]

#a 2D array made of these lists
e = np.array([a, b, c, d])

e

In [15]:
# Let's print rows 3 and 4

In [16]:
# Let's print row 3 and the first 3 elements

Now to loop over an array you need not just one loop but two, since there's a row index AND a column index. We have returned to nested loops!

The first loop goes through and when i = 1, we are looking at the b array from above (since it is the second entry). Then when going through the second loop, and j = 3 for example it will be the 4th entry (in this case a 7).


In [17]:
# Let's print the length of e

In [18]:
# Let's use np.shape to print the shape of the array e


In [19]:
## Create a nested loop to print all values in the array


## <p style="background-color: #f5df18; padding: 10px;"> 🛑 looping through a 2D array </p>

A.) Make a 2D array of numbers that is 4 rows and 4 columns.

B.) Make a nested for loop that goes first through each row and prints the row number, then prints the element at each column in that row.

In [None]:
### your solution here ####

&nbsp;

# **Array Manipulation and Attributes**
---
Before we end for today, lets cover how to change an array once it's been created, as well as another useful way to get an array's properties.

## **Manipulating existing arrays**

You can **add values to the end of an array** using `np.append()`. Use it like this:
```python

new_array = np.append(array, something_appended)

````


You can even append another array, like this:

```python
array_appended = np.append(array, [5, 6])
two_arrays_appended_together = np.append(array1, array2)
```


Here is an example:

In [20]:
friends = np.array([4,5,6])
enemies = np.array([7,8,9])
everyone = np.append(friends,enemies)
print(everyone)

test_array = np.array([np.array([4,5,6]), np.array([7,8,9])])

print(test_array.shape)
test_array.flatten()

The last part of today will be some useful functions in the numpy library. You call these by `np.command(nameofarray)`:


| Command       | Description                                                                 |
|----------------|-----------------------------------------------------------------------------|
| `sum`          | Sum all the elements in the array.                                          |
| `min`          | Return the minimum value in the array.                                      |
| `max`          | Return the maximum value in the array.                                      |
| `sort`         | Return a sorted copy of the array in ascending order.                       |
| `argsort`      | Return the indices that would sort an array. Useful for ranking values.     |
| `where`        | Return the indices where a condition is `True`. Often used for filtering.   |
| `len`          | Return the number of elements along the first axis (like number of rows).   |
| `delete`       | Remove the specified index or slice from an array.                          |
| `unique`       | Return the unique values in an array.                                       |
| `dot`          | Perform matrix multiplication (dot product).                                |
| `concatenate`  | Join two or more arrays along a specified axis.                             |
| `flatten`      | Return a copy of a multi-dimensional array collapsed into one dimension.    |

    

## example using `np.delete`

In [None]:

astronomers = ['Vera Rubin', 'Nancy Roman', 'Galileo Galilei', 'Carl Sagan', 'Edwin Hubble', 'Cecilia Payne-Gaposchkin']

new_array = np.delete(np.array(astronomers),0)
print(new_array)

## example using `np.where` with tabular data


In [21]:
## Let's initialize the pandas library



# load the file `national-pokedex.csv` stored in the data directory




In [None]:
# let's start off by looking at the columns and getting a description of the data



- Let's use `np.where()` to find the index of the [Pokémon with the highest HP](https://pokemondb.net/pokedex/blissey).
- The syntax is: `np.where(array == value)`
- Note: `np.where()` returns a tuple containing arrays of indices that match the condition.
- To extract the actual array of indices, use `[0]` after the function, like this: `np.where(...)[0]`.


In [None]:
# use np.where to find the index in the national dex of the pokemon with the highest HP


# use the index to find the pokemon within the national dex with the highest HP


Let’s combine `np.where` with `np.sort` to write a `for` loop that finds the **top five Pokémon with the highest HP**.

> 💡 `np.sort` returns a sorted copy of an array, in ascending order by default.  
You can use it with `np.argsort` if you want the indices of the sorted values instead.


## <p style="background-color: #f5df18; padding: 10px;"> 🛑 Gotta Sort 'Em All: Mastering NumPy to Rank Pokémon </p>

Use your NumPy skills to build a function that finds the top five Pokémon based on a selected stat (like HP, Attack, or Defense).

### What to do:
- Fill in the blanks (`___`) in the `find_top_five` function with the correct NumPy commands (`np.sort`, `np.unique`, `np.where`, etc.).
- Run a `for` loop that prints the **top five Pokémon** for each of the available stats listed in the `stats` list.


In [None]:
def find_top_five(stat, pokedex=national_dex):
    """This function prints the top five Pokémon for a given stat."""

    # Fill in the `___` with the appropriate NumPy commands
    top_five = ___(___(pokedex[stat]))[-5:][::-1]  # Use np.unique and np.sort

    i = 0
    rank = ['1st', '2nd', '3rd', '4th', '5th', '6th', '7th', '8th']

    for value in top_five:
        if i <= 5:
            idx = ___(pokedex[stat] == value)[0]  # Use np.where

            if len(idx) > 1:
                for loc in idx:
                    if i >= 5:
                        break
                    else:
                        print(pokedex['Pokemon'][loc], 'has the', rank[i], 'highest', stat.replace('_', ' ').lower())
                        i += 1
            else:
                if i >= 5:
                    break
                else:
                    print(pokedex['Pokemon'][idx[0]], 'has the', rank[i], 'highest', stat.replace('_', ' ').lower())
                    i += 1
        else:
            return


In [22]:
# List of available stats to explore
stats = ['HP', 'Attack', 'Defense', 'Special_Attack', 'Special_Defense']

# Write a loop to print the top five Pokémon for each stat
for stat in stats:
    print(f'Top 5 Pokémon with highest {stat} stat!')
    find_top_five( # complete this line
    print('*' * 10)

## **Array Attributes**

The above functions require the use of `np.command(array)`. Some functions for numpy arrays do not need the `np.` prefix, and rather can just be called like `array_name.command`:

```
ndim - returns the array dimensions (axes)
shape - returns the shape of an array (like 2x2, 4x5)
size - returns the total number of elements in an array
dtype - returns the data type of the array components
```



If you'd like to know more about any of these commands or other functions not listed here (there are LOTS of numpy commands!) check out the documentation [here](https://numpy.org/doc/stable/index.html).

# <p style="background-color: #f5df18; padding: 10px;"> 🗝️ Key points</p>

---

- NumPy supports multidimensional arrays, allowing you to work with data of higher dimensions efficiently.
- NumPy provides functions for array manipulation, such as reshaping, transposing, concatenating, and splitting arrays.
- NumPy arrays seamlessly integrate with other scientific libraries in Python, such as SciPy, pandas, and matplotlib, making it a fundamental building block for scientific computing in Python.
- NumPy provides a wide range of mathematical functions that operate element-wise on arrays, known as universal functions or ufuncs.
