# NumPy, Matrices, and Datafiles

*ENGR 1330 | Computational Thinking with Data Science | Texas Tech University*

**Developed By:** Derek Johnston @ Texas Tech University

In [2]:
import numpy as np
import csv

## Exercise 1 | Creating a Numpy Array

The core datatype of NumPy is the NumPy Array. NumPy Arrays build on the basic Python List datatype to add functionality for vector math and common statistical calculations. The easiest way to create a NumPy Array is to first create a List and then passing it as the argument to the *np.Array()* constructor. In this exercise, you'll create some NumPy arrays, and then perform some simple operations on them.

### Resources

- [Creating a NumPy Array](https://www.w3schools.com/PYTHON/numpy/numpy_creating_arrays.asp)
- [NumPy Array Shape](https://www.w3schools.com/PYTHON/numpy/numpy_array_shape.asp)
- [Sorting a NumPy Array](https://www.w3schools.com/PYTHON/numpy/numpy_array_sort.asp)

### Instructions

1. Create a 1-dimensional Array, named *my_arr* containing the elements **[9, 8, 7, 6, 5, 4, 3, 2, 1]**.
2. Print the shape of *my_arr*.
3. Create a 2-dimensional Array, named *my_mat* containing rows, each containing a copy of my_arr.
4. Print the shape of *my_mat*.
5. Create a new Array named *my_prod* and use it to store the result of the operation **my_arr * my_arr**.
6. Print out *my_prod*
7. Use the **np.sort** function to sort *my_prod* and store the result in a new Array named *my_prod_sorted*.
8. Print out *my_prod_sorted*

In [12]:
# Exercise 1 Code Goes Here
my_arr = np.array([9, 8, 7, 6, 5, 4, 3, 2, 1])
print(my_arr.shape)

my_mat = np.array([my_arr, my_arr, my_arr])
print(my_mat.shape)

my_prod = my_arr * my_arr
print(my_prod)

my_prod_sorted = np.sort(my_prod)
print(my_prod_sorted)

(9,)
(3, 9)
[81 64 49 36 25 16  9  4  1]
[ 1  4  9 16 25 36 49 64 81]


## Exercise 2 | Working with Matrices

NumPy really shines when it's time to work with vectors and matrices. A lot of complex operations work right out of the box and are very easy to use. In this exercise, we'll look at how to manipulate the shape of matrices, access their rows and columns, and perform some matrix math operations. 

### Resources

- [Reshaping an Array](https://www.w3schools.com/PYTHON/numpy/numpy_array_reshape.asp)
- [Array Indexing](https://www.w3schools.com/PYTHON/numpy/numpy_array_indexing.asp)
- [NumPy Dot Product](https://pythonexamples.org/python-numpy-dot-product/)

### Instructions

- In exercise 1, you created an array called *my_arr*. Use the **reshape** function to create a 3x3 matrix named *arr_reshaped*.
- Print out *arr_reshaped* to see the result.
- Use indexing to print out the last row of the matrix and print out the element at the center of the matrix.
- Create two 1-dimensional arrays named *X* and *Y* with the elements *[3, 9, 5, 4]* and *[1, 8, 7, 6]*, respectively.
- Compute the dot product of *X* and *Y* and print the result.


In [15]:
# Exercise 2 Code Goes Here

arr_reshaped = my_arr.reshape(3, 3)
print(arr_reshaped)
print(arr_reshaped[2, :])
print(arr_reshaped[1, 1])

X = [3, 9, 5, 4]
Y = [1, 8, 7, 6]

dot_prod = np.dot(X, Y)
print(dot_prod)

[[9 8 7]
 [6 5 4]
 [3 2 1]]
[3 2 1]
5
134


## Exercise 3 | Importing a Datafile

Eventually, we'll want to put our data science skills to work on data from the real world. To do this, we first need to get that data into our Python scripts. It's also a good idea to save our data once we're finished working with it. In this exercise, you'll practice reading-in a *comma-separated (CSV)* file containing some data. You'll also look at how to copy data into a CSV file of your own.

### Resources
- [Working with CSV Files in Python](https://realpython.com/python-csv/)


### Instructions

1. Use a **with** statement to open the *airtravel.csv* for **reading** as *file*.
2. Create a csv reader object for *file* with a *,* delimiter
3. Use a **for** loop to print each row of the CSV file.
4. Using the same **for** loop that you created in step 3, create a List called *data* and append, to it, each row of the CSV file.
5. Print the row of *data* which contains the entries for October.
6. Use a **with open** statement to create a new file called *airtravel_copy.csv* for writing. **Hint:** Use the argument *newline=''* to keep your output file clean.
7. Create a *csv.writer* object with the arguments *file, quotechar="'", quoting=csv.QUOTE_MINIMAL*.
8. Using a **for** loop, write each row in your *data* matrix *airtravel_copy.csv*. 

In [28]:
# Exercise 3 Code Goes Here
with open('airtravel.csv', 'r') as file:
    csv_reader = csv.reader(file, delimiter=',')
    data = []
    for row in csv_reader:
        print(row)
        data.append(row)
    print(data[10])

with open('airtravel_copy.csv', 'w', newline='') as file:
    csv_writer = csv.writer(file, quotechar="'", quoting=csv.QUOTE_MINIMAL)
    for d in data:
        csv_writer.writerow(d)

['Month', ' "1958"', ' "1959"', ' "1960"']
['JAN', '  340', '  360', '  417']
['FEB', '  318', '  342', '  391']
['MAR', '  362', '  406', '  419']
['APR', '  348', '  396', '  461']
['MAY', '  363', '  420', '  472']
['JUN', '  435', '  472', '  535']
['JUL', '  491', '  548', '  622']
['AUG', '  505', '  559', '  606']
['SEP', '  404', '  463', '  508']
['OCT', '  359', '  407', '  461']
['NOV', '  310', '  362', '  390']
['DEC', '  337', '  405', '  432']
[]
['OCT', '  359', '  407', '  461']


## CHALLENGE| Putting it all together

The *airtravel.csv* file from Exercise 3 contains the average number of passengers each month for the years 1958, 1959, and 1960. In this challenge, you will use all the skills that you've learned in the previous exercises.

### Resources
- [Calculating the Mean using NumPy](https://www.geeksforgeeks.org/numpy-mean-in-python/)

### Instructions

1. Read-in the data contained in *airtravel.csv** into a 2-dimensional list. 
3. Create a 2x3 matrix in which the first row contains the headers *["1958", "1959", "1960"] and the second row contains the **mean** number of passengers flown during that year.
3. Save your averages matrix in a CSV file named *averages.csv*

### Hints

- The data that comes out of the CSV file will all be of the *string* datatype. You'll need to come up with a way to convert the numbers to a *numeric* datatype such as *int64* before you can calculate the mean.
- When you read the CSV file into your script, you'll get an empty row at the end of the matrix. Be sure to remove this row from the matrix.

In [84]:
# Challenge Code Goes Here
data = []
with open("airtravel.csv", 'r') as file:
    csv_reader = csv.reader(file)
    for row in csv_reader:
        data.append(row)

del data[-1]
data = np.array(data)

passengers = data[1:, 1:].astype(np.int64)

averages = [
    ["1958", "1959", "1960"],
    [
        round(np.mean(passengers[1:, 0])),
        round(np.mean(passengers[1:, 1])),
        round(np.mean(passengers[1:, 2]))
    ]
]

with open('averages.csv', 'w', newline='') as file:
    csv_writer = csv.writer(file)
    csv_writer.writerows(averages)

[['1958', '1959', '1960'], [385, 435, 482]]
