# CS 345 Exercise 01:  working with CSV files

**Instructions:** Complete the exercises in this notebook and submit it via Canvas.


**CSV** or comma separated values is one of the basic formats for distributing data, and the main one we will use this semester.  CSV files are used to represent data that is in the form of a two dimensional array, i.e. a matrix.  For example: 

\begin{pmatrix}
12 & 13 & 1\\
3 & 5 & 2
\end{pmatrix}

Let's create a CSV file that contains this matrix:

In [22]:
data = """12,13,1
3,5,2
"""
file_handle = open("data.csv", "w")
file_handle.write(data)
file_handle.close()

The following will print the contents of the file to see that we have indeed created it:

In [24]:
%cat data.csv

UsageError: Line magic function `%cat` not found.


## Exercises


### Reading a CSV file

Write a function called `csv_read(file_name)` that reads the data stored in the given file and returns a matrix as a list-of-lists.  Given the above file if you read it using your function

```python
matrix = csv_read("data.csv")
```

should give you the matrix

```python
[[12.0, 13.0, 1.0], [3.0, 5.0, 2.0]]
```

and

```python
>>> matrix[0]
[12.0, 13.0, 1.0]
```

```python
>>> matrix[1][2]
2.0
```

In [48]:
# fill in the implementation of the following function:

def csv_read(file_name) :
    handle = open(file_name)
    matrix = []
    for line in handle:
        row = line.split(',')
        row = [float(value) for value in row]
        matrix.append(row)

    return matrix

Let's verify that your code works correctly:

In [50]:
# the following code cell should return True if your code reads the
# provided test case correctly

matrix = [[12.0, 13.0, 1.0], [3.0, 5.0, 2.0]]
result = csv_read("data.csv")
result == matrix

True

Some pointers to get you started:


First, here's the Pythonic way of reading a file:

```Python
    try: 
        file_handle = open(file_name)   
        # file_name is the name of the file
    except :
        return -1
    with file_handle :
        for line in file_handle :
            # process each line
```

The `try-except` block takes care of the situation of a file name that does not correspond to an open-able file.

For processing each line, we recommend using a string's [split](https://docs.python.org/3.7/library/stdtypes.html?highlight=split#str.split) method.
To convert the string literals to floating point numbers use the `float` function.

### Operations on matrices

As a second exercise, write two functions that return the sum of the elements in the rows/columns of the matrix:

In [52]:
def sum_columns(matrix) :
    totals = []
    
    for cols in range (len(matrix[0])):
        result = 0
        for rows in range (len(matrix)):
            result += matrix[rows][cols]
            
        totals.append(result)
        
    return totals

def sum_rows(matrix) :
    totals = []
    
    for rows in range(len(matrix)):
        result = 0
        for cols in range (len(matrix[0])):
            result += matrix[rows][cols]
            
        totals.append(result)
        
    return totals

In [6]:
# code for verifying your implementation

### CSV files in practice

CSV files are so common that the Python standard library includes a module called `csv`.  Details are in the [Python documentation](https://docs.python.org/3/library/csv.html).
Reading CSV files is such a common task that there are other options in [NumPy](https://docs.scipy.org/doc/numpy/reference/generated/numpy.genfromtxt.html) and [pandas](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_csv.html).  Do not use these in your implementation!

### Slicing Python lists

**Slices** allow you to create sublists of existing lists.  

The syntax for a slice is as follows:

```Python
sequence [start:stop[:step]]
```

```
start (optional): Starting index of the slice. Defaults to 0.
stop (optional): The last index of the slice or the number of items to get. Defaults to the length of the list.
step (optional): The step/stride value of the slice. Defaults to 1.
```

For example:

In [56]:
values = [1,2,3,4,5,6,7,8]
values[1:5]

[2, 3, 4, 5]

Next, try out the following commands:

```python
values[1:3]  
values[2:-1] 
values[:2]   
values[2:]   
values[::2] # the last value is the step/stride
```

In [95]:
values[1:3]  
values[2:-1] 
values[:2]   
values[2:]   
values[::2]

[1, 3, 5, 7]

Based on your experiment answer the following:

* What happens if you omit the start/end index?
* What is the effect of using negative indices for the start or end index?
* What is the effect of using a negative step/stride?

- The resulting sublist will either run up to the end index if the start index is excluded, or start at the staring index and run to the end of the list if the end index is excluded
- It makes the start and stop index act off of the end of the list
- The list is returned in reverse order in sequence of the step value

* Write code that reverses a list using a slice (hint:  negative strides).

In [97]:
values[::-1]

[8, 7, 6, 5, 4, 3, 2, 1]

Are slices limited to lists?  Can you apply them to strings as well?

- Yes, strings are effectively arrays in Python so they can be indexed into and sliced.

### List comprehension

List comprehension is a very convenient Python syntax for creating lists.  Here are a couple of quick exercises to help you familiarize (or re-familiarize) yourself with this useful piece of Python syntax.

* In the next cell, write code that produces the first 10 integers that are a multiple of 3.  Use list comprehensions for this task.  Make sure to print your list or have it be the output of the cell.

In [103]:
mults_of_3 = [x*3 for x in range(1,11)]
print(mults_of_3)

[3, 6, 9, 12, 15, 18, 21, 24, 27, 30]


### More list comprehension

* You are given a list of integers.  Using list comprehension create a sublist of the original list that contains only the even numbers from the list.

For example, given the list
```Python
values = [2,8,11,3,6,2]
```
The result should be the list with the values
```Python
[2,8,6,2]
```

In [111]:
values = [2,8,11,3,6,2]
evens = [x for x in values if x % 2 == 0]
print(evens)

[2, 8, 6, 2]
