***
# Reading Data Files
- Commma Separated Value (CSV) files - a generic text file for data
 - (https://en.wikipedia.org/wiki/Comma-separated_values)


- Use the csv standard library - https://docs.python.org/3/library/csv.html

---

The data within a CSV files can be separated by things other than a 'comma':

comma: 1.0 , 2.0 , 3.0

semicolon: 1.0 ; 2.0 ; 3.0

includes quotes (for groupings): 1 , 2 , "a, b, c"

---

We will use the following file for reading in some three-dimensional data.

**data_3d.csv** contains the following information:
- Hearder first line
- 28 data rows
- 3 data columns (Time, Exp and Theory)

In [None]:
import csv

In [None]:
## CSV data file acan be found at
## https://github.com/karlkirschner/2020_Scientific_Programming/blob/master/data_3d.csv

## For Colabs

## In order to upload data
#from google.colab import files
#uploaded = files.upload()

## In order to show the plots
#%matplotlib inline

Example of a true comma separated value file

In [None]:
## A Jupyter and Linux bash trick for looking into a file
!head data_3d.csv --lines=10

Example of a semicolon separated value file.

In [None]:
!head data_3d_semi.csv --lines=10

For the cells that follow - we will use the data_3d.csv file.

In [None]:
time = []
exp = []
sim = []

with open('data_3d.csv') as file:
   read_it = csv.reader(file, delimiter=',')

   next(read_it)  ## skips header (i.e. the first row)
    
   for row in read_it:
       time.append(float((row[0])))
       exp.append(float(row[1]))
       sim.append(float(row[2]))

Now we have created three list that contains floats, giving us our data that we can then do something with!

---
#### Slicing
A quick note about getting a slice of data from a list.

Let's say that **a** is a list, then we can get different ranges via:

a (or a[:])   ## a copy of the whole array

a[start:stop] ## items start to stop-1

a[start:]     ## items start to end of list

a[:stop]      ## items start of list to stop-1

a[item]       ## a single item

In [None]:
print(time)

In [None]:
print(time[:])

In [None]:
print(time[5:15])

In [None]:
print(time[5:])

In [None]:
print(time[:15])

In [None]:
print(time[1])

---
Okay, back to our original focus.

Let's verify that our items are indeed numbers (we will come back to this in a bit).

In [None]:
type(time[0])

We can treat the items within the time list and floats.

In [None]:
time[1] - time[0]

---
### Okay, so let's demo how we could use this data.

(Again, we take a peak into our future and do some plotting.)

In [None]:
import matplotlib.pyplot as plt

In [None]:
## For Colabs
## In order to show the plots

#%matplotlib inline

In [None]:
plt.style.use('seaborn-whitegrid')
plt.figure(figsize=(15,5))

plt.plot(time, exp, linewidth=5, linestyle='solid', label='Experimental')
plt.plot(time, sim, linewidth=5, linestyle='dashed', label='Simulated')

plt.xlabel('Time (seconds)')
plt.ylabel('Y-Axis (Unit)')
plt.title('Experimental and Simulated Results')
plt.grid(True)

plt.show()

Notice that we have gaps in the tick marks for the x-axis.

How can we easily change this for plotting?
- Read the x-axis data as: **strings** (versus a float as above) and plot again.

    ```time.append(str(row[0]))```

In [None]:
time=[]
exp=[]
sim=[]

with open('data_3d.csv') as file:
   read_it = csv.reader(file, delimiter=',')

   next(read_it)
    
   for row in read_it:
       time.append(str(row[0]))
       exp.append(float(row[1]))
       sim.append(float(row[2]))

In [None]:
type(time[0])

In [None]:
plt.style.use('seaborn-whitegrid')
plt.figure(figsize=(15,5))

plt.plot(time, exp, linewidth=5, linestyle='solid', label='Experimental')
plt.plot(time, sim, linewidth=5, linestyle='dashed', label='Simulated')

plt.xlabel('Time (seconds)')
plt.ylabel('Y-Axis (Unit)')
plt.title('Experimental and Simulated Results')
plt.grid(True)
plt.show()

**Notice**: the x-axis ticks have changed - every data point is now labeled.

**Also notice**: we can no longer treat them a numbers

In [None]:
time[1] - time[0]

In [None]:
float(time[1]) - float(time[0])