# Loading a csv into NumPy -- MCEN 1030 -- 12 Nov
Today we will...
- Reminisce about matrix math from last class
- Learn how to import *.csv files into NumPy, and export them too.
- In-class problem related to examining data set for power-law growth

## Last class: matrix math

Let's warm up by reminding ourselves of some matrix math. Remember that  

    np.array([[1,2,3]]) # makes a row vector
    np.array([[4,5,6],[7,8,9]) # makes a matrix
    np.array([[11],[12],[13]) # makes a column vector

and to do some matrix multiplication, $Ax = ?$:
    
    np.dot(A,x)

Below, let's code the following:
- create a set of x values from 0 to 1 (50 should do)
- then create some y values based on the equation $y=x^2$
- then we will rotate the points via the matrix transformation (recall HW4):
$$\begin{bmatrix}\cos\theta & -\sin\theta\\ \sin\theta & \cos\theta\end{bmatrix}\begin{bmatrix}x\\ y \end{bmatrix}=\begin{bmatrix}x_\text{new}\\ y_\text{new}\end{bmatrix}$$
- plot both data sets

Note: because the output of np.dot(A,x) will be a 2-row, 1-column vector, we will need to use v[0,0] when accessing the information in the output. I'll leave it to you to troubleshoot whether the other component is v[1,0] or v[0,1]. Think about it, then check with your neighbor.

In [None]:
import numpy as np
import matplotlib.pyplot as plt
pi=np.pi
theta=pi/4 # 45 degrees

# code here


## Importing comma-separated value files into NumPy

Here is the command that does the job:

    data = np.loadtxt("some_data.csv", delimiter=",")

This creates a numpy array called "data" which we can slice up however we like. E.g.
    
    col0=data[:,0]

gets you the zeroth (first) column.

Let's dive straight in and do a problem. Put 11-12_data1.csv in the folder you are working in, then write code that...
- imports the csv into a 2D array
- breaks that 2D array up into a 1D array for the zeroth column, called x, and the other column is y
- plots, to see what that initial data looks like
- then, in a second cell, performs the following matrix manipulation, with $m=4.1$
$$\begin{bmatrix}1 & m\\ 0 & 1\end{bmatrix}\begin{bmatrix}x\\ y \end{bmatrix}=\begin{bmatrix}x_\text{new}\\ y_\text{new}\end{bmatrix}$$
- plots the resulting collection of points

In [1]:
# code here

In [None]:
# process the data and makes a plot of the processed data

## Saving a csv

If you'd like to save data as a csv, you might find the command 

    M=np.c_[col0,col1]
    
useful... "concatenate column 0 and column 1". Then:

    np.savetxt("processed_data.csv", M, delimiter=",")

## in-class problem
"Moore's Law", proposed in 1965, predicts that computing power will double every two years. More precisely: the number of transistors on a chip will double every two years... so if there were 5,000 in a chip from 1970, it would be 10,000 by 1972, then 20,000 by 1974, then 40,000 by 1976. The equation that describes this is
$$y = y_0\cdot 2^{(n-n_0)/2}$$
where $y$ is the number of transistors in year $n$, with $y_0$ the value at some reference year, $n_0$.

Put 11-12_moore_data.csv in your working folder. It includes data (taken from https://en.wikipedia.org/wiki/Moore%27s_law via https://web.eecs.utk.edu/~dcostine/personal/PowerDeviceLib/DigiTest/index.html) on the historical number of transistors in microprocessors.

In the code space below:
- Import the data set and break it up into two (1D) numpy arrays.
- Create a set of "theoretical" values for years from 1971 to 2019 based on the above equation. Use the zeroth data point from the data set as the values for $y_0$ and $n_0$.
- Plot the data using plt.semilogy(...,'.') and the theoretical curve using plt.semilogy(...,'-'), where you'll fill in the ...'s.
- Report to the Canvas quiz the difference between the theoretical prediction in 2019 and the final data point in the data set divided by the final data point in the data set. (Thus, it is something like a percent error, though don't multiply it by 100.) E.g., if your theoretical curve predicts 518, and the last data point is 550, you would find
$$ \frac{518-550}{550} = -0.0581$$
(Report to four digits past the decimal.)

In [None]:
# code in here

## A big-picture question
Your phone is something like 100,000 time more powerful than the navigation computer on the Apollo 11. Increased computational power, and increased access to that power, has lead to amazing innovations. 

But is it concerning that machine learning/AI data centers are consuming about as much power as GERMANY?! How can we balance the development/use of this technology with the need to move away from cheap fossil fuels?

"Fixing AI’s energy crisis" https://www.nature.com/articles/d41586-024-03408-z 