## File Input and Output (IO)

<img src = "EosDAQTestStand.PNG" width ="400" alt="Alt Text"/>

An experiment will often collect data and save it to a file.<br>
Python can basically open any types of file if you have the correct library to do so.<br>
For this class, we will just use simple text files. Sometimes called Raw data files.<br>
The library we need to do this are loadtxt() and savetxt() from numpy

In [10]:
# Typically arrays are read in from a file (data file)
from numpy import loadtxt
#data = loadtxt("../PHYS4102_Sp2023/values.txt")
#print(data)
# Note that by default the values are float, but we can force it be int...
data = loadtxt("values.txt", int)
#print(data)

# Notice that the values are delimited by a space 
# However, you can tell python to use other delimiters such as , : / ;
data = loadtxt("values.txt", delimiter = " ")
print(data)

[[ 1  2  3  4]
 [ 5  6 17  8]
 [ 9 10 11 12]]


#### Arrays
Notice that the printed data has a bunch of brackets around them. Anything in Python with Brackets are Arrays or Lists. You can think of them as matrices from Linear Algebra or Columns and Rows from Excel.

<img src = "numpy_arrays-1024x572.png" width=400/>

In [131]:
# Often you don't know how many rows and columns a data file contains
# You can find out using the shape function
# To get a better handle on the shape of matrix
print(data.shape ) # this will give you the shape of the matrix

(3, 4)


In [132]:
# If you use the len() instead, it will actually just give you the total number of elements ROWs.
# That's because python consider each row (list of 4 things) one element.
print(len(data))

3


In [5]:
# To get the total number of elements just print the row*column
print(data.shape[0]*data.shape[1]) # This will tell you how many elements total.

12


### Getting Maximum / Minimum and Index of Arrays

Sometimes when you load a file it is useful to get the maximum value or the minimum value.<br>
This could be to set the range of your plots<br>

We would also like to know which row or column the min and max are located<br>

In [140]:
from numpy import amax, amin, where
print(data)
#Print out the maximum value of the data
print("Max of the array is ",amax(data))
#Print out the minimum value of the data
print("Min of the array is ",amin(data))

#Get the position of the maximum value
index = where(data == amax(data))
#index = where(data == 17)
print("The index is ",index)
#print("The data is ",data)
# Note that the index is paired where contains the type of data of the array.
# This is known a tuple (typically called an n-tuple) in this case 2.
# we can ignore that printing just the first element in each tuple

print("The first row and column of the index array ",index[0][0])

print("The 2nd row and 1st column of the index array ", index[1][0])

[[ 1.  2.  3.  4.]
 [ 5.  6. 17.  8.]
 [ 9. 10. 11. 12.]]
Max of the array is  17.0
Min of the array is  1.0
The index is  (array([1], dtype=int64), array([2], dtype=int64))
The first row and column of the index array  1
The 2nd row and 1st column of the index array  2


### Writing a Data file:
You will also need to save to a data file sometimes.<br>
This can be done using numpy's savetxt function.<br>
The format (fmt = %6.2f) is often hardest to remember<br>
The documentation is here. https://numpy.org/doc/stable/reference/generated/numpy.savetxt.html<br>
However, you should search "numpy savetxt" to find it<br>

In [12]:
from numpy import savetxt
# Loaded the stars magnitude and temperature file
#savetxt("mysave.txt", data, delimiter = " ")
#savetxt("mysave.txt", data, delimiter = ",")
#savetxt("mysave.txt", data, fmt="%6.2e", delimiter = " ")
savetxt("mysave.txt", data, fmt="%1.4e", delimiter = ",")

## Ch 2.5: For Loops

<h4> The most useful loop you need for coding is the for loop. <br>
You can use for loop to do while loop stuff but not vice versa.<br>

In [149]:
# For these repeative task you can use a for loop
# for    changing_item       in       the_list_stuff_you're looping through :
    #tab over once to let python know you want to be inside the loop
    # More codes
    # More codes
# Once you're done you remove the tab, this line will be outside the loop.

# For example if we want to sum up all the elements of this vector
vec = [1 , 2 , 3, 4, 5]
Sum = 0
for eachElement in vec :
    Sum = Sum+eachElement
    #print(eachElement)

print(Sum)

15


In [14]:
# We can do computation to num inside the for loop 
#but once we're out it disappears.
from numpy import sin
vec = [1 , 2 , 3, 4, 5]
for bob in vec :
    print (sin(bob))

0.8414709848078965
0.9092974268256817
0.1411200080598672
-0.7568024953079282
-0.9589242746631385


### Iterators
The example so far allowed us to loop through EVERY element in the list or array<br>
But what if we want to control which element we go through?<br>
Iterator is a list of number (typically starts from 0 to N)<br>

In [154]:
# create a range of numbers from 0-9 and print it out one at a time
#print(range(10))

for ival in range(10):
    print(ival)

0
1
2
3
4
5
6
7
8
9


In [11]:
# create a range of numbers from 5-9 and print it out one at a time
for ival in range(5,10):
    print(ival)

5
6
7
8
9


In [155]:
# create a range of numbers from 0-9 iterating by 2 instead and print it out one at a time
for ival in range(0,10,2):
    print(ival)

0
2
4
6
8


In [15]:
# the built-in range function has limitations. It can only do integer iterations
# create a range of numbers from 0-9 iterating by 2 instead and print it out one at a time
for ival in range(0,10,1.5):
    print(ival)

TypeError: 'float' object cannot be interpreted as an integer

In [16]:
# arange library allows for non-integer iteration.
from numpy import arange
for ival in arange(0,10,1.5):
    print(ival)

0.0
1.5
3.0
4.5
6.0
7.5
9.0


In [157]:
# Let's look at how we can use iterator in our 2D data array
# Print out the array

print("This is the data matrix:\n",data)
for irow in range(0,data.shape[0],1):
    for icol in range(0,data.shape[1],1):
        print( data [ irow , icol ] )

This is the data matrix:
 [[ 1.  2.  3.  4.]
 [ 5.  6. 17.  8.]
 [ 9. 10. 11. 12.]]
1.0
2.0
3.0
4.0
5.0
6.0
17.0
8.0
9.0
10.0
11.0
12.0


### Parsing Data (slice and dice)
Not all data will be written in nice arrays like values.txt<br>
Let's look at values2.txt<br>

You can see that some of the rows aren't even numbers.<br>
There are also mixed values of floats and integers.<br>
There's also a row with NAN (Not A Number). This happens when you try to divide by 0 or get results that are non-sensible in an experiment.<br>

In [17]:
import numpy as np
#data = np.loadtxt("values2.txt")

# Skipped the documentation rows.
data = np.loadtxt("values2.txt",skiprows=3)
#print("Skipped the first three rows\n",data)

#print("The invalid number is:", data[8,1])
#data[8,1]=0
#print("The changed number is:", data[8,1])

# you can manually count the number of rows down and skip that row
#print("Skipped the nan row\n", data[0:8,0:2], "\n", data[9,0:2])

# Print the rows which has NAN.
#print(np.isnan(data))
# But we actually want rows that is NOT NAN
#print(~np.isnan(data))

print(data)
# The problem is that row 9 has the NAN value. But it is in column 2
# using the any function on an array allows us to pick out 
# which row has ANY false.
#print(~np.isnan(data).any(axis=1))

# Now we can make a copy of the data that does not contain that row.
data2 = data[~np.isnan(data).any(axis=1)]
#print(data2)

# You can do the same to columns also but in this case it would not make sense
data3 = data[:,~np.isnan(data).any(axis=0)]
print(data3)


[[ 1.000e+00  1.500e+00]
 [ 2.000e+00  2.300e+00]
 [ 3.000e+00 -1.000e-01]
 [ 4.000e+00  1.400e-02]
 [ 5.000e+00  8.900e+00]
 [ 6.000e+00  1.252e+01]
 [ 7.000e+00  3.000e-03]
 [ 8.000e+00  1.570e+01]
 [ 9.000e+00        nan]
 [ 1.000e+01  1.230e+00]]
[[ 1.]
 [ 2.]
 [ 3.]
 [ 4.]
 [ 5.]
 [ 6.]
 [ 7.]
 [ 8.]
 [ 9.]
 [10.]]


In [24]:
# There where function is also extremely useful.
print(data)
#print(data[np.where(data > 5)])
# notice that this only tell you any values bigger than 10 in array.

# This allows you to pick out anything in column 1 that's bigger than 10.
print(data[np.where(data[ : , 1 ] > 10)])

# You can do the opposite cut of course...
#print(data[np.where(data[:,1] < 10)])


[[ 1.000e+00  1.500e+00]
 [ 2.000e+00  2.300e+00]
 [ 3.000e+00 -1.000e-01]
 [ 4.000e+00  1.400e-02]
 [ 5.000e+00  8.900e+00]
 [ 6.000e+00  1.252e+01]
 [ 7.000e+00  3.000e-03]
 [ 8.000e+00  1.570e+01]
 [ 9.000e+00        nan]
 [ 1.000e+01  1.230e+00]]
[[ 6.   12.52]
 [ 8.   15.7 ]]


### Parsing Data Part 2 (if else method)
The slice and dice method is very fast but it lacks some control you might want. <br>
For example, if you want to replace that NAN value above with some default value. <br>
You will have to do some array manipulation to get to it.<br>


In [25]:
print(data)
for irow in arange(0,data.shape[0],1):
    for icol in arange(0,data.shape[1],1):
        if( np.isnan(data[ irow , icol ])):
            print("Found a NAN so I should replace it! [",irow,",",icol,"]")
            #data [ irow , icol ] = 0


[[ 1.000e+00  1.500e+00]
 [ 2.000e+00  2.300e+00]
 [ 3.000e+00 -1.000e-01]
 [ 4.000e+00  1.400e-02]
 [ 5.000e+00  8.900e+00]
 [ 6.000e+00  1.252e+01]
 [ 7.000e+00  3.000e-03]
 [ 8.000e+00  1.570e+01]
 [ 9.000e+00        nan]
 [ 1.000e+01  1.230e+00]]
Found a NAN so I should replace it! [ 8 , 1 ]


In [26]:
# can do more cuts on each data using if else
for irow in arange(0,data.shape[0],1):
    for icol in arange(0,data.shape[1],1):
        if( np.isnan(data[ irow , icol ])):
            print("Found a NAN so I should replace it! [",irow,",",icol,"]")
            #data [ irow , icol ] = 0
        elif(data[ irow , icol ] > 10):
            print("This number is greater than 10")
            
        elif(data[ irow , icol ] < 10):
            x =1
        else:
            print("Otherwise")
            


This number is greater than 10
This number is greater than 10
Found a NAN so I should replace it! [ 8 , 1 ]
Otherwise


In [117]:
# You can also something similar with while loops
print(data)
irow = 0
while(data[irow,1] < 10):
    print(data[irow,:])
    irow=irow+1


[[ 1.000e+00  1.500e+00]
 [ 2.000e+00  2.300e+00]
 [ 3.000e+00 -1.000e-01]
 [ 4.000e+00  1.400e-02]
 [ 5.000e+00  8.900e+00]
 [ 6.000e+00  1.252e+01]
 [ 7.000e+00  3.000e-03]
 [ 8.000e+00  1.570e+01]
 [ 9.000e+00        nan]
 [ 1.000e+01  1.230e+00]]
[1.  1.5]
[2.  2.3]
[ 3.  -0.1]
[4.    0.014]
[5.  8.9]


## Ch 2.6: User Defined Function

<h4> Computational Operations often repeat itself. Some of the programs you already wrote could be useful in other programs.

Those of you who had Thermal knows that we compute combinatorics alot.

<p style="text-align: center;"> $ n! = \Pi_{k=1}^{k=n}k$ </p><br>




In [106]:
# Let's do it first the old fashion way.

# variable to keep the result after each loop. Start with 1 because 0*anything = 0
result = 1.0

# Set which factorial we want to calculate.
n = 5
k = 1

while (k < n):
    k=k+1
    result = result*k
    
print(result)

120.0


In [190]:
# Now let's write a User Defined Function for it.
# To define a function
def factorial ( n ) :
    result = 1.0
    # Set which factorial we want to calculate.
    k = 1

    while (k < n):
        k=k+1
        result = result*k
    
    return result

In [193]:
print(factorial(0))

1.0


In [108]:
# Since we're in the same python program, we can use factorial function in other boxes.
print(factorial(8))

40320.0


This will be used when you're doing fits to customized function.<br> 
If you want to make a function find the solutions for:<br>
 
 $ 0  = Ax^2+Bx+C$
 
We can define this function.

In [194]:
from numpy import sqrt
def QuadEq(A,B,C):
    return  (-B+sqrt(B**2-4*A*C))/2*A , (-B-sqrt(B**2-4*A*C))/2*A

In [195]:
# Now you can input any parameter you need for the function to find the result.
print(QuadEq(1,9,18))

(-3.0, -6.0)


In [196]:
# But now there's a problem..
print(QuadEq(1,2,3))

(nan, nan)


  return  (-B+sqrt(B**2-4*A*C))/2*A , (-B-sqrt(B**2-4*A*C))/2*A


In [197]:
# You have to account for every possibility in your function.
from numpy import sqrt

def QuadEq(A,B,C):
    if(B**2-4*A*C >= 0): # Check if the determinate is positive
        return  (-B+sqrt(B**2-4*A*C))/2*A , (-B-sqrt(B**2-4*A*C))/2*A
    else:
        return 0,0 # gives an obvious wrong answer

In [198]:
print(QuadEq(1,2,3))

(0, 0)


## Ch 2.7 Programming Style

<h4>  

1) Comment a lot!!!<br>

2) Meaningful Variable Names, one thing that people fall for is x_new ... x_newnew .... instead use x_20180910  or x_TrapMethod<br>

3) Right type of variables, as you read your code use type(x) to remember the type<br>

4) Import function first.... common courtesy  otherwise the imports are hidden and people won't know what their program depend on....

5) Name your constants. <br>
G_Newton = 6.67e-11<br>
R = 6471e3<br>
M=5.97e24<br>

6) Use function when possible. 

7) Print out Partial results
<h4> 
8) Layout..... Make your code readable at least to yourself! 

```python
distance_satelite = sqrt( position_satelite[0]**2+position_satelite[1]**2+position_satelite[2]**2)
```
You can break long lines 
```python
distance_satelite = sqrt( position_satelite[0]**2
                         +position_satelite[1]**2
                         +position_satelite[2]**2)
``` 
9) Make simple and clear program.... don't do stuff like this...

result = func1( funct2 ( func3 ( x[i:j,k,l])))))

10) iPython Notebook is really a new way to work through your problems and document your results.

11) ALWAYS use Markdown insert the process, derivation or "thoughts" into your notebook.

12) Show your code to someone else before turning it in. See if they can follow your work and predict what the result would be. Then run the code and see what they think should happen does so....

13) Replicating someone's code.
When you google something always type the code in yourself instead of copying and pasting. This gives your muscle memory and there are also syntax and logic you might not catch.


14)If you're using something entire idea to do something reference the code. Otherwise it is just like plagiarizing someone's paper. It also reminds you where you got the code from so you can look it up yourself later.

# Github
Git was created in 2005 by Linus Torvalds. It is a distributed version control system.

It allows users to "upload" multiple versions of file(s) and it will keep track of who uploaded, when, a comment of what the revision is about. You will see this very often in any software you use.

![image.png](attachment:2d88f5f7-2aba-4bb2-aa78-7c96c799fa23.png)


# Github is an extremely useful, flexible and could be complicated tool.

![image.png](attachment:a9f6db7d-5e4b-40e0-b7b3-9cc5bca67995.png)

# Easy (using just the basic function) Starter Guide for Github.

## Go to github.com and create an account / login.

## Create a new repository name phys4102 and make it public for now. 

![image.png](attachment:00d4ab25-cb0e-4ef4-8ae4-fe4b631de0ed.png)

## Create the Reade file for your repo by clicking on the "Add Readme File" in your repository.
### Write in your read me file to tell your visitors What this repo contains, who you are and let your user know to contact you through github messaging.

## Create/Find your Jupyter Notebook for Code 1 or others. 

## Click on Add File-> Upload File
![image.png](attachment:ff2670ee-680b-43ff-b3a9-87d47d0ba2cc.png)

## Select the file or drag it into your browser. Make to Comment your commit, "This is the first commit of [[Filename.ipynb]]
![image.png](attachment:15b30dd2-b72a-4f81-affc-d15a76b26a38.png)

## When you want to update the file, you can upload the file with EXACTLY the same name. It will keep the old version of the file but the site will display the latest version.

## When you are submiting the link to Canvas or sending the link show others. Use the PERMA LINK for the file.
![image.png](attachment:1a212b21-3795-4605-80ea-458f8b92dab1.png)


