# Table of Contents
* [Common Python containers for data](#Common-Python-containers-for-data)
  * [Lists](#Lists)
    * [Exercise Set 1](#Exercise-Set-1)
    * [List Comprehensions](#List-Comprehensions)
    * [Exercise Set 2](#Exercise-Set-2)
  * [Dictionaries](#Dictionaries)
    * [Exercise Set 3](#Exercise-Set-3)
* [Numerical data types: NumPy](#Numerical-data-types:-NumPy)
  * [Exercise Set 4](#Exercise-Set-4)
* [Reading and writing text files in Python](#Reading-and-writing-text-files-in-Python)
  * [Exercise Set 5](#Exercise-Set-5)
 
<font size="1">Notebook by Nick Sherer and Matt Zhang</font>

## Introduction

In today's workshop, we'll be covering the most common ways to combine different pieces of data in Python and how to read and write files. When programming, you'll often find you have a collection of related items like a collection of numbers describing the position of a particle over time or a collection of words making up a sentence. Because this occurs a lot, Python comes with datatypes for very general but useful ways you might want to aggregate data. However, sometimes you need to do more specific work like matrix algebra or numerics, and for this a Python library called NumPy exists that is less generic but much better suited for expressing math and doing computations quickly.

Python objects though only live as long as your program is running. And even if you kept your programs running forever, they might crash or your battery might die or the power might go out. To get around this, you'll need to read data in from files and write to files so we'll cover that too.

## Common Python containers for data

### Lists
<font size="1">[Return to Table of Contents](#Table-of-Contents)</font>

Lists are the simplest data structures in Python, and you'll use them for absolutely everything. Let's start with a 1D list first.

In [1]:
myNumbers = [1123, 875, 3, 9, 188, 1]

This list stores six numbers, and we can get each entry in the list by using its index. One thing that sometimes trips people up is that _**Python starts counting at 0!**_ In this way it is like C and most other programming languages but different from Matlab.

In [2]:
print("What we'd call first item in myNumbers has a 0 index:", myNumbers[0])
print("What we'd call the third item in myNumbers has a 2 index:", myNumbers[2])
print("We can also count down from the end of a list. The last element of a list has a -1 index:", myNumbers[-1])
print("And the second to last item in a list has a -2 index:", myNumbers[-2])

What we'd call first item in myNumbers has a 0 index: 1123
What we'd call the third item in myNumbers has a 2 index: 3
We can also count down from the end of a list. The last element of a list has a -1 index: 1
And the second to last item in a list has a -2 index: 188


You can choose subsets of a list. What's returned is itself a list. This syntax involves lots of colons. This is called slicing. Note that the first index in the slice is included, but the second is not.

In [3]:
a = list(range(10))
b = a[2:6] # this means from 2 up to (but not including) 6
c = a[:4] # if you skip the first number, it's assumed to be 0. So this is the same as a a[0:4]
d = a[5:] # if you skip the last number, it's assumed to be up to and including the end of the list.
e = a[2:6:2] # if you add a second colon and third number z, you'll only take every zth element in the slice
f = a[::3] # when there is a double colon then a number x, that indicates to only take every xth number in the list
g = a[2::3] # this is taking every 3rd number in the list starting with 2
print('a is', a)
print('b is', b)
print('c is', c)
print('d is', d)
print('e is', e)
print('f is', f)
print('g is', g)

a is [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
b is [2, 3, 4, 5]
c is [0, 1, 2, 3]
d is [5, 6, 7, 8, 9]
e is [2, 4]
f is [0, 3, 6, 9]
g is [2, 5, 8]


Lists can be modified after being created. There are special methods (functions) for doing this.

In [4]:
print('At first a is', a)
a.pop() # remove the last element from a list
print('Now a is', a)
a.pop(3) # remove the 3rd element from a list
print('Now a is', a)
a.append(36) # put a 36 at the end of a list
print('Now a is', a)
a.insert(2, 80) # put 80 at the second index of the list
print('Now a is', a)
a.remove(4) # remove the element 4 from the list. This will throw an error if the element is not in the list.
print('Finally a is', a)

At first a is [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
Now a is [0, 1, 2, 3, 4, 5, 6, 7, 8]
Now a is [0, 1, 2, 4, 5, 6, 7, 8]
Now a is [0, 1, 2, 4, 5, 6, 7, 8, 36]
Now a is [0, 1, 80, 2, 4, 5, 6, 7, 8, 36]
Finally a is [0, 1, 80, 2, 5, 6, 7, 8, 36]


A list can be used to store any kinds of data. Another common type of data to work with is a list of strings.

In [5]:
dogNames = ['Fido', 'Fluffy', 'Spot', 'Romulus', "Qo'noS"]

It's also common that you may want to sort a list of strings to be in alphabetical order.

In [6]:
dogNames.sort()
print('In alphabetical order our dogs are named', dogNames)

In alphabetical order our dogs are named ['Fido', 'Fluffy', "Qo'noS", 'Romulus', 'Spot']


Notice that calling sort on a list changes the list itself. This may or may not be the behavior you want from your program. If you don't want to change a list, you should first make a copy of it.

In [7]:
import copy
catNames = ['Cheshire', 'Aslan', 'Sphinx', 'Minx']
sortedCatNames = copy.deepcopy(catNames)
sortedCatNames.sort()
print("catNames is", catNames)
print("sortedcatNames is", sortedCatNames)

catNames is ['Cheshire', 'Aslan', 'Sphinx', 'Minx']
sortedcatNames is ['Aslan', 'Cheshire', 'Minx', 'Sphinx']


Lists can not only be used to store any kind of data, they can store arbitrary combinations of different kinds of data including other lists.

In [8]:
fours = [4, 'four', [1, 2, 3, 4], [[[4]]] ]
print('fours[2][2] is', fours[2][2])
print('fours[3][0][0] is', fours[3][0][0])

fours[2][2] is 3
fours[3][0][0] is [4]


***
#### **<span style="color:blue">Exercise Set 1</span>**
<font size="1">[Return to Table of Contents](#Table-of-Contents)</font>

**Practice with Indexing**

Access the number 7 in the following lists. Remember to print out the number you access so that you see your answer. Don't assume you are right!

In [9]:
sevenList0 = [0, 3, 4, 7, 3]
sevenList1 = [0, 9, 5, 2, [7, 1]]
sevenList2 = [1, [4], ['cat', [5, [8, 7] ] ], 13]
#your code here
print(sevenList0[3])
print(sevenList1[4][0])
print(sevenList2[2][1][1][1])

7
7
7


**Math with Lists? Not really**

What happens if you multiply a list by an integer? Try it out.

What happens if you add two lists together? Try it out.

Were these the behaviors you expected? Are these the behaviors you would want if you wanted lists to work like mathematical vectors?

In [10]:
#your code here
list_multiply = 5*[6,1,4]
print(list_multiply)
list_add = ['Tom', 'Dick', 'Harry'] + ['Alice', 'Bob', 'Eve']
print(list_add)

[6, 1, 4, 6, 1, 4, 6, 1, 4, 6, 1, 4, 6, 1, 4]
['Tom', 'Dick', 'Harry', 'Alice', 'Bob', 'Eve']


***

#### List Comprehensions
<font size="1">[Return to Table of Contents](#Table-of-Contents)</font>

Because making new lists is such a common operation in Python, there is a special syntax for making lists. These are called list comprehensions, and they look like this

In [11]:
evens = [i*2 for i in range(10)]
print('even is', evens)
x = [1, 6, 2, 9, 5]
y = [i*2+6 for i in x]
print('y is', y)
z = [i if i>3 else 0 for i in x]
print('z is',z)

even is [0, 2, 4, 6, 8, 10, 12, 14, 16, 18]
y is [8, 18, 10, 24, 16]
z is [0, 6, 0, 9, 5]


It may help to know the code for z is equivalent to the following much longer code

In [12]:
z = []
for i in x:
    if i > 3:
        z.append(i)
    else:
        z.append(0)
print(z)

[0, 6, 0, 9, 5]


Anyways, this syntax is so handy that's it's important to get very comfortable with it so here are some more exercises.

***

#### **<span style="color:blue">Exercise Set 2</span>**
<font size="1">[Return to Table of Contents](#Table-of-Contents)</font>

**Practice with list comprehensions**

Problem 1: Create a list of odd numbers from 1 to 57.

In [13]:
# your code here
list1 = [i*2+1 for i in range(29)]
print(list1)

[1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29, 31, 33, 35, 37, 39, 41, 43, 45, 47, 49, 51, 53, 55, 57]


Problem 2: Make a list of squares from 1 to 10,000.

In [14]:
# your code here
list2 = [i*i for i in range(1, 101)]
print(list2)

[1, 4, 9, 16, 25, 36, 49, 64, 81, 100, 121, 144, 169, 196, 225, 256, 289, 324, 361, 400, 441, 484, 529, 576, 625, 676, 729, 784, 841, 900, 961, 1024, 1089, 1156, 1225, 1296, 1369, 1444, 1521, 1600, 1681, 1764, 1849, 1936, 2025, 2116, 2209, 2304, 2401, 2500, 2601, 2704, 2809, 2916, 3025, 3136, 3249, 3364, 3481, 3600, 3721, 3844, 3969, 4096, 4225, 4356, 4489, 4624, 4761, 4900, 5041, 5184, 5329, 5476, 5625, 5776, 5929, 6084, 6241, 6400, 6561, 6724, 6889, 7056, 7225, 7396, 7569, 7744, 7921, 8100, 8281, 8464, 8649, 8836, 9025, 9216, 9409, 9604, 9801, 10000]


Problem 3: Make a new list from <code>alter_list</code> below, where every even value is left the same and every odd value is multiplied by -1.

In [15]:
# your code here
alter_list = [12, 56, 23, 35, 123, 32, 65, 131, 54]
list3 = [i*-1 if i%2==1 else i for i in a]
print(list3)

[0, -1, 80, 2, -5, 6, -7, 8, 36]


***

### Dictionaries
<font size="1">[Return to Table of Contents](#Table-of-Contents)</font>

Lists are probably the python datatype you'll see the most. They're very simple to work with and very general. However they can be very awkward for some purposes. For example what if you wanted a data structure with the masses of various planets? You could make a list of their masses in order of average orbital distance starting with Mercury, then going to Venus, then Earth, etc. but then anyone reading your code would have to know the order of the planets to understand it. And what if you had ordered the planets by mass instead of distance? Or what if whatever you were working with didn't have any convenient ordering. Well, as long as you can name your items, dictionaries may be a better choice than a list.

In [16]:
planet_masses = {'Venus': 0.815, 'Earth': 1, 'Mars': 0.11,
                 'Jupiter': 317.8, 'Saturn': 95.2, 'Uranus': 14.6, 'Neptune': 17.2}
# The masses are in units of earth masses
print("The planets of our solar system and their masses in multiples of the earth's mass are", planet_masses)

The planets of our solar system and their masses in multiples of the earth's mass are {'Neptune': 17.2, 'Jupiter': 317.8, 'Earth': 1, 'Venus': 0.815, 'Uranus': 14.6, 'Saturn': 95.2, 'Mars': 0.11}


One thing that might strike you as odd (or not, I was surprised but maybe you aren't) is that the planets are not necessarily printed out in the order you put them in the dictionary. Python dictionaries don't preserve order (If you want something with that property, Python has a library called OrderedDict). This might seem kind of weird, but it's important to keep in mind. When you need to access elements of a dictionary you have to access them by their "key" i.e. name.

In [17]:
print('The mass of Jupiter is', planet_masses['Jupiter'], 'earth masses.')

The mass of Jupiter is 317.8 earth masses.


If you try to access a dictionary with a key that isn't in it, you'll get a KeyError.

In [18]:
try:
    print('The mass of Pluto is', planet_masses['Pluto'], 'earth masses.') #Pluto'sNotAPlanet
except KeyError as e:
    print('KeyError:', e, 'is not a planet.')

KeyError: 'Pluto' is not a planet.


However, it's perfectly fine to add a new key and value to the dictionary. For example, I forgot to put the value of the mass of Mercury in the dictionary.

In [19]:
planet_masses['Mercury'] = .0553
print(planet_masses['Mercury'])

0.0553


If you iterate across a dictionary, you'll iterate across the keys of the dictionary. If you want to iterate across the values of some dictionary <code>foo</code> call <code>foo.values()</code> and if you want to iterate across the keys and values call <code>foo.items()</code>.

In [20]:
for thing in planet_masses:
    print(thing)

Mercury
Neptune
Jupiter
Earth
Venus
Uranus
Saturn
Mars


In [21]:
for thing in planet_masses.values():
    print(thing)

0.0553
17.2
317.8
1
0.815
14.6
95.2
0.11


In [22]:
for key, value in planet_masses.items():
    print('The mass of', key, 'is', value, 'earth masses.')

The mass of Mercury is 0.0553 earth masses.
The mass of Neptune is 17.2 earth masses.
The mass of Jupiter is 317.8 earth masses.
The mass of Earth is 1 earth masses.
The mass of Venus is 0.815 earth masses.
The mass of Uranus is 14.6 earth masses.
The mass of Saturn is 95.2 earth masses.
The mass of Mars is 0.11 earth masses.


***

#### **<span style="color:blue">Exercise Set 3</span>**
<font size="1">[Return to Table of Contents](#Table-of-Contents)</font>

Make a dictionary of the planets' average radius (distance from the center to the surface). Use units of earth radius. You can get the data from [Wikipedia](https://en.wikipedia.org/wiki/List_of_Solar_System_objects_by_size).

Then calculate the the acceleration due gravity at each planet's surface in units of the acceleration due to gravity at the surface of the earth i.e. in units of *g*.

In [23]:
#your code here
planet_radii = {'Mercury': .3829, 'Venus': 0.9499, 'Earth': 1, 'Mars': 0.5320,
                 'Jupiter': 10.97, 'Saturn': 9.14, 'Uranus': 3.981, 'Neptune': 3.865}

planet_gravity = {}
for key in planet_masses:
    planet_gravity[key] = planet_masses[key]/planet_radii[key]**2
    print("In units of g, the acceleration due to gravity on", key, "is", planet_gravity[key])

In units of g, the acceleration due to gravity on Mercury is 0.3771849872735875
In units of g, the acceleration due to gravity on Neptune is 1.151408550882049
In units of g, the acceleration due to gravity on Jupiter is 2.640831172111892
In units of g, the acceleration due to gravity on Earth is 1.0
In units of g, the acceleration due to gravity on Venus is 0.9032372366122815
In units of g, the acceleration due to gravity on Uranus is 0.9212309083570219
In units of g, the acceleration due to gravity on Saturn is 1.13957931328376
In units of g, the acceleration due to gravity on Mars is 0.38865961897224255


***

## Numerical data types: NumPy
<font size="1">[Return to Table of Contents](#Table-of-Contents)</font>

Lists and dictionaries are very useful general datatypes, but they aren't well adapted idioms for writing clear mathematical code or doing mathematical computations quickly.

Arrays work like lists in some ways, but add a lot of extra features. They're not in basic Python, so you'll have to import the NumPy (numeric Python) package to use them.

Because we use NumPy so much for so many different things, it's common to just call NumPy "np" when you import it. You can turn a list into a numpy array with the np.array() function. A list of lists of the same length will be turned into a 2-dimensional array. Numpy arrays have their elements indexed the same way as nested lists would and have the same slicing syntax.


In [24]:
import numpy as np
array_1d = np.array([2, 3, 5])
array_2d = np.array([[3.3, -7, 1],
                     [2.1, 45, 3],
                     [0.9, 23, 1]])
print('array_1d is', array_1d)
print('array_2d is\n', array_2d)
print('array_1d[1] is', array_1d[1])
print('array_2d[1][1] is', array_2d[1][1])

array_1d is [2 3 5]
array_2d is
 [[  3.3  -7.    1. ]
 [  2.1  45.    3. ]
 [  0.9  23.    1. ]]
array_1d[1] is 3
array_2d[1][1] is 45.0


However, there are some things lists can do which numpy arrays cannot do. For example, numpy handles non-rectangular arrays in a very strange way. So this is legal numpy syntax although this array won't behave very well if you use it in operations...

In [25]:
legal_jagged_array = np.array([[1],
                              [3, 4, 9],
                              'cat'])
print(legal_jagged_array)

[[1] [3, 4, 9] 'cat']


but this is not

In [26]:
try:
    illegal_jagged_array = np.array([1,
                                    [3, 4, 9],
                                    'cat'])
    print(illegal_jagged_array)
except ValueError:
    print("You can't do that.")

You can't do that.


You'll notice those the legal and illegal arrays have *almost* identical syntax. Personally, I have no idea why one should be legal and the other illegal. But you shouldn't try making weird arrays of inhomogeneous objects or inconsistent dimensions anyways. Stick to rectangular arrays. Stay inside the box.

Numpy has lots of great features for constructing common types of arrays and matrices. It also has ways to return the basic properties of an array.

In [27]:
zeros = np.zeros((5)) # a matrix of all zeros
ones = np.ones((3,3)) # a matrix of all ones
line = np.linspace(0,10, 41) # 41 numbers from 0 to 10 including the endpoints (0 and 10)
print('zeros is', zeros)
print('The shape of zeros is', zeros.shape)
print('ones is\n', ones)
print('The shape of ones is', ones.shape)
print('line is', line)
print('The size of line is', line.size)

zeros is [ 0.  0.  0.  0.  0.]
The shape of zeros is (5,)
ones is
 [[ 1.  1.  1.]
 [ 1.  1.  1.]
 [ 1.  1.  1.]]
The shape of ones is (3, 3)
line is [  0.     0.25   0.5    0.75   1.     1.25   1.5    1.75   2.     2.25
   2.5    2.75   3.     3.25   3.5    3.75   4.     4.25   4.5    4.75   5.
   5.25   5.5    5.75   6.     6.25   6.5    6.75   7.     7.25   7.5
   7.75   8.     8.25   8.5    8.75   9.     9.25   9.5    9.75  10.  ]
The size of line is 41


Numpy is great for linear algebra. You can also do vector and matrix addition and multiplication, as well as transposes, inverses, etc. very easily. Note that the command for concatenating arrays is different than the command for concatenating lists.

In [28]:
matrix_1 = np.array([[3, 2, 8, 1], [12, 4, 0, 6]])
matrix_2 = np.array([[1, 6, 3, 9], [2, 5, 1, 6]])
print("matrix_1")
print(matrix_1)
print("matrix_2")
print(matrix_2)
print("Sum of these two matrices")
print(matrix_1+matrix_2)
print("Difference of these two matrices")
print(matrix_1-matrix_2)
print("Element-wise multiplication")
print(matrix_1*matrix_2)
print("Transpose of matrix 1")
print(matrix_1.transpose())
print("Dot product")
print(matrix_1.transpose().dot(matrix_2))

matrix_1
[[ 3  2  8  1]
 [12  4  0  6]]
matrix_2
[[1 6 3 9]
 [2 5 1 6]]
Sum of these two matrices
[[ 4  8 11 10]
 [14  9  1 12]]
Difference of these two matrices
[[ 2 -4  5 -8]
 [10 -1 -1  0]]
Element-wise multiplication
[[ 3 12 24  9]
 [24 20  0 36]]
Transpose of matrix 1
[[ 3 12]
 [ 2  4]
 [ 8  0]
 [ 1  6]]
Dot product
[[27 78 21 99]
 [10 32 10 42]
 [ 8 48 24 72]
 [13 36  9 45]]


More matrix operations. You should search online for the correct numpy function whenever you need a matrix operation. NumPy has a detailed user guide and documentation [here](https://docs.scipy.org/doc/numpy-1.13.0/index.html).

In [29]:
np.random.seed(42) # This sets the pseudorandom number generator to a fixed seed so this notebook will be more replicable.
random_matrix = np.random.rand(4, 4) # A 4x4 matrix of random numbers uniformly distributed over 0 to 1
print("random_matrix")
print(random_matrix)
print("Determinant of random_matrix")
print(np.linalg.det(random_matrix))
print("Inverse of random_matrix")
print(np.linalg.inv(random_matrix))
print("Trace of random_matrix")
print(np.trace(random_matrix))

random_matrix
[[ 0.37454012  0.95071431  0.73199394  0.59865848]
 [ 0.15601864  0.15599452  0.05808361  0.86617615]
 [ 0.60111501  0.70807258  0.02058449  0.96990985]
 [ 0.83244264  0.21233911  0.18182497  0.18340451]]
Determinant of random_matrix
-0.265699516555
Inverse of random_matrix
[[-0.32076901 -0.12766508  0.06141427  1.32518674]
 [ 0.35151041 -1.88500014  1.65560045 -1.0003883 ]
 [ 1.14080312  1.3467702  -2.0407373   0.70794071]
 [-0.08202687  1.42666425 -0.17238177 -0.10600441]]
Trace of random_matrix
0.734523643333


***

#### **<span style="color:blue">Exercise Set 4</span>**
<font size="1">[Return to Table of Contents](#Table-of-Contents)</font>

**Finding the functions you need in NumPy documentation or from google-fu**
A big part of using any programming language efficiently is being able to find the functions you need and how they work. So for this exercise, you'll mostly be looking up the appropriate functions from google or from the [documentation](https://docs.scipy.org/doc/numpy-1.13.0/index.html).

Problem 1: Create a 100x5x5 array of random numbers with a Gaussian mean of 5 and standard deviation of 1.5. Think of this as 100 measurements from a 5x5 grid in space. Find the mean and standard deviation at each point on the grid (these should be 5x5 grids).

In [30]:
# your code here
gaussian_array = np.random.normal(5, 1.5, (100, 5, 5))
print(gaussian_array.mean(axis=0))
print(gaussian_array.std(axis=0))

[[ 4.99559726  4.9451385   5.11595957  5.02964755  5.14883433]
 [ 5.02170852  4.85458451  5.0325048   5.28816063  4.84422793]
 [ 4.94351385  5.16772104  5.03828609  4.99953728  4.99455275]
 [ 5.06109731  5.09684161  5.1458458   4.96412878  4.9348862 ]
 [ 5.15689286  4.98355319  5.29667706  5.07902869  5.09246461]]
[[ 1.53890203  1.4753707   1.49307679  1.35015338  1.60510101]
 [ 1.33779126  1.34214927  1.45851996  1.33283605  1.61625641]
 [ 1.38249491  1.32108042  1.55640239  1.57801591  1.6093316 ]
 [ 1.34081312  1.45363323  1.53613089  1.46397731  1.53142359]
 [ 1.50168925  1.52885708  1.44838856  1.5053435   1.44750539]]


Problem 2: Find the difference between the sum of the squares of the first one hundred natural numbers (starting from 1) and the square of the sum (https://projecteuler.net/problem=6).

In [31]:
# your code here
first_100 = np.arange(101)
print(np.sum(first_100**2)-np.sum(first_100)**2)

-25164150


***

## Reading and writing text files in Python
<font size="1">[Return to Table of Contents](#Table-of-Contents)</font>

When you write a script or program to do some scientific work, it's generally going to have some results. You'll often want to save these results so they can later maybe be the inputs to another analysis. One possibility is you can save the results inside a lab notebook but whether your notebook is physical or virtual, it's a pain to get data from it into other programs.

So you're better off saving your data and results in a computer friendly format in the first place i.e. in a file.

The most basic and still useful type of file is a text file. An important warning I will give is that **A Microsoft Word document is not a text file; so don't use it to store scientific data.** It has all sorts of formatting information, can support images, and can save document history and comments. All of these features make extracting data from it programatically unreliable and painful. A text file is plain text. It often has extensions ".txt" or ".csv". There are no fonts or font sizes or styles or formatting. Python supports reading and writing text files.

There are some text files in the folder with this notebook that will be used for an example.

Opening a text file is simple

In [32]:
open_file = open('first_file.txt','r') # 'r' opens the file in read-only mode. This prevents changes to the file.
print(open_file.read())
open_file.close()

This is the first file for this lesson!
It's just plain text.
Python can read this file all at once or line-by-line.
You've printed the whole file with print(open_file.read()).
To print one line from the file write print(open_file.read_line()).
What happens if you open this file again and call open_file.readline() three times?
Try it!


In [33]:
open_file = open('first_file.txt','r')
print(open_file.readline())
print(open_file.readline())
print(open_file.readline())
open_file.close()

This is the first file for this lesson!

It's just plain text.

Python can read this file all at once or line-by-line.



It's important that when you are done working with a file in Python you call its <code>close</code> method (function). Otherwise later on you could accidentally write something to a file and corrupt it or worse you could *accidentally change the file without causing an error*. At least if a file is corrupted you know something went wrong, if your data is changed but looks ok you could have accidentally false data!

Writing to a text file is pretty simple too. By default, when you write to a text file, any new text you add goes on the end of it.

In [34]:
open_file = open('testfile.txt','w') # 'w' is for write mode. If another file with the same name exists, it will be erased first.
open_file.write('It is time to make a new file.\n') # \n means to go to a newline. There isn't literally a \n in the text file
open_file.write('Why? Because it is not forbidden.\n')
open_file.write('We are very daring like that.\n')
open_file.close()

You should open your text file in notepad or a similar editor to make sure it looks like you expect. Textfiles are a nice way to save scientific data because they are easily read by humans with any computer and their format is unlikely to become unreadable for a few more decades at least. For example, let's take our earlier dictionary of the acceleration due to gravity on the planets of the solar system and save that data to a new file. We'll make sure to close the file after we're done writing the data. After running this code, go look at the file in your text editor and make sure it looks correct.

In [35]:
g_forces_by_planet = open('g-forces by planet.txt', 'w') # make a file to save our data to
for key, value in planet_gravity.items(): # iterate over the keys (planets) and values (g-forces) of our dictionary
    g_forces_by_planet.write(key + ',' + str(value) + '\n') # separate the planet name from the g-force value by a comma
                                                          # and make sure to turn the value into a string (text)
g_forces_by_planet.close() #always close the file when done

Reading data back in from this file into a dictionary identical to the one we used to write it is pretty simple. We're going to take advantage of a few nice python features here. We can loop over the lines of a textfile automatically, and any strings can be split into a list of strings with the <code>split</code> method.

In [36]:
g_forces_by_planet = open('g-forces by planet.txt', 'r') # first open our text file in read-only mode
planet_gravity2 = {} # we're going to make a new dictionary identical to the old one we computed
for line in g_forces_by_planet: # loop over the lines of the file
    [key,value]=line.split(',') # each line had the key and the value split by a comma, so let's assign the
    planet_gravity2[key]=float(value) # convert the g-forces back into a floating point number and assign to the key
g_forces_by_planet.close() # always close the file when done

Let's inspect these two dictionaries to make sure they're the same like they should be.

In [37]:
print('planet_gravity is', planet_gravity)
print('planet_gravity2 is', planet_gravity2)

planet_gravity is {'Mercury': 0.3771849872735875, 'Neptune': 1.151408550882049, 'Jupiter': 2.640831172111892, 'Earth': 1.0, 'Venus': 0.9032372366122815, 'Saturn': 1.13957931328376, 'Uranus': 0.9212309083570219, 'Mars': 0.38865961897224255}
planet_gravity2 is {'Mercury': 0.3771849872735875, 'Neptune': 1.151408550882049, 'Jupiter': 2.640831172111892, 'Earth': 1.0, 'Venus': 0.9032372366122815, 'Uranus': 0.9212309083570219, 'Saturn': 1.13957931328376, 'Mars': 0.38865961897224255}


A more rigorous check is to make sure they share the same key-value pairs

In [38]:
check = True
for key in planet_gravity:
    check = check and (planet_gravity[key]==planet_gravity2[key]) #if check or the part after and is false, check is false
print(check)

True


***

#### **<span style="color:blue">Exercise Set 5</span>**
<font size="1">[Return to Table of Contents](#Table-of-Contents)</font>

Alright, this is the final exercise! For this exercise, compute the density of each planet in units of the density of earth and save these results to a file with the same layout as the "g-forces by planet.txt" file.

In [39]:
# your code here
planet_densities = {}
for key in planet_masses:
    planet_densities[key] = planet_masses[key] / planet_radii[key]**3
densities_by_planet = open('densities by planet.txt','w')
for key, value in planet_densities.items():
    densities_by_planet.write(key + ',' + str(value) + '\n')
densities_by_planet.close()