## <p style="text-align: center">COMP10001 Foundations of Computing<br>Semester 2, 2022<br>Tutorial Questions: Week 9 </p>
### <p style="text-align: center">Tutor: [Jiyu Chen](https://jiyuc.live)</p>

### 1. What is a “list comprehension”? How do we write one and how do they make our code simpler?

A **list comprehension** is a shortcut notation used to accomplish simple iteration tasks involving a collection (usually list, but also possibly a set or dictionary) in one line of code. List comprehensions are formed by wrapping a pair of brackets around `<expression> <for iteration statement> <optional if filter condition>`. The iteration statement will be run and for each iteration, the result of the expression will be added to the collection - which could be a list, set or dictionary. If a filter condition is included, the object will only be added if that condition evaluates to True.

List comprehensions are useful to shorten repetitive, simple loops into readable single lines of code. They are especially useful for **initialising lists**. Avoid cramming too much into list comprehensions or nesting them inside each other: we want them to remain readable.

#### For example, using traditional for-loop or list comprehension to create a list of odd numbers

In [15]:
a,b = list(range(10)),list(range(10))
x = list(zip(a,b))

In [17]:
nums = []
for i in range(10): # for iteration statement
  # if filter condition
        nums.append(i) # expression
print(nums)
a,b = range(10),range(10)

# list comprehension: <expression> <for iteration statement> <optional if filter condition>
nums = (i for i,j in x)
print(nums)

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
{0, 1, 2, 3, 4, 5, 6, 7, 8, 9}


#### Note: During the exam, You might be asked to write a **SINGLE** line of expression, with utility of list comprehension, to produce certain output.

### 2. What happens if we use curly brackets instead of square brackets around a “list” comprehension? What happens if we use parentheses?

### Exercise 1. Evaluate the following list comprehensions. For each one, also write some python code to generate the same list without using a comprehension.

```
(a) [(name, 0) for name in ("evelyn", "alex", "sam")]

(b) [i**2 for i in range(5) if i % 2 == 1]

(c) "".join([letter.upper() for letter in "python"])
```

In [2]:
#(a)
temp = []
for name in ("evelyn", "alex", "sam"):
    temp.append((name, 0))
print('(a)', temp)

#(b)
temp = []
for i in range(5):
    if i % 2 == 1:
        temp.append(i**2)
print('(b)', temp)

#(c)
temp = []
for letter in "python":
    temp.append(letter.upper())
print('(c)', "".join(temp))

(a) [('evelyn', 0), ('alex', 0), ('sam', 0)]
(b) [1, 9]
(c) PYTHON


### 3. Why do we use “files”? Could we use computers without them?

Files allow us to store data on a computer **permanently**. Files **persist** on storage media after a program is terminated, as opposed to internal data storage such as lists and dictionaries which exist in the computer’s **temporary memory** and are erased when the program finishes. (Assuming you're writing a Word document and you forget to hit the `save` button. Suddenly, your computer dead. After rebooting, your writing disappeared.)

Files are also useful for storing large amounts of data in a structured way and sharing it with others. It would not be possible to use code for solving many big data problems without them. In fact, it would not be possible to write code without them!

**Why computers need memory? Why not let a program directly reading and writing on the hard disk?**
<img src="memory_hier.png" width="600" height="300">



### 4. What are the steps to reading and writing files?

In Python, to open a file we use the `open()` function, which takes two arguments: the file’s `filename` as a string; and another string which represents the `“mode”` for access. Some common modes are `'r'` for reading; `'w'` for writing (erasing all file contents if the file exists initially); and `'a'` for appending to an already-existing file. 

`open()` returns a “file object” which is used to access the file.

Some common ways to **read** a file are using the `file.read()` method to read a whole file, returning a string; `file.readline()` to read one line of the file, returning a string; and `file.readlines()` to read an entire file, returning a list with each row of the file split as a separate element in the list. 

To **write**, use the `file.write()` method to write a string to output.
When finished with a file, be sure to close it with the `file.close()` method to prevent buffer errors.

In [25]:
file_object = open('./in.txt','r')

In [22]:
print("\n>>Reading file use read():\n")
print(file_object.read())
file_object.close()  # close the file object!


>>Reading file use read():

To study the representational power of a GNN, we analyze when a GNN maps two nodes to the same location in the embedding space.
Intuitively, a maximally powerful GNN maps two nodes to the same location only if they have identical subtree structures with identical features on the corresponding nodes.
Since subtree structures are defined recursively via node neighborhoods (Figure 1), we can reduce our analysis to the question whether a GNN maps two neighborhoods (i.e., two multisets) to the same embedding or representation.
A maximally powerful GNN would never map two different neighborhoods, i.e., multisets of feature vectors, to the same representation.
This means its aggregation scheme must be injective.
Thus, we abstract a GNN’s aggregation scheme as a class of functions over multisets that their neural networks can represent, and analyze whether they are able to represent injective multiset functions.


In [24]:
print("\n>>Reading file use readline():\n")
print(file_object.readline())
file_object.close()  # close the file object!


>>Reading file use readline():

To study the representational power of a GNN, we analyze when a GNN maps two nodes to the same location in the embedding space.



In [26]:
print("\n>>Reading file use readlines():\n")
print(open('./in.txt','r').readlines())
file_object.close()  # close the file object!


>>Reading file use readlines():

['To study the representational power of a GNN, we analyze when a GNN maps two nodes to the same location in the embedding space.\n', 'Intuitively, a maximally powerful GNN maps two nodes to the same location only if they have identical subtree structures with identical features on the corresponding nodes.\n', 'Since subtree structures are defined recursively via node neighborhoods (Figure 1), we can reduce our analysis to the question whether a GNN maps two neighborhoods (i.e., two multisets) to the same embedding or representation.\n', 'A maximally powerful GNN would never map two different neighborhoods, i.e., multisets of feature vectors, to the same representation.\n', 'This means its aggregation scheme must be injective.\n', 'Thus, we abstract a GNN’s aggregation scheme as a class of functions over multisets that their neural networks can represent, and analyze whether they are able to represent injective multiset functions.']


In [27]:
[line.strip('\n') for line in open('./in.txt','r').readlines()]

['To study the representational power of a GNN, we analyze when a GNN maps two nodes to the same location in the embedding space.',
 'Intuitively, a maximally powerful GNN maps two nodes to the same location only if they have identical subtree structures with identical features on the corresponding nodes.',
 'Since subtree structures are defined recursively via node neighborhoods (Figure 1), we can reduce our analysis to the question whether a GNN maps two neighborhoods (i.e., two multisets) to the same embedding or representation.',
 'A maximally powerful GNN would never map two different neighborhoods, i.e., multisets of feature vectors, to the same representation.',
 'This means its aggregation scheme must be injective.',
 'Thus, we abstract a GNN’s aggregation scheme as a class of functions over multisets that their neural networks can represent, and analyze whether they are able to represent injective multiset functions.']

In [11]:
print("\n>>Reading file using for-loop:\n")
with open('./in.txt','r') as f: # f = open('./in.txt','r') 
    for line in f:  # f.readlines()
        print(line)
        
f.close()  # close the file object!


>>Reading file using for-loop:

To study the representational power of a GNN, we analyze when a GNN maps two nodes to the same location in the embedding space.

Intuitively, a maximally powerful GNN maps two nodes to the same location only if they have identical subtree structures with identical features on the corresponding nodes.

Since subtree structures are defined recursively via node neighborhoods (Figure 1), we can reduce our analysis to the question whether a GNN maps two neighborhoods (i.e., two multisets) to the same embedding or representation.

A maximally powerful GNN would never map two different neighborhoods, i.e., multisets of feature vectors, to the same representation.

This means its aggregation scheme must be injective.

Thus, we abstract a GNN’s aggregation scheme as a class of functions over multisets that their neural networks can represent, and analyze whether they are able to represent injective multiset functions.


### 5. What is a “csv” file and why is it useful for storing and manipulating data?

A csv (comma separated values) file is a text file stored in a particular format, similar to that of a spreadsheet. That is, as rows of data with individual values separated by a comma (`,`) and rows separated by a new line character `\n`. 

csv data is useful for storing statistics or measurement data because the structure of the file allows Python to read and process the data in its spreadsheet-like format. Python’s `csv` library is very useful for processing csv files.

In [28]:
import csv

COLUMN_NAME = 'Train'

fp = open("travel.csv")

for row in csv.DictReader(fp):  # for loop iterate through rows
    #print(row)
    print(row[COLUMN_NAME])  # access cell on a row by specifiying the column name
    



242969
368572
13715
62069
56417


### Exercise 2. Fill in the blanks `...` in the program below which reads from in.txt and writes to `out.txt`.

In [29]:
outfile = open("out.txt", "w")

In [30]:

outfile = open("out.txt", "w")  # 1
with open("in.txt", "r") as infile:  # 2
    line_no = 1
    for line in infile:  # 3
        outfile.write(f"line: {line_no}, length: {len(line)}\n")  # 4
        line_no += 1
outfile.write("The End")
outfile.close()  # 5


### Exercise 3. `travel.csv` is a csv file containing data on how people get to work in different cities in Australia. `process.py` is a python program which processes this data. What information does the `process.py` attempt to find and print?


```
"travel.csv"

City,Train,Tram,Bus,Ferry,Car,Total
Melbourne,242969,55169,31937,783,1282997,1613855
Sydney,368572,3210,138340,9007,1206350,1725482
Adelaide,13715,4137,33673,211,390360,442102
Brisbane,62069,229,58228,3761,663353,787650
Perth,56417,223,37899,373,594571,689489

```

Data Source: Census of Population and Housing, 2016, TableBuilder

In [32]:
''' process.py '''
import csv

fp = open("travel.csv")
city = ''
curr_max = 0.0
for row in csv.DictReader(fp):
    ferry = int(row["Tram"])
    total = int(row["Total"])
    if ferry / total > curr_max:
        city = row["City"]
        curr_max = ferry / total
print(city)

Melbourne
