<div style="text-align: right">
    <i>
        LIN 537: Computational Lingusitics 1 <br>
        Fall 2019 <br>
        Alëna Aksënova
    </i>
</div>

# Notebook 3: lists and for loops

This notebook introduces a new data type, `list`, and explains what methods are defined for lists (`append`, `extend`, `insert`, and others) and what issues appear when the lists are copied. Then it shows the way to access sub-elements of larger elements individually via a `for-loop`.

## Lists

The new data type, `list`, is a collection of items.
These items can be of very different types: integers, strings, booleans, floats, and other lists as well.

In [0]:
a = [1, 2, 3]
print(a)

In [0]:
type(a)

A list below contains another list as its sub-list:

In [0]:
b = [1, "a", ["cab", True]]

The items in the list are _ordered_, just as strings. Remember, that strings are ordered, and the simple proof to it is that strings "devil" and "lived" are different words. Lists are like that as well: `[1, 2, 3]` is not the same as `[3, 1, 2]`.

In [0]:
[1, 2, 3] == [3, 1, 2]

They are ordered, and therefore we can use indexing with lists:

In [0]:
sample_list = [1, "linguistics", ["physics", 15], 3.14, False]
print("Element at the index 1 is", sample_list[1])
print("Element at the index 2 is", sample_list[2])
print("Element at the index 4 is", sample_list[4])

In [0]:
len(sample_list)

Accessing an element of the sublist is possible by indicating its address in several layers of indexing:

In [0]:
print(sample_list[2][0])

An empty list can be created in a following way:

In [0]:
empty_list = []
print("This is an empty list:", empty_list)
print("Its length is", len(empty_list))

### Modifying lists

One can add new elements to lists by using methods `append`, `extend` and `insert`.

#### `append`

This method appends a new item to some already existing list and uses the following syntax `list_to_append_to.append(what_to_append)`.

In [0]:
list_1 = [1, 2, 3]
list_1.append("new item")
print(list_1)

However, this is a way to add one item, and it cannot be used directly if we want to add all items from one list to another.

In [0]:
one_list = [1, 2, 3]
another_list = [True, "linguistics"]
one_list.append(another_list)
print(one_list)

#### `extend`

The `extend` method adds the element from the second list in a _flat_ way:

In [0]:
one_list = [1, 2, 3]
another_list = [True, "linguistics"]
one_list.extend(another_list)
print(one_list)

#### `insert`

If an element needs to be inserted on a concrete position, `insert` can be used with the specified index:

In [0]:
states = ["California", "New York", "Arizona"]
states.insert(1, "Colorado")
print(states)

#### `remove` and `del`

The method `remove` removes a concrete item from the list.

In [0]:
states = ["California", "New York", "Arizona"]
states.remove("Arizona")
print(states)

However, notice that `remove` only removes the first instance of the item:

In [0]:
states = ["California", "New York", "Arizona", "New York"]
states.remove("New York")
print(states)

If an item needs to be remove by position, one should use `del` operator. Notice its unusual syntax!

In [0]:
print(states)
del states[1]
print(states)

Rewriting an element of a list by some other element can be done directly by accessing that element by index and changing it.

In [0]:
cities = ["NYC", "LA"]
cities[0] = "SF"
print(cities)

**Practice.** You are given the following list of letters.

In [0]:
letters = ["d", "b", "c", "n"]

Insert "x" at the position 3 in the list `letters`. Then remove "c" from it. Append "e". Delete the element at the index 2, and, finally, rewrite the letter at the position 1 as 'o'. Print `letters`.

## Copying lists

One needs to be careful when copying lists because it is a bit tricky. Consider the following code and its behavior.

In [0]:
states_1 = ["California", "New York", "Arizona"]
states_2 = states_1
del states_2[2]
print("states_1:", states_1)
print("states_2:", states_2)

All variables, lists, strings, and so on are stored in memory. When you are running the cell that initializes a variable, it reseves a spot in memory for that variable, or it _allocates_ memory for that variable. **Memory allocation** is a process of reserving space in memory for some object.

<img src="images/3_1.png" width="300">

If a list contains another list (i.e. if the list is _nested_), then the "main" and the "internal" lists are contained in different memory positions. The "main" list contains an indicator where to look for the "internal" list, and this indicator is called a **pointer**.

<img src="images/3_2.png" width="350">

Examples of pointers in real life:
  * address written on an envelope (address is a string that points to your appartment);
  * your debit card (it doesn't have money, it points to an account with your money);
  * URL (it doesn't have any info _inside_ it, it points to a source of that information), etc.
  

In [0]:
states_1 = ["California", "New York", "Arizona"]
states_2 = states_1

### "Straightforward" copy

When we copy a list by just assigning its value to another list, i.e. as `list_2 = list_1`, what happens in memory is the following:

<img src="images/3_3.png" width="700">

The new list now is _pointing to the same slot in memory_ as the old one, i.e. they are linked.

**Analogy:** Mary and Jack have a shared bank account. Even though they have separate debit cards, they still point to the same bank account!

In [0]:
states_1 = ["California", "New York", "Arizona"]
states_2 = states_1
del states_2[2]
print("states_1:", states_1)
print("states_2:", states_2)

Even though we removed "Arizona" from the copy of the list, the original list was modified! It happens because if you copy a list in this direct way, the copy and the original list share the same _reference_, or, in other words, they occupy the same location in the memory.

### Shallow copy

One way to copy the list and to avoid that problem, is to take a full slice of that list, it will create a **shallow copy**.

In [0]:
states_1 = ["California", "New York", "Arizona"]
states_2 = states_1[:]
del states_2[2]
print("states_1:", states_1)
print("states_2:", states_2)

The state of memory now looks different than before, these two lists are not linked anymore:

<img src="images/3_4.png" width="700">

However, this will work if we have _flat_ lists. Let's try to do a shallow copy of a nested list.

In [0]:
states_1 = ["CA", ["NY", "NV"]]
states_2 = states_1[:]
del states_2[1][0]
print("states_1:", states_1)
print("states_2:", states_2)

Remember that a "nested" list has a different location in memory than the "main" list:

<img src="images/3_2.png" width="350">

When we are creating a shallow copy of a nested list, we are reserving separate memory location for the "main" list, but the pointers for the "nested" lists are still pointing to the same locations as in the original list!

<img src="images/3_5.png" width="700">

**Analogy:** Mary and Jack have different bank accounts, but they are still sharing a sub-account.

### Deep copy

**Deep copy** copies references to all elements and sub-elements of the original list. However, to access this function (`deepcopy`), we need to import it from the package for copying different data structures that is called `copy`.

In [0]:
from copy import deepcopy

After `from` there goes the name of the package, and after `import` we name the function that is being imported.

In [0]:
states_1 = ["CA", "CO", ["NY", "NV"], "RI"]
states_2 = deepcopy(states_1)
del states_2[2][0]
del states_2[1]
print("states_1:", states_1)
print("states_2:", states_2)

In this case, we are copying the architecture of the list we had originally, and allocating different memory locations for every single sub-list of the original list.

<img src="images/3_6.png" width="750">

**Analogy:** Mary and Jack have completely separate bank accounts and have nothing to do with each other.

**Question:** why does the following slightly modified code produce an error message?

In [0]:
states_1 = ["CA", "CO", ["NY", "NV"], "RI"]
states_2 = deepcopy(states_1)
del states_2[1]
del states_2[2][0]
print("states_1:", states_1)
print("states_2:", states_2)

## For-loops

For loops allow us to _iterate_ over elements of containers such as lists or strings, and to access items of those containers individually, in order. The syntax is following:

     for item in container:
            # the variable "item" now refers to the next item of the container

In [0]:
for char in "linguistics":
    print("The current symbol is", char)

We can combine `if`-`elif`-`else` statements and `for` loops.

In [0]:
vowels = ["a", "e", "i", "o", "u"]
for char in "linguistics":
    if char in vowels:
        print("I found a vowel! It is", char)

Of course, `for` loops can be contained within `for` loops:

In [0]:
cities = ["NYC", "LA", "SF"]

for city in cities:
    print("The current city is", city)
    print("Its letters are:")
    
    for letter in city:
        print("\t", letter)

**Practice.** We are given lists of questions and possible answers.

In [0]:
questions = ["How are you?", "What are you doing?", "What's your name?"]
answers = ["Fine!", "Nothing much", "Jen"]

Ask user every one of these questions, and if that answer is present in our list of answers, print "I knew it!".

**Practice.** You are given the following two lists.

In [0]:
cities = ["NYC", "LA", "SF"]
small_cities = ["Stony Brook", "Port Jeff"]

Add cities from `small_cities` to `cities` using the `append` method and a `for` loop, so that it would yield the following list:

    ["NYC", "LA", "SF", "Stony Brook", "Port Jeff"]

# Homework 3

**Due on Thursday, September 26th, 11.59pm**

Send your notebook (don't forget to save your solutions!) to <alena.aksenova@stonybrook.edu> with the subject **\[CompLing1\] Homework 3**.

**Problem 1.** You are given two lists, `cities` and `small_cities`. Insert the elements from `small_cities` in `cities` in such a way so that the list `cities` would contain items in the following order:
    
    ["NYC", "LA", "Stony Brook", "Port Jeff", "SF"]
    
Use any method or way you want.

In [45]:
cities = ["NYC", "LA", "SF"]
small_cities = ["Stony Brook", "Port Jeff"]

# your code here
city1=small_cities[0]
city2=small_cities[1]
cities.insert(2,city1)
cities.insert(3,city2)
print(cities)

['NYC', 'LA', 'Stony Brook', 'Port Jeff', 'SF']


**Problem 2.** Using the given list `cities`, produce the following output:

    NYC NYC
    NYC LA
    NYC SF
    LA NYC
    LA LA
    LA SF
    SF NYC
    SF LA
    SF SF

In [48]:
cities = ["NYC", "LA", "SF"]
# your code here
for city in cities:
  for letter in cities:
    print(city,letter)

NYC NYC
NYC LA
NYC SF
LA NYC
LA LA
LA SF
SF NYC
SF LA
SF SF


**Problem 3.** You are given some words from the [Swadesh list](https://en.wikipedia.org/wiki/Swadesh_list).

In [51]:
words = ["sun", "moon", "earth", "water", "food", "sky"]
trans=[]

for i in words:
  print("The word is",i)
  tran1=input("Please translate the word:")
  trans.append(tran1)
print(trans)

The word is sun
Please translate the word:太陽
The word is moon
Please translate the word:月亮
The word is earth
Please translate the word:地球
The word is water
Please translate the word:水
The word is food
Please translate the word:食物
The word is sky
Please translate the word:天空
['太陽', '月亮', '地球', '水', '食物', '天空']


Imagine that you are working with a native speaker of some language other than English. Create a new (empty) list for words of that language, call it `translations`. Then, for every word of the Swadesh list (`words`), ask the user to provide its translation, and save them into the `translations` list. After all the words were translated, print `translations`.