# First Steps: Data Types, Lists, and Built-In Functions

Remember, completing the worksheet is great for your understanding but not necessary from a grading perspective. Whatever you are able to complete in the discussion section will usually be fine. That said, you are heartily encouraged to make additional time to complete any parts that you may not have gotten to during the scheduled Discussion. 

## Introduction

The purpose of this Discussion activity is to (gently) help you get up to speed writing Python code. Please pay special attention to Problem 1, as it illustrates some important theoretical points which can help you if you run into bugs later in the course. 

Don't rush through this worksheet and copypaste everything I wrote into the code cells -- just read it and type it in yourself, because that will help you remember the syntax. Working through them productively (in a way that makes you learn something) is more important than finishing every problem.

## § Problem 1: Variables and Objects in Python

When you learned C++, you learned that a given variable had to stay the same type for an entire program. That is, if you had an int named x in the beginning of your program, x had to stay an int for the entire program. The story is much different in python. In this exercise, we will explore these difference.

*a)* In the code cell below, write the three lines:
```python
x = 2
x = [5]
x = "a"
```
**Print the value of `x` after each line.** Run the cell. Try running it multiple times. Observe the output. In a few words, what does this tell you about what happens, in terms of objects and variables? Write your answer in the space below the code cell. If you had known nothing about python, would you have expected this code to work based on your knowledge of C++?

In [5]:
# code here
x = 2
print(x)
print(type(x))
x = [5]
print(x)
print(type(x))
x = "a"
print(x)
print(type(x))

2
<class 'int'>
[5]
<class 'list'>
a
<class 'str'>


*\[I noticed that every time I redefined x, it automatically adjusted what the variable x represented. There is no need to create a new variable every time.\].*

*b)* Now write the three lines:
```python
y = [1, 2, 3]
y[0] = 8
y.sort()
```
Print the value and `id()` of `y` after each line. What does this tell you about what happens, in terms of objects and variables?

In [7]:
# code here
y = [1, 2, 3]
print(id(y))
print(y)
y[0] = 8
print(id(y))
print(y)
y.sort()
print(id(y))
print(y)

4574059136
[1, 2, 3]
4574059136
[8, 2, 3]
4574059136
[2, 3, 8]


*\[What I noticed is that lists are mutable, and they have the same id even after being changed.\].*

*c)* Write the four lines:
```python
a = 0
b = a
a += 1
a += 2
```
After each line, print `a`, `b`, and their identities (once they've been defined). Is this the same as what you might have expected based on your previous programming experience in a language such as C++? 

In [11]:
# code here
a = 0
print("a: ", a)
print("The id of a is: ", id(a))
b = a
print("a: ", a)
print("b: ", b)
print("The id of a is: ", id(a))
print("The id of b is: ", id(b))
a += 1
print("a: ", a)
print("b: ", b)
print("The id of a is: ", id(a))
print("The id of b is: ", id(b))
a += 2
print("a: ", a)
print("b: ", b)
print("The id of a is: ", id(a))
print("The id of b is: ", id(b))

a:  0
The id of a is:  4338134088
a:  0
b:  0
The id of a is:  4338134088
The id of b is:  4338134088
a:  1
b:  0
The id of a is:  4338134120
The id of b is:  4338134088
a:  3
b:  0
The id of a is:  4338134184
The id of b is:  4338134088


*\[This is not the same as what I would expect in C++ because although the value of a changes, b does not change and still has the id that the original a had.\].*

*d)* Write the four lines:
```python
c = [0, 1, 2]
d = c
c += [3]
c += [4, 5]
```
After each line, print `c`, `d`, and their identities (once they've been defined). Is this the same as what you might have expected based on your previous programming experience in a language such as C++? How does it differ from your answer to the previous part? (It might help to google the phrase `mutable variable'.)

In [13]:
# code here
c = [0, 1, 2]
print("c: ", c)
print("The id of c is: ", id(c))
d = c
print("c: ", c)
print("d: ", d)
print("The id of c is: ", id(c))
print("The id of d is: ", id(d))
c += [3]
print("c: ", c)
print("d: ", d)
print("The id of c is: ", id(c))
print("The id of d is: ", id(d))
c += [4, 5]
print("c: ", c)
print("d: ", d)
print("The id of c is: ", id(c))
print("The id of d is: ", id(d))

c:  [0, 1, 2]
The id of c is:  4574065344
c:  [0, 1, 2]
d:  [0, 1, 2]
The id of c is:  4574065344
The id of d is:  4574065344
c:  [0, 1, 2, 3]
d:  [0, 1, 2, 3]
The id of c is:  4574065344
The id of d is:  4574065344
c:  [0, 1, 2, 3, 4, 5]
d:  [0, 1, 2, 3, 4, 5]
The id of c is:  4574065344
The id of d is:  4574065344


*\[This is different from the last part because as c changes, d also changes even though I never redefine d. The ids for both lists stay the same and they both change as I change c.\].*

## § Problem 2: List and String Slicing

This problem will ask you to create your own code cells. Recall that you can do so by highlighting the cell below which you want to add a new one, and then use the `+` sign in the toolbar. Even better, press `b` when the cell is highlighted (works in Jupyter Notebook, not Jupyter Lab). 

### Part (A)

Add a code cell below this one. In that cell, assign  the string `"string"` to a new variable named `s`. Then run the cell. 

For each of the following items, **first guess what the output should be**. Once you decide, add a new code cell, run the corresponding item inside it, and compare your guess with the output:

```python
s[2]
s[-2]
s[:4]
s[4:]
s[1:5]
s[:]
s[2:-1]
s[::2]
```

Make sure you run each of these in a separate cell, and don't clear the output -- else we won't know that you did the work.

Think about which of these would also work the same way if we had declared `s` to be a list (of the same length). Write your hypothesis in a new *markdown* cell below. (Markdown cells like this one don't run their contents through the Python interpreter -- they just display nicely whatever you put inside them. Use the drop-down menu in the toolbar to turn a code cell into a markdown cell. Double-click a markdown cell to edit it, and run it like you would a code cell.) 

In [26]:
s = "string"

In [15]:
s[2]

'r'

In [16]:
s[-2]

'n'

In [17]:
s[:4]

'stri'

In [18]:
s[4:]

'ng'

In [19]:
s[1:5]

'trin'

In [20]:
s[:]

'string'

In [21]:
s[2:-1]

'rin'

In [22]:
s[::2]

'srn'

## Part (B) 

Based on what you know about strings, what do you think will happen if you run the two lines:
```python
s[1] = 'p'
print(s)
```
Add one new cell right below this one with both lines, and run it to test your hypothesis. As always, keep the output so that we see your work.

Think about what would happen if we had declared `s` to be a list of the same length (for instance, the list `[1, 2, 3, 4, 5, 6]`) and ran the same code. Write your hypothesis in a new markdown cell below.

In [23]:
s[1] = 'p'
print(s)

TypeError: 'str' object does not support item assignment

I hypothesize that if we had declared s to be a list of the same length, the 2nd element of the list would be replaced with the character "p".

## Part (C)

Take a moment to review string indexing. Edit the next cell to fill in the missing positive and negative index labels. Don't forget to render the cell. 

```
| s | t | r | i | n | g |
  0   1   2   3   4   5    <-- replace these question marks with positive indices
 -6  -5  -4  -3  -2   -1   <-- add negative indices on this line (use spaces to align)
```

## § Problem 3: Built-In Functions

## Part (A)

In the code cell below we have declared two lists. Run this cell now. 

In [27]:
m1 = [0, 3, 6, 9]
m2 = [0, 2, 4, 6]

Make a new code cell right below this one. Inside it, define a new list that consists of two copies of `m1` followed by one copy of `m2`. Do not re-type the righthand sides of the `=` signs in the code cell above! Using the same single code cell, perform the following operations on the list, printing the list after each operation:

1. Sort this list
2. Reverse the resulting list
3. Throw out the first value of the list
4. Throw out the last three values of the list
5. Append the number 5 to the list
6. Replace the list with two copies of itself put together (use `=`)
7. Sort the list in reverse order (using just one function call)

Don't guess what your code will do -- **run the cell regularly** to see what happens! 

You can find several useful functions in [these lecture notes](https://nbviewer.jupyter.org/github/PhilChodrow/PIC16A/blob/master/content/basics/lists.ipynb). 

In [49]:
m3 = m1 + m1 + m2
m3.sort()
print(m3)
m3.reverse()
print(m3)
m3.pop(0)
print(m3)
m3.pop()
m3.pop()
m3.pop()
print(m3)
m3.append(5)
print(m3)
m3 = m3 + m3
print(m3)
m3.sort(reverse = True)
print(m3)

[0, 0, 0, 2, 3, 3, 4, 6, 6, 6, 9, 9]
[9, 9, 6, 6, 6, 4, 3, 3, 2, 0, 0, 0]
[9, 6, 6, 6, 4, 3, 3, 2, 0, 0, 0]
[9, 6, 6, 6, 4, 3, 3, 2]
[9, 6, 6, 6, 4, 3, 3, 2, 5]
[9, 6, 6, 6, 4, 3, 3, 2, 5, 9, 6, 6, 6, 4, 3, 3, 2, 5]
[9, 9, 6, 6, 6, 6, 6, 6, 5, 5, 4, 4, 3, 3, 3, 3, 2, 2]


## Part (B)

For each of the following code cells: 

1. Make a markdown cell above the code cell. Inside that markdown cell, write your hypothesis describing what the cell will print or otherwise achieve. There are several functions here that we haven't learned yet -- take your best guess! A couple of words per function (not per line of code) will suffice.
2. Then, run the cell to check your hypothesis. 
3. After checking, feel free to modify the code cell to experiment with the functions and make sure you are clear on their operation. 

My hypothesis for the cell below is that three variables 'p', 'q', and 'r' will be defined with three different lists.

In [41]:
p = ['t', 'e', 's', 't']
q = ['a', 'b', 'c', 'd']
r = [10, 20, 8, 17]

My hypothesis for the cell below is that first the length of 'p' will be printed, then the length of 'p' and 'q' will be printed, then the length of an empty list will be printed.

In [42]:
print(len(p))
print(len(p + q))
print(len([]))

4
8
0


My hypothesis for the cell below is that t1 is a variable defining a string with all of the elements of 'q' joined together using ', '. t2 is a similar variable defining a string with all of the elements of 'p' joined together using ' '. Then, both strings are printed and then t1 is printed in upper case, then all lower case, and then finally a list of the string joined by 'q' and ' ' but then split at b.

In [43]:
t1 = ', '.join(q)
t2 = ''.join(p)
print(t1)
print(t2)
print(t1.upper())
print(t1.upper().lower())
print(''.join(q).split('b'))

a, b, c, d
test
A, B, C, D
a, b, c, d
['a', 'cd']


My hypothesis for the code below is that first thing printed will be the number of ',' in t1 and second the number of 't' in t2.

In [44]:
print(t1.count(','))
print(t2.count('t'))

3
2


My hypothesis for the code below is that first the minimum (smallest alphabetical) of p will be printed, then the max of q, then the max of r and sum of r.

In [45]:
print(min(p))
print(max(q))
print(max(r))
print(sum(r))

e
d
20
55


My hypothesis for the cell below is that first the range function from 0-10 will be printed, then it will be printed in a list from 0-9. Next, a list from 5-7 will be printed. Next, an empty list will be printed and then finally a list from 10-1.

My hypothesis for the code below is that first the sorted list "p" based on ascending order will be printed, then the combined list will be printed with each element at the same index being together in a tuple. Finally, each list will be combined in a larger list with each individual list being a tuple.

In [46]:
print(sorted(p))
print(list(zip(p,q,r)))
print(list(zip(*zip(p,q,r))))

['e', 's', 't', 't']
[('t', 'a', 10), ('e', 'b', 20), ('s', 'c', 8), ('t', 'd', 17)]
[('t', 'e', 's', 't'), ('a', 'b', 'c', 'd'), (10, 20, 8, 17)]


In [47]:
print(range(10))
print(list(range(10)))
print(list(range(5,8)))
print(list(range(10,0)))
print(list(range(10,0,-1)))

range(0, 10)
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
[5, 6, 7]
[]
[10, 9, 8, 7, 6, 5, 4, 3, 2, 1]
