## String indexing

- Unlike int and float, the str data type has **internal** structure
- One can access individual/group of characters in a string using **indexing**
- One can compute the **length** of a string


In [None]:
### EXAMPLE

st="EECE231 sections 1,2,3"

#indexing starts at 0
print(st[0])  # access the first element
print(st[1])  #access the second element    
print(st[len(st)-1]) # access the last element



### Slicing
- We can also access a **slice** of the string (a substring)

In [None]:
st="EECE231 sections 1,2,3"
print(st[0:4]) # first index inclusive, the second exclusive
print(st[4:7])

In [None]:
## We can omit one or both of the indices
st="EECE231 sections 1,2,3"
print(st[8:])
print(st[:3])

### Reverse indexing

In [None]:
### Reverse indexing
st="EECE231 sections 1,2,3"
#print(st[-1]) # the last element is at -1
## element before the last -2 and so on
print(st[-1])
print(st[-2])
print(st[-3:]) #print the

### Extended slicing

- So far the slicing "increment" was 1
- We can extend that to arbitrary increments

In [None]:
st="EECE231 sections 1,2,3"
a=st[::2]
print(a)
print(st[-1:-4:-1])

### Immutability

- Once created, a string cannot be changed

In [None]:
st="EECE231 sections 1,2,3"
st[0]='A'

## Lists

- Lists are similar to strings but can composed of arbitrary objects
- **many** of the operations on string carry over to lists

In [None]:
from IPython.display import IFrame,HTML
src="https://pythontutor.com/visualize.html#code=L%3D%5B10,2,%22ab%22,4.4%5D%0Aprint%28L%29%0AL%5B1%5D%3D7%0Aprint%28L%29%0Aprint%28L%5B0%5D%29%0Aprint%28len%28L%29%29%0A&cumulative=false&curInstr=0&heapPrimitives=nevernest&mode=display&origin=opt-frontend.js&py=3&rawInputLstJSON=%5B%5D&textReferences=false"
IFrame(width=900, height=800, src=src)


In [8]:
A=[1,18,4,39,5]
print(A[0])
print(A[2:])
print(A[-1])
print(A[-2:])
sum(A)/len(A)

1
[4, 39, 5]
5
[39, 5]


13.4

### Creating a list from a string

- Given a string, the **split** function returns a list of **tokens**
- By default, it returns all substrings separated by space 

In [9]:
st="one two three"
print(st.split())
## we can use any separator
st="four,five,six"
print(st.split(','))


['one', 'two', 'three']
['four', 'five', 'six']


### Inhomogeneous lists

- The elements of a list **don't have** to be of the same type

In [10]:
B=['a',2.2,'hello']
print(B[0])
print(B[1:])
print(B[-1::-1])

a
[2.2, 'hello']
['hello', 2.2, 'a']


### Elements can be lists or lists of lists

In [11]:
A=[['a',3.1,17],'string']
print(A[0])
# think of it as index operator applied sequentially
# e.g. (A[0])[1]
print(A[0][1]) 

['a', 3.1, 17]
3.1


### (some) Operations on lists

In [12]:
A=[1,18,4,39,5]
B=[200,210,220,230]
C=2*A  ## repetition
print(C)
D=A+B ## concatenation
print(D)  
A.reverse()
print(A)

[1, 18, 4, 39, 5, 1, 18, 4, 39, 5]
[1, 18, 4, 39, 5, 200, 210, 220, 230]
[5, 39, 4, 18, 1]


#### Click on the link below to see a visualization
[visualize it](https://pythontutor.com/visualize.html#code=A%3D%5B1,18,4,39,5%5D%0AB%3D%5B200,210,220,230%5D%0AC%3D2*A%20%20%23%23%20repetition%0Aprint%28C%29%0AD%3DA%2BB%20%23%23%20concatenation%0Aprint%28D%29%20%20%0AA.reverse%28%29%0Aprint%28A%29&cumulative=false&curInstr=7&heapPrimitives=nevernest&mode=display&origin=opt-frontend.js&py=3&rawInputLstJSON=%5B%5D&textReferences=false)

### Iteration

- One can iterate over the elements of the list with a for loop

In [13]:
A=[1,18,4,39,5]
for i in range(len(A)):
    A[i]+=1
for v in A: 
    v+=1    #does not change values of A
print(A)

[2, 19, 5, 40, 6]


In [14]:
for i,v in enumerate(A):
    print(f'A[{i}]={v}')

A[0]=2
A[1]=19
A[2]=5
A[3]=40
A[4]=6


### Exercise

- Create a list containing 5 elements
- Each element a list with 3 elements equal to zero
- i.e. [ [1,1,1],[1,1,1],[1,1,1],[1,1,1],[1,1,1] ]

In [15]:
A=5*[3*[1]]
print(A)
for i in range(len(A)):
    print(sum(A[i]))
# 3-d list
#each element of B is a list of list
B=5*[2*[2*[1]]]
print(B)


[[1, 1, 1], [1, 1, 1], [1, 1, 1], [1, 1, 1], [1, 1, 1]]
3
3
3
3
3
[[[1, 1], [1, 1]], [[1, 1], [1, 1]], [[1, 1], [1, 1]], [[1, 1], [1, 1]], [[1, 1], [1, 1]]]


### Exercise

- We are given a list of students' grades
- Each student has grades for 3 exams
- We want to compute the average grade for each student
- Store the result in a new list

In [7]:
G=[[60,94,88],[93,76,72],[100,67,93]]
average=[]
for g in G:
    x=sum(g)/len(g)
    average.append(x)
print(average)

[80.66666666666667, 80.33333333333333, 86.66666666666667]


### Two sum problem (Special case of subset sum)

- You are given
   1. A list of integers x
   1. A number target
- You need to check if two **distinct** elements of x add up to target
- For example
  - x=[17,3,22,5,13,2], target=8
  - returns [1,3]

In [None]:
x=[17,3,22,5,13,2]
target=8
low,high=None,None
for i in range(0,len(x)):
    for j in range(i+1,len(x)):
        if x[i]+x[j]==target:
            low,high=i,j
print(low,high)

#### Discussion

- The above code continues after we find a solution, if any
- It would be nice to stop if we find a solution
- Cannot break from a nested loop. 
- One way is to use a flag

In [None]:
x=[17,3,22,5,13,2]
target=9
low,high,done=None,None,False
for i in range(0,len(x)):
    for j in range(i+1,len(x)):
        if x[i]+x[j]==target:
            low,high,done=i,j,True
            break
    if done:
        break
print(low,high)

### Longest common prefix

- Given a list of strings, e.g. ["flower","flow","flout"]
- Find the longest common prefix, in the above it is "flo"
- Strategy
    1. Consider the characters of the first string ("flower") in sequence
    1. For each such character check that all the other strings have the same character at that position
    1. if not then stop
    1. otherwise consider the next character


- Example ["flower","flow","flout"]
  1. **'f'**,  2nd has 'f'? yes, 3rd has 'f' yes then **continue**
  1. **'l'**, 2nd has 'l'?yes, 3rd has 'l' yes then **continue**
  1. **'o'**, 2nd has 'o'?yes, 3rd has 'o' yes then **continue**
  1. **'w'**, 2nd has 'w'?yes, 3rd has 'w' **NO** then **STOP**
- <u>**Make sure we don't go out of bounds in any of the strings**</u>

In [None]:
strs=["flower","flow","flout"]
mismatch=False
# underscore=ignore value
# why use it? Learning    
for i,_ in enumerate(strs[0]):
    for s in strs[1:]:
        if i>= len(s) or s[i]!=strs[0][i]:
            mismatch=True
            break
    if mismatch:
        break
if not mismatch:
        result=strs[0]    
else:
        result=strs[0][0:i]
print(result)        

#### Edge cases

In [None]:
strs=["flower","flow","flight"]
mismatch=False
if len(strs)==0:
    result=""
elif len(strs)==1:
    result=strs[0]
else:
    for i,v in enumerate(strs[0]):
        for s in strs[1:]:
            if i>= len(s) or s[i]!=strs[0][i]:
                mismatch=True
                break
            if done:
                break
    if not mismatch:
        result=strs[0]    
    else:
        result=strs[0][0:i]
print(f'result={result}')

#### Mutability

- Unlike strings, lists are mutable

In [None]:
A=[1,18,4,39,5]
print(A)
A[3]=999
print(A)
A[2:4]=111,111,111,111,111,111
print(A)

In [None]:
A=[3.4,77,'eece','231']
print(A)
A[2]='EECE'
print(A)
print(A[-1::-1])
A[1:2]=1111,1111,111,111
print(A)

## Tuples

- Tuples are similar to lists but are **immutable**. Once created its elements cannot be **changed**

In [None]:
tupleA=(7,'a')
print(tupleA[0])
print(len(tupleA))
tupleB=("EECE",231)
tupleC=tupleA+tupleB
print(tupleC)

In [2]:
### cannot change the value of elements
tupleA[0]=8

TypeError: 'tuple' object does not support item assignment

In [3]:
## empty tuple
tupleA=()
print(type(tupleA))
## (e) is not a tuple since it is a valid expression
x=(1)
print(type(x))
## a tuple with a single item is create like this
x=1,
print(type(x))

<class 'tuple'>
<class 'int'>
<class 'tuple'>


### Packing and unpacking tuples

In [14]:
#packing
section1="EECE",231,1
print(section1)
#unpacking
dept,course,sec=section1
print(dept)
print(course)
print(sec)

('EECE', 231, 1)
EECE
231
1


**NOTE**: unpacking can be performed, but **usually is not**, on lists 

In [16]:
listA=["EECE",231,1]
dept,course,sec=listA
print(dept)
print(course)
print(sec)

EECE
231
1


### Alias and clone

- the assignment $y=x$ is interpreted differently depending on the type of $x$
1. if $x$ is int,float or str then $y$ is a **clone** (a copy) of $x$
2. otherwise $y$ is an **alias**. If the values of $x$ or $y$ change they **both** change

In [None]:
x=17
p=3.14
s="EECE"
listA=[1,2]
tupleA=([7,'a'],[8,'b'])
### assignments
y=x
q=p
t=s
listB=listA
tupleB=tupleA
listA[0]=99
print(listA)
print(listB)

tupleB[0][1]='Z'
print(tupleA)
print(tupleB)

[visualize it](https://pythontutor.com/visualize.html#code=x%3D17%0Ap%3D3.14%0As%3D%22EECE%22%0AlistA%3D%5B1,2%5D%0AtupleA%3D%28%5B7,'a'%5D,%5B8,'b'%5D%29%0A%23%23%23%20assignments%0Ay%3Dx%0Aq%3Dp%0At%3Ds%0AlistB%3DlistA%0AtupleB%3DtupleA%0AlistA%5B0%5D%3D99%0Aprint%28listA%29%0Aprint%28listB%29%0A%0AtupleB%5B0%5D%5B1%5D%3D'Z'%0Aprint%28tupleA%29%0Aprint%28tupleB%29&cumulative=false&curInstr=0&heapPrimitives=nevernest&mode=display&origin=opt-frontend.js&py=3&rawInputLstJSON=%5B%5D&textReferences=false)

### copy

- If we want a clone of a list we use the copy method
``` 
listA=["EECE",231]
listB=listA.copy()


In [20]:
listA=["EEXE",231]
listB=listA.copy()
listA[0]="EECE"
print(listA)
print(listB)
## no copy method (not needed, why?) exists for tuples

['EECE', 231]
['EEXE', 231]


### Slicing creates new objects

In [22]:
listA=['a','b',3]
listB=listA[0:2]
print(id(listA),id(listB))
listB[0]='Z'
print(listA,listB)
tupleA="EEXE",231,1
tupleB=tupleA[0:2]
print(id(tupleA),id(tupleB))

2357652548416 2357653509248
['a', 'b', 3] ['Z', 'b']
2357631423168 2357652534592


[visualize it](https://pythontutor.com/visualize.html#code=listA%3D%5B'a','b',3%5D%0AlistB%3DlistA%5B0%3A2%5D%0Aprint%28id%28listA%29,id%28listB%29%29%0AlistB%5B0%5D%3D'Z'%0Aprint%28listA,listB%29%0AtupleA%3D%22EEXE%22,231,1%0AtupleB%3DtupleA%5B0%3A2%5D%0Aprint%28id%28tupleA%29,id%28tupleB%29%29&cumulative=false&curInstr=0&heapPrimitives=nevernest&mode=display&origin=opt-frontend.js&py=3&rawInputLstJSON=%5B%5D&textReferences=false)

In [None]:
#shallow copy
listA=[['a','b',3],[7,'d',5],["hello","there"]]
listB=listA[0:2]
listB[0][0]='Z'
print(listA)
print(listB)

#### Deep copy

In [None]:
from copy import deepcopy
listA=[['a','b',3],[7,'d',5],["hello","there"]]
listAcopy=deepcopy(listA)

listA[0][0]='Z'
print(listA)
print(listAcopy)

[visualize](https://pythontutor.com/visualize.html#code=from%20copy%20import%20deepcopy%0AlistA%3D%5B%5B'a','b',3%5D,%5B7,'d',5%5D,%5B%22hello%22,%22there%22%5D%5D%0AlistAcopy%3Ddeepcopy%28listA%29%0A%0AlistA%5B0%5D%5B0%5D%3D'Z'%0Aprint%28listA%29%0Aprint%28listAcopy%29&cumulative=false&curInstr=0&heapPrimitives=nevernest&mode=display&origin=opt-frontend.js&py=3&rawInputLstJSON=%5B%5D&textReferences=false)

#### List comprehension

- A concise way of creating lists


In [8]:
squares=[x**2 for x in range(10)]
print(squares)
even_squares=[x**2 for x in range(10) if x%2==0 and x>4]
print(even_squares)

[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]
[36, 64]


In [12]:
listA=[(x,y) for x in range(6) if x%2==0 for y in range(10) if y%3==0]
print(listA)

[(0, 0), (0, 3), (0, 6), (0, 9), (2, 0), (2, 3), (2, 6), (2, 9), (4, 0), (4, 3), (4, 6), (4, 9)]


In [14]:
listA=[[(x,y) for x in range(6) if x%2==0] for y in range(6) if y%3==0]
print(listA)

[[(0, 0), (2, 0), (4, 0)], [(0, 3), (2, 3), (4, 3)]]


### Removing elements from lists

In [11]:
A=['a',3,17,'b']
## no return value
A.remove(3)
print(A)
## remove element at index and returns it
result=A.pop(0)
print(result)
print(A)

['a', 17, 'b']
a
[17, 'b']


### Common mistake

In [None]:
A=[0,1,2,3,4,5,6]
for i in range(len(A)):
    if A[i]%2==0:
        A.pop(i)
print(A)

[visualize](https://pythontutor.com/visualize.html#code=A%3D%5B0,1,2,3,4,5,6%5D%0Afor%20i%20in%20range%28len%28A%29%29%3A%0A%20%20%20%20if%20A%5Bi%5D%252%3D%3D0%3A%0A%20%20%20%20%20%20%20%20A.pop%28i%29%0Aprint%28A%29&cumulative=false&curInstr=0&heapPrimitives=nevernest&mode=display&origin=opt-frontend.js&py=3&rawInputLstJSON=%5B%5D&textReferences=false)

In [15]:
# list comprehension more efficient

A=[0,1,2,3,4,5,6,7,8,9,10]
A=[i for i in A if i%2==1]
print(A)

[1, 3, 5, 7, 9]
