## MY470 Computer Programming
# Working with Strings and Lists in Python
### Week 2 Lab

## Variables

Variables associate objects (values) with a name. Objects have types (belong to classes). Here are the rules for naming variables:
* Variables must begin with a letter (a - z, A - Z) or underscore (_)
* Variables can contain letters, underscore, and numbers

Notes:
- Names should be informative, dont start with underscores, but dont make them too long, use abbreviations
- trade off beween short and readable, dont wnat too many underscores
- if you want to define reserved words as variable, you will get an error. If you dont get an error, you are overriding something, be careful

* Watch out for reserved words and names of functions! They are highlighted differently in code to remind you of these.

In [13]:
# List of reserved words in Python: and, as, assert, break, 
# class, continue, def, del, elif, else, except, exec, 
# finally, for, from, global, if, import, in, is, lambda, not, 
# or, pass, print, raise, return, try, while, with, yield
 
# trial = 2
# try = 3

# Note that the error below explains what the error is, and where exactly it is located.

ls = [1, 2]
# example, function list()
# ls vs list looks different, there is a function clalled list, if i execute 
# has redefined what this variable does list = [1, 2], then will get error if you want to use the list function then to create list out of tuple e.g. 
# list((1, 2)) list object is not callable, now have to restart the kernel, delete the code, 

ls = [1, 2, 3, 4]
print(ls)
list = [1, 2, 3, 4]
print(list)
# list((1, 5, 6)) # this creates an error because list object is not callable

[1, 2, 3, 4]
[1, 2, 3, 4]


In [23]:
#list = [1, 2, 3] # Note the color of "list" - Python recognizes this but you are redefining it!
#list((10, 20, 30)) # The in-built function will no longer work

[1, 2, 3]


TypeError: 'list' object is not callable

**If you catch such an error, delete/edit the code and restart the Kernel!** This will reload all built-in functions, so that you can use them as intended.

## [Style Guide for Python Code (PEP 8)](https://www.python.org/dev/peps/pep-0008/)

Get familiar with the guide. We will not enforce it to the letter but expect compliance with most common-sense aspects (e.g., variable names, space around operators, line length, function documentation, ...).

* 📖 Use **`UPPERCASE_WITH_UNDERSCORES`** for constants, like passwords or secret keys
* 📖 Use **`lowercase_with_underscore`** for variable names, functions, and methods
* 📖 Use **`UpperCamelCase`** for classes (coming in Week 5!) 

- **Read through this as much as you can, have not covered everything of this**
- For variable names, use lowercase, start with lowercase, use underscores to seperate words 
    - **BUT**: Make your variables look like line 2, but this is already too long, use abbreviations, dont make names too long
    - Use code we give you as examples
- put space around equal signs vs R less strict about this
- separate blocks e.g., for loading data, processing to make it more legible
- Use comments for humans to read, explain in common words, do not have to narrate your code, explain the logic of what you are doing
(for this week's assignment you only need 3 lines of code for solution, but later with algorithms it will be more)
-  ==should not be more than 8 lines/characters, want to avoid scrolling, if issue with printing, dont make your lines too long==


## Resources

In addition to the Python resources online, you can query any object to get help on what methods are available

In [15]:
dir(dict)
help(dict.popitem)

# contains anything you can do with dict, ignore for now any of these underscores, 
# if you want more information, can do help(dict.) get the methods for dict

Help on method_descriptor:

popitem(self, /) unbound builtins.dict method
    Remove and return a (key, value) pair as a 2-tuple.

    Pairs are returned in LIFO (last-in, first-out) order.
    Raises KeyError if the dict is empty.



# Strings

* Ordered sequences of characters ==> Can index strings
* **Immutable** thus I cannot do mystring[4] = "a" that would replace what is already there

- **Empty space is part of a string**
- can use single or double quotation marks
- Can call methods on strings, transform it
- If you make changes, **you need to save it to a new object, otherwise it is gone**, can also overwrite our original variable to save space
E.g.
x = "my string"
x.upper()
x

- a[::-1] -> reverses the entire string order
- example[::-2] With the minus at the end, you start at the end, so the end is the beginning and vice versa
- **Always check your results** 
- **always define small variable while still testing your code to test visually what is happening, count whether you have 10 elements**
- Print function: For jupyter notebooks, only evaluates the last statement only, this is when you do not need the print function, if you want something above, you need the print function to prints this too
    -diff between print("lödölagkj") and lödölaghk if written on topc of each other

In [36]:
a = "my string"
a[::-1]
a[-2::-4]



'ns'

In [2]:
x = 'my string'

# Capitalises the first character (chr) of the string
x = x.capitalize()

# prints the string
print(x)

# prints the chr at index 3
print(x[3])
print(x[2]) # EMPTY WHITE SPACE IS PART OF THE STRING, THUS HERE IT PRINTS ONLY WHITE SPACE

# prints the last chr
print(x[-1])

# print a range 
# NOTE: not inclusive of the last index, 4 chrs because python starts at 0
print(x[0:4])

# Index one to the last index
# Again, not inclusive
print(x[1:-1])

# EXTENDED SLICING 
# Get every other (2) item in the string.
print(x[::2])

# Reverse steps, every other chr
print(x[::-2])





My string
s
 
g
My s
y strin
M tig
git M


In [42]:
# Exercise 1: Make three new strings from the first and last, 
# second and second to last, and third and third to last letters 
# in the string below. Print the three strings.

p = 'redder'
# print(p[:])
# print(p[1:-1])
# print(p[2:-2])

p1 = p[0] + p[-1]
p2 = p[1] + p[-2]
p3 = p[2] + p[-3]
print(p1, p2, p3)




rr ee dd


In [12]:
# Exercise 2: Make a new string that is the same as string1 but 
# with the 8th and 22nd characters missing.

# SPACE COUNTS AS A CHARACTER IN A STRING
# Want the first "l" in "cancelled" and the second "l" in "travelling" to be missing

# SPACE COUNTS AS A CHARACTER
# first l should be mising
# second l in travellling should be missing
string1 = 'I cancelled my travelling plans.'
string1[:7] + string1[8:21] + string1[22:] # This is the efficient solution

# Inefficient solution because four iterations, but if something is too compressed, then this is also not good
ls = list(string1)
print(ls)
ls.pop(21) # this now also amends the index number of all other characters once 21 is removed, have re-index the second one, shifted to another position
print(ls)
ls.pop(7)
print(ls)
''.join(ls) # VS ' '.join(ls) here there is much more space between the individual characters
# have not yet entirely understood when you start with white space and when with the name of the variable itself 
# when do you start with an empty string

# Now you can see that essentially we are asking you to change 
# from British English to American English. 
# If you knew that, you could have done it another way:
new = string1.replace('travelling', 'traveling').replace('cancelled', 'canceled')
print(new)

['I', ' ', 'c', 'a', 'n', 'c', 'e', 'l', 'l', 'e', 'd', ' ', 'm', 'y', ' ', 't', 'r', 'a', 'v', 'e', 'l', 'l', 'i', 'n', 'g', ' ', 'p', 'l', 'a', 'n', 's', '.']
['I', ' ', 'c', 'a', 'n', 'c', 'e', 'l', 'l', 'e', 'd', ' ', 'm', 'y', ' ', 't', 'r', 'a', 'v', 'e', 'l', 'i', 'n', 'g', ' ', 'p', 'l', 'a', 'n', 's', '.']
['I', ' ', 'c', 'a', 'n', 'c', 'e', 'l', 'e', 'd', ' ', 'm', 'y', ' ', 't', 'r', 'a', 'v', 'e', 'l', 'i', 'n', 'g', ' ', 'p', 'l', 'a', 'n', 's', '.']


'I canceled my traveling plans.'

## String Methods

* `S.upper()`
* `S.lower()`
* `S.capitalize()`
* `S.find(S1)`
* `S.replace(S1, S2)`
* `S.strip(S1)`
* `S.split(S1)`
* `S.join(L)`

- lstrip, rstrip and strip
- better to break them down
- to save changes, save it as a new variable, but if you have to keep doing this, e.g. if applying these methods over and over
  write multiple lines for better legibiltiy and override the previous definition of the variable with your additions

## Methods Can Be "Stringed"

`sls = s.strip().replace('  ', ' ').upper().split()`

**However, be aware that this may reduce the clarity of your code.**
**3 operations of this kind max, and then new line with variable redefined or overwritten.**

📖 It is largely a question of code legibility. 


⚡️ Except when you are working with large data — it is then also a question of memory.

In [22]:
# Exercise 3: Remove the trailing white space in the string below, 
# replace all double spaces with single space, and format to a sentence 
# with proper punctuation. Print the resulting string.

string1 = '  this  is a very badly.  formatted string -  I would  like to make it cleaner\n'
#''.join(string1)
string1.strip().capitalize().replace(" i ", "I ").replace("  ", " ").replace(".","") + "." # my solution, but better to split this up into two lines?

#print(list(string1))
#''.join(string1).strip().capitalize().replace(".", "").replace("  ", " ").upper() + "."

# Answer: This exercise asks you to practice string methods
new = string1.strip().replace('  ', ' ').capitalize().replace('.', '') + '.'
print(new)
# Hmm, it looks like capitalize() makes everything in the sentence 
# lower case, which is bad for the pronoun 'I'
# We will need to fix this.
# Make sure you have the spaces around "i", as we will otherwise replace all i's 
new = new.replace(' i ', ' I ') 
print(new)


'This is a very badly formatted string - I would like to make it cleaner.'

In [11]:
# Exercise 4: Convert the string below to a list

s = "['apple', 'orange', 'pear', 'cherry']"
# print(s)
# print(type(s)) # s is a string
# ls = s.replace("'", "")#.lstrip("[").rstrip("]").split(", ")
# # type(s) but we still get type str?
# # why does splitting by the comma add those brackets on both sides that we had just gotten rid of in the earlier step?
# print(ls)
# type(ls) # but it is still telling me that it is a string and not a list?


# to python this is string, to parse it into a list, how do I use the methods that python offers to make this into object into a list 
# format it into a string that you can easily split into a list, split on ,, first string would be apple, want to remove, get list of four elements, each of which is a string
# clean it up, replace aposthrophes, before split, i want to stript it, want to l and r strip
# replace goes through every character of the string and look for this element, once it finds something it stops, strip more efficient
#replace replaces everything while strip only looks at the ends, r and l only at right and left hand side 
# join is the one yo call on string but the string you call it is on is the connector, join the list i am giving oyu , everythign else you metnion object and then you perform operations
# join method only for the string, 


# HER SOLUTION
# ls_solution = s.lstrip('[') # why are the " at the beginning and end gone even though we have not yet removed them?
# print(ls_solution)

ls_solution=s.lstrip("[").rstrip("]").replace("'", "").split(",")
print(ls_solution)







['apple', ' orange', ' pear', ' cherry']


In [20]:
# Exercise 5: Reverse the strings below.

s1 = 'stressed'
s2 = 'drawer'
semordnilap1 = 'stressed'
semordnilap2 = 'drawer'

# Another solution is to use the extended slicing we covered in class
# The solution is simpler and more efficient
print(semordnilap1[::-1], semordnilap2[::-1])
# if you want to save it, define it as a variable 
stressed_reversed = s1[::-1]
drawer_reversed = s2[::-1]


# Her solution
''.join(list(reversed(s1)))

# Answer: Unfortunately, strings don't have a reverse() method. 

# One solution is to create a list from the string, reverse the list and then join
# new1 = ''.join(reversed(list(semordnilap1)))
# new2 = ''.join(reversed(list(semordnilap2)))
# print(new1, new2)

new1 = ''.join(reversed(list(semordnilap1)))
print(new1)

desserts reward
desserts


# Lists

* Ordered sequence of values
* Mutable

In [48]:
mylist = [1, 2, 3, 4]
mylist.append(5)
print(mylist)

[1, 2, 3, 4, 5]


## List Methods

* `L.append(e)`
* `L.extend(L1)`
* `L.insert(i, e)`
* `L.remove(e)`
* `L.pop(i)`
* `L.sort()`
* `L.reverse()`

In [53]:
# Exercise 6: Use a list operation to create a list of ten elements, 
# each of which is '*'
# use [] ot concatenate the list, surround ith
print(list(10 * '*')) # my solution
10 * ['x'] # solution in seminar, but why are we here not getting the clsed brackets 10 times?

# Exercise 6: Use a list operation to create a list of ten elements, 
# each of which is '*'

# Answer: Multiply a list with one '*' by 10
print(10*['*'])


['*', '*', '*', '*', '*', '*', '*', '*', '*', '*']


['x', 'x', 'x', 'x', 'x', 'x', 'x', 'x', 'x', 'x']

In [65]:
# Exercise 7: Assign each of the three elements in the list below 
# to three variables a, b, c
ls = [['dogs', 'cows', 'rabbits', 'cats'], 'eat', {'meat', 'grass'}]
a, b, c = ls[0], ls[1], ls[2]
print(a, b, c)
# seminar solution a, b, c = ls

# Exercise 7: Assign each of the three elements in the list below 
# to three variables a, b, c
ls = [['dogs', 'cows', 'rabbits', 'cats'], 'eat', {'meat', 'grass'}]

# Answer: You can do multiple assignment in Python. 
# This is a useful way to unpack tuples and lists.
a, b, c = ls
print(a)
print(b)
print(c)


['dogs', 'cows', 'rabbits', 'cats'] eat {'meat', 'grass'}


In [25]:
# Exercise 8: Replace the last element in ls1 with ls2
ls1 = [0, 0, 0, 1]
ls2 = [1, 2, 3]

ls1[3] = ls2 # my solution
print(ls1)
print(ls2)

# Be aware of aliasing
ls2.append(100)
ls2
ls1

# ls.replace(1, ls2) does not work 


# want to open the repository, not hte file, all th electures, can open each individually, 
# command shift dot, shows you the hiiden files, lectures repository is git repository, these are hidden, keep version tracking, if you want o use the github capabilities, 
# have to open the whole repoistory
# file open file, 
# can also jsut work on file but have to open repository if you want to commit and push
# have to open the entire repository 

[0, 0, 0, [1, 2, 3]]
[1, 2, 3]


[0, 0, 0, [1, 2, 3, 100]]

In [37]:
# Exercise 9: Create a new list that contains only unique elements from list x

x = [1, 5, 4, 5, 6, 2, 3, 2, 9, 9, 9, 0, 2, 5, 7]
x = list(set(x)) # my solution
# print(x)

# Answer: Create a set from the list and then convert to list again. 
# Can sort, if necessary.
unique_ls = list(set(x))
unique_ls.sort()
# print(unique_ls)

test_ls = list(set(x))
test_ls.sort()
print(unique_ls)



[0, 1, 2, 3, 4, 5, 6, 7, 9]


In [21]:
# Exercise 10: Print the elements that occur both in list a and list b

a = ['red', 'orange', 'brown', 'blue', 'purple', 'green']
b = ['blue', 'cyan', 'green', 'pink', 'red', 'yellow']

# Answer: Use sets and their intersection
print(set(a) & set(b))

In [92]:
# Exercise 11: Print the second smallest and the second largest numbers 
# in this list of unique numbers

x = [2, 5, 0.7, 0.2, 0.1, 6, 7, 3, 1, 0, 0.3]
x.sort()
second_smallest, second_largest = x[1], x[-2]
print(second_smallest, second_largest)

print(second_largest),
print(second_smallest)

# Answer: This one requires some ingenuity. One solution is to identify 
# the min() and max() in the list, remove them and then do it again. 
# This would work because the numbers don't repeat.
newx = x[:]
newx.remove(max(x))
newx.remove(min(x))
print(min(newx), max(newx))

# Another option is to sort the list and return the second and 
# last to second elements. Again, this would work here only because 
# there is not more than one instance of the same number in the list.
x.sort()
print(x[1], x[-2])

# Which solution is preferable will depend on the data because sorting
# could be computationally expensive for large lists. We will learn
# about this in Week 10.


0.1 6
6
0.1


In [63]:
# Exercise 12: Create a new list c that contains the elements of 
# list a and b. Watch out for aliasing - you need to avoid it here.

a = [1, 2, 3, 4, 5]
b = ['a', 'b', 'c', 'd']
c = a + b # my solution, but how woudl aliasing be relevant here?
print(c) 

# Check for aliasing - no issue
a.append("a")
print(c)

print() # need some space between prints for legibility

# Another way is to use the extend() method but then you need to 
# account for aliasing
# Here is the wrong way to do it
c = a
c.extend(b)
print(c)
# Here is why it is wrong: 
print("Modifying c has changed a: a =", a)
a.append(7)
print(a)
print("And modifying a changes c: c =", c)

print() # space for legibility 

# Here is the right way: 
a = [1, 2, 3, 4, 5]
b = ['a', 'b', 'c', 'd']
c = a[:]
c.extend(b)
print(c)
# Cloned, not aliased
a.append([3, 3])
#print("Modify a but not c: a =" a ", c =" c)
print('Modify a but not c: a =', a, ', c =', c)




[1, 2, 3, 4, 5, 'a', 'b', 'c', 'd']
[1, 2, 3, 4, 5, 'a', 'b', 'c', 'd']

[1, 2, 3, 4, 5, 'a', 'a', 'b', 'c', 'd']
Modifying c has changed a: a = [1, 2, 3, 4, 5, 'a', 'a', 'b', 'c', 'd']
[1, 2, 3, 4, 5, 'a', 'a', 'b', 'c', 'd', 7]
And modifying a changes c: c = [1, 2, 3, 4, 5, 'a', 'a', 'b', 'c', 'd', 7]

[1, 2, 3, 4, 5, 'a', 'b', 'c', 'd']
Modify a but not c: a = [1, 2, 3, 4, 5, [3, 3]] , c = [1, 2, 3, 4, 5, 'a', 'b', 'c', 'd']


## Problem Set 0 (Formative)

* Practice string and list manipulations
* Practice working with data
  * Data is available in the `data` repository and you should use relative paths to access it
    * If you do not have access to the `data` repository, you need to fill out the Moodle survey so that we can add you to the organization. Once we add you, you need to accept our invitation and then you will be able to access the data.
  * Do not copy the data in your repository!