# Listiness.. or things that behave like lists

Python puts great stock in the idea of having  **protocols** or mechanisms of behavior, and identifying cases in which this behavior is common.

One of the most important ideas is that of things that behave like a *list of items*. 

## Lists

The first of these that we should consider, are, well, lists themselves :-). Lets see how python lists behave.

A list in Python is a sequence of anything!

Lists are mutable; you can insert and delete elements anytime.

In [1]:
# CREATING A LIST

# A list is made from zero or more elements, separated by commas, and surrounded by square
empty_list = []
working_days = ['Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday']
print (working_days[2])

Notice that Python lists are indexed from 0. The first element gets index 0, the second element gets index 1, and so on. Thus `working_days[2]` gives us the 3rd element.

In [2]:
# An empty list can be created using the list() function:
next_empty_list = list()
print (type(next_empty_list))

There is a class of objects in Python that are not lists, bur rather, like lists, have a sequential existent. We dont want to generate them until we need them because they are easy to generate and would otherwise take up memory. An example is the
object created by the function `range(start, stop)` which just gives you sequential numbers from start to 1 before stop. To generate these numbers, we can pass them to the `list` function

In [4]:
num_list = list(range(1,10)) 
print(num_list) 

In [5]:
# Lists are mutable (elements are changeable) 
num_list = [1, 2, 3, 4]
num_list[1] = 8 
print (num_list)

In [6]:
# Indexing a list
# - Any integer expression can be used as an index
# - If an index has a negative value, it counts from backward
lst = list(range(1,5))
print(lst)
print (lst[-1])

You can get slices of lists

In [30]:
lst[0:3] # from 0, dont include whats at index 3

In [31]:
lst[-3:-1]

In [7]:
# A list can have another list, or anything else as its element
numbers = [1, 2, 3, 4, 5]
courses = ['PP', 'BDA', "USP", 'WTA'] 
new_list = [numbers, courses, '6th sem'] 
print (new_list, len(new_list))

In [8]:
# List Operations
# '+' for concatenation
list_a = ['a','b','c'] 
list_no = [1, 2, 3, 4] 
list_new = list_a + list_no
print (list_new)

In [9]:
#Membership (using 'in' operator)
'a' in list_a

## Operations on lists

Lists support many methods which make using them simple. My favorite is `append`.

### Growing lists

In [10]:
# GROWING METHODS
# (i) append(x) - Adds a new element 'x' at the end of the list
alist = ['a', 'b', 'c'] 
alist.append('d') 
print (alist)

In [11]:
# (ii) extend(list) - Adds a list to another list 
alist.extend(list_no)
print (alist)

In [12]:
# (iii) insert(i, x) - inserts a new element 'x' at specified index 'i' 
alist.insert(4,'e')
print(alist)

### Searching and Sorting

In [13]:
# Searching methods
# (i) index(x) - returns the index value of 'x' in the list
alist = [1, 2, 3, 4, 5] 
alist.index(3)

In [14]:
# (ii) count(x) - returns the number of occurrence of 'x' in the list eg:
alist.append(3)
print (alist) 
alist.count(3)

In [15]:
# SORTING METHODS
# (a) sort() - Orders a list 'in place'. Default ordering - ascending order
num = [4, 2, 7, 3, 9, 1] 
num.sort()
num

In [16]:
# To change the order of sorting, use the keyword 'reverse' as an argument to sort()
num = [4, 2, 7, 3, 9, 1] 
num.sort(reverse = True)
num

In [17]:
# (b) sorted(list) - returns the sorted 'list', but does not replace order in original list 
num = [ 4, 6, 2]
num = sorted(num)
num

### Changing and shortening

You can change any list by individually changing an element. There are some other things you can do as well...

In [18]:
# Reverse a list

# reversed(list) - returns the reverse of 'list' as iterator eg:
L = list(range(1,5))
print (L)
print (list(reversed(L)))

In [19]:
# SHRINKING METHODS
# (a) remove(x) - removes the element 'x' from the list
num = [4, 7, 2, 6, 3, 9] 
num.remove(2)
num

In [20]:
# (b) pop(i) - removes and returns the element in index position 'i'
num = [4, 7, 2, 6, 3, 9] 
num.pop(3)
num

In [21]:
# (c) pop() - removes the last element from the list
num = [4, 7, 2, 6, 3, 9] 
num.pop()
num

In [22]:
# (d) clear() - removes all the elements from a list.
num = [4, 7, 2, 6, 3, 9]
num.clear()
num

## Iterating over lists

Using a for loop to do iteration is quite simple.

In [23]:
num = [4, 7, 2, 6, 3, 9]
for ele in num:
    print(ele)

You can now mix in conditionals to filter your iteration:

In [24]:
for ele in num:
    if ele % 2 == 0: #even numbers only
        print(ele)

There is a short-cut iteration syntax called a list comprehension, often used to construct new lists

In [25]:
list_with_same_as_num = [e for e in num]
list_with_same_as_num

This kind of syntax is really useful when combined with conditionals:

In [26]:
list_with_evens = [e for e in num if e % 2 == 0]
list_with_evens

## Strings

Strings such as `hello world` in python behave just like lists, and a lot of what you learn about lists applies to them: they are **iterable**, and they have a length! But they have one critical additional property: they are **immutable**, that is they cant be changed!


In [28]:
var = "This is a string"
print (var, len(var))
print (var[3]) # the s of This

In [32]:
for char in var:
    print(char)

In [29]:
var[3] = "t" # this will fail because immutability

Strings cab be sliced just as lists can

In [27]:
#String slicing
print (var[6:])
print (var[1:3])
print (var[:-1])
print (var[2:10:2]) #the last parameter is for chagning the step size

### String Functions

Some built-in functions for string manipulations:

*   capitalize()
*   center()
*   count()
*   decode()
*   encode()
*   endswith()
*   expandtabs()
*   find()
*   format()
*   index()
*   isalpha()
*   isdigit()
*   islower()
*   isspace()
*   isupper()
*   join()
*   ljust()
*   lower()
*   isalnum()
*   istitle()
*   lstrip()
*   rjust()
*   splitlines()
*   startswith()
*   strip()
*   partition()
*   replace()
*   rfind()
*   rindex()
*   rpartition()
*   rsplit()
*   rstrip()
*   split()
*   swapcase()
*   title()
*   translate()
*   upper()


**[PRGRAMMING EXERCISE]**

Consider the string "HelloWorld,123,ThisIsUniv.Ai". Find the number of uppercase, lowercase, special character and numerical characters.


In [None]:
#Try your code here

## Files

The built-in `open()` function creates a Python file object, which serves as a link to a file residing on your machine. After calling 'open()', strings of data can be transferred to and from the associated external file by calling the returned file object's methods.

At this point, you can read data from the file as a whole (`read()`, or `n` bytes at a time, `read(n)`. You can read a line at a time with `readline()`, and all the lines into a list of strings with `readlines()`. Similar methods exist for writing.

You must close the file after you finish using it.

But as you might have expected, you can treat a file just like a list even more idiomatically, as we shall see.

In [37]:
fd = open("data/Julius Caesar.txt")
counter = 0
for line in fd:
    if counter < 10: # print first 10 lines
        print("<<", line, ">>")
    counter = counter + 1 # also writeable as counter += 1
fd.close()

Notice that the newlines remain. You can use the string method `strip` to remove them.

In [41]:
fd = open("data/Julius Caesar.txt")
counter = 0
for line in fd:
    if counter < 10: # print first 10 lines
        print("<<", line.strip(), ">>")
    else:
        break # break out of for loop
    counter = counter + 1 # also writeable as counter += 1
fd.close()
print(counter)

Above we added a `break` statement in the for loop which ended our iteration through the file. You can use `readlines()` here but it will read the entire file into memory.

In [42]:
fd = open("data/Julius Caesar.txt")
lines = [line.strip() for line in fd.readlines()]
fd.close()
print(len(lines))

### What about writing?

Lets write the first ten lines out...

In [44]:
fd = open("data/Julius Caesar.txt")
fd2 = open("data/julfirst10.txt", "w")
counter = 0
for line in fd:
    if counter < 10: # print first 10 lines
        print("<<", line.strip(), ">>")
        fd2.write(line)
    else:
        break # break out of for loop
    counter = counter + 1 # also writeable as counter += 1
fd.close()
fd2.close()
print(counter)

## A working Example

In [48]:
## Read a file, parse lines, and get all words

# make a list with all words in documents
# the words can occur more than once
wordlist = []  
fd = open("data/Julius Caesar.txt")
lines = fd.readlines()
fd.close()
# strip newline characters and other whitespace off the edges
cleaned_lines = [line.strip() for line in lines] 
# make a list of lists. 
# each inner list if the list of words on that line
list_of_lines_words = [line.split() for line in lines]
# Take each list of words, and get all the words
for lines_words in list_of_lines_words:
    wordlist = wordlist + lines_words # update the wordlist using the new list.
print(wordlist[:1000]) # first 1000 words