# (3D) Lists and words
## (3D-1) Lists

In this notebook, we'll learn:

* What **lists** are
* How to create lists
* How to find things in lists
* How to edit lists
* How to add/remove things from lists

## Intro to lists

So far we've been working primarily with strings. Strings are the ground zero for text mining: they represent the sequences of letters, spaces, punctuation marks that form the very body of the text.

But we've also seen some limitations of strings. When we try to count "words" in strings, we realize that Python has no concept of a "word." All strings are Greek to Python; for Python every string is just a sequence of hieroglyphs (to shift metaphors across the Mediterreanean). So when we count the occurrences of "she" in a string, Python has no idea that "she" is a word, that it should be counted while "sheer" should not.

For that, we need **lists**: a *sequence* of elements, each of which is distinct.

So instead of the string representation of: 

```python
text_as_string = "Perhaps people thought that doom could be pushed forward and away..."
```
    
We can move to a list representation:

```python
text_as_list = ['Perhaps', 'people', 'thought', 'that', 'doom', 'could', 'be', 'pushed', 'forward', 'and', 'away', '...']
```

Each element in this list is distinct. So while to count "she" in text_as_string will yield 1 (in "pu**she**d"), to count "she" in text_as_list will correctly return 0.

In [None]:
# Let's test this out
text_as_string = "Perhaps people thought that doom could be pushed forward and away..."
text_as_list = ['Perhaps', 'people', 'thought', 'that', 'doom', 'could', 'be', 'pushed', 'forward', 'and', 'away', '...']

print('Number of "she" in text_as_string:',text_as_string.count('she'))
print('Number of "she" in text_as_list:',text_as_list.count('she'))

## How to create lists

The syntax for creating a list is using **square brackets**.

```python
my_list = [thing1, thing2, thing3, ..., thingN]
```

Use commas to separate elements in the list.

In [None]:
# Let's define a list and print it

the_question = ['to','be','or','not','to','be']

print(the_question)

In [None]:
## @TODO: Make a new list and print it
#


In [3]:
# Lists can include any kind of data

my_list = [1, 2, 3, 'hello', 4.21414]
my_list

[1, 2, 3, 'hello', 4.21414]

In [4]:
# Lists can even include other lists!

my_meta_list = ['A','B',my_list,'C']
my_meta_list

['A', 'B', [1, 2, 3, 'hello', 4.21414], 'C']

## How to slice lists

The same way as slicing strings!

In [5]:
# Slice the first word from Hamlet

the_question[0]

'to'

In [6]:
## @TODO: Slice the last word from Hamlet
#

the_question[-1]

'be'

In [7]:
## @TODO: Slice the first two words
#

the_question[:2]

['to', 'be']

In [8]:
## @TODO: Slice the last two words
#
the_question[-2:]

['to', 'be']

## Finding things in lists

The same way as in strings!

In [10]:
the_question.index('to')

0

In [11]:
the_question[0]

'to'

In [13]:
the_question.index('to',1)

4

In [14]:
the_question[4]

'to'

In [15]:
the_question.index('be')

1

In [16]:
the_question.index('be',2)

5

In [18]:
## @TODO: Find the index or 'not'
# Then print the two words to left and right of 'not'
#

index = the_question.index('not')

the_question[index-2 : index+2+1]

['be', 'or', 'not', 'to', 'be']

## How to edit lists

In [19]:
# Let's change what Hamlet said

the_question[-1]='pee'

print(the_question)

['to', 'be', 'or', 'not', 'to', 'pee']


In [21]:
## @TODO: Change the other 'be' in `the_question` and print the list
#

index_be = the_question.index('be')
the_question[index_be] = 'pee'

the_question

['to', 'pee', 'or', 'not', 'to', 'pee']

## How to add things to lists

### Method 1: `list.append(thing)`

We can use `append()` to add individual things to lists.

In [22]:
the_abyss = ['And','if','you','gaze','long','into','an','abyss',',']
print(the_abyss)

the_abyss.append('the')
print(the_abyss)

the_abyss.append('abyss')
print(the_abyss)

the_abyss.append('also')
print(the_abyss)

the_abyss.append('gazes')
print(the_abyss)

the_abyss.append('into')
print(the_abyss)

the_abyss.append('you')
print(the_abyss)

the_abyss.append('.')
print(the_abyss)

['And', 'if', 'you', 'gaze', 'long', 'into', 'an', 'abyss', ',']
['And', 'if', 'you', 'gaze', 'long', 'into', 'an', 'abyss', ',', 'the']
['And', 'if', 'you', 'gaze', 'long', 'into', 'an', 'abyss', ',', 'the', 'abyss']
['And', 'if', 'you', 'gaze', 'long', 'into', 'an', 'abyss', ',', 'the', 'abyss', 'also']
['And', 'if', 'you', 'gaze', 'long', 'into', 'an', 'abyss', ',', 'the', 'abyss', 'also', 'gazes']
['And', 'if', 'you', 'gaze', 'long', 'into', 'an', 'abyss', ',', 'the', 'abyss', 'also', 'gazes', 'into']
['And', 'if', 'you', 'gaze', 'long', 'into', 'an', 'abyss', ',', 'the', 'abyss', 'also', 'gazes', 'into', 'you']
['And', 'if', 'you', 'gaze', 'long', 'into', 'an', 'abyss', ',', 'the', 'abyss', 'also', 'gazes', 'into', 'you', '.']


In [23]:
## @TODO: Make a knock knock joke list,
# adding each element of the joke to the list using append(),
# and printing after each append
# 
# choose from:
# 1) Knock knock! | Who's there? | Cash | Cash who? | No thanks, I'll have some peanuts.
# 2) Knock, Knock | Who’s there? | A little old lady | A little old lady who? | Wow, I didn’t know you could yodel!
# 3) ?

joke = ['Knock Knock!']
print(joke)

joke.append("Who's there?")
print(joke)

joke.append("Cash")
print(joke)

joke.append("Cash who?")
print(joke)

joke.append("No thanks, I'll have some peanuts")
print(joke)

['Knock Knock!']
['Knock Knock!', "Who's there?"]
['Knock Knock!', "Who's there?", 'Cash']
['Knock Knock!', "Who's there?", 'Cash', 'Cash who?']
['Knock Knock!', "Who's there?", 'Cash', 'Cash who?', "No thanks, I'll have some peanuts"]


### Method 2: `list.extend([multiple, things])`

We can use `extend()` to add *multiple* things to a list.

In [24]:
# We can use extend

the_abyss = ['And','if','you','gaze','long','into','an','abyss',',']
print(the_abyss)

the_abyss.extend(['the', 'abyss', 'also', 'gazes'])
print(the_abyss)

the_abyss.extend(['into', 'you', '.'])
print(the_abyss)

['And', 'if', 'you', 'gaze', 'long', 'into', 'an', 'abyss', ',']
['And', 'if', 'you', 'gaze', 'long', 'into', 'an', 'abyss', ',', 'the', 'abyss', 'also', 'gazes']
['And', 'if', 'you', 'gaze', 'long', 'into', 'an', 'abyss', ',', 'the', 'abyss', 'also', 'gazes', 'into', 'you', '.']


In [26]:
## @TODO: Extend "rhyme_scheme" by each quatrain of a Shakespearean sonnet and print after each extend

quatrain1=['a','b','a','b']
quatrain2=['c','d','c','d']
quatrain3=['e','f','e','f']
couplet  =['g','g']

rhyme_scheme = []

rhyme_scheme.extend(quatrain1)
rhyme_scheme.extend(quatrain2)
rhyme_scheme.extend(quatrain3)
rhyme_scheme.extend(couplet)

print(rhyme_scheme)

['a', 'b', 'a', 'b', 'c', 'd', 'c', 'd', 'e', 'f', 'e', 'f', 'g', 'g']


### Method 3: list addition

We can also add lists: `AB = A + B`

In [27]:
A = [1,2,3,4,5]
B = [6,7,8,9,10]

print(A)
print(B)
print(A + B)

[1, 2, 3, 4, 5]
[6, 7, 8, 9, 10]
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]


In [28]:
## @TODO: Rebuild rhyme_scheme using addition
#

rhyme_scheme = quatrain1 + quatrain2 + quatrain3 + couplet

print(rhyme_scheme)

['a', 'b', 'a', 'b', 'c', 'd', 'c', 'd', 'e', 'f', 'e', 'f', 'g', 'g']


In [30]:
# There's also IN-PLACE addition using "+="

rhyming_couplets=[]
print(rhyming_couplets)

rhyming_couplets+=['a','a']
print(rhyming_couplets)

rhyming_couplets+=['b','b']
print(rhyming_couplets)

rhyming_couplets+=['c','c']
print(rhyming_couplets)

[]
['a', 'a']
['a', 'a', 'b', 'b']
['a', 'a', 'b', 'b', 'c', 'c']


In [35]:
string='hello'

In [40]:
string+='hello'

print(string)

hellohellohellohellohellohello


### How to remove things from lists

In [None]:
hamlet_again = ['to','be','or','not','to','be']

print(hamlet_again)

In [None]:
hamlet_again.remove('be')
print(hamlet_again)

### Careful! Lists are "mutable"

In [43]:
###
# Careful! Lists are "mutable". This means they can change in-place and still be themselves
###

# Make a list called rose
rose = ['A', 'rose', 'by', 'any', 'other', 'name', 'would', 'smell', 'as', 'sweet']

# Assign the variable 'buttflower' to the contents of 'rose'
buttflower = list(rose)

# Print the contents of rose and buttflower
print('Before editing:')
print('rose =',rose)
print('buttflower =',buttflower)

# Change only buttflower
buttflower.append('AS')
buttflower.append('FEET')

# Print the contents of rose and buttflower
print()
print('After editing buttflower:')
print('rose =',rose)
print('buttflower =',buttflower)

Before editing:
rose = ['A', 'rose', 'by', 'any', 'other', 'name', 'would', 'smell', 'as', 'sweet']
buttflower = ['A', 'rose', 'by', 'any', 'other', 'name', 'would', 'smell', 'as', 'sweet']

After editing buttflower:
rose = ['A', 'rose', 'by', 'any', 'other', 'name', 'would', 'smell', 'as', 'sweet']
buttflower = ['A', 'rose', 'by', 'any', 'other', 'name', 'would', 'smell', 'as', 'sweet', 'AS', 'FEET']


In [42]:
# Compare this to strings! Strings are 'immutable'

rose_str = 'A rose by any other name would smell as sweet'
buttflower_str = rose_str

# Print the contents of rose and buttflower
print('Before editing:')
print('rose_str =',rose_str)
print('buttflower_str =',buttflower_str)

# Change only buttflower_str
buttflower_str = buttflower_str + ' AS FEET'

# Print the contents of rose and buttflower
print()
print('After editing buttflower_str:')
print('rose_str =',rose_str)
print('buttflower_str =',buttflower_str)

Before editing:
rose_str = A rose by any other name would smell as sweet
buttflower_str = A rose by any other name would smell as sweet

After editing buttflower_str:
rose_str = A rose by any other name would smell as sweet
buttflower_str = A rose by any other name would smell as sweet AS FEET
