In [1]:
# Remember to execute this cell with Shift+Enter

import jupman

# Lists 4 - Search methods


## [Download exercises zip](../_static/generated/lists.zip)

[Browse files online](https://github.com/DavidLeoni/softpython-en/tree/master/lists)

Lists offer several different methods to perform searches and transformations inside them, but beware: the power is nothing without control! Sometimes you might feel the need to use them, but very often they hide traps you will later regret. So whenever you write code with one of these methods, **always ask yourself the questions we will stress**.

|Method|Returns|Description|
|-------|------|-----------|
|[str1.split(str2)](#split-method---from-strings-to-lists)|`list`|Produces a list with all the words in str1 separated from str2| 
|[list.count(obj)](#count-method)|`int`|Counts the occurrences of an element|
|[list.index(obj)](#index-method)|`int`| Searches for the first occurence of an element and returns its position|
|[list.remove(obj)](#remove-method)|`None`|Removes the first occurrence of an element|

## What to do

1. Unzip [exercises zip](../_static/generated/lists.zip) in a folder, you should obtain something like this:

```
lists
    lists1.ipynb    
    lists1-sol.ipynb         
    lists2.ipynb
    lists2-sol.ipynb
    lists3.ipynb
    lists3-sol.ipynb    
    lists4.ipynb
    lists4-sol.ipynb    
    lists5-chal.ipynb
    jupman.py         
```

<div class="alert alert-warning">

**WARNING: to correctly visualize the notebook, it MUST be in an unzipped folder !**
</div>

2. open Jupyter Notebook from that folder. Two things should open, first a console and then a browser. The browser should show a file list: navigate the list and open the notebook `lists4.ipynb`

3. Go on reading the exercises file, sometimes you will find paragraphs marked **Exercises** which will ask to write Python commands in the following cells.

Shortcut keys:

- to execute Python code inside a Jupyter cell, press `Control + Enter`

- to execute Python code inside a Jupyter cell AND select next cell, press `Shift + Enter`

- to execute Python code inside a Jupyter cell AND a create a new cell aftwerwards, press `Alt + Enter`

- If the notebooks look stuck, try to select `Kernel -> Restart`

## `split` method - from strings to lists

The `split` method of strings does the opposite of `join`: it's called on a string, and a separater is passed as parameter, which can be a single character or a substring. The result is a list of strings without the separator.

In [2]:
"Finally the pirates shared the treasure".split("the")

['Finally ', ' pirates shared ', ' treasure']

By calling `split` without arguments generic _blanks_ are used as separators (space, `\n`, tab `\t`, etc)

In [3]:
s = "Finally the\npirates\tshared     the treasure"
print(s)

Finally the
pirates	shared     the treasure


In [4]:
s.split()

['Finally', 'the', 'pirates', 'shared', 'the', 'treasure']

It's also possible to limit the number of elements to split by specifying the parameter `maxsplit`:

In [5]:
s.split(maxsplit=2)

['Finally', 'the', 'pirates\tshared     the treasure']

<div class="alert alert-warning">
    
**WARNING**: What happens if the string does _not_ contain the separator? Remember to also consider this case!
</div>

In [6]:
"I talk and overtalk and I never ever take a break".split(',')

['I talk and overtalk and I never ever take a break']

**QUESTION**: Look at thie cose. Will it print something? Or will it produce an error?

1.  ```python
    "revolving\tdoor".split()
    ```
1.  ```python
    "take great\t\ncare".split()    
    ```
1.  ```python
    "do not\tforget\nabout\tme".split('\t')    
    ```    
1.  ```python
    "non ti scordar\ndi\tme".split(' ')
    ```
1.  ```python
    "The Guardian of the Abyss stared at us".split('abyss')[1]
    ```    
1.  ```python
    "".split('abyss')[0]
    ```    
1.  ```python
    "abyss_OOOO_abyss".split('abyss')[0]
    ```

### Exercise - trash dance

You've been hired to dance in the last video of the notorious band _Melodic Trash_. You can't miss this golden opportunity. Excited, you start reading the score, but you find a lot of errors - of course the band doesn't  need to know about writing scores to get tv time. There are strange symbols, and the last bar is too long (after the sixth bar) and needs to be put one row at a time. Write some code which fixes the score in a list `dance`.

* **DO NOT** write string constants from the input in your code (so no `"Ra Ta Pam"` ...)

Example - given:

```python
music = "Zam Dam\tZa Bum Bum\tZam\tBam To Tum\tRa Ta Pam\tBar Ra\tRammaGumma  Unza\n\t\nTACAUACA \n BOOMBOOM!"
```

after your code it must result:

```python
>>> print(dance)
['Zam Dam',
 'Za Bum Bum',
 'Zam',
 'Bam To Tum',
 'Ra Ta Pam',
 'Bar Ra',
 'RammaGumma',
 'Unza',
 'TACAUACA',
 'BOOMBOOM!']
```

In [7]:


music = "Zam Dam\tZa Bum Bum\tZam\tBam To Tum\tRa Ta Pam\tBar Ra\tRammaGumma  Unza\n\t\nTACAUACA \n BOOMBOOM!"

# write here

dance = music.split('\t',maxsplit=6)
dance = dance[:-1] + dance[-1].split()
dance

['Zam Dam',
 'Za Bum Bum',
 'Zam',
 'Bam To Tum',
 'Ra Ta Pam',
 'Bar Ra',
 'RammaGumma',
 'Unza',
 'TACAUACA',
 'BOOMBOOM!']

In [7]:


music = "Zam Dam\tZa Bum Bum\tZam\tBam To Tum\tRa Ta Pam\tBar Ra\tRammaGumma  Unza\n\t\nTACAUACA \n BOOMBOOM!"

# write here



### Exercise - Trash in tour

The _Melodic Trash_ band strikes again! In a new tour they present the summer hits. The records company only provides the sales numbers in angosaxon format, so before communicating them to Italian media we need a conversion.

Write some code which given the `hits` and a `position` in the hit parade, (from `1` to `4`),  prints the sales number.

- **NOTE**: commas must be substituted with dots

Example - given:

```python
hits = """6,230,650 - I love you like the moldy tomatoes in the fridge
2,000,123 - The pain of living filthy rich
100,000 - Groupies are never enough
837 - Do you remember the trashcans in the summer..."""

position = 1   # the tomatoes
#position = 4  # the trashcans
```

Prints:
```

Number 1 in hit parade "I love you like the moldy tomatoes in the fridge" sold 6.230.650 copies
```

In [8]:

hits = """6,230,650 - I love you like the moldy tomatoes in the fridge
2,000,123 - The pain of living filthy rich
100,000 - Groupies are never enough
837 - Do you remember the trashcans in the summer..."""

position = 1   # the tomatoes
#position = 4  # the trashcans

# write here

lst = hits.split('\n')
ext = lst[position-1].split(' - ')

print("Number", position, "in hit parade", '"' + ext[1] + '"', 
      'sold', '.'.join(ext[0].split(',')), 'copies')

Number 1 in hit parade "I love you like the moldy tomatoes in the fridge" sold 6.230.650 copies


In [8]:

hits = """6,230,650 - I love you like the moldy tomatoes in the fridge
2,000,123 - The pain of living filthy rich
100,000 - Groupies are never enough
837 - Do you remember the trashcans in the summer..."""

position = 1   # the tomatoes
#position = 4  # the trashcans

# write here



### Exercise - manylines

Given the following string of text:

```python
"""This is a string
of text on
several lines which tells nothing."""
```

1. print it

2. prints how many lines, words and characters it contains

3. sort the words in alphabetical order and print the first and last ones in lexicographical order

You should obtain:

```
This is a string
of text on
several lines which tells nothing.

Lines: 3   words: 12   chars: 62

['T', 'h', 'i', 's', ' ', 'i', 's', ' ', 'a', ' ', 's', 't', 'r', 'i', 'n', 'g', '\n', 'o', 'f', ' ', 't', 'e', 'x', 't', ' ', 'o', 'n', '\n', 's', 'e', 'v', 'e', 'r', 'a', 'l', ' ', 'l', 'i', 'n', 'e', 's', ' ', 'w', 'h', 'i', 'c', 'h', ' ', 't', 'e', 'l', 'l', 's', ' ', 'n', 'o', 't', 'h', 'i', 'n', 'g', '.']
62

First word: This
Last word : which
['This', 'a', 'is', 'lines', 'nothing.', 'of', 'on', 'several', 'string', 'tells', 'text', 'which']
```

In [9]:


s = """This is a string
of text on
several lines which tells nothing."""

# write here

# 1) print
print(s)
print("")

# 2) prints the lines, words and characters
lines = s.split('\n')

# NOTE: words are separated by a space or a newline

words = lines[0].split(' ') + lines[1].split(' ') + lines[2].split(' ')
num_chars = len(s)
print("Lines:", len(lines), "  words:", len(words), "  chars:", num_chars)

# alternative method for number of characters
print("")
characters = list(s)
num_chars2 = len(characters)
print(characters)
print(num_chars2)

# 3. alphabetically order the words and prints the first and last one in lexicographical order

words.sort() # NOTE: it returns NOTHING !!!!
print("")
print("First word:", words[0])
print("Last word :", words[-1])
print(words)

This is a string
of text on
several lines which tells nothing.

Lines: 3   words: 12   chars: 62

['T', 'h', 'i', 's', ' ', 'i', 's', ' ', 'a', ' ', 's', 't', 'r', 'i', 'n', 'g', '\n', 'o', 'f', ' ', 't', 'e', 'x', 't', ' ', 'o', 'n', '\n', 's', 'e', 'v', 'e', 'r', 'a', 'l', ' ', 'l', 'i', 'n', 'e', 's', ' ', 'w', 'h', 'i', 'c', 'h', ' ', 't', 'e', 'l', 'l', 's', ' ', 'n', 'o', 't', 'h', 'i', 'n', 'g', '.']
62

First word: This
Last word : which
['This', 'a', 'is', 'lines', 'nothing.', 'of', 'on', 'several', 'string', 'tells', 'text', 'which']


In [9]:


s = """This is a string
of text on
several lines which tells nothing."""

# write here



### Exercise - takechars

✪ Given a `phrase` which contains **exactly** 3 words and has **always** as a central word a number $n$, write some code which PRINTS the first $n$ characters of the third word.

Example - given:

```python
phrase = "Take 4 letters"
```

your code must print:

```
lett
```


In [10]:

phrase = "Take 4 letters"        # lett
#phrase= "Getting 5 caratters"   # carat
#phrase= "Take 10 characters"    # characters

# write here
words = phrase.split()
n = int(words[1])
print(words[2][:n])

lett


In [10]:

phrase = "Take 4 letters"        # lett
#phrase= "Getting 5 caratters"   # carat
#phrase= "Take 10 characters"    # characters

# write here



## `count` method

We can find the number of occurrences of a certain element in a list by using the method `count`

In [11]:
la = ['a', 'n', 'a', 'c', 'o', 'n', 'd', 'a']

In [12]:
la.count('n')

2

In [13]:
la.count('a')

3

In [14]:
la.count('d')

1

### Do not abuse count

<div class="alert alert-warning">
    
**WARNING**: `count` **is often used in a wrong / inefficient ways**

Always ask yourself:
    
1. Could the list contain duplicates? Remember they will get counted!
2. Could the list contain _no_ duplicate? Remember to also handle this case!    
3. `count` performs a search on all the list, which could be inefficient: is it really needed, or do we already know the interval where to search?
</div>

**QUESTION**: Look at the following code fragments, and for each of them try guessing the result (or if it produces an error)

1.  ```python
    ['A','aa','a','aaAah',"a", "aaaa"[1], " a "].count("a")
    ```    
1.  ```python
    ["the", "punishment", "of", "the","fools"].count('Fools') == 1
    ```
1.  ```python
    lst = ['oasis','date','oasis','coconut','date','coconut']
    print(lst.count('date') == 1)
    ```
1.  ```python
    lst = ['oasis','date','oasis','coconut','date','coconut']
    print(lst[4] == 'date')
    ```
1.  ```python
    ['2',2,"2",2,float("2"),2.0, 4/2,"1+1",int('3')-float('1')].count(2)
    ```    
1.  ```python
    [].count([])
    ```
1.  ```python
    [[],[],[]].count([])
    ```    

### Exercise - country life

Given a list `country`, write some code which prints `True` if the first half contains a number of elements `el1` equal to the number of elements `el2` in the second half.

In [15]:

el1,el2 = 'shovels', 'hoes'          # True
#el1,el2 = 'shovels', 'shovels'      # False
#el1,el2 = 'wheelbarrows', 'plows'   # True
#el1,el2 = 'shovels', 'wheelbarrows' # False

country = ['plows','wheelbarrows', 'shovels',      'wheelbarrows', 'shovels','hoes', 'wheelbarrows',
           'hoes', 'plows',        'wheelbarrows', 'plows',        'shovels','plows','hoes']

# write here
mid = len(country)//2
country[:mid].count(el1) == country[mid:].count(el2)

True

In [15]:

el1,el2 = 'shovels', 'hoes'          # True
#el1,el2 = 'shovels', 'shovels'      # False
#el1,el2 = 'wheelbarrows', 'plows'   # True
#el1,el2 = 'shovels', 'wheelbarrows' # False

country = ['plows','wheelbarrows', 'shovels',      'wheelbarrows', 'shovels','hoes', 'wheelbarrows',
           'hoes', 'plows',        'wheelbarrows', 'plows',        'shovels','plows','hoes']

# write here



## `index` method

The `index` method allows us to find the index of the FIRST occurrence of an element.

In [16]:
#      0   1   2   3   4   5 
la = ['p','a','e','s','e']

In [17]:
la.index('p')

0

In [18]:
la.index('a')  

1

In [19]:
la.index('e')  # we find the FIRST occurrence

2

If the element we're looking for is not present, we will get an error:



```python
>>> la.index('z')

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-303-32d9c064ebe0> in <module>
----> 1 la.index('z')

ValueError: 'z' is not in list

```

Optionally, you can specify an index to start from (**included**):

In [20]:
# 0   1   2   3   4   5   6   7   8   9   10
['a','c','c','a','p','a','r','r','a','r','e'].index('a',6)

8

And also where to end (**excluded**):

```python
# 0   1   2   3   4   5   6   7   8   9   10
['a','c','c','a','p','a','r','r','a','r','e'].index('a',6,8)

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-17-7f344c26b62e> in <module>
      1 # 0   1   2   3   4   5   6   7   8   9   10
----> 2 ['a','c','c','a','p','a','r','r','a','r','e'].index('a',6,8)

ValueError: 'a' is not in list
```


### Do not abuse index

<div class="alert alert-warning">
    
**WARNING**: `index` **is often used in a wrong / inefficient ways**

Always ask yourself:
    
1. Could the list contain duplicates? Remember only the _first_ will be found!
2. Could the list _not_ contain  the searched element?  Remember to also handle this case!
3. `index` performs a search on all the list, which could be inefficient: is it really needed, or do we already know the interval where to search?
4. If we want to know if an `element` is in a position we already know, `index` is useless, it's enough to write `my_list[3] == element`. If you used `index`, it could discover duplicate characters which are _before_ or _after_ the one we are interested in!
</div>

**QUESTION**: Look at the following code fragments, and for each one try guessing the result it produces (or if it gives error).

1.  ```python
    ['arc','boat','hollow','dune'].index('hollow') == ['arc','boat','hollow','dune'].index('hollow',1)
    ```
1.  ```python
    ['azure','blue','sky blue','smurfs'][-1:].index('sky blue')
    ```    
1.  ```python
    road = ['asphalt','bitumen','cement','gravel']
    print('mortar' in road or road.index('mortar'))
    ```    
1.  ```python
    road = ['asphalt','bitumen','cement','gravel']
    print('mortar' in road and road.index('mortar'))
    ```
1.  ```python
    road = ['asphalt','bitumen','mortar','gravel']
    print('mortar' in road and road.index('mortar'))
    ```
1.  ```python
    la = [0,5,10]
    la.reverse()
    print(la.index(5) > la.index(10))
    ```

### Exercise - Spatoč

In the past you met the Slavic painter Spatoč when he was still dirt poor. He gifted you with 2 or 3 paintings (you don't remember) of dubious artistic value that you hid in the attic, but now watching TV you just noticed that Spatoč has gained international fame. You run to the attic to retrieve the paintings, which are lost among junk. Every painting is contained in a `[ ]` box, but you don't know in which rack it is. Write some code which prints where they are.

- racks are **numbered from 1**. If the third painting was not found, print `0`.
- **DO NOT** use loops nor `if`
- **HINT**: printing first two is easy - to print the last one have a look at [Booleans - evaluation order](https://en.softpython.org/basics/basics2-bools-sol.html#Evaluation-order)

Example 1 - given:

In [21]:
      #  1      2           3             4             5          
attic = [3,    '\\',       ['painting'], '---',        ['painting'], 
      #  6      7           8             9             10
         5.23, ['shovel'], ['ski'],      ["painting"], ['lamp']]

prints:

```
rack of first painting : 3
rack of second painting: 5
rack of third painting : 9
```

Example 2 - given:

In [22]:
        # 1           2     3       4            5          6          7
attic = [['painting'],'--',['ski'],['painting'],['statue'],['shovel'],['boots']]

prints

```
rack of first painting : 1
rack of second painting: 4
rack of third painting : 0
```

In [23]:

      #  1 2     3           4      5           6     7          8       9             10
attic = [3,'\\',['painting'],'---',['painting'],5.23,['shovel'],['ski'],['painting'], ['lamp']]
#  3,5,9
         # 1           2     3       4            5          6          7
#attic = [['painting'],'--',['ski'],['painting'],['statue'],['shovel'],['boots']]
#  1,4,0

# write here

i1 = attic.index(['painting'])
print("rack of first painting :", i1+1)
i2 = attic.index(['painting'], i1+1)
print("rack of second painting:", i2+1)
i3 = int(['painting'] in attic[i2+1:]) and (attic.index(['painting'], i2+1) + 1)
print("rack of third painting :", i3)

rack of first painting : 3
rack of second painting: 5
rack of third painting : 9


In [23]:

      #  1 2     3           4      5           6     7          8       9             10
attic = [3,'\\',['painting'],'---',['painting'],5.23,['shovel'],['ski'],['painting'], ['lamp']]
#  3,5,9
         # 1           2     3       4            5          6          7
#attic = [['painting'],'--',['ski'],['painting'],['statue'],['shovel'],['boots']]
#  1,4,0

# write here



## `remove` method

`remove` takes an object as parameter, searches for the FIRST cell containing that object and eliminates it:

In [24]:
#     0 1 2 3 4 5
la = [6,7,9,5,9,8]   # the 9 is in the first cell with index 2 and 4

In [25]:
la.remove(9)   # searches first cell containing 9

In [26]:
la

[6, 7, 5, 9, 8]

As you can see, the cell which was at index 2 and that contained the FIRST occurrence of `9` has been eliminated. The cell containing the SECOND occurrence of `9` is still there.

If you try removing an object which is not present, you will receive an error:


```python
la.remove(666)

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-121-5d04a71f9d33> in <module>
----> 1 la.remove(666)

ValueError: list.remove(x): x not in list
```

### Do not abuse remove

<div class="alert alert-warning">
    
**WARNING**: `remove` **is often used in a wrong / inefficient ways**

Always ask yourself:
    
1. Could the list contain duplicates? Remember only the _first_ will be removed!
2. Could the list _not_ contain  the searched element?  Remember to also handle this case!
3. `remove` performs a search on all the list, which could be inefficient: is it really needed, or do we already know the position `i` where the element to be removed is? In such case it's much better using `.pop(i)`
</div>

**QUESTION**: Look at the following code fragments, and for each try guessing the result (or if it produces an error).

1.  ```python
    la = ['a','b','c','b']
    la.remove('b')
    print(la)
    ```
1.  ```python
    la = ['a','b','c','b']
    x = la.remove('b')
    print(x)
    print(la)
    ```    
1.  ```python
    la = ['a','d','c','d']
    la.remove('b')
    print(la)
    ```
1.  ```python
    la = ['a','bb','c','bbb']
    la.remove('b')
    print(la)
    ```        
1.  ```python
    la = ['a','b','c','b']
    la.remove('B')    
    print(la)
    ```    
1.  ```python
    la = ['a',9,'99',9,'c',str(9),'999']
    la.remove("9")    
    print(la)
    ```
1.  ```python    
    la = ["don't", "trick","me"]
    la.remove("don't").remove("trick").remove("me")
    print(la)
    ```
1.  ```python
    la = ["don't", "trick","me"]
    la.remove("don't")
    la.remove("trick")
    la.remove("me")
    print(la)
    ```    
1.  ```python
    la = [4,5,7,10]
    11 in la or la.remove(11)    
    print(la)
    ```     
1.  ```python
    la = [4,5,7,10]
    11 in la and la.remove(11)    
    print(la)
    ``` 
1.  ```python
    la = [4,5,7,10]
    5 in la and la.remove(5)
    print(la)
    ```    
1.  ```python
    la = [9, [9], [[9]], [[[9]]] ]
    la.remove([9])
    print(la)
    ```        
1.  ```python
    la = [9, [9], [[9]], [[[9]]] ]
    la.remove([[9]])
    print(la)
    ```        

### Exercise - nob

Write some code which removes from list `la` all the numbers contained in the 3 elements list `lb`.

* your code must work with any list `la` and `lb` of three elements

* you can assume that list `la` contains exactly TWO occurrences of all the elements of `lb` (plus also other numbers)

Example -  given:

```python
lb = [8,7,4]
la = [7,8,11,8,7,4,5,4]
```

after your code it must result:

```python
>>> print(la)
[11, 5]
```

In [27]:

lb = [8,7,4]
la = [7,8,11,8,7,4,5,4]

# write here

la.remove(lb[0])
la.remove(lb[0])
la.remove(lb[1])
la.remove(lb[1])
la.remove(lb[2])
la.remove(lb[2])
print(la)

[11, 5]


In [27]:

lb = [8,7,4]
la = [7,8,11,8,7,4,5,4]

# write here



## Continue

Go on with [first challenges](https://en.softpython.org/lists/lists5-chal.html)