## Strings

Strings have already been discussed in Chapter 02, but can also be treated as collections similar to lists and tuples.
For example

In [1]:
S = 'The Taj Mahal is beautiful'
print([x for x in S if x.islower()]) # list of lower case charactes
words = S.split() # list of words
print("Words are:",words)
print("--".join(words)) # hyphenated 
" ".join(w.capitalize() for w in words) # capitalise words

['h', 'e', 'a', 'j', 'a', 'h', 'a', 'l', 'i', 's', 'b', 'e', 'a', 'u', 't', 'i', 'f', 'u', 'l']
Words are: ['The', 'Taj', 'Mahal', 'is', 'beautiful']
The--Taj--Mahal--is--beautiful


'The Taj Mahal Is Beautiful'

String Indexing and Slicing are similar to Lists which was explained in detail earlier.

In [2]:
print(S[4])
print(S[4:])

T
Taj Mahal is beautiful


## Set

**Set** is an unordered collection of unique items. Set is defined by values separated by comma inside braces `{ }`. Items in a set are not ordered.

In [3]:
S = {5,2,3,1,4}

# printing set variable
print("S = ", S)

# data type of variable a
print(type(S))

S =  {1, 2, 3, 4, 5}
<class 'set'>


In [4]:
P = {2,5,1,4,3}
print(S==P)

True


In [5]:
R = {1,1,2,3,3,4,5,5,5}
print(R==S)

True


We can perform set operations like union, intersection on two sets. Sets have unique values. They eliminate duplicates.

In [6]:
a = {1,2,2,3,3,3}
print(a)

{1, 2, 3}


Since, set are unordered collection, indexing has no meaning. Hence, the slicing operator `[]` does not work.

In [7]:
a = {1,2,3}
a[1]

TypeError: ignored

To add one item to a set use the `add()` method.

To add more than one item to a set use the `update()` method.

In [8]:
thisset = {"apple", "banana", "cherry"}

thisset.add("orange")

print(thisset)

{'banana', 'cherry', 'orange', 'apple'}


In [9]:
thisset = {"apple", "banana", "cherry"}

thisset.update(["orange", "mango", "grapes"])

print(thisset)

{'mango', 'apple', 'orange', 'grapes', 'banana', 'cherry'}


To determine how many items a set has, use the `len()` method.

In [10]:
thisset = {"apple", "banana", "cherry"}

print(len(thisset))

3


To remove an item in a set, use the `remove()`, or the `discard()` method.

In [11]:
thisset = {"apple", "banana", "cherry"}

thisset.remove("banana")

print(thisset)

{'cherry', 'apple'}


In [12]:
thisset = {"apple", "banana", "cherry"}

thisset.discard("banana")

print(thisset)

{'cherry', 'apple'}


Remove the last item by using the `pop()` method:

In [13]:
thisset = {"apple", "banana", "cherry"}

x = thisset.pop()

print(x)
print(thisset)

banana
{'cherry', 'apple'}


Similar with the case for list, the `clear()` method empties the set:

In [14]:
thisset = {"apple", "banana", "cherry"}

thisset.clear()

print(thisset)

set()


The `del` keyword will delete the set completely:

In [15]:
thisset = {"apple", "banana", "cherry"}

del thisset

print(thisset)

NameError: ignored

You can use the `union()` method that returns a new set containing all items from both sets, or the `update()` method that inserts all the items from one set into another:

In [16]:
set1 = {"a", "b" , "c", 1, 3}
set2 = {1, 2, 3, "b", "a", "f"}

set3 = set1.union(set2)
print(set3)

{1, 2, 3, 'c', 'a', 'f', 'b'}


In [17]:
set1 = {"a", "b" , "c", 1, 3}
set2 = {1, 2, 3, "b", "a", "f"}

set1.update(set2)
print(set1)

{1, 2, 3, 'c', 'a', 'f', 'b'}


Intersection of Sets:
The intersection operation on two sets produces a new set containing only the common elements from both the sets.

In [19]:
DaysA = set(["Mon","Tue","Wed"])
DaysB = set(["Wed","Thu","Fri","Sat","Sun"])
AllDays = DaysA & DaysB
alldays = DaysA.intersection(DaysB)
print(AllDays)
print(alldays)

{'Wed'}
{'Wed'}


Difference of Sets:
The difference operation on two sets produces a new set containing only the elements from the first set and none from the second set

In [20]:
DaysA = set(["Mon","Tue","Wed"])
DaysB = set(["Wed","Thu","Fri","Sat","Sun"])
AllDays = DaysA - DaysB
print(AllDays)

{'Tue', 'Mon'}


## Dictionaries

Dictionaries are mappings between keys and items stored in the dictionaries. Alternatively one can think of dictionaries as sets in which something stored against every element of the set. They can be defined as follows:

To define a dictionary, equate a variable to `{ }` or `dict()`

In [21]:
d = dict() # or equivalently d={}
print(type(d))
d['abc'] = 3
d[4] = "A string"
print(d)

<class 'dict'>
{'abc': 3, 4: 'A string'}


As can be guessed from the output above. Dictionaries can be defined by using the `{ key : value }` syntax. The following dictionary has three elements

In [22]:
d = { 1: 'One', 2 : 'Two', 100 : 'Hundred'}
len(d)

3

Now you are able to access 'One' by the index value set at 1

In [None]:
print(d[1])

One


There are a number of alternative ways for specifying a dictionary including as a list of `(key,value)` tuples.
To illustrate this we will start with two lists and form a set of tuples from them using the **zip()** function
Two lists which are related can be merged to form a dictionary.

In [23]:
names = ['One', 'Two', 'Three', 'Four', 'Five']
numbers = [1, 2, 3, 4, 5]
[ (name,number) for name,number in zip(names,numbers)] # create (name,number) pairs

[('One', 1), ('Two', 2), ('Three', 3), ('Four', 4), ('Five', 5)]

Now we can create a dictionary that maps the name to the number as follows.

In [24]:
a1 = dict((name,number) for name,number in zip(names,numbers))
print(a1)

{'One': 1, 'Two': 2, 'Three': 3, 'Four': 4, 'Five': 5}


Note that the ordering for this dictionary is not based on the order in which elements are added but on its own ordering (based on hash index ordering). It is best never to assume an ordering when iterating over elements of a dictionary.

**Note:** Any value used as a key must be _immutable_. That means that _tuples_ can be used as keys (because they can't be changed) but lists are not allowed. As an aside for more advanced readers, arbitrary objects can be used as keys -- but in this case the object reference (address) is used as a key, not the "value" of the object.

The use of tuples as keys is very common and allows for a (sparse) matrix type data structure:

In [None]:
matrix = { (0,1): 3.5, (2,17): 0.1}
matrix[2,2] = matrix[0,1] + matrix[2,17]
# matrix[2,2] is equivalent to matrix[ (2,2) ]
print(matrix)

{(0, 1): 3.5, (2, 17): 0.1, (2, 2): 3.6}


Dictionary can also be built using the loop style definition.

In [25]:
names = ['One', 'Two', 'Three', 'Four', 'Five']
a2 = { name : len(name) for name in names}
print(a2)

{'One': 3, 'Two': 3, 'Three': 5, 'Four': 4, 'Five': 4}


### Built-in Functions

The `len()` function and `in` operator have the obvious meaning:

In [26]:
print("a1 has",len(a1),"elements")
print("One is in a1",'One' in a1,"but not 2:", 2 in a1) # 'in' checks keys only

a1 has 5 elements
One is in a1 True but not 2: False


The `clear( )` function is used to erase all elements.

In [27]:
a2.clear()
print(a2)

{}


The `values( )` function returns a list with all the assigned values in the dictionary. (Acutally not quit a list, but something that we can iterate over just like a list to construct a list, tuple or any other collection):

In [28]:
names = ['One', 'Two', 'Three', 'Four', 'Five']
numbers = [1, 2, 3, 4, 5]
a1 = dict((name,number) for name,number in zip(names,numbers))
[ v for v in a1.values() ]

[1, 2, 3, 4, 5]

`keys( )` function returns all the index or the keys to which contains the values that it was assigned to.

In [29]:
{ k for k in a1.keys() }

{'Five', 'Four', 'One', 'Three', 'Two'}

`items( )` is returns a list containing both the list but each element in the dictionary is inside a tuple. This is same as the result that was obtained when zip function was used - except that the ordering may be 'shuffled' by the dictionary.

In [None]:
",  ".join( "%s = %d" % (name,val) for name,val in a1.items())

'One = 1,  Two = 2,  Three = 3,  Four = 4,  Five = 5'

The `pop( )` function is used to get the remove that particular element and this removed element can be assigned to a new variable. But remember only the value is stored and not the key. Because the is just a index value.

In [30]:
val = a1.pop('Four')
print(a1)
print("Removed",val)

{'One': 1, 'Two': 2, 'Three': 3, 'Five': 5}
Removed 4


# When to use Dictionaries vs Lists

The choice of whether to store data in a list or dictionary (or set) may seem a bit arbitrary at times. Here is a brief summary of some of the pros and cons of these:

* Finding elements in a set vs a list:  `x in C` is valid whether the collection `C` is a list, set or dictonary. However computationally for large collections this is much slower with lists than sets or dictionaries. On the other hand if all items are indexed by an integer than `x[45672]` is much faster to look up if x is a list than if it is a dictionary.
* If all your items are indexed by integers but with some indices unused you could use lists and assign some dummy value (e.g. "") whenever there is no corresponding item. For very sparse collections this could consume significant additional memory compared to a dictionary. On the other hand if most values are present, then storing the indices explicitly (as is done in a dictionary) could consume significant additional memory compared to the list representation.



In [None]:
import time
bigList = [i for i in range(0,100000)]
bigSet = set(bigList)
start = time.clock()  # how long to find the last number out of 10,000 items?
99999 in bigList
print("List lookup time: %.6f ms" % (1000*(time.clock()-start)))
start = time.clock()
99999 in bigSet
print("Set lookup time:  %.6f ms" % (1000*(time.clock()-start)))

List lookup time: 3.775000 ms
Set lookup time:  0.224000 ms


  after removing the cwd from sys.path.
  
  import sys
  if __name__ == '__main__':


## Exercises

1. Create a dictionary of integer values, then write a script to sort (ascending and descending) this dictionary by value.

2. Combine these two list into a dictionary:
```python
keys = ['Ten', 'Twenty', 'Thirty']
values = [10, 20, 30]
```
3. Merge these two dictionaries to one dictionary:
```python
dictA = {'Ten': 10, 'Twenty': 20, 'Thirty': 30}
dictB = {'Thirty': 30, 'Fourty': 40, 'Fifty': 50}
``` 

4. Please access the value of key `history` from the below `sample_dict`, then replace the value of key `history` by `100`
```python
sample_dict = { 
   "class":{ 
      "student":{ 
         "name":"Mike",
         "marks":{ 
            "physics":70,
            "history":80
                }
            }
        }
    }
```

5. Write a Python script to check if a given key already exists in a dictionary or not.

6. Write a Python script to print a dictionary where the keys are numbers between 1 and 15 (both included) and the values are square of keys. </br>
Sample Dictionary
`{1: 1, 2: 4, 3: 9, 4: 16, 5: 25, 6: 36, 7: 49, 8: 64, 9: 81, 10: 100, 11: 121, 12: 144, 13: 169, 14: 196, 15: 225}`

7. Given the following dictionary: 
```python
inventory = {
    'gold' : 500,
    'pouch' : ['flint', 'twine', 'gemstone'],
    'backpack' : ['xylophone','dagger', 'bedroll','bread loaf']
}
```
Try to do the followings:
- Add a key to inventory called `pocket`.
- Set the value of `pocket` to be a list consisting of the strings `seashell`, `strange berry`, and `lint`.
- `.sort()` the items in the list stored under the `backpack` key.
- Then `.remove('dagger')` from the list of items stored under the `backpack` key.
- Add 50 to the number stored under the `gold` key.