# Dictionary & List (Data Structures)

Data Structure is how we organize, manage and store data in formats that enables efficient access and modification of datas

There are several types of data structures used in computer languages (`Array`, `Stack`, `Queue`, `Tree`, `Object` and so on)

Some of them are interchangable, and language specific. 

![](https://img-blog.csdnimg.cn/20190910131153958.jpg) a example of data structre `Stack`



Here we focus on `List` and `Dictionary` that are commonly used in Python.

*Some concept of `List` is interchangable as `Array` in other programming language. It's worth noting that `List` in Python allow you to store different datatypes at once*

To initualize a `List`, we use []

In [1]:
sequence_list = ['a','!',0.13,1]
print(sequence_list)

['a', '!', 0.13, 1]


<img src="https://imgur.com/ljqtx7F.png" width="500">

List is also able to add or delete new element inside

`append` Add new element to the end of the list

`remove` Remove the first item from the list, which value is equal to 'x'

`insert` Add item into specific index

`pop` Remove element from specific index

`del` Delete certain elements in list

In [2]:
sequence_list = ['a','p','p','l']
print(sequence_list)

sequence_list.append('e')
print(sequence_list)

sequence_list.remove('p')
print(sequence_list)

sequence_list.insert(2, 'x')
print(sequence_list)

sequence_list.pop(1)
print(sequence_list)

#notice the way of using it
del sequence_list[1:3]
print(sequence_list)

['a', 'p', 'p', 'l']
['a', 'p', 'p', 'l', 'e']
['a', 'p', 'l', 'e']
['a', 'p', 'x', 'l', 'e']
['a', 'x', 'l', 'e']
['a', 'e']


we can also sort or reverse it

In [3]:
text = ['This','is','an','example','of','words','in','list']
print(text)

text.sort(reverse=True)
print(text)

text.sort()
print(text)

import random # import the random library
random.shuffle(text)
print(text)

['This', 'is', 'an', 'example', 'of', 'words', 'in', 'list']
['words', 'of', 'list', 'is', 'in', 'example', 'an', 'This']
['This', 'an', 'example', 'in', 'is', 'list', 'of', 'words']
['in', 'This', 'example', 'is', 'list', 'of', 'an', 'words']


## Preprocess Text to List

After understanding how `list` works, we can try to preprocess our text into `list`

In [4]:
text = '''
If the weather's nice, 
we'll meet at 13?
'''

temp_text = text.split(' ')
print(temp_text)

#replacing characters
text = text.replace("'s"," is").replace("'ll"," will")

#removing numbers
remove_digits = str.maketrans('', '','0123456789')
text = text.translate(remove_digits)

#replcaing line breaks and tab
text = text.replace('\n',' ').replace('\t',' ').replace('“', ' " ').replace('”', ' " ')

#leave punctuation in 
for punctuation in ['.','-',',','!','?','(','—',')']:
    text = text.replace(punctuation, ' {0} '.format(punctuation))

#split it!
text_list = text.split(' ')
print(text_list)


#get rid of blanks
text_list= [word for word in text_list if word != '']
print(text_list)

['\nIf', 'the', "weather's", 'nice,', "\nwe'll", 'meet', 'at', '13?\n']
['', 'If', 'the', 'weather', 'is', 'nice', ',', '', '', 'we', 'will', 'meet', 'at', '', '?', '', '']
['If', 'the', 'weather', 'is', 'nice', ',', 'we', 'will', 'meet', 'at', '?']


# Dictionary

`Dictionary` is a special data type used in python.

<img src="https://imgur.com/kM3nFy0.png" width="500"> example of dictionary

It works similar as `Structure` in other Low-level programming langauge or a simplier version of `Object`

The core concept of dictionary is `key`, it's same as `index` in `List`, but allows you to not only use `int` as format

Just need to remember, with every `key` comes a `value`

To initualize a dictionary, we use **{}**. But to access it, we use **[]**

In [5]:
dic = {'apple':10,'milk':20,'egg':100}
print(dic['apple'])

10


not only can the index(key) be `string`, so does the value inside.

In [6]:
dic = {'apple':'ten','milk':20,'egg':'unknown'}
print(dic['apple'])

ten


### Loops in Dictionary

we could also run loops in dictionary.

In [7]:
dic = {'a':100,'b':30,'c':'unknown'}
for index,value in dic.items():
    print (index,value)

a 100
b 30
c unknown


### Keys 

to access keys in dictionary, we use the function `keys()`

**Keep in mind that directly loop with dictionary will only show it's `key`**

In [8]:
dic = {'a':100,'b':30,'c':'unknown'}
print(dic.keys())

for c in dic:
    print(c)


dict_keys(['a', 'b', 'c'])
a
b
c


Since we know how to use `keys()` now, it's also possible to loop with `keys()`

This will allow us to go through the dictionary and access the values inside

In [9]:
dic = {'a':100,'b':30,'c':'unknown', 'd':'apple'}

for key in dic.keys():
    print(dic[key])

100
30
unknown
apple


We can also check whether a `key` is inside our dictionary

In [10]:
dic = {'a':100,'b':30,'c':'unknown', 'd':'apple'}

if 'e' not in dic:
    print('there is no e')

there is no e


### Calculating the probability of a character

In [11]:
text = '''Rose is a rose is a rose is a rose'''

dic = {}
for c in text:
    if c not in dic.keys():
        dic[c] =1
    else:
        dic[c] +=1
        
for index,value in dic.items():
    print (index,value,value/len(text))
    
print()
    
#we could format the number as percentage with the following line
print("{:.2%}".format(0.001))

#so the correct way will be

print()
for index,value in dic.items():
    print (index,value,"{:.2%}".format(value/len(text)))
    

R 1 0.029411764705882353
o 4 0.11764705882352941
s 7 0.20588235294117646
e 4 0.11764705882352941
  9 0.2647058823529412
i 3 0.08823529411764706
a 3 0.08823529411764706
r 3 0.08823529411764706

0.10%

R 1 2.94%
o 4 11.76%
s 7 20.59%
e 4 11.76%
  9 26.47%
i 3 8.82%
a 3 8.82%
r 3 8.82%


### Calculating the probability of a words

In [16]:
text = '''Rose is a rose is a rose is a rose'''


#add one line here to caculate words instead of characters. 

#

dic = {}
for c in text:
    if c not in dic.keys():
        dic[c] =1
    else:
        dic[c] +=1
        
for index,value in dic.items():
    print (index,value,"{:.2%}".format(value/len(text)))
    

Rose 1 10.00%
is 3 30.00%
a 3 30.00%
rose 3 30.00%
