#Data Structures: String and Dictionary

##String

Strings are used to store non-numeric information. A string is one type of sequence in Python. In general, sequence is a positionally ordered collection of other objects. A sequence maintains a left-to-right order among the items it contains. These items can be stored and fetched by their relative positions (i.e., index).

Strings in python are surrounded by either single quotation marks, or double quotation marks. 'hello' is the same as "hello".

If use triple quotes (""" or '''), it supports multi-line strings as a block of comments. 





In [39]:
'''This is a demo
    for multi-line
    comments among the source code'''

comments = '''This is a demo
    for multi-line
    comments as a string value'''

print(comments)

This is a demo
    for multi-line
    comments as a string value


####Strings are Arrays
Like many other popular programming languages, strings in Python are arrays of bytes representing unicode characters.

However, Python does not have a character data type, a single character is simply a string with a length of 1.

Square brackets can be used to access elements of the string.

In [40]:
s = "Hello, World!"
print(s[1])

e


####Looping Through a String
Since strings are arrays, we can loop through the characters in a string, with a for loop.

In [41]:
for a in "Python":
  print(a)

P
y
t
h
o
n


###Escape sequences (backslash + a character)

To insert characters that are illegal in a string, use an escape character.An escape character is a backslash \ followed by the character you want to insert. A common example of an illegal character is a single or double quote inside a string that is surrounded by double quotes:

Here is the list of commonly used escape characters.
```
Escape Character                  Result
===================================================
\'                                Single Quote
\\                                Backslash
\n	                              New Line	
\r	                              Carriage Return	
\t	                              Tab	
\b	                              Backspace	
\f	                              Form Feed	
\ooo	                          Octal value	
\xhh	                          Hex value
```




In [None]:
a = "\'Python\'"
b = "Python\tPython"
c = "Python\nPython"
d = "Python\\Python"
print(a)
print(b)
print(c)
print(d)

'Python'
Python	Python
Python
Python
Python\Python


There is another way using raw strings so Python can interpret special characters (e.g., backslash or quotation) as is.

In [None]:
path0 = "C:\user\test.py" #error because \u not defined.
path1 = "C:\\user\\test.py" #use escape character \\
path2 = r"C:\user\test.py" #use raw string to suppress escape characters.

##String in action:
###String basic operations: concatenation; lenth; repetition

In [None]:
s = "I like " + "Python."
len(s)
print(s*2)

I like Python.I like Python.


###Using "in" to check if a string contains,

In [None]:
s = "I like " + "Python."
if "Python" in s:
    print("Yes, Python is found in the string.")

Yes, Python is found in the string.


###Using index to access one item or a slice of items in a string

In [None]:
s = "abcdefg"
print(s[0]) #access the first item
print(s[:]) #slicing from the first (inclusive) to the last item (inclusive).
print(s[0:-1]) #slicing from the first (inclusive) to the last item (exclusive).

a
abcdefg
abcdef


###We can use the third index as stride to slice a string.

In [None]:
s = "abcdefg"
print(s[::2]) #slicing the whole string with a stride 2
print(s[::-1]) #reversing the items in the string
print(s[6:1:-1]) #reversing a string partially, stop by the item with index 1 (exclusive)

aceg
gfedcba
gfedc


Please note, String is immutable. You cannot change the value of any item in a given string. 

###There are many String methods provided for various text processing tasks as shown in the examples below.

In [None]:
s = "I love Java!"
s = s.replace('Java', 'Python') #replace all 'Java' with 'Python'
print(s)
location = s.find("Python")
print(location)

I love Python!
7


In [None]:
s = '---'.join(['Java','Python', 'C++'])
print(s)

Java---Python---C++


In [None]:
s1 = 'I love Python.'
wordList = s1.split()
print(wordList)

s2 = 'Java ok. Python ok. C++ ok'
result = s2.split('ok')
print(result)

['I', 'love', 'Python.']
['Java ', '. Python ', '. C++ ', '']


####The upper() method returns the string in upper case only for one-time display. The value of the original string is immutable.

In [42]:
a = "Hello, World!"
print(a.upper())
print(a)

HELLO, WORLD!
Hello, World!


##Dictionary

Dictionaries are used to store data values in **key:value** pairs.

A dictionary is a collection which is ordered*, changeable and does not allow duplicates. (*As of Python version 3.7, Dictionary order is guaranteed to be insertion order. In Python 3.6 and earlier, dictionaries are unordered. Please note the difference between "ordered" and "sorted")

Dictionaries are written with curly brackets, and have keys and values. Dictionary items are ordered, mutable, and does not allow duplicates.

Dictionary items are presented in key:value pairs, and can be referred to by using the key name.

In [9]:
stuRecord = {1:"Alice", 2:"Bob", 3:"Cindy", 4:"David"}
print(stuRecord)

{1: 'Alice', 2: 'Bob', 3: 'Cindy', 4: 'David'}


####Duplicates Not Allowed
New value overrides the old one if sharing the same key


In [10]:
stuRecord = {1:"Alice", 2:"Bob", 2:"Cindy", 4:"David"} 
print(stuRecord)

{1: 'Alice', 2: 'Cindy', 4: 'David'}


####Mutable (or Changeable)

Dictionaries are changeable, meaning that we can change, add or remove items after the dictionary has been created. Please note, in dictionary, we use Key (not index) to access Value.



In [11]:
stuRecord = {1:"Alice", 2:"Bob", 3:"Cindy", 4:"David"}
stuRecord[1] = "Alex"
print(stuRecord)

{1: 'Alex', 2: 'Bob', 3: 'Cindy', 4: 'David'}


####Dictionary Length
To determine how many items a dictionary has, use the len() function:

In [12]:
print(len(stuRecord))

4


####Dictionary Items - Data Types
The values in dictionary items can be of any data type:

In [13]:
inventoryDict = {
  "Type": "Faculty",
  "Tenured": True,
  "Year": 2000,
  "Major": ["Computer Science", "Computer Networking", "Cybersecurity"]
}

####Python Collections (Arrays)
As a quick summary, there are four collection data types in the Python programming language:

* List is a collection which is ordered and changeable. Allows duplicate members.
* Tuple is a collection which is ordered and unchangeable. Allows duplicate members.
* Set is a collection which is unordered and unindexed. No duplicate members.
* Dictionary is a collection which is ordered (by insertion) and changeable. No duplicate members.

When choosing a collection type, it is useful to understand the properties of that type. Choosing the right type for a particular data set could mean retention of meaning, and, it could mean an increase in efficiency or security.

####Accessing items in a dictionary

In [15]:
stuRecord = {1:"Alice", 2:"Bob", 3:"Cindy", 4:"David"}
print(stuRecord[3])
print(stuRecord.get(3))

Cindy
Cindy


####Get keys

The keys() method will return a list of all the keys in the dictionary.

In [17]:
stuRecord = {10:"Alice", 20:"Bob", 30:"Cindy", 40:"David"}
print(stuRecord.keys())

dict_keys([10, 20, 30, 40])


We can add more items in Key:Value pair.

In [18]:
stuRecord = {10:"Alice", 20:"Bob", 30:"Cindy", 40:"David"}
stuRecord[50]="Ellen"
print(stuRecord.keys())

dict_keys([10, 20, 30, 40, 50])


####Get Values
The values() method will return a list of all the values in the dictionary.

In [19]:
stuRecord = {10:"Alice", 20:"Bob", 30:"Cindy", 40:"David"}
print(stuRecord.values())

dict_values(['Alice', 'Bob', 'Cindy', 'David'])


####Get Items
The items() method will return each item in a dictionary, as tuples in a list.

In [20]:
stuRecord = {10:"Alice", 20:"Bob", 30:"Cindy", 40:"David"}
print(stuRecord.items())

dict_items([(10, 'Alice'), (20, 'Bob'), (30, 'Cindy'), (40, 'David')])


####Check if Key Exists
To determine if a specified key is present in a dictionary use the in keyword:

In [21]:
stuRecord = {10:"Alice", 20:"Bob", 30:"Cindy", 40:"David"}
if 40 in stuRecord:
  print("Yes, '40' is one of the keys in the this dictionary")

Yes, '40' is one of the keys in the this dictionary


####Change Values
You can change the value of a specific item by referring to its key name:

In [22]:
stuRecord = {10:"Alice", 20:"Bob", 30:"Cindy", 40:"David"}
stuRecord[40] = "Derek"
print(stuRecord)


{10: 'Alice', 20: 'Bob', 30: 'Cindy', 40: 'Derek'}


####Update Dictionary
The update() method will update the dictionary with the items from the given argument.

The argument must be a dictionary, or an iterable object with key:value pairs.

In [23]:
stuRecord = {10:"Alice", 20:"Bob", 30:"Cindy", 40:"David"}
stuRecord.update({40:"Derek"})
print(stuRecord)

{10: 'Alice', 20: 'Bob', 30: 'Cindy', 40: 'Derek'}


####Removing Items
There are several methods to remove items from a dictionary:

In [24]:
stuRecord = {10:"Alice", 20:"Bob", 30:"Cindy", 40:"David"}
stuRecord.pop(40)
print(stuRecord)

{10: 'Alice', 20: 'Bob', 30: 'Cindy'}


The popitem() method removes the last inserted item (in versions before 3.7, a random item is removed instead):

In [25]:
stuRecord = {10:"Alice", 20:"Bob", 30:"Cindy", 40:"David"}
stuRecord.popitem()
print(stuRecord)

{10: 'Alice', 20: 'Bob', 30: 'Cindy'}


The del keyword removes the item with the specified key name:

In [26]:
stuRecord = {10:"Alice", 20:"Bob", 30:"Cindy", 40:"David"}
del stuRecord[40]
print(stuRecord)

{10: 'Alice', 20: 'Bob', 30: 'Cindy'}


The del keyword can also delete the dictionary completely:

In [27]:
stuRecord = {10:"Alice", 20:"Bob", 30:"Cindy", 40:"David"}
del stuRecord
print(stuRecord)

NameError: ignored

The clear() method empties the dictionary:

In [28]:
stuRecord = {10:"Alice", 20:"Bob", 30:"Cindy", 40:"David"}
stuRecord.clear()
print(stuRecord)

{}


####Loop Through a Dictionary
You can loop through a dictionary by using a for loop.

When looping through a dictionary, the return value are the keys of the dictionary, but there are methods to return the values as well.

In [33]:
stuRecord = {10:"Alice", 20:"Bob", 30:"Cindy", 40:"David"}

#Print all key names in the dictionary, one by one:
for k in stuRecord: 
    print(k)

#You can use the keys() method to return the keys of a dictionary:
for k in stuRecord.keys(): 
    print(k)

#Print all values based on their keys, one by one:
for k in stuRecord: 
    print(stuRecord[k])

#You can also use the values() method to return values of a dictionary:
for v in stuRecord.values():
    print(v)

#Loop through both keys and values, by using the items() method:
for k, v in stuRecord.items(): 
    print(k, v)


10
20
30
40
10
20
30
40
Alice
Bob
Cindy
David
Alice
Bob
Cindy
David
10 Alice
20 Bob
30 Cindy
40 David


####Copy a Dictionary
You cannot copy a dictionary simply by typing dict2 = dict1, because: dict2 will only be a reference to dict1, and changes made in dict1 will automatically also be made in dict2.

There are ways to make a copy, one way is to use the built-in Dictionary method copy().

In [34]:
stuRecord1 = {10:"Alice", 20:"Bob", 30:"Cindy", 40:"David"}
stuRecord2 = stuRecord1.copy()
stuRecord1[40] = "Derek"
print(stuRecord1)
print(stuRecord2)

{10: 'Alice', 20: 'Bob', 30: 'Cindy', 40: 'Derek'}
{10: 'Alice', 20: 'Bob', 30: 'Cindy', 40: 'David'}


Another way to make a copy is to use the built-in function dict().

In [35]:
stuRecord1 = {10:"Alice", 20:"Bob", 30:"Cindy", 40:"David"}
stuRecord2 = dict(stuRecord1)
stuRecord1[40] = "Derek"
print(stuRecord1)
print(stuRecord2)

{10: 'Alice', 20: 'Bob', 30: 'Cindy', 40: 'Derek'}
{10: 'Alice', 20: 'Bob', 30: 'Cindy', 40: 'David'}


####Nested Dictionaries
A dictionary can contain dictionaries, this is called nested dictionaries.



In [37]:
gradeBook = {"Alice" : {"Quiz1":100, "Quiz2":90, "Quiz3":95}, 
             "Bob" : {"Quiz1":90, "Quiz2":80, "Quiz3":85}, 
             "Cindy" : {"Quiz1":100, "Quiz2":95, "Quiz3":95}, 
             "David" : {"Quiz1":100, "Quiz2":90, "Quiz3":100}}

print(gradeBook)

print(gradeBook["Alice"]["Quiz2"])

{'Alice': {'Quiz1': 100, 'Quiz2': 90, 'Quiz3': 95}, 'Bob': {'Quiz1': 90, 'Quiz2': 80, 'Quiz3': 85}, 'Cindy': {'Quiz1': 100, 'Quiz2': 95, 'Quiz3': 95}, 'David': {'Quiz1': 100, 'Quiz2': 90, 'Quiz3': 100}}
90




---



###The format method
Sometimes we may want to construct strings from other information. This is where the format() method is useful.

In [None]:
courseNum = 170
name = 'Alice'

print('{0} is taking IT {1}'.format(name, courseNum))
print('I am impressed to see how {0} is playing with Python now'.format(name))

Alice is taking IT 170
I am impressed to see how Alice is playing with Python now


##How It Works

str.format() is one of the string formatting methods in Python3, which allows multiple substitutions and value formatting. This method lets us concatenate elements within a string through positional formatting.

Using a Single Formatter :
Formatters work by putting in one or more replacement fields and placeholders defined by a pair of curly braces { } into a string and calling the str.format(). The value we wish to put into the placeholders and concatenate with the string passed as parameters into the format function.

A string can use certain specifications and subsequently, the format method can be called to substitute those specifications with corresponding arguments to the format method.

Observe the first usage where we use {0} and this corresponds to the variable name which is the first argument to the format method. Similarly, the second specification is {1} corresponding to age which is the second argument to the format method. Note that Python starts counting from 0 which means that first position is at index 0, second position is at index 1, and so on.

Please note, if we don't use position index, Python will match {} in order to the values in the format method. 

Syntax : { } .format(value)

Parameters :
(value) : Can be an integer, floating point numeric constant, string, or even variables.

Returntype : Returns a formatted string with the value passed as parameter in the placeholder position.

Using Multiple Formatters :
Multiple pairs of curly braces can be used while formatting the string. Let’s say if another variable substitution is needed in sentence, can be done by adding a second pair of curly braces and passing a second value into the method. Python will replace the placeholders by values in order.

Syntax : { } { } .format(value1, value2)

Parameters :
(value1, value2) : Can be integers, floating point numeric constants, strings, and even variables. Only difference is, the number of values passed as parameters in format() method must be equal to the number of placeholders created in the string.

**Inside the placeholders you can add a formatting type to format the result:** (Optional Reading)

:<		Left aligns the result (within the available space)

:>		Right aligns the result (within the available space)

:^		Center aligns the result (within the available space)

:=		Places the sign to the left most position

:+		Use a plus sign to indicate if the result is positive or negative

:-		Use a minus sign for negative values only

: 		Use a space to insert an extra space before positive numbers (and a minus sign befor negative numbers)

:,		Use a comma as a thousand separator

:_		Use a underscore as a thousand separator

:b		Binary format

:c		Converts the value into the corresponding unicode character

:d		Decimal format

:e		Scientific format, with a lower case e

:E		Scientific format, with an upper case E

:f		Fix point number format

:F		Fix point number format, in uppercase format (show inf and nan as INF and NAN)

:g		General format

:G		General format (using a upper case E for scientific notations)

:o		Octal format

:x		Hex format, lower case

:X		Hex format, upper case

:n		Number format

:%		Percentage format