## Strings




 Previously, we explored how
numbers are stored in memory. This was achieved by changing a number
from its representation in the decimal system to the binary system.

There is no simple way to do the same for letters
 and symbols
. However, nothing prevents us from mapping
letters to numbers and then storing the number. The only extra step
required is to tell the program that this number is meant to be a
letter. Python will then use a lookup table to figure out which letter
to print as output. Obviously, this will only work if everyone agrees
to use the same lookup table&#x2026;

We can use the `chr()` function to show the character associated with
a given number   \index{number to
ASCII}



In [1]:
print(chr(65))

But how do we tell python that a given value should be interpreted as
a character? Using `chr()` each time we want to see a character
would be tedious.

In our last module, we discovered that each variable is assigned a
type (int, float, etc.). These assignments are done when you assign a
value to the variable. E.g., if you type `a=12` python will assign an
integer-type. If you type `a=12.1` python will assign an float-type.

However, we can't simply write `a=b` to assign the letter `b` to the
variable `a` because `b` could also be the name of a variable. Most
programming languages thus use quotation marks to indicate that you
are assigning a character to a variable. Consider the following



In [1]:
b = 12     # the value of the variable b is 12
print(b)
a = b      # the value of the variable a is 12
print(a)
a = "b"    # the value of the variable a is the letter b
print(a)
print("a") # print the character "a"

you can use either single quotation marks or double ones



In [1]:
a = 'b'  # OK
print(a)
a = "b"  # OK
print(a)

but this will not work



In [1]:
a = "a'

Lastly, We can use the `ord()` function to show which number belongs
to a given letter   \index{ASCII to
number}



In [1]:
print(ord("a"))

### The ASCII table



 ASCII stands for "American Standard Code for
Information Interchange" and defines a way to map characters to
numbers (see the below table for a short excerpt).

| Dec|Hex|Char|Description|
|---|---|---|---|
| 65|41|A|Capital A|
| 66|42|B|Capital B|
| 67|43|C|Capital C|
| 68|44|D|Capital D|
| 69|45|E|Capital E|
| 70|46|F|Capital F|
| 71|47|G|Capital G|
| 72|48|H|Capital H|
| 73|49|I|Capital I|

The complete table is available at [https://en.wikipedia.org/wiki/ASCII](https://en.wikipedia.org/wiki/ASCII)
The ASCII standard is the oldest and most widely accepted way to map
characters to numbers. However, due to its age and country of origin,
it comes with considerable limitations. It was initially designed to
store text characters with a bit-width of 1 byte. I.e., with numbers
between 0 and 255. This implies that there are insufficient mappings
to consider special characters, like umlauts, let alone other
alphabets.

  It is only recently that a globally
accepted mapping between numbers and text has become available
([https://en.wikipedia.org/wiki/Unicode](https://en.wikipedia.org/wiki/Unicode)), but even there, different
variants exist, and not every operating system supports them
similarly. This is one of the reasons why we will only use letters
that are defined in the original ASCII-table.

Fun fact: The name of the artificial intelligence in the movie "2001,
A Space Odyssey" was HAL. This name was derived by subtracting 1 from
the ASCII values of IBM.



### Strings



 Working with
single letters is not convenient. Thus, every computer language knows
about sequences of letters, which are called strings. We can think of
strings simply as a special type of list. It should, therefore, be no
problem for you to print, e.g., the 3<sup>rd</sup> letter of this string.



In [1]:
a = "This is an example of a string"
print(a)
# now print the 3rd letter of this string

Unlike lists, you cannot modify elements of a string (i.e., they are immutable):



In [1]:
a = "Joe"
a = "Jessie" # this is ok
a[2] = "x"   # this will not work

Similarly to lists, you can work with ranges and query a
string object to see which methods it provides. We can do this with the `dir()` function as follows:



In [1]:
a = "Joe"
dir(a)

Remember that any names that start with an underscore are best
ignored, unless you want to change the internals of the python
language.

You can get further information on these methods with the `help()`
function by combining the variable name with the method name, e.g.,



In [1]:
a = "Joe"
help(a.upper)

## Other Data Types



Python knows various data types, many of which we may never
use. However, the most important ones should at least be mentioned:



### Vectors versus Lists



If you have used matlab before, you will be familiar with
vectors. Unlike matlab, python does not have a native vector type.
Lists can contain almost anything (e.g., other lists, strings, and any
of the types below).  Vectors, however, can only contain numbers and
support mathematical methods (i.e., multiplying two vectors will give
you the cross product). We will learn how to use vectors in a later
module.  



### Tuples




Python tuples are a special version of a list that won't allow you
to modify the value of a list element. We define a tuple by declaring
the list with regular brackets rather than square brackets.



In [1]:
my_list = [1, 2, 4]       # a regular list
my_tuple = (1, 2, 3) # a tuple list

We access elements of a tuple via index or range operations similar to
regular list



In [1]:
print(my_tuple[1]) # note the square brackets for the index expression!

however, unlike lists, **you cannot change the values in a tuple!**



In [1]:
my_tuple[1] = 3

While you cannot change the value of a tuple, it is possible to join two tuples to create a **new** tuple.



In [1]:
my_first_tuple = (1, 2, 3)
my_second_tuple = (3, 8, 7)
new_tuple = my_first_tuple + my_second_tuple
print(new_tuple)

# or replace a existing tuple
my_first_tuple = my_first_tuple + my_second_tuple
print(my_first_tuple)

Tuples are an immutable type (similar to strings). So you can create
them, you can delete them, and you can re-assign them. But you cannot
change the values of the tuple elements.



### Sets




Sets are declared with curly braces. Unlike lists, you can only add unique
elements. If there are duplicates, they will simply be ignored.



In [1]:
my_set = {1, 2, 3, 3}
print(my_set)

Another critical difference is that sets are not indexed. I.e., you
cannot write `my_set[1]`. Set elements can be added, removed, and
changed. Furthermore, sets provide all sorts of exciting
operations (i.e., union, difference, intersection, subset, etc.). A practical
example of how to use sets would be the following case. You have the list
of students enrolled in ESS224H1 Introduction To Mineralogy And
Petrology, and the list of students who are enrolled in ESS262H1
Earth System Processes. If both are a set, you can use a single
command to find out which students are enrolled in both courses:



In [1]:
ESS224 = {"Maria", "Stuart", "Andy", "Drew"}
ESS262 = {"James", "Mark", "Alex", "Maria", "Silvy", "Stuart"}
ESS224.intersection(ESS262)

or, imagine that you have two e-mail lists, and you want to join both
list and you ant to make sure that people do not receive the e-mail in duplicate



In [1]:
ESS224 = {"Maria", "Stuart", "Andy", "Drew"}
ESS262 = {"James", "Mark", "Alex", "Maria", "Silvy", "Stuart"}
print(ESS224.union(ESS262))

### Dictionaries




  Dictionaries are a
 datatype that enables you to look up values based on a key rather
 than an index. Each dictionary entry consists of a key-value pair. The
 key can be a number, a string, or tuple, and the value can be pretty
 much anything (e.g., a value, a string, a list, another
 dictionary, etc.). Let's consider this simple example:



In [1]:
in_class_quiz = {"Domenica" : 72,
                 "Brian" : 77,
                 "George" : 95,
                 "Liz" : 81}

Note that dictionaries use curly braces similar to sets, but
the internal structure is different.

We can query the dictionary to see how individual students performed
in the quiz. However, rather than referring to the result by
a numeric index, we use the key



In [1]:
print(in_class_quiz["Brian"])

Note that the key needs to be unique otherwise, you will override the
earlier definition.



In [1]:
in_class_quiz = {"Domenica" : 72,
                 "Brian" : 77,
                 "George" : 95,
                 "Brian" : 50,  # Brian appears twice, so this entry overwrites the earlier one
                 "Liz" : 81}
print(in_class_quiz["Brian"])

similar to lists, we can change/update individual values by referring to
their key.



In [1]:
in_class_quiz["Brian"] = 70
print(in_class_quiz["Brian"])

### Random bits and pieces



All of these data types are ultimately ways of storing list-type
elements. One of the most often used operations on a list is
determining the number of list elements. The python function `len()`
will do exactly this



In [1]:
s = "Test"
l = [1, 2, 3]
s = {1 ,2}
print(len(s))
print(len(l))
print(len(s))

You may have stumbled upon this before, but it is possible to convert
many of these datatypes into another data type:



In [1]:
t = (1, 2 , 3, 3)
l = list(t)
s = set(l)
print(t)
print(l)
print(s)