# Sequences

[Strings](#strings)
  
- [Basics](#string-basics)

- [Indexing and slicing](#indexing-slicing)

- [String methods](#string-methods)

- [Conversion and formatting](#conversion-formatting)

[Lists](#lists)

- [Basics](#list-basics)

- [List methods](#list-methods)

- [Sorting](#sorting)

[Dictionaries](#dictionary)

- [Basics](#dict-basics)

- [Setting and retrieving values](#dict-use)

## Strings
<a id='strings'></a>

###  Basics
<a id='string-basics'></a>

The string type in Python represents strings as a sequence of Unicode characters. String literals are enclosed in single or double quotes, these are freely interchangeable but single quotes are preferred by some.

In [None]:
"penguin" == 'penguin'

In [None]:
'ű'

Strings are immutable types:

In [None]:
s1 = "a penguin and a giraffe"
s2 = s1

In [None]:
id(s1)

In [None]:
id(s2)

In [None]:
s1 += ' '; id(s1)

In [None]:
s1

In [None]:
s2

Some control characters are represented in literals with escaped sequences (prefixed with a backslash "\")

In [None]:
"\n" # newline
"\t" # tab
"\r" # CR
'\'' # single quote
"\"" # double quote
"\\" # backslash
# "\ooo" # character with octal value ooo
# "\xhh" # character with hex value hh

print ("This line \t has tabs \t and a newline\n....")

### Indexing and slicing
<a id='indexing-slicing'></a>

Individual characters in a string can be accessed via __indexing__. Indexing in python is "one-off", numbering starts at zero:

In [None]:
s = "penguin"

In [None]:
s[0]                 # nulladik elem

In [None]:
s[3]                # 3. elem

In [None]:
s[-2]        # hátulról keres.

Negative indices count characters from the back of the string:

In [None]:
s[-1]

In [None]:
s[-3]

It is also possible to refer to a sequence of characters, this is known as slicing. Note that the second number indicates the first element that is _not_ included. This takes getting used to, but has the advantage that the length of any slice i:j is j-i.

In [None]:
print(s);s[1:3]

In [None]:
s[2:-1]

If either number is ommited from the slicing syntax, it refers to the beginning or end of the string:

In [None]:
s[:3]

In [None]:
s[3:]

In [None]:
s[:-3]

In [None]:
s[-3:]

If either index is larger than the length of the string, it is considered equal to it.

In [None]:
s[:100]

In [None]:
IP="192.168.52.101"
print(IP[0:3]); print(IP[4:7]); print(IP[8:11])

In other words, $[i:j]$ really means: "from the i-th element to the last element before the j-th, or to the end of the string".

It is also possible to only include every n-th character in a slice, this is achieved via a second colon and a third number:

In [None]:
s[1:6:2]

In [None]:
s[::2]

### String methods
<a id='string-methods'></a>

String objects offer a wide variety of built-in methods for string manipulation, we will now list a few of them.

The __replace__ method of strings will replace all occurences of some character in a string with another:

In [None]:
s = "This\ttext\tcontains\ttabs\tinstead\tof\tspaces"

In [None]:
print(s)

In [None]:
s.replace('\t', ' ')

The _replace_ method can also be used to delete all occurrences of some character:

In [None]:
s.replace('\t', '')

<hr />

The __strip__ method will remove all occurences of some character from the edges of a string, or all whitespace if no arguments are given

In [306]:
s = " Here's a text with whitespace at each end\n"

In [307]:
s.strip()

"Here's a text with whitespace at each end"

In [None]:
s = "*** This text has stars at each end ***"

In [None]:
s.strip("*")

We can pass multiple characters to _strip_ in the form of a string and it will remove all occurences of each character.

In [None]:
s.strip("* ")

<hr />

The __split__ method splits the string on all whitespace if no arguments are passed, or on a specific string, and returns a _list_ of the resulting substrings:

In [296]:
s = "This line\tcontains both\t\tspaces and tabs\n"
ss = r"This line\tcontains both\t\tspaces and tabs\n"

In [302]:
print(s)

This line	contains both		spaces and tabs



In [308]:
s.split()

["Here's", 'a', 'text', 'with', 'whitespace', 'at', 'each', 'end']

In [309]:
s.split('\t')

[" Here's a text with whitespace at each end\n"]

In [310]:
s.split('\t\t')

[" Here's a text with whitespace at each end\n"]

As we have seen earlier, the _split()_ method is limited and sometimes using _re.split()_ is a better option.

<hr />

The inverse operation of split, __join__, is also a string method:

In [299]:
l = ["Here's", "a", "list", "of", "words", "waiting", "to", "be", "joined"]

In [300]:
" ".join(l)

"Here's a list of words waiting to be joined"

In [301]:
"_".join(l)

"Here's_a_list_of_words_waiting_to_be_joined"

<hr />

The functions __lower__ and __upper__ will return lowercased and uppercased versions of a string

In [None]:
"Penguin".lower()

In [None]:
"Penguin".upper()

The __title__ function capitalizes the first letter only:

In [None]:
"penguin".title()

These functions will leave strings unchanged if they are already uppercased/lowercased/titlecased:

In [None]:
"PENGUIN".upper()

There are also functions to check for these properties of strings:

In [None]:
"PENGUIN".isupper()

In [None]:
"Penguin".isupper()

In [None]:
"Penguin".istitle()

In [None]:
"Penguin".islower()

<hr />

Some more boolean functions on strings include __isalpha__ and __isalnum__, which check whether a string contains alphabetical characters only, or alphanumeric characters only:

In [None]:
"penguin".isalpha()

In [None]:
"penguin221".isalpha()

In [None]:
"penguin".isalnum()    # alphanumerikus-e

In [None]:
"penguin221".isalnum()    # alphanumerikus-e

To find out if some string is contained in another, the keyword __in__ may be used:

In [None]:
"gui" in "penguin"

In [None]:
"gi" in "penguin"

The functions __startswith__ and __endswith__ will only check the beginning and end of a string, respectively:

In [None]:
"penguin".startswith("pen")

In [None]:
"penguin".endswith("in")

These are more convenient than slicing, since strings that are too short will not raise errors:

In [None]:
s = "penguin"
t = ""

In [None]:
s[-1] == "n"

In [None]:
t[-1] == "n"   # üres a string, hiba leszen.

In [None]:
t.endswith("n")

### Conversion and formatting
<a id='conversion-formatting'></a>

Most Python types can be converted to strings without any problems:

In [None]:
str(3)

In [None]:
str([1, 2])

In [None]:
str(True)

This and string concatenation already provide a straight-forward way for printing any value:

In [None]:
i = 3
print("The value of i is " + str(i) + ".")

A much more convenient way to achieve this is __string formatting__. The format method of a string will substitute arguments of arbitrary type into the string, provided that the string conforms to the syntax of __format strings__.

In [None]:
name = "John"
age = 8

In [None]:
print("{0} is {1} years old".format(name, age))

Format instructions in a string are enclosed in curly braces. Numbers indicate the position of the argument that is to be substituted at the given position in the string. Compare:

In [None]:
print("{1} is {0} years old".format(name, age))

If numbers are omitted, variables will be substituted in left-to-right order:

In [None]:
print("{} is {} years old".format(name, age))

Format strings may also refer to named (keyword) arguments by their names:

In [None]:
print("{name} is {age} years old".format(name="John", age=8))

The most useful thing about format strings is that they allow us to specify the way in which non-string types are formatted. This is achieved by writing a colon (:) after the number or name of the varaible in the format string, and then providing a __format specification__.

Format specifications can control many different properties of how a string is displayed, including alignment, number of decimal places, and a choice of number formats. All options are documented in the official Python manual under the section "[Format Specification Mini-language](https://docs.python.org/3.7/library/string.html#formatspec)". Below we show only a few examples.

This example instructs the _format_ method to only print a float to 4 digits:

In [None]:
import math

In [None]:
print("The value of Pi is {0:.4f}".format(math.pi))

The $.4$ part of the above example configures precision, the $f$ following it sets the number type to __fixed point__. The number is correctly rounded to the specified number of decimal digits.

Another example will treat the variable as a percentage (multiplying it by 100 and printing a % sign).

In [None]:
print("The current annual interest rate is {0:.2%}".format(0.0525))

The last two examples show format strings that will cause a string to be centered or right-aligned (a display width must be specified)

In [None]:
print('The text below is\n{0:^30}'.format('centered'))

In [None]:
print('The text below is\n{0:>30}'.format('right-aligned'))

## Lists
<a id='lists'></a>

### Basics
<a id='lists-basics'></a>

We've seen earlier that lists can hold elements of arbitrary types, and that they are mutable types, i.e. its elements may be changed without creating a new list object.

Indexing and slicing work for lists the same way they do for strings:

In [None]:
l = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l']

In [None]:
l[3]

In [None]:
l[2:5]

In [None]:
l[-2:]

In [None]:
l[:4]

Since strings are mutable, indexing and slicing can also be used to change one or several elements in the list:

In [None]:
l[1] = 'x'

In [None]:
l

In [None]:
l[2:4] = ['y', 'z']

In [None]:
l

### List methods
<a id='list-methods'></a>

The __append__ function can be used to add a new element to the end of a list:

In [None]:
l.append("m")

In [None]:
l

The __extend__ function adds all elements from another list to the end of a list:

In [None]:
m = ['n', 'o', 'p', 'q']

In [None]:
l.extend(m)

In [None]:
l

The addition operation (+) creates a new list using all elements of the first list, then all elements of the second list:

In [None]:
l + m

The __insert__ method of lists can be used to insert an element at a given position:

In [None]:
l.insert(1, 'b')

In [None]:
l

The __remove__ command removes the first occurrence of a given element from a list (an error is raised if the element is not in the list at all):

In [None]:
l.remove('x')

In [None]:
l

The __pop__ method will return a list element at a given position and also remove it from the list. If no index is specified, the last element is removed:

In [None]:
l.pop()    # végéről kivesz

In [None]:
l

In [None]:
l.pop(3)

In [None]:
l

Alternatively, the keyword __del__ will remove an element based on its index, but not return it:

In [None]:
del l[2]

In [None]:
l

This also works on entire slices of a list:

In [None]:
del l[2:5]

In [None]:
l

The __index__ method will return the position in the list where a given element occurs for the first time. An error is raised if the element is not in the list.

In [None]:
l.index('b')

In [None]:
l.index(9)

The __count__ method of lists returns the number of times a given element occurs in a list:

In [None]:
l.count('m')

### 4.2.3 Sorting
<a id='4.2.3'></a>

The __sort__ method will sort the list __in place__:

In [None]:
l = [ "f", "a", "z", "l", "b", "e"]
l.sort()

In [None]:
l

It's counterpart, __sorted__, will return a new list object, the original unsorted list remains unaffected:

In [None]:
l = [3,8,2,7,3,1,7,13,1]

In [None]:
sorted(l)

In [None]:
l

NOTE: these sort functions can take an arbitrary function as their _key_ parameter to specify a function of each element that is to be used for sorting:

In [None]:
sorted(l, key=lambda x: -x)

In [None]:
sorted(l, key=lambda x: x%7)

## Dictionaries
<a id='dictionaries'></a>

### Basics
<a id='dictionary-basics'></a>

__Dictionaries__ in Python are one of the most commonly used types for storing structured data. A dictionary is a map from keys to values, in some programming languages it is known as an associative array.

Here's a sample dictionary mapping persons to their age:

In [None]:
d = {"John": 8, "Mary": 11, "Susan": 15}

Any value can be retrieved by indexing the dictionary with a key:

In [None]:
d['John']

New key-value pairs are added in a similar fashion:

In [None]:
d['Jack'] = 12

In [None]:
d

Dictionaries can be constructed from any sequence of pairs using the constructor __dict__, e.g.:

In [None]:
dict([('Jack', 12), ('John', 8), ('Mary', 11), ('Susan', 15)])

The keyword __in__, which we have so far used to test membership of an element is lits, tuples or sets, also works with dictionaries and determines whether a key is present (values are not considered):

In [None]:
"Jack" in d

In [None]:
12 in d

Values of a dictionary may be of arbitrary type, but keys must be hashable, e.g. we can't use lists:

In [None]:
d[[1,2]] = 3

Tuples work, however, they can be used if keys need to have structure:

In [None]:
d[(1,2)] = 3

Dictionary elements can be removed using the keyword __del__:

In [None]:
del d[(1,2)]

There are methods to retrieve the list of all keys, all values, or all pairs in a dictionary:

In [None]:
d.keys()

In [None]:
d.values()

In [None]:
d.items()

There are multiple ways to iterate over elements of a dictionary. If used as the iterator in a for loop, it will expose its keys:

In [None]:
for k in d:
    print(k)

Additionally, there are methods that return iterators over values or key-value items:

In [None]:
for k in d.values():
    print(k)

In [None]:
for k, v in d.items():
    print(k, v)

### Setting and retrieving values
<a id='dict-use'></a>

It is very common to have to access a dictionary without knowing whether some key is present in it or not. There are several different ways to do this:

In [None]:
def lookup(d, key):
    if not key in d:
        print("Key not in dict!")
    else:
        return d[key]

In [None]:
lookup(d, "John")

In [None]:
lookup(d, "Jill")

In [None]:
def lookup2(d, key):
    try:
        return d[key]
    except KeyError:
        print("Key not in dict!")

In [None]:
lookup2(d, "John")

In [None]:
lookup(d, "Jill")

Dictionaries also offer the __get__ method, which returns the value for some key if the key is present in the dictionary and a default element otherwise:

In [None]:
def lookup3(d, key):
    return d.get(key, "default")

In [None]:
lookup3(d, "John")

In [None]:
lookup3(d, "Jill")

Without a second argument, _get_ returns None for keys that are not in the dictionary:

In [None]:
print(d.get("Jill"))

Often we'd like to ensure that once a new key is encountered, it is added to the dictionary with some default value. When combining this with lookup, a straightforward implementation would be the following:

In [None]:
def lookup4(d, key):
    if not key in d:
        d[key] = "default"
    return d[key]

In [None]:
d

In [None]:
lookup4(d, "Jill")

In [None]:
d

This behaviour is implemented by the dictionary method __setdefault__, which takes an optional second argument specifying the default value to be set (and returned):

In [None]:
d.setdefault("John", "default")

In [None]:
d.setdefault("Michelle", "default")

In [None]:
d

Finally, dictionaries that exhibit this behaviour upon plain lookup are available via the __defaultdict__ type of the collections module.

A dictionary's items can also be changed by calling the __update__ method on another dictionary or any iterable of pairs. Values are overwritten for keys that were already present in the dictionary, while new keys are also added.

In [None]:
d

In [None]:
e = {"John": 16, "Jill": 17, "Roger": 9}

In [None]:
d.update(e)

In [None]:
d