# Data structures 1

**References**:
+ [ThinkPython (book)](https://allendowney.github.io/ThinkPython/)

**Content**:
+ Strings
    + Indexing
    + Slices
    + Immutable
    + Methods
    + Regular expressions
+ Lists
    + list operations
    + list methods
    + list comprehensions
+ Working with lists and strings
+ Objects, Values, & Aliasing 

## Strings

+ A string is a **sequence** of characters
+ A character can be a letter, a digit, a punctuation mark, or whitespace

### Indexing
+ select single letter via indexing (e.g., `word[2]`)
+ the index can be an integer, a variable, or an expression
+ Note: indexing starts in Python with `0`
+ use negative index to count backward from the end

In [4]:
# select a letter using indexing
# index as integer
fruit = "banana"
print(fruit[1])

# index as variable
i = 1
print(fruit[i])

# index as expression
print(fruit[i+1])

# negative index: get last letter 
print(fruit[-1])

a
a
n


### Slices
+ A segment of a string is called a slice
+ different types of slices:
    + closed form: `[n:m]` returns the part of the string from the nth character to the mth character (excluding the last letter)
    + open start: `[:m]` slice starts at the beginning of the string and goes to the mth character
    + open end: `[n:]` slice starts at the nth character and goes until the end
    + empty set: `[n:n]` yields an empty element

In [11]:
# slices
# select the letter 1,2,3
print(fruit[1:4])

# select first three letters
print(fruit[:3])

# select last 3 letters
print(fruit[-3:])

# empty element
print(fruit[3:3])

ana
ban
ana



### Immutable
+ Strings are immutable: you can’t change an existing string by assigning to it a new value

In [14]:
# strings are immutable
fruit[0] = "P" # yields a TypeError

# working alternative
new_fruit = "P" + fruit[1:]
print(new_fruit)

Panana


### Comparisons
+ evaluate whether
    + two strings are equal `==`
    + one string comes in alphabetic order before `<` or after `>` another one
    + uppercase comes always before lowercase

In [23]:
# check whether two strings are equal
print( "Hello" == "hello" )
print( "hello" == "hello" )

# check whether first string comes before "c" in alphabet
print( "a" > "c" )
print( "a" < "b" )
print( "ba" < "bb" )

# uppercase comes before lowercase
print( "A" < "a" )

False
True
False
True
True
True


### Methods
+ Strings provide methods that perform a variety of useful operations (overview: `dir("string")`)
+ A method call is called an **invocation** (e.g., in the case of `fruit.upper()`, we would say that we are invoking `upper` on `fruit`.
+ Example methods:
    + `lower`, `upper`
    + `replace`
    + `split`, `join`
    + `startswith`, `endswith`

In [37]:
# have a look into all methods of strings
dir("this is a string")

# checkout how a method works
help(fruit.startswith)

# example methods
print( fruit.startswith("b") )
print( fruit.upper() )
print( fruit.replace("b","p") )
print( fruit.split("n") )

Help on built-in function startswith:

startswith(...) method of builtins.str instance
    S.startswith(prefix[, start[, end]]) -> bool
    
    Return True if S starts with the specified prefix, False otherwise.
    With optional start, test S beginning at that position.
    With optional end, stop comparing S at that position.
    prefix can also be a tuple of strings to try.

BANANA
panana
['ba', 'a', 'a']
True


### Regular expressions
+ A module called re provides functions related to regular expressions
+ it allows for a lot of tools such as
    + check whether specific patterns appear in the text  `re.search(pattern, text)`
    + if pattern is not in the text the method will return an empty element
    + check for two different types of one pattern (e.g., `re.search("col(o|ou)r", text)`)
    + string substitution with `re.sub(pattern, repl, string)`

In [62]:
import re
abstract = "Priors are a key feature of the Bayesian paradigm."

# check whether "Bayes" appears in abstract
result = re.search("Bayes", abstract)
print( result )
print( result.string ) # return entire text string
print( result.group() ) # return pattern
print( result.span() ) # return range whether pattern appears in text string
# using indexing to check span
print( abstract[32:37] )

# returns nothing if pattern is not in string 
null_result = re.search("bayes", abstract)
print( null_result )
# check whether null_result is empty
null_result == None

# check for different types of patterns
description = "The sky has a blue color."

print( re.search("col(o|ou)r", description) )

# string substitution
print( re.sub("sky", "car", description) )

<re.Match object; span=(32, 37), match='Bayes'>
Priors are a key feature of the Bayesian paradigm.
Bayes
(32, 37)
Bayes
None
<re.Match object; span=(19, 24), match='color'>
The car has a blue color.


Der Befehl "tail" ist entweder falsch geschrieben oder
konnte nicht gefunden werden.
