<a href="https://colab.research.google.com/github/justalge/another_python_totorial/blob/main/week2/Lecture_3_types_sequential_copy.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Data types and variables

A variable can be seen as a container (or some say a pigeonhole) to store certain values. While the program is running, variables are accessed and sometimes changed, i.e., a new value will be assigned to a variable.

Putting values into the variables can be realized with assignments. The way you assign values to variables is nearly the same in all programming languages. In most cases, the equal "=" sign is used. 

There is no declaration of variables required in Python, which makes it quite easy. It's not even possible to declare the variables. If there is need for a variable, you should think of a name and start using it as a variable. Another remarkable aspect of Python: Not only the value of a variable may change during program execution, but the type as well.

In [None]:
i = 5
print(i, type(i))

i = 42 + 0.11
print(i, type(i))

i = "Alice"
print(i, type(i))

# When Python executes an assignment like "i = 42", it evaluates the right side
# of the assignment and recognizes that it corresponds to the integer number 42.
# It creates an object of the integer class to save this data.

# In other words, Python automatically takes care of the physical representation
# for the different data types.

5 <class 'int'>
42.11 <class 'float'>
Alice <class 'str'>


#### Object References

Python variables are references to objects, but the actual data is contained in the objects:

![](https://www.python-course.eu/images/python_variable_1_600w.webp)

As variables are pointing to objects and objects can be of arbitrary data types, variables cannot have types associated with them. This is a huge difference from C, C++ or Java, where a variable is associated with a fixed data type. This association can't be changed as long as the program is running.

In [None]:
# We want to demonstrate something else now. Let's look at the following code:

x = 42
y = x

# We created an integer object 42 and assigned it to the variable x. After this
# we assigned x to the variable y. This means that both variables reference the
# same object. The following picture illustrates this:


![](https://www.python-course.eu/images/python_variable_2_800w.webp)

In [None]:
# What will happen when we execute

y = 78

# after the previous code?

# Python will create a new integer object with the content 78 and then the
# variable y will reference this newly created object, as we can see in the
# following picture:

![](https://www.python-course.eu/images/python_variable_3_800w.webp)

Most probably, we will see further changes to the variables in the flow of our program. There might be, for example, a string assignment to the variable x. The previously integer object "42" will be orphaned after this assignment. It will be removed by Python, because no other variable is referencing it.

![](https://www.python-course.eu/images/python_variable_4_800w.webp)

How can we see or prove that x and y really reference the same object after the assignment y = x of our previous example?

In [None]:
# The identity function id() can be used for this purpose. Every instance
# (object or variable) has an identity, i.e., an integer which is unique within
# the script or program, i.e., other objects have different identities. So,
# let's have a look at our previous example and how the identities will change:

x = 42
print(id(x))

y = x
print(id(x), id(y))

y = 78
print(id(x), id(y))

94636915965728
94636915965728 94636915965728
94636915965728 94636915966880


#### Valid variable names

The naming of variables follows the more general concept of an identifier. A Python identifier is a name used to identify a variable, function, class, module or other object.

A variable name and an identifier can consist of the uppercase letters "A" through "Z", the lowercase letters "a" through "z", the underscore _ and, except for the first character, the digits 0 through 9. Python 3.x is based on Unicode. That is, variable names and identifier names can additionally contain Unicode characters as well.

In [None]:
# Identifiers are unlimited in length. Case is significant. The fact that
# identifier names are case-sensitive can cause problems to some Windows users,
# where file names are case-insensitive, for example

# The following variable definitions are all valid:

я_переменная = 7

height = 10

υψος = 10

μεγιστη_υψος = 100

MinimumHeight = 1

**No identifier can have the same name as one of the Python keywords, although they are obeying the above naming conventions:**

and, as, assert, break, class, continue, def, del, elif, else,
except, False, finally, for, from, global, if, import, in, is, 
lambda, None, nonlocal, not, or, pass, raise, return, True, try, 
while, with, yield 

There is no need to learn them by heart. You can get the list of Python keywords in the interactive shell by using help. You type help() in the interactive, but please don't forget the parenthesis:

In [None]:
help()


Welcome to Python 3.7's help utility!

If this is your first time using Python, you should definitely check out
the tutorial on the Internet at https://docs.python.org/3.7/tutorial/.

Enter the name of any module, keyword, or topic to get help on writing
Python programs and using Python modules.  To quit this help utility and
return to the interpreter, just type "quit".

To get a list of available modules, keywords, symbols, or topics, type
"modules", "keywords", "symbols", or "topics".  Each module also comes
with a one-line summary of what it does; to list the modules whose name
or summary contain a given string such as "spam", type "modules spam".

help> keywords

Here is a list of the Python keywords.  Enter any keyword to get more help.

False               class               from                or
None                continue            global              pass
True                def                 if                  raise
and                 del                 import      

#### Naming Conventions `the_natural_way_of_naming_things` vs `TheNaturalWayOfNamingThings`

Certain names should be avoided for variable names: Never use the characters 'l' (lowercase letter "L"), 'O' ("O" like in "Ontario"), or 'I' (like in "Indiana") as single character variable names. They should be avoided, because these characters are indistinguishable from the numerals one and zero in some fonts. When tempted to use 'l', use 'L' instead, if you cannot think of a better name anyway. 

The Style Guide has to say the following about the naming of identifiers in standard modules: "All identifiers in the Python standard library MUST use ASCII-only identifiers, and SHOULD use English words wherever feasible (in many cases, abbreviations and technical terms are used which aren't English). In addition, string literals and comments must also be in ASCII. The only exceptions are (a) test cases testing the non-ASCII features, and (b) names of authors. Authors whose names are not based on the latin alphabet MUST provide a latin transliteration of their names."

#### Types of objects in python



*   built-in, i.e., objects provided by Python
*   objects from extension libraries
*   created in the application by the programmer



#### Numbers

##### Integers

In [None]:
# There are four built-in ways dealing with integers in python:

# 1) Normal integers

num = 123
print(num, type(num))

# 2) Octal literals (base 8)
# A number prefixed by 0o (zero and a lowercase "o" or uppercase "O") will be
# interpreted as an octal number

oct_num = 0o10
print(oct_num, type(oct_num))

# 3) Hexadecimal literals (base 16)
# Hexadecimal literals have to be prefixed either by "0x" or "0X"

hex_number = 0xA0F
print(hex_number, type(hex_number))

# 3) Binary literals (base 2)
# Binary literals can easily be written as well. They have to be prefixed by a
# leading "0", followed by a "b" or "B"

bin_num = 0b101010
print(bin_num, type(bin_num))


# the functions hex, bin, oct can be used to convert an integer number into the
# corresponding string representation of the integer number:

print()
x = hex(19)
print(x, type(x))

123 <class 'int'>
8 <class 'int'>
2575 <class 'int'>
42 <class 'int'>

0x13 <class 'str'>


##### Floating point numbers

In [None]:
a = 42.11
b = 3.1415e-10
c = 4.24E-7

##### Complex numbers

In [None]:
x = 3 + 4j

y = 2 - 3j

z = x + y

print(z)

(5+1j)


##### Long integers (only for Python2!)

Python 2 has two integer types: int and long. There is no "long int" in Python3 anymore. There is only one "int" type, which contains both "int" and "long" from Python2

# Strings

#### Introduction

Before Unicode came into usage, there was a one to one relationship between bytes and characters, i.e., every character - of a national variant, i.e. not all the characters of the world - was represented by a single byte.

ASCII is restricted to 128 characters and "Extended ASCII" is is still limited to 256 bytes or characters. This is good enough for languages like English, German and French, but by far not sufficient for Chinese, Japanese and Korean. That's where Unicode gets into the game. Unicode is a standard designed to represent every character from every language, i.e., it can handle any text of the world's writing systems. These writing systems can also be used simultaneously, i.e., Roman alphabet mixed with Cyrillic or even Chinese characters.

There is a different story about Unicode. A character maps to a code point. A code point is a theoretical concept. That is, for example, that the character "A" is assigned a code point U+0041. The "U+" means "Unicode" and the "0041" is a hexadecimal number, 65 in decimal notation.

In [None]:
hex(65)

'0x41'

In [None]:
# get position of a character `A` in ASCII:

print(ord('A'))

# back:

print(chr(65))

65
A


Up to four bytes are possible per character in Unicode. Theoretically, this means a huge number of 4294967296 possible characters. Due to restrictions from UTF-16 encoding, there are "only" 1,112,064 characters possible. Unicode version 8.0 has assigned 120,737 characters. This means that there are slightly more than 10 % of all possible characters assigned, in other words, we can still add nearly a million characters to Unicode.

#### Unicode Encodings

**UTF-32** It's a one to one encoding, i.e., it takes each Unicode character (a 4-byte number) and stores it in 4 bytes. One advantage of this encoding is that you can find the Nth character of a string in linear time, because the Nth character starts at the 4×Nth byte. A serious disadvantage of this approach is due to the fact that it needs four bytes for every character.

**UTF-16** UTF-16 (16-bit Unicode Transformation Format) is a character encoding capable of encoding all 1,112,064 valid code points of Unicode. The encoding is variable-length, as code points are encoded with one or two 16-bit code units.

**UTF-8** UTF8 is a variable-length encoding system for Unicode, i.e., different characters take up a different number of bytes. ASCII characters use solely one byte per character. This means that the first 128 characters UTF-8 are indistinguishable from ASCII. But the so-called "Extended Latin" characters like the Umlaute ä, ö and so on take up two bytes. Chinese characters need three bytes. Finally, the very seldom used characters of the "astral plane" need four bytes to be encoded in UTF-8. W3Techs (Web Technology Surveys) writes that "UTF-8 is used by 94.3% of all the websites whose character encoding we know."

####String, Unicode and Python

After this lengthy but necessary introduction, we finally come to Python and the way it deals with strings. All strings in Python 3 are sequences of "pure" Unicode characters, no specific encoding like UTF-8.

In [None]:
# There are different ways to define strings in Python:

s = 'I am a string enclosed in single quotes.'
s2 = "I am another string, but I am enclosed in double quotes."

# Single quotes will have to be escaped with a backslash \, if the string is
# defined with single quotes:

s3 = 'It doesn\'t matter!'

# This is not necessary, if the string is represented by double quotes:

s3 = "It doesn't matter!"

# Analogously, we will have to escape a double quote inside a double quoted string:

txt = "He said: \"It doesn't matter, if you enclose a string in single or double quotes!\""
print(txt)

# They can also be enclosed in matching groups of three single or double quotes.
# In this case they are called triple-quoted strings. The backslash character is
# used to escape characters that otherwise have a special meaning, such as
# newline, backslash itself, or the quote character:

txt = '''A string in triple quotes can extend


over multiple lines like this one, and can contain
'single' and "double" quotes.'''
print()
print(txt)

# In triple-quoted strings, unescaped newlines and quotes are allowed (and are
# retained), except that three unescaped quotes in a row terminate the string

He said: "It doesn't matter, if you enclose a string in single or double quotes!"

A string in triple quotes can extend


over multiple lines like this one, and can contain
'single' and "double" quotes.


A string in Python consists of a series or sequence of characters - letters, numbers, and special characters. Strings can be subscripted or indexed. Similar to C, the first character of a string has the index 0.

In [None]:
s = "Hello World"

print(s[0])
print(s[5])

# The last character of a string can be accessed this way:

print(s[len(s)-1])

# And the easier variant:

print(s[-1])

H
 
d
d


**By the way, there is no character type in Python. A character is simply a string of size one**

#### Some operators and functions for strings

1. **Concatenation**. Strings can be glued together (concatenated) with the + operator: "Hello" + "World" will result in "HelloWorld"

In [None]:
"Hello" + "World"

'HelloWorld'

2. **Repetition**. String can be repeated or repeatedly concatenated with the asterisk operator "*": "-" * 3 will result in "---"

In [None]:
'*' * 3

'***'

3. **Indexing**. "Python"[0] will result in "P"

In [None]:
 "Python"[0]

'P'

4. **Slicing**. Substrings can be created with the slice or slicing notation, i.e., two indices in square brackets separated by a colon: "Python"[2:4] will result in "th"

In [None]:
print("Python"[2:4])

# the third number is step:
print("Python"[0:100:2])

th
Pto


In [None]:
"Python"[-100:-1:-1]

''

5. **Size** len("Python") will result in 6

In [None]:
len('Python')

6

#### Immutable strings

In [None]:
# Like strings in Java and unlike C or C++, Python strings cannot be changed.
# Trying to change an indexed position will raise an error:

s = "Some things are immutable!"
s[-1] = "." 

TypeError: ignored

In [None]:
# Beginners in Python are often confused, when they see the following codelines:

txt = "He lives in Berlin!"
txt = "He lives in Hamburg!"

# The variable "txt" is a reference to a string object. We define a completely
# new string object in the second assignment. So, you shouldn't confuse the
# variable name with the referenced object!

#### A String Peculiarity

In [None]:
# Strings show a special effect, which we will illustrate in the following
# example. We will need the "is"-Operator. If both a and b are strings, "a is b"
# checks if they have the same identity, i.e., share the same memory location.
# If "a is b" is True, then it trivially follows that "a == b" has to be True as
# well. Yet, "a == b" True doesn't imply that "a is b" is True as well!

a = "Linux"
b = "Linux"

print(id(a), id(b))
print(a == b)
a is b

140277119413616 140277119413616
True


True

In [None]:
a = "Baden-Wurttemberg"
b = "Baden-Wurttemberg"

print(id(a), id(b))
print(a == b)
print(a is b)

# The special character, i.e. the hyphen, is to blame, more information: 
# https://stackoverflow.com/questions/25183031/different-storage-position-of-equal-strings-with-special-characters

# Also small integer caching: Python caches small integers, which are integers
# between -5 and 256. These numbers are used so frequently that it’s better for
# performance to already have these objects available:

a = 5
b = 5
print(id(a), id(b))

a = 55555555555555555
b = 55555555555555555
print(id(a), id(b))

140614532739680 140614532741840
True
False
94600937544320 94600937544320
140614535818000 140614535818480


In [None]:
a = "Baden!"
b = "Baden!"
a is b

False

In [None]:
a = "Baden1"
b = "Baden1"
a is b

True

#### Escape Sequences in Strings

To end our coverage of strings in this chapter, we will introduce some escape characters and sequences. The backslash () character is used to escape characters, i.e., to "escape" the special meaning, which this character would otherwise have. Examples for such characters are newline, backslash itself, or the quote character. String literals may optionally be prefixed with a letter 'r' or 'R'; these strings are called raw strings. Raw strings use different rules for interpreting backslash escape sequences:




In [None]:
print('Alice wrote \n to Bob')
print()
print(r'Alice wrote \n to Bob')

Alice wrote 
 to Bob

Alice wrote \n to Bob


Escape sequences and their meaning:

\newline - Ignored

\\ - Backslash(\)

\' - Single quote(')

\" - Double quote(")

\a - ASCII Bell (BEL)

\b - ASCII Backspace (BS)

\f - ASCII Formfeed (FF)

\n - ASCII Linefeed (LF)

\N{name} - Character named name in the Unicode database (Unocode only)

\r - ASCII Carriage Return (CR)

\t - ASCII Horisontal Tab (TAB)

\uxxxx - Character with 16-bit hex value xxxx (Unicode only)

\Uxxxxxxxx - Character with 32-bit hex value xxxxxxxx (Unicode only)

\v - ASCII Vertical Tab (VT)

\ooo - Character with octal value ooo

\xhh - Character with hex value hh

#### Byte Strings

Python 3.0 uses the concepts of text and (binary) data instead of Unicode strings and 8-bit strings. Every string or text in Python 3 is Unicode, but encoded Unicode is represented as binary data. The type used to hold text is str, the type used to hold data is bytes. It's not possible to mix text and data in Python 3; it will raise TypeError. While a string object holds a sequence of characters (in Unicode), a bytes object holds a sequence of bytes, out of the range 0 to 255, representing the ASCII values. Defining bytes objects and casting them into strings:

In [None]:
x = "Hallo"
t = str(x)
u = t.encode("UTF-8")
print(u)

b'Hallo'


# Sequential data types

#### Bytes

The byte object is a sequence of small integers. The elements of a byte object are in the range 0 to 255, corresponding to ASCII characters and they are printed as such.

In [None]:
s = "Glückliche Fügung"
s_bytes = s.encode('utf-8') 
s_bytes

b'Gl\xc3\xbcckliche F\xc3\xbcgung'

#### Lists

1. They are ordered
2. They contain arbitrary objects
3. Elements of a list can be accessed by an index
4. They are arbitrarily nestable, i.e. they can contain other lists as sublists
5. Variable size
6. They are mutable, i.e. the elements of a list can be changed


In [None]:
[]  # an empty list
[42, "What's the question?", 3.1415]  # A list of mixed data types
["High up", ["further down", ["and down", ["deep down", "the answer", 42]]]]  # A deeply nested list

['High up', ['further down', ['and down', ['deep down', 'the answer', 42]]]]

In [None]:
# We go to a virtual supermarket. Fetch a cart and start shopping:

shopping_list = ['milk', 'egg', 'butter', 'bread', 'bananas']
cart = []
#  "pop()"" removes the last element of the list and returns it
article = shopping_list.pop()  
print(article, shopping_list)
cart.append(article)
print(cart)

# we go on like this:
article = shopping_list.pop()  
print("shopping_list:", shopping_list)
cart.append(article)
print("cart: ", cart)

bananas ['milk', 'yoghurt', 'egg', 'butter', 'bread']
['bananas']
shopping_list: ['milk', 'yoghurt', 'egg', 'butter']
cart:  ['bananas', 'bread']


In [None]:
# With a while loop:

shopping_list = ['milk', 'yoghurt', 'egg', 'butter', 'bread', 'bananas']
cart = []
while shopping_list != []:
    article = shopping_list.pop()  
    cart.append(article)
    print(article, shopping_list, cart)

print("shopping_list: ", shopping_list)
print("cart: ", cart)

bananas ['milk', 'yoghurt', 'egg', 'butter', 'bread'] ['bananas']
bread ['milk', 'yoghurt', 'egg', 'butter'] ['bananas', 'bread']
butter ['milk', 'yoghurt', 'egg'] ['bananas', 'bread', 'butter']
egg ['milk', 'yoghurt'] ['bananas', 'bread', 'butter', 'egg']
yoghurt ['milk'] ['bananas', 'bread', 'butter', 'egg', 'yoghurt']
milk [] ['bananas', 'bread', 'butter', 'egg', 'yoghurt', 'milk']
shopping_list:  []
cart:  ['bananas', 'bread', 'butter', 'egg', 'yoghurt', 'milk']


#### Tuples

A tuple is an immutable list, i.e. a tuple cannot be changed in any way, once it has been created. A tuple is defined analogously to lists, except the set of elements is enclosed in parentheses instead of square brackets. The rules for indices are the same as for lists. Once a tuple has been created, you can't add elements to a tuple or remove elements from a tuple.

Benefits of tuples:

1. Tuples are faster than lists.
2. If you know that some data doesn't have to be changed, you should use tuples instead of lists, because this protects your data against accidental changes.
3. The main advantage of tuples is that tuples can be used as keys in dictionaries, while lists can't.

In [None]:
t = ("tuples", "are", "immutable")
t[0]

'tuples'

In [None]:
t[0] = "assignments to elements are not possible"

TypeError: ignored

#### Slicing

In [None]:
slogan = "Python is great"
first_six = slogan[0:6]
first_six

'Python'

In [None]:
starting_at_five = slogan[5:]
starting_at_five

'n is great'

In [None]:
a_copy = slogan[:]
without_last_five = slogan[0:-5]
without_last_five

'Python is '

Slicing works with three arguments as well. If the third argument is for example 3, only every third element of the list, string or tuple from the range of the first two arguments will be taken.

If s is a sequential data type, it works like this:

s[begin: end: step]

In [None]:
# In the following example we define a string and we print every third character
# of this string:

slogan = "Python under Linux is great"
slogan[::3]

'Ph d n  e'

In [None]:
# hidden sentence:

s = "TPoyrtohnotno  ciosu rtshees  lianr gTeosrto nCtiot yb yi nB oCdaennasdeao"
print(s)

TPoyrtohnotno  ciosu rtshees  lianr gTeosrto nCtiot yb yi nB oCdaennasdeao


In [None]:
# solution:

s[::2]

'Toronto is the largest City in Canada'

In [None]:
# You may be interested in how we created the string. You have to understand
# list comprehension to understand the following:

s = "Toronto is the largest City in Canada"
t = "Python courses in Toronto by Bodenseo"
s = "".join(["".join(x) for x in zip(s,t)])
s

'TPoyrtohnotno  ciosu rtshees  lianr gTeosrto nCtiot yb yi nB oCdaennasdeao'

In [None]:
# length:

txt = "Hello World"
print(len(txt))

# concatenation:

# strings
print()
firstname = "Homer"
surname = "Simpson"
name = firstname + " " + surname
print(name)

# lists
print()
colours1 = ["red", "green","blue"]
colours2 = ["black", "white"]
colours = colours1 + colours2
print(colours)

# checking if an element is contained in list

print()
abc = ["a","b","c","d","e"]
print("a" in abc)
print("a" not in abc)

# repetitions:

print()
print(3 * "xyz-")
print(3 * ["a","b","c"])

11

Homer Simpson

['red', 'green', 'blue', 'black', 'white']

True
False

xyz-xyz-xyz-
['a', 'b', 'c', 'a', 'b', 'c', 'a', 'b', 'c']


#### The Pitfalls of Repetitions

In [None]:
x = ["a","b","c"]
y = [x] * 4
y

[['a', 'b', 'c'], ['a', 'b', 'c'], ['a', 'b', 'c'], ['a', 'b', 'c']]

In [None]:
y[0][0] = "p"
y

[['p', 'b', 'c'], ['p', 'b', 'c'], ['p', 'b', 'c'], ['p', 'b', 'c']]

![](https://www.python-course.eu/images/repetitions.webp)

This result is quite astonishing for beginners of Python programming. We have assigned a new value to the first element of the first sublist of y, i.e. y[0][0] and we have "automatically" changed the first elements of all the sublists in y, i.e. y[1][0], y[2][0], y[3][0].
The reason is that the repetition operator "* 4" creates 4 references to the list x: and so it's clear that every element of y is changed, if we apply a new value to y[0][0].

#### List comprehensions

In [None]:
t = [i**2 for i in range(10)]
t

[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

In [None]:
t = [i if i % 2 == 0 else None for i in range(10)]
t

[0, None, 2, None, 4, None, 6, None, 8, None]

#### How to create list of empty lists, template for many exercises

In [None]:
table = [[0]*8]*8  # this is WRONG WAY!
table

[[0, 0, 0, 0, 0, 0, 0, 0],
 [0, 0, 0, 0, 0, 0, 0, 0],
 [0, 0, 0, 0, 0, 0, 0, 0],
 [0, 0, 0, 0, 0, 0, 0, 0],
 [0, 0, 0, 0, 0, 0, 0, 0],
 [0, 0, 0, 0, 0, 0, 0, 0],
 [0, 0, 0, 0, 0, 0, 0, 0],
 [0, 0, 0, 0, 0, 0, 0, 0]]

In [None]:
# because:
table[4][3] = 1
table

[[0, 0, 0, 1, 0, 0, 0, 0],
 [0, 0, 0, 1, 0, 0, 0, 0],
 [0, 0, 0, 1, 0, 0, 0, 0],
 [0, 0, 0, 1, 0, 0, 0, 0],
 [0, 0, 0, 1, 0, 0, 0, 0],
 [0, 0, 0, 1, 0, 0, 0, 0],
 [0, 0, 0, 1, 0, 0, 0, 0],
 [0, 0, 0, 1, 0, 0, 0, 0]]

In [None]:
s = [0]* 8
s[3] = 1
s

[0, 0, 0, 1, 0, 0, 0, 0]

In [None]:
# correct way:

table = [[0]*8 for _ in range(8)]
table[4][3] = 1
table

[[0, 0, 0, 0, 0, 0, 0, 0],
 [0, 0, 0, 0, 0, 0, 0, 0],
 [0, 0, 0, 0, 0, 0, 0, 0],
 [0, 0, 0, 0, 0, 0, 0, 0],
 [0, 0, 0, 1, 0, 0, 0, 0],
 [0, 0, 0, 0, 0, 0, 0, 0],
 [0, 0, 0, 0, 0, 0, 0, 0],
 [0, 0, 0, 0, 0, 0, 0, 0]]

In [None]:
# if you want to create list of IMMUTABLE type same values the following is OK:

line = [0]*20
line[5] = 5
line


[0, 0, 0, 0, 0, 5, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]

# List manipulations

A list can be seen as a stack. A stack in computer science is a data structure, which has at least two operations: one which can be used to put or push data on the stack, and another one to take away the most upper element of the stack.

In [None]:
# append

lst = [3, 5, 7]
lst.append(42)
print(lst)

# It's important to understand that append returns "None". In other words, it
# usually doesn't make sense to reassign the return value:

print()
lst = [3, 5, 7]
lst = lst.append(42)
print(lst)

# pop

print()
cities = ["Hamburg", "Linz", "Salzburg", "Vienna"]
print(cities.pop(0), cities)
print(cities.pop())

[3, 5, 7, 42]

None

Hamburg ['Linz', 'Salzburg', 'Vienna']
Vienna


**but popping elements from the list not from the end is not efficient!**

In [None]:
# extend

lst = [42,98,77]
lst2 = [8,69]
lst.append(lst2)
print(lst)

# correct:

lst = [42,98,77]
lst2 = [8,69]
lst.extend(lst2)
print(lst)

# The argument of 'extend' doesn't have to be a list. It can be any kind of
# iterable. That is, we can use tuples and strings as well:

lst = ["a", "b", "c"]
programming_language = "Python"
lst.extend(programming_language)
print(lst)

[42, 98, 77, [8, 69]]
[42, 98, 77, 8, 69]
['a', 'b', 'c', 'P', 'y', 't', 'h', 'o', 'n']


#### Extending and Appending Lists with the '+' Operator

In [None]:
# There is an alternative to 'append' and 'extend'.
# '+' can be used to combine lists.

In [None]:
level = ["beginner", "intermediate", "advanced"]
other_words = ["novice", "expert"]
print(level + other_words)

# Be careful. Never ever do the following:

print()
L = [3, 4]
L = L + [42]
print(L)

# Even though we get the same result, it is not an alternative to
# 'append' and 'extend'. The augmented assignment (+=) is an alternative:

print()
L = [3, 4]
L += [42]
print(L)

['beginner', 'intermediate', 'advanced', 'novice', 'expert']

[3, 4, 42]

[3, 4, 42]


In [None]:
# In the following example, we will compare the different approaches and
# calculate their run times. To understand the following program, you need
# to know that time.time() returns a float number, the time in seconds since the
# so-called ``The Epoch`` (Epoch time (also known as Unix time or POSIX time) is
# a system for describing instants in time, defined as the number of seconds that
# have elapsed since 00:00:00 Coordinated Universal Time (UTC), Thursday, 1 January 1970,
# not counting leap seconds). time.time() - start_time calculates the time in
# seconds used for the for loops:

import time

n= 100000

start_time = time.time()
l = []
for i in range(n):
    l = l + [i * 2]
print(time.time() - start_time)

28.44809627532959


In [None]:
start_time = time.time()
l = []
for i in range(n):
    l += [i * 2]
print(time.time() - start_time)

0.026547670364379883


In [None]:
start_time = time.time()
l = []
for i in range(n):
    l.append(i * 2)
print(time.time() - start_time)

0.02409672737121582


We can see that the "+" operator is about 1268 times slower than the append method. The explanation is easy: If we use the append method, we will simply append a further element to the list in each loop pass. Now we come to the first loop, in which we use l = l + [i * 2]. The list will be copied in every loop pass. The new element will be added to the copy of the list and result will be reassigned to the variable l. After that, the old list will have to be removed by Python, because it is not referenced anymore. We can also see that the version with the augmented assignment ("+="), the loop in the middle, is only slightly slower than the version using "append".

In [None]:
# remove

# It is possible to remove a certain value from a list without knowing the
# position with the method "remove":

L = [3, 4, 5, 4]
L.remove(4)
print(L)

# This call will remove the first occurrence of '4' from the list 'L'

[3, 5, 4]


In [None]:
# index - linear search in a list

# The method "index" can be used to find the position of an element within a list:

colours = ["red", "green", "blue", "green", "yellow"]
print(colours.index("green"))
print(colours.index('green', 2, 5))

# insert

# The functionality of the method "append" can be simulated with insert in the
# following way:

abc = ["a","b","c"]
abc.insert(len(abc),"d")
abc

1
3


['a', 'b', 'c', 'd']

# Shallow and Deep copy

**The problems which we will encounter are general problems of mutable data types.**

In [None]:
a1 = [1, 2, 3, 4, 5]
b1 = a1  # not a copy!

a1[0] = 0
print(a1)
print(b1)

[0, 2, 3, 4, 5]
[0, 2, 3, 4, 5]


In [None]:
# this is because:

print(id(a1), id(b1))  # same objects!

140614531682704 140614531682704


In [None]:
# for integers different result:

x1 = 1
x2 = x1
print(id(x1), id(x2))

print()
x1 = 5
print(x1, x2)
print(id(x1), id(x2))

# but this is because int type is immutable! and on 6-th line we create
# a new object `5` and reference variable `x1` to it. Old object `1` resist

94600937544192 94600937544192

5 1
94600937544320 94600937544192


![](https://www.python-course.eu/images/deep_copy_detailed1_800w.webp)

In [None]:
colours1 = ["red", "blue"]
colours2 = colours1
print(colours1, colours2)

['red', 'blue'] ['red', 'blue']


#### Copying lists

We have finally arrived at the topic of copying lists. The list class offers the method copy for this purpose. Even though the name seems to be clear, this path also has a catch.

In [None]:
L = [1,2,3]
L[:]

[1, 2, 3]

In [None]:
# The catch can be seen, if we use help on copy:

help(list.copy)

Help on method_descriptor:

copy(self, /)
    Return a shallow copy of the list.



First you should remember what a list is in Python. A list in Python is an object consisting of an ordered sequence of references to Python objects. The following is a list of strings:

![](https://www.python-course.eu/images/list_firstnames_800w.webp)

Basically, the list object is solely the blue box with the arrows, i.e. the references to the strings. The strings itself are not part of the list. The following list whatever is such a more general list: 

![](https://www.python-course.eu/images/list_whatever_800w.webp)

When a list is copied, we copy the references.

In [None]:
person1 = ["Swen", ["Seestrasse", "Konstanz"]]

person2 = person1.copy()
person2[0] = "Sarah"
print(person1)
print(person2)

# Till now everything is OK

['Swen', ['Seestrasse', 'Konstanz']]
['Sarah', ['Seestrasse', 'Konstanz']]


![](https://www.python-course.eu/images/copy_nested_list_2_800w.webp)

In [None]:
person2[1][0] = "Bücklestraße"

print(person1)
print(person2)

# NOT OK

['Swen', ['Bücklestraße', 'Konstanz']]
['Sarah', ['Bücklestraße', 'Konstanz']]


![](https://www.python-course.eu/images/copy_nested_list_3_800w.webp)

#### Deepcopy

A solution to the described problem is provided by the module copy. This module provides the method "deepcopy", which allows a complete or deep copy of an arbitrary list, i.e. shallow and other lists.

In [None]:
from copy import deepcopy
person1 = ["Swen", ["Seestrasse", "Konstanz"]]

person2 = deepcopy(person1)
person2[0] = "Sarah"
person2[1][0] = "Bücklestrasse"

print(person1)
print(person2)

# OK

['Swen', ['Seestrasse', 'Konstanz']]
['Sarah', ['Bücklestrasse', 'Konstanz']]


![](https://www.python-course.eu/images/copy_with_deepcopy_2_800w.webp)

#### Method split: string -> list of words

In [None]:
words = 'Sasha went through the highway and did something'.split()
words

['Sasha', 'went', 'through', 'the', 'highway', 'and', 'did', 'something']

#### del in python

In [None]:
del words[3:7]
words

['Sasha', 'went', 'through', 'something']