# infoShare Academy Python

## Knowledge test overview with answers

### [Very Easy] Question 1: Hello World

```
1. How to correctly write "Hello World"?
* print("Hello world!")
* print "Hello world!"
* print(Hello world!)
* print Hello world
```

#### Answer

Correct answer is

In [1]:
print("Hello World")

Hello World


In Python the easiest way to print out a text is to use built-in function `print`. First argument of this function is an object `str`, which we want to print out. It can be a string of characters: object (like in our case) or other type of an object. In the second answer it will be converted to string of characters. Using print function we can also print many objects. For example:

In [2]:
print('Ala', 'Jan', 'Adam')

Ala Jan Adam


Print function may also have additional arguments which can specify:
* separator used to separate printed objects
* character to be printed as a last one
* a place where the text should be printed
* a flag affecting text buffering


Because function `print` is built-in you don’t need to import it. In most cases the basic way of usage:

In [3]:
print("Text to print out")

Text to print out


will be enough, but it is useful to know all the features.

In Python 2, `print` was a keyword (statement) such as `return`. This made it possible to print out:

In [4]:
# This line will result in SyntaxError when using Python 3, because 'print' is a not a keyword anymore.
print "Text"

SyntaxError: Missing parentheses in call to 'print'. Did you mean print("Text")? (<ipython-input-4-52e779ea1e92>, line 2)

It has been changed in Python 3 and now `print` is only a built-in function.


More detailed information about `print` function in Python 3 is fully described here:
https://docs.python.org/3/library/functions.html#print

Information about differences between Python 3 and 2:
https://docs.python.org/3/whatsnew/3.0.html

Information about `print` function in Python 2: https://docs.python.org/2.7/reference/simple_stmts.html#print https://docs.python.org/2.7/library/functions.html#print

### [Very Easy] Question 2: String concatenation (joining)

```
2. Which of the following commands will assign to a variable 'result' string of characters?
* result = "Py" + "thon"
* result = "Py" . "thon"
* result = "Py" x "thon"
* result = {"Py": "thon"}
```

#### Answer

Correct answer is:

In [5]:
result = "Py" + "thon"

In Python one of the ways to concatenate strings (connect the strings) is to use operator `+`.
It is possible because built-in type `str` implements appropriate methods which make possible to concatenate or index using operators basing also on sequential types.
This type of command is very clear and well understandable. To connect a new string of characters with an existing value (saved in a variable) we can use operator `+=`:

In [6]:
welcome = "Hello"
welcome += " World!"
print(welcome)

Hello World!


Using operator `+` to concatenate strings, we should remember about some restrictions imposed by this type of characteristics. Type `str` in Python is an immutable type. (check also question: _Which of the following types is mutable?_).
Using operators `+` or `+=` with type `str` will create a new `str` object.
In the most cases use of `+` operator will be a good solution because of its understandable usage.

Anyway, you must remember that when we combine a lot of strings of characters, it can have negative performance consequences. In this case it’s better to use `join()` method. For example:

In [7]:
r = ['r', 'r', 't', 's', 'd']
joined = ' '.join(r)
print("Using space (' ') character as separator:")
print(joined)

r = ['r', 'r', 't', 's', 'd']
joined = '\n'.join(r)
# Note that raw string (e.g. r"raw string") is used below so that '\n' is not changed
# to a real new during print out. Withing raw strings special characters (like: \n, \t, etc.)
# are not evaluated.
print(r"Using new line ('\n') character as separator:")
print(joined)

r = ['r', 'r', 't', 's', 'd']
joined = '--+--'.join(r)
print("Using multiple character string ('--+--') as separator:")
print(joined)

Using space (' ') character as separator:
r r t s d
Using new line ('\n') character as separator:
r
r
t
s
d
Using multiple character string ('--+--') as separator:
r--+--r--+--t--+--s--+--d


Information about `str` type:  
https://docs.python.org/3/library/stdtypes.html#str  
https://docs.python.org/3/library/stdtypes.html#typesseq-common

Article about various string operations:  
https://realpython.com/python-string-split-concatenate-join/

### [Very Easy] Question 3: `pass` keyword

```
3. Which of the following instructions does not perform any operation?
* noop
* wait
* pass
* sleep
```

#### Answer
Correct answer: **`pass`**

`pass` statement does not do any operation. It can be used as a fulfillment, e.g. a function body, thus creating a structurally correct implementation:

In [8]:
def not_sure_about_it():
    pass

`pass` can also be using when creating empty classes:

In [9]:
class MyNewClass:
    pass

Or when prototyping class with list of methods to be created, but we're not ready to implement them:

In [10]:
class Basic:
    def new_basic_object(self):
        pass
    
    def addmethod(self):
        pass

**NOTE:** the need for this strange keyword arises, because Python is a language that uses identation (typically 4 spaces) to indicate code blocks (contexts). So, without `pass` it would be impossible to say exactly to which code block, empty line belongs to. Keeping this in mind, it doesn't surprise that `pass` is also heavily used within `for` loops and `if` conditional statements.

Information about `pass` statement in documentation:  
https://docs.python.org/3/reference/simple_stmts.html#the-pass-statement

### [Very Easy] Question 4: conditional statements (`if`)

```
4. Which of the following keywords is used to define a conditional statement?
* check
* when
* switch
* if
```

#### Answer

Correct answer: **`if`**

In Python conditions statement are written according to the following pattern :
```
if logical_condition:
    code that will be executed when logical_condition is true
```

After this section we can put any number of optional blocks `elif` to deﬁne
alternative conditions:

```
elif another_condition:
    code that will be executed when none of the previous conditions was true but this one is true
elif yet_another_condition:
    ...
```

At the very end we can deﬁne optional instruction `else` which will execute a code if none of all the previous conditions was true:

```
else:
    code that will be executed when none of the previous conditions was true
```
 

In practice, with simple condition, this could look like this:

In [11]:
if 2 == 1:
    print("That's interesting!")
else:
    print("No Matrix here.")

No Matrix here.


It's possible to create more complex conditional statements using logical operators, e.g. `or` or  `and`. Look at how evaluation of condition changes when we expand it with `or`:

In [12]:
if (2 == 1) or (2 > 1):
    print("That's interesting!")
else:
    print("No Matrix here.")

That's interesting!


In Python it is also possible (but **not recommended**) to write instruction `if` in one line:

In [13]:
if 1 > 0: print("Something is better than nothing.")

Something is better than nothing.


Or use conditional expression (also called *ternary operator*):

In [14]:
print("1 > 0") if 1 > 0 else print("1 <= 0")

1 > 0


More information:
https://realpython.com/python-conditional-statements/

### [Very Easy] Question 5: function definitions

```
5. How to correctly declare a function called 'get_full_name', having two 
arguments: 'first_name' and 'last_name'? 

* function get_full_name(first_name, last_name): 
* def get_full_name: first_name, last_name → 
* def get_full_name(first_name, last_name): 
* def get_full_name = (first_name, last_name) => 
```

#### Answer

Correct answer is:
```python
def get_full_name(first_name, last_name):
```

In Python the function can be declared using the keyword `def` followed by the name
of the function (we can use this name to refer the function to call it). Next, inside
the round brackets we should type the names of the function arguments. They are
optional (the function may have no arguments at all - then the brackets are empty
inside: `()`. 

The line ends with a colon, and the next line starts with the deﬁnition of the function
body (instructions to be executed within it).

In [15]:
def get_full_name(first_name, last_name):
    return first_name + " " + last_name

print(get_full_name("Krzysztof", "Jarzyna"))

Krzysztof Jarzyna


The rule of thumb is not to pass too
many arguments to the function. When a function receives a lot of arguments this might mean
that its scope is too big. In such case should consider dividing the function into
several smaller ones.  
Documentation:  
https://docs.python.org/3/tutorial/controlﬂow.html#deﬁning-functions  
Basic information about functions in Python:  
https://www.programiz.com/python-programming/function

#### Example: string formatting made easy


Function above could be reimplmenented using new string formatting syntax introduced in Python 3.6.

In short, whenever there are curly braces present, within special string that starts with 'f' before quotes, everything within them will be evaluated as Python statement.

For example (note that there's an 'f' before quote character):

In [16]:
def get_full_name(first_name, last_name):
    return f"{first_name} {last_name}"

print(get_full_name("Krzysztof", "Jarzyna"))

Krzysztof Jarzyna


As you can see the result is the same. However, now we can easily extend statements within curly braces, not only referencing local variables.

Lets use `*` operator for strings with which create a new string with multiple instances of the same text, e.g.:


In [17]:
'a' * 5

'aaaaa'

New function using the string multiplication:

In [18]:
def get_full_name2(first_name, last_name):
    return f"{first_name*2} {last_name*2}"

print(get_full_name2("Krzysztof", "Jarzyna"))

KrzysztofKrzysztof JarzynaJarzyna


For more details about this type of string formatting, see PEP498:  
https://www.python.org/dev/peps/pep-0498/ 

Also, it's worth noting at this point, that there are other interesting Python Extension Proposal documents, like PEP8 which describes good code style practices:  
https://www.python.org/dev/peps/pep-0008/

#### Example: passing arguments to functions as tuples or lists

Sometimes as the result of previous processing data is stored in a tuple or a list. Often this might be a result of queries to external data source like databases.

In such case, a shortened syntax extists to pass each value as separate argument to already existing function. For example, we have tuple:

In [19]:
the_name = ("Krzysztof", "Jarzyna")

Instead of manually indexing each value and passing it into a previously defined `get_full_name` function, like this:

In [20]:
print(get_full_name(the_name[0], the_name[1]))

Krzysztof Jarzyna


we can use a `*` notation to unwrap the values automatically:

In [21]:
print(get_full_name(*the_name))

Krzysztof Jarzyna


Of course when using this notation, size of the tuple must exactly the same as numer of arguments required to call the function.

For example, if our tuple with only one element:

In [22]:
the_name_too_short = ("Krzysztof", )

using the above-mentioned syntax will result in `TypeError`:

In [23]:
print(get_full_name(*the_name_too_short))

TypeError: get_full_name() missing 1 required positional argument: 'last_name'

Similarly, if the tuple is too long, `TypeError` will also occur. For example:

In [24]:
the_name_too_long = ("Krzysztof", "Marek", "Jarzyna")

cannot be easily passed in the same function:

In [25]:
print(get_full_name(*the_name_too_long))

TypeError: get_full_name() takes 2 positional arguments but 3 were given

#### Example: passing arguments to functions as dictionaries


Examples above applied to passing in positional arguments. However there's also a way to pass in keyword (named) arguments directly from collections. These of course need to include both *key* and *value*, thus dictionaries are used. In Python basic version of dictionaries (also known as *maps* from other languages) can be accessed through class `dict`.

Let's assume that we redefined our function to use default values:

In [26]:
def get_full_name_kwargs(first_name=None, last_name=None):
    return f"{first_name} {last_name}"

What's worth noting, you can still pass in arguments from tuple to such function as positional ones:

In [27]:
print(get_full_name_kwargs(*the_name))

Krzysztof Jarzyna


However, in this case our data is stored as dictionary:

In [28]:
the_name_dict = {"first_name" : "Krzysztof", "last_name" : "Jarzyna"}

In [29]:
In dictionaries we can access values for each individual key through using square brackets syntax (`[]`):

SyntaxError: invalid syntax (<ipython-input-29-41aeb575f95f>, line 1)

In [30]:
print(the_name_dict["first_name"])

Krzysztof


We could possibly, manually retrieve values for each key and pass for respective arguments:

In [31]:
print(get_full_name_kwargs(first_name=the_name_dict["first_name"],
                           last_name=the_name_dict["last_name"]))

Krzysztof Jarzyna


This works, but it's not very *pythonic*. It turns out, that similarly to single star (`*`) syntax for positional for order collections (tuples, list) there's also a double star (`**`) syntax for mapping collections, like dictionaries.

Shortening code above to it's `**` equivalent results in:

In [32]:
print(get_full_name_kwargs(**the_name_dict))

Krzysztof Jarzyna


What's intresting is that you can also use this notation to call original function which did not have a any default values:

In [33]:
print(get_full_name(**the_name_dict))

Krzysztof Jarzyna


That's true because you can always pass in values to function argument by using their names.

### [Easy] Question 6: class method arguments

```
6. What is the name of the first argument of the method (function in a class)? 
* this 
* self 
* that 
* arg 
```

#### Answer

Correct answer: `self`

The ﬁrst argument the method will receive is a reference pointing to the instance of the class (object) on which the method was called. This argument is passed implicitly (when calling the method, we do not to pass it. Python will "add it" automatically running the function call).

Method example:

In [34]:
class Contract:
    def generate(self, law_policy_name):
        pass

and the method call:

In [35]:
employee_contract = Contract()
employee_contract.generate("polish_law")

In this example, in the body of the `generate` method, using the argument `self`  we can access the  object written in `employee_contract` variable. In the `law_policy` argument, we will ﬁnd the value *polish_law*.  

Naming the ﬁrst argument of the method in a diﬀerent way (e.g. `object_reference`)
would be syntactically correct (it would work). However, it is inconsistent with the
conventions and stylistic rules of Python code, deﬁned by PEP 8:
https://www.python.org/dev/peps/pep-0008/#function-and-method-arguments 

For example following method defition uses different name for the first argument:

In [36]:
class More:
    def read(i_should_not_be_called_this_way):
        print(i_should_not_be_called_this_way)

Yet, still by calling this method we can see that the variable `i_should_not_be_called_this_way`, contains reference to the class object itself:

In [37]:
More().read()

<__main__.More object at 0x108017f60>


**NOTE:** in single line above we've both:
1. created object of class `More` by using default contructor (no custom `__init__` method was called) and *not naming it*
2. directly called method `read()` against newly created object

This combination might be helpful when there's not need for keeping object reference in variable after method execution.

### [Easy] Question 7: underlines within numbers

```
What is the result of the action: 1_2 + 3_4 ? 
* 4.6 
* -2 
* 46 
* this action will cause an error 
```

#### Answer

Correct answer is  `46`.

Notation `1_2` is equivalent to `12` - it is just a number `12`. Underlines will be ignored by
the interpreter. Similarly in the case of `3_4`.  
Underline notation was introduced in Python 3.6 to make it easier to write especially large numbers, e.g. instead of:

In [38]:
money = 10000000

we can write:

In [39]:
money = 10_000_000

which is easier to read, but the value is the same:

In [40]:
print(money)

10000000


This idea was introduced by PEP 515:  
https://www.python.org/dev/peps/pep-0515/

**NOTE:** if you want to print out the big numbers, in a more readable way, you can use, grouping options in (previously introduced) string formatting.

To seperate thousands parts, it's possible to use either `,` (comma):

In [41]:
print(f"{money:,}")

10,000,000


or `_` (underline):

In [42]:
print(f"{money:_}")

10_000_000


For more details, look at *Format Specification Mini-Language* :   
https://docs.python.org/3.7/library/string.html#format-specification-mini-language

### [Easy] Question 8: context managers and `with` keyword

```
8. Which of the following keywords is used to enable a context manager? 
* with 
* using 
* go 
* run_manager 
```

#### Answer

Correct answer is `with`.

Context manager allows to create readable and reusable imptementation of
situation in which some instructions should always be executed before and after
another dynamic part. A common example is working with ﬁles:

```python
with open("path/to/file")
  for line in data_file:
as data_file:
    print(line)
```


When working with various ﬁles we always want to:
* open the indicated ﬁle
* make speciﬁc operations on ﬁle
* close the ﬁle  

What is important, the ﬁle should always be closed, regardless of whether the
operations performed on it were successful or were aborted by an exception. The
usage of context manager guarantees the above.


Here's and example of creating file in one context and them opening and reading in second one:

In [43]:
with open('message.txt', 'w') as f:
    f.write("OK Boomer")


with open('message.txt', 'r') as f:
    print(f.read())

OK Boomer


An alternative syntax would be the
usage of: `try (…) except (…) finally`, however, for example when working with ﬁles, this
would be a less readable solution and duplicating the logic of error handling and ﬁle
closing.

More information about context manager:  
https://docs.python.org/3/reference/compound_stmts.html#with

### [Easy] Question 9: importing modules

```
9. Which of the following instructions is used to enable a csv module in our code? 
* require('csv') 
* use csv 
* use('csv') 
* import csv 
```

#### Answer

Correct answer is `import csv`.

In Python, a module is just a ﬁle containing a Python code. To use a diﬀerent module
in the implemented solution (and available functions, classes, variables, etc.), it must
be imported. This is the information for the interpreter that in the following code
there will most likely be a reference to this module so the interpreter should ﬁnd it.


We should notice, that information about where the module is located and under
what symbol it will be available, is passed in the same way. The import system should
be used with the usage of:
* module from the standard library
* module from an external, previously installed library
* another module implemented within the same application (e.g. class or function
from another ﬁle)


Modules are organized into hierarchical structures - like ﬁles on hard disk. This
structure consists of modules called packages (which are equivalent to a directory
on your hard disk), which may contain other packages (subdirectories) or modules
that aren’t packages (ﬁles). When importing a speciﬁc module, its location should be
indicated using dotted notation to separate subsequent levels of depth:

In [44]:
import os.path
print(os.path.join("directory", "file"))

directory/file


To refer to an imported module, function etc. without dot notation, use construction `from (...) import (...)`. For example:

In [45]:
from os import path
print(path.join("directory", "file"))

directory/file


It is also possible to give an alias to an imported module and refer to it via this alias:

In [46]:
from os import path as system_path
print(system_path.join("directory", "file"))

directory/file


The path to the module can be given both in absolute (preferred in most cases) and
relative.

More details about modules import in Python:  
https://docs.python.org/3/reference/import.html

PEP 8 (import):  
https://www.python.org/dev/peps/pep-0008/#imports

Articles:  
https://www.digitalocean.com/community/tutorials/how-to-import-modules-in-python-3  
https://realpython.com/absolute-vs-relative-python-imports/

### Question 10: `len` function

```
10 .Which of the following instructions will return the length of the string: "abc" 
(number of letters)? 
* len("abc") 
* length("abc") 
* "abc".length() 
* "abc".get_length() 
```

#### Answer

Correct answer is

In [47]:
len("abc")

3

The built-in `len` function returns the number of elements (length) in a object. The
type `str` is a sequence of characters and in this case the function `len` returns the
length of the sequence (number of characters / letters).

For the list, the function will
return the number of elements in the list:

In [48]:
len([1,2,3,4])

4

For the dictionary the number of pairs `key: value`:

In [49]:
len({'a': 1, 'b': 2, 'c': 3})

3

To enable the len function to work correctly for your own class, you must meet the
expected protocol. In this case, implement the `__len__` method. When someone
uses the `len` function on an object of this class, the `__len__` method will be called
and the returned value will be the result of the `len` function.

Also note that `len` is a unique method as it has a special function. However, this is a rare example (in Python) of usage convenience over strict design decisions.

Information about `len` function in documentation:  
https://docs.python.org/3/library/functions.html#len

Basic information about `__len__`:  
https://stackoverﬂow.com/questions/2481421/diﬀerence-between-len-and-len

### [Medium] Question 11: list comprehensions

```
11. The following code snippet:

some_variable = [number for number in range(10)]

is an example of usage of: 

* List comprehension 
* Dict comprehension 
* Generator expression 
* Tuple expression 
```

#### Answer

Correct answer is  *list comprehension*.

List comprehension is a short form of creating a new list based on an existing one
(list or another sequence). This is an alternative to using a standard loop for.

In [50]:
some_variable = [number for number in range(10)]
print(some_variable)

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]


Above statements can also be written in this (longer) way:

In [51]:
some_variable= []
for number in range(10):
    some_variable.append(number)
print(some_variable)

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]


In this case, the use of comprehension lists allows to get a concise and easy to read
form. In this example, the role of an existing sequence is `range(10)`, which is a
sequence of numbers from 0 to 9 inclusive. List comprehension may also have a
condition that allows you to ﬁlter the elements: 

In [52]:
some_variable = [number for number in range(10) if number > 5]
print(some_variable)

[6, 7, 8, 9]


In an extensive example:

In [53]:
class Student(object):
    def __init__(self, final_grade):
        self.final_grade = final_grade
        
    def __repr__(self):
        return f"Student<grade={self.final_grade}>"

students = [Student(3), Student(5), Student(6), Student(5)]
best_students = [student for student in students if student.final_grade > 4]
print(best_students)

[Student<grade=5>, Student<grade=6>, Student<grade=5>]


Here's and example of making all names in the list upper case, using list comprehension to create a new list:

In [54]:
all_user_names = ['John Johnny', 'Mat Mattly', 'Steve Steveney']
for upcase_name in [name.upper() for name in all_user_names]:
    print(upcase_name)

JOHN JOHNNY
MAT MATTLY
STEVE STEVENEY


Similarly to creating nested loops, it is also possible to nest comprehension lists.
However, you should keep in mind that for more complex expressions, it is better to
use a loop approach, because complex list comprehensions become quickly very
complicated.

In Python, there are also dict comprehensions, set comprehensions (working
analogously to list comprehensions, but on other data types) and an expressions
generator (working in a similar way).


More details:  
https://docs.python.org/3/tutorial/datastructures.html#list-comprehensions  
and more:  
https://djangostars.com/blog/list-comprehensions-and-generator-expressions/

Information about ranges:  
https://docs.python.org/3.7/library/stdtypes.html#ranges


### [Medium] Question 12: function definitions

```
12. Which of the following types is mutable? 
* bool 
* str 
* list 
* int
```

#### Answer

Correct answer is `list`.


In Python, as in most programming languages, data types can be mutable or
immutable. Immutable objects can’t be changed. 

Type `str` is an example and operation of concatenation (merging, check: "Which of
the following commands will assign the string " Python " to the `result` variable?"). To
complete the name with the missing part, the `+=` operator can be used:

In [55]:
name = "Mik"
print(id(name))
name += "olaj"
print(id(name))

4430046968
4430047584


In above example we're using `id` function which returns place in memory (offset) where variable is stored. If `+=` where to modify existing object instead of creating new one, the offsets should be the same. In this case, the values differ, so it's clear that Python underneath create a new `str` object with modified value.

This can also be illustrated using `is` operator which compares whether variables point to exactly the same object (not another instance). The check could look like this:

In [56]:
part_name = "Mik"
name = part_name
name is part_name

True

In [57]:
name += "olaj"
name is part_name

False

In [58]:
print(name)

Mikolaj


In [59]:
print(part_name)

Mik


Therefore, modiﬁcation of the immutable type is not possible. Each time a new
object will be created and the "old" will remain intact. Among the built-in Python
types are immutable:
* bool
* int
* ﬂoat
* tuple
* frozenset
* str

On the other hand, a mutable object can be modiﬁed. Example of mutable types:
* list
* set
* dict

Newly defined classes are also (typically) mutable by default.

A similar example to the previous one, but for `list` object, would look like this:

In [60]:
numbers = [1, 2, 3]
new_numbers = numbers
new_numbers is numbers

True

In [61]:
new_numbers += [4]
new_numbers is numbers

True

In [62]:
print(new_numbers)

[1, 2, 3, 4]


In [63]:
print(numbers)

[1, 2, 3, 4]


In this example, the list object is modiﬁed, and the change is visible in both, the
"new" and "old" variables, because both point to the same object (no new object was
created, as in the case of type `str`).

NOTE: This property has implications, when deﬁning a function with a mutable argument with the default value. 

More details:  
https://docs.python.org/3/library/stdtypes.html#immutable-sequence-types  
https://docs.python.org/3/glossary.html#term-immutable  
https://docs.python.org/3/glossary.html#term-mutable  

Article:  
https://medium.com/@meghamohan/mutable-and-immutable-side-of-python-c2145cf72747


### [Medium] Question 13: sequential collections slicing

```
13. How to return the last element from the list: grades = [5, 3, 2, 5]?
* grades[last] 
* grades[-1] 
* grades[len(grades)] 
* grades[4] 
```

#### Answer

Correct answer is `grades[-1]`:

In [64]:
grades = [5, 3, 2, 5]
grades[-1]

5

In Python we can reference to individual items in the list by using the index,
providing the item number in square brackets. The ﬁrst item in the list has an index
of 0.

So calling:

In [65]:
print(grades[1])

3


will write `3`. 

To start indexing from the end of the list  we use negative values . So `-1` is the index
of the last item in the list, `-2` the second to last, etc. 

As a warm-up to one of the following questions, there's a also a syntax called *slicing* through we you can extract sublists. A couple of examples below. Details will be exaplained later in this notebook:

In [66]:
print(grades[:])

[5, 3, 2, 5]


In [67]:
print(grades[-4:])

[5, 3, 2, 5]


In [68]:
print(grades[-3:])

[3, 2, 5]


In [69]:
print(grades[-5:])

[5, 3, 2, 5]


In [70]:
print(grades[-200:120])

[5, 3, 2, 5]


Sequence operations:  
https://docs.python.org/3/library/stdtypes.html#common-sequence-operations

Information about lists:  
https://www.programiz.com/python-programming/list

### [Medium] Question 14: handling exceptions syntax

```
14. Which keyword is ​NOT ​the element of the block of exception handling (`try …`):
* except 
* finally 
* catch 
* else 
```

#### Answer

The correct answer is: `catch`.

In Python, the exception handling block begins with the keyword `try`: followed by
the code executing within the block. Next is the exception handling section (`except`,
may occur several times) or the `finally` section, which will always be executed,
regardless of whether an exception occurred. After the keyword `except`, you can
specify the type of exception to be "caught" and handled in the block. If the thrown
exception does not match the given type, the interpreter will look for the next `except`
section with the matching type. Not specifying any type catches all types of
exceptions. This approach is not recommended because it limits the ﬂexibility of
implementation. A good practice is to give the most accurate type of exceptions
caught. To handle several diﬀerent types of exceptions in a diﬀerent way, prepare a
separate `except`. section for each. In the exception handling block it is allowed to
omit the `except` or `finally` section, however one of them must occur (if both, then
in the order: ﬁrst except then finally). Finally, there may be an optional `else`
section if the exception is not thrown.


An example of an exception handling block might look like this:

In [71]:
data_file = open("message.txt")
try:
    for line in data_file:
        print(line)
except IOError:
    print("Something went wrong...")
else:
    print("No problem :)")
finally:        
    data_file.close()

OK Boomer
No problem :)


The above example is simpliﬁed, for the purposes of presenting the possibilities of
the `try…` structure. This applies primarily to error handling in the `except` block.
Writing a message on the screen is not a good way to handle an exceptional cases. If
all we should do with a given exception is to save information about its occurrence,
one should use e.g. the logging module and store more detailed data in the program
log. In terms of ﬁle support itself, the preferred solution is to use the available
context manager (check: "What keyword can we use context manager for?").

Information about exceptions and their handling:  
https://docs.python.org/3.7/tutorial/errors.html

### [Medium] Question 15: class definition with inheritance

```
15. How to declare a class 'FullTimeContract', which inherits from the class 'Contract'? 
* class FullTimeContract(Contract): 
* class FullTimeContract inherits Contract: 
* class FullTimeContract implements Contract: 
* class FullTimeContract: Contract 
```

#### Answer

Correct answer is `class FullTimeContract(Contract):`.

Inheritance means that the type (inheriting, child) is a more detailed version of the
general type (base type, parent). In the example with the contract,
`FullTimeContract` is a particular type of general `Contract`. Thus objects of diﬀerent
types inheriting from the same parent to be treated in the same way and to
determine which class, the method being called, should be derived from when it is
called. This concept is called polymorphism.

In [72]:
class Contract:
    def calculate_value(self, salary_per_hour):
        return salary_per_hour
    
class FullTimeContract(Contract):
    def calculate_value(self, salary_per_hour):
        return 160 * salary_per_hour
    
class PartTimeContract(Contract):
    def calculate_value(self, salary_per_hour):
        return 80 * salary_per_hour

contracts = [FullTimeContract(), PartTimeContract()]
total_amount = 0
for contract in contracts:
    total_amount += contract.calculate_value(salary_per_hour=100)
print(total_amount)

24000


In the implementation of a child class, you can refer to the properties and methods
deﬁned in the base class using the `super()` construct. The child class can also
extend the base class by adding new ﬁelds and methods, as well as overwriting the
implementation of a method already deﬁned in the base class. Classes are
associated with the object-oriented programming paradigm, which is a very broad
topic, but also an important element of programming skills.


Information about classes in Python:  
https://docs.python.org/3/tutorial/classes.html

### [Hard] Question 16: type hints

```
16. Declaration 'def get_distance() → Optional[int]:' means that function 'get_distance' should: 
* Always return a value of type 'int' 
* Take the parameter 'out' of type 'int' 
* be invoked with any number of parameters of type 'int' 
* Always return a value of typu 'int' or value 'None'
```

#### Answer

Correct answer is: `Always return a value of type int or a value of None`.


The notation used above is *type hints*. It allows to declare expected types for
function arguments, returned values, variables. Python is a dynamically typed
language - however, the introduction of bigger support for type hints in new versions
of the language allows to use additional mechanisms that facilitate reading of
existing code and to reduce the number of errors.

The above statement won’t
prevent the function from being implemented in the following way, where we're returning `str` instead of `int`:

In [73]:
from typing import Optional
def get_distance() -> Optional[int]:
    return "This is wrong"

but it will be easier to detect this type of error:
* Development environments (IDE like PyCharm) will indicate this implementation as a potential error
* Information about the expected types allows for better syntax suggestion
* Using the so-called type checker (e.g. mypy tool) error will be returned when the code is checked (e.g. before joining changes to the common repository), and not when it is executed in a test or production environment

The entry `→ int` means that the given function should always return a value of type
`int`. "Packaging" in `Optional` also allows to return value `None`.

Information about the type hints:  
https://docs.python.org/3/library/typing.html

Mypy tool:  
https://github.com/python/mypy

Carl Meyer about type hints:  
https://youtu.be/pMgmKJyWKn8


### [Hard] Question 17: dictionary keys requirements

```
17. What type can NOT be a key in a dictionary? 
* int 
* str 
* bool 
* list 
```

#### Answer

Correct answer is `list`.

The key in the dictionary can be any object that is "hashable". This means objects for
which the hash value doesn’t change and can be compared with others based on
their identity, not the value. Thus we should see `TypeError` for following statement, when trying to use `list` objects as keys:

In [74]:
my_dict = {[1,1] : 'a', [1,2] : 'b'}

TypeError: unhashable type: 'list'

However we can use `tuples` as dictionary keys without problems because these are immutable and hashable:

In [75]:
my_dict = {(1,1) : 'a', (1,2) : 'b'}

Types `int`, `str` and `bool` are correct dictionary keys. The
`list`, like other built-in mutable types, isn’t hashable, and therefore can’t be used as a
key in the dictionary.

We can make our own classes useful as dictionary keys if we make them *hashable* through implementing `__hash__` method in particular:

In [76]:
class example(object):
    def __init__(self, a):
        self.value = a
    def __eq__(self, rhs):
        return self.value == rhs.value
    def __hash__(self):
        return hash(self.value)

a = example(1)
d = {a: "first"}
a.data = 2
print(d[example(1)])

first


Information about dictionary and hashable:    
https://docs.python.org/3/library/stdtypes.html#typesmapping  
https://docs.python.org/3/glossary.html#term-hashable

### [Hard] Question 18: empty list as function argument default value

```
18 .How to correctly use an empty list as a default argument for a function? 
* def calculate(numbers=[]): 
* def calculate(numbers=None): 
      if numbers is None: 
          numbers = []
* def calculate(): 
      numbers = [] 
* def calculate(numbers: []):
```

#### Answer

Correct answer is:
```python
def calculate(numbers=None):
    if numbers is None:
        numbers = []
```


This construction is due to the fact that the list is a mutable type and to the way
Python processes the function deﬁnition and default arguments. Default values are
assigned to function arguments when the interpreter "reads" its deﬁnition for the
ﬁrst time, not after each function call. So this only happens once during the whole
program execution. Giving an empty list directly as the default argument will cause it
to be one and the same list for each call. Because this is a mutable type, all changes
made on the ﬁrst call will be visible on the second. The following example illustrates
this well:

In [77]:
def more_numbers(numbers=[]):
    print(numbers)
    numbers.append(1)

In [78]:
more_numbers()

[]


In [79]:
more_numbers()

[1]


In [80]:
more_numbers()

[1, 1]


In each subsequent call, the function uses the same list that has already been
modiﬁed by previous executions. The use of the structure using `None`  as the default
value and assigning the list to a variable only in the body of the function is an
accepted way to solve this problem. A similar situation applies to all mutable types
being used as the default arguments of the function.


Default arguments:  
https://docs.python.org/3/tutorial/controlﬂow.html#default-argument-values

Using mutable types as the default arguments of the function:  
https://nikos7am.com/posts/mutable-default-arguments/


### [Hard] Question 19: function calling (keyword vs positional arguments)

```
19. Which of the function: 'def download(url, timeout=5):' invocation is NOT correct? 
* download("www.infoshareacademy.com") 
* download(url="www.infoshareacademy.com", 10) 
* download(timeout=10, url="www.infoshareacademy.com") 
* download(url="www.infoshareacademy.com")
```

#### Answer

Correct answer is:
```python
download(url="www.infoshareacademy.com", 10)
```

Such a call will cause an error (`SyntaxError`) resulting from passing a positional
argument after the named argument:

In [81]:
def download(url, timeout=5):
    print(url, timeout)
    
download(url="www.infoshareacademy.com", 10)

SyntaxError: positional argument follows keyword argument (<ipython-input-81-0c67f225e60d>, line 4)

Arguments passed to a function or method
can be positional arguments (without the argument name and the `=` character) or
named arguments (keyword).

For a function deﬁned in this example, it’s possible to:
* pass both arguments as positional arguments

In [82]:
download("www.infoshareacademy.com", 10)

NameError: name 'download' is not defined

* pass only the ﬁrst argument as a positional argument (timeout will default)

In [83]:
download("www.infoshareacademy.com")

NameError: name 'download' is not defined

* pass only the ﬁrst argument as a keyword argument (timeout will default)

In [84]:
download(url="www.infoshareacademy.com")

NameError: name 'download' is not defined

* pass both arguments as keyword arguments

In [85]:
download(url="www.infoshareacademy.com", timeout=10)

NameError: name 'download' is not defined

or

In [86]:
download(timeout=10, url="www.infoshareacademy.com")

NameError: name 'download' is not defined

The order of keyword arguments isn’t important (the interpreter knows which value
should be assigned to which argument). The order of positional arguments is
important because it corresponds to their assignment to individual variables. It is not
allowed to pass the positional argument after keyword arguments, or is it allowed to
omit an argument that has no default value.

Introduced in PEP 3102 concept of the Keyword-Only Arguments allows you to force
some (or all) arguments in the form of keyword arguments. This is to increase
readability in situations where the value of the argument isn’t obvious in the context
of the function being called. To apply this convention, you must use an asterisk when
you declare function arguments. This will cause all subsequent arguments to be
passed as keywords arguments:

In [87]:
def something(could_be_positional, *, only_keyword):
    pass

Above function cannot be called using 2 positional arguments, as shown for the original `download` function. Calling it like this will result in `TypeError`: 

In [88]:
something("first_arg", "second_arg")

TypeError: something() takes 1 positional argument but 2 were given

Whereas passing in `only_keyword` argument through keyword will work without any errors:

In [89]:
something("first_arg", only_keyword="second_arg")

Information about arguments of the function:  
https://docs.python.org/3/tutorial/controlﬂow.html#more-on-deﬁning-functions

PEP 3102:  
https://www.python.org/dev/peps/pep-3102/

### [Hard] Question 20: list slicing

```
20. How to return a list, having elements indexed 1, 2 and 3 from the list: 'grades = [5, 3, 2, 5, 6]'? 
* grades[1:] 
* grades[1][2][3] 
* grades[1:2:3] 
* grades[1:4] 
```

#### Answer

Correct answer is:
```python
grades[1:4]
```

Let's see this the results:

In [90]:
grades = [5, 3, 2, 5, 6]
grades[1:4]

[3, 2, 5]

By referencing the list (or other sequence) using an index and a colon, you can
perform the slice operation. This operation allows you to receive:
    
* list of items from the beginning to the given index `[:index]` (exluding the element under index `index`):

In [91]:
grades[:2]

[5, 3]

* list of items from the given index to the end `[index:]` (including the element under index `index`):

In [92]:
grades[1:]

[3, 2, 5, 6]

* list of items between two indexes `[start:end]` (including the element under index `start`,
exluding the element under index `end`):

In [93]:
grades[1:3]

[3, 2]

* list of elements between two indexes, containing every  n-th element `[start:end:n]` (including the element under index `start`,  then the element at index `start + n` etc., exluding the element under index `end`):

In [94]:
grades[1:5:2]

[3, 5]

Slice operations are a convenient way to get a selected subset of elements from a
longer sequence. When using a slice, remember: 
* the ﬁrst element of the list has the index `0`:

In [95]:
grades[0]

5

* last element of the list assigned to the variable numbers has index `-1` or
`len(numbers) - 1`:

In [96]:
print(grades[-1])
print(grades[len(grades)-1])

6
6


* the use of complicated slices (e.g. using negative indexes and selecting every n-
th element) signiﬁcantly reduces the clearness of the code. In such case, it is
worth carrying out the planned operation e.g. in several steps

What's interesing indexes used within slicing syntax do not have to within range of valid indexes. See example below working without errors:

In [97]:
grades[-100:]

[5, 3, 2, 5, 6]

In [98]:
grades[:300]

[5, 3, 2, 5, 6]

In [99]:
grades[-150:300]

[5, 3, 2, 5, 6]

Lists and list operations:  
https://docs.python.org/3/tutorial/introduction.html#lists  
https://docs.python.org/3/library/stdtypes.html#common-sequence-operations

## Task 1: functions, list comprehensions, string operations

### Task
Implement a function that should take one parameter as input called the_list.

Function should return a new list that is based only on last 5 elements of input list.

Elements of new list should be squared values of original element, but only if original element contained digit `7`.

For example, for input list:

`[1, 1, 3, 4, 7, 7, 2]`

The return list should be:

`[49, 49]`

LEVEL UP: try to use as few lines of code as possible.


### Helpful: checking for character presence in a number

In [100]:
a = 777
a

777

Converting a number to string:

In [101]:
str(a)

'777'

In [102]:
str_a = str(a)

Checking whether there's digit `8` present in our number:

In [103]:
'8' in str_a

False

There same for digit `7`:

In [104]:
'7' in str_a

True

### Solution 1 (straightforward, not very pythonic)

In [105]:
the_list = [1, 1, 3, 4, 7, 7, 2]
def func_task1_sol1(L):
    # Creating empty list for storing results.
    result_list = []
    # We want to process only last 5 elements of the list, so slicing is done.
    newL = L[-5:]
    
    for elem in newL:
        # Checking whether there is digit '7' within number.
        if '7' in str(elem):
            # Calculating square of the element.
            elem_squared = elem**2
            # Using append method to add new element to existing list.
            result_list.append(elem_squared)
            
func_task1_sol2(the_list)

NameError: name 'func_task1_sol2' is not defined

### Solution 2 (whole code in one line, pythonic way)

In [106]:
the_list = [1, 1, 3, 4, 7, 7, 2]

def func_task1_sol2(L):
    return [n**2 for n in L[-5:] if '7' in str(n)]

func_task1_sol2(the_list)

[49, 49]

## Task 2: using dictionaries, creating classes

### Task

Create a class called `Zoo`. It should store a count of animals per species/kind, e.g. ammount of lions, crocodiles, etc.

You should be able to create the class by passing the ammounts as keyword parameters.

Create method `update`, that takes keyword parameters and adjust the count for each keyword.

Example creation:

`z = Zoo(lion=3, panther=2)`

Example update:

`z.update(lion=-1, cheetah=2)`

Example initial animals count:

`initial_count = {'lion' : 2, 'tiger' : 4, 'parrot': 11, 'giraffe' : 2}`


### Solution 

In [107]:
initial_count = {'lion' : 2, 'tiger' : 4, 'parrot': 11, 'giraffe' : 2}

class Zoo(object):
    def __init__(self, **kwargs):
        self.count = kwargs
        
    def update(self, **kwargs):
        for k in kwargs:
            self.count[k] = self.count.get(k, 0) + kwargs[k]
            
z = Zoo(**initial_count)

z.update(lion=-1, cheetah=2)

print(z.count)

{'lion': 1, 'tiger': 4, 'parrot': 11, 'giraffe': 2, 'cheetah': 2}


## Task 3: `__repr__` method
### Task

Implement `__repr__` method for class `Zoo`.


### Helpful: example implementation of `__repr__`

In [108]:
class OriginalCar(object):
    def __init__(self, color, speed=0):
        self.color = color
        self.speed = speed
        
    def __repr__(self):
        return f"Car({self.color}, {self.speed})"
    
print(OriginalCar("blue"))

Car(blue, 0)


### Solution

In [109]:
class Zoo(object):
    def __init__(self, **kwargs):
        self.count = kwargs
        
    def update(self, **kwargs):
        for k in kwargs:
            self.count[k] = self.count.get(k, 0) + kwargs[k]
            
    def __repr__(self):
        return str(self.count)
    
    def __len__(self):
        return sum(self.count.values())    
    
print(Zoo(**initial_count))

{'lion': 2, 'tiger': 4, 'parrot': 11, 'giraffe': 2}


## Task 4

### Task
Implement `__len__` method and use it in `__repr__` method implementation.

`len` should return ammount of all animals (every kind) currently in the Zoo.

### Solution

In [110]:
class Zoo(object):
    def __init__(self, **kwargs):
        self.count = kwargs
        
    def update(self, **kwargs):
        for k in kwargs:
            self.count[k] = self.count.get(k, 0) + kwargs[k]
            
    def __repr__(self):
        output = f"Zoo(overall={len(self)}\n"
        output += "\n".join((f"{k}={v}" for k, v in self.count.items() if v > 0))
        output += "\n)"
        return output            
    
    def __len__(self):
        return sum(self.count.values())    
    
print(Zoo(**initial_count))

Zoo(overall=19
lion=2
tiger=4
parrot=11
giraffe=2
)


## Task 5

### Task
Sort list of cars by the ratio of `max_speed` to `weight` of car.

`Car` class basic implementation:

```python
class Car(object):
    def __init__(self, max_speed, weight):
        self.max_speed = max_speed
        self.weight = weight
    
    def __repr__(self):
        return f"Car(max_speed={self.max_speed}, weight={self.weight})"
```


Example list of cars to process:

`cars = [Car(202, 2250), Car(190, 2200), Car(180, 2100), Car(160, 1100)]`




### Helpful: how to sort list of objects?

In [111]:
cars = [Car(202, 2250), Car(190, 2200), Car(180, 2100), Car(160, 1100)]

NameError: name 'Car' is not defined

In [112]:
cars.sort()
from pprint import pprint
pprint(cars)

NameError: name 'cars' is not defined

In [113]:
sorted(cars)

NameError: name 'cars' is not defined

In [114]:
sorted(cars, key=lambda car: car.max_speed)

NameError: name 'cars' is not defined

In [115]:
sorted(cars, key=lambda car: car.weight)

NameError: name 'cars' is not defined

In [116]:
sorted(cars, key=lambda car: car.max_speed/car.weight)

NameError: name 'cars' is not defined

In [117]:
sorted(cars, key=lambda car: car.ratio)

NameError: name 'cars' is not defined

In [118]:
from operator import itemgetter, attrgetter
sorted(cars, key=attrgetter('ratio'))

NameError: name 'cars' is not defined

In [119]:
t = [(1,2,3),(1,77,9),(1,33,3)]
print(t)
sorted(t, key=itemgetter(2))

[(1, 2, 3), (1, 77, 9), (1, 33, 3)]


[(1, 2, 3), (1, 33, 3), (1, 77, 9)]

### Solution

In [120]:
class Car(object):
    def __init__(self, max_speed, weight):
        self.max_speed = max_speed
        self.weight = weight
        self.ratio = max_speed/weight
        
    def __lt__(self, rlt):
        return self.ratio < rlt.ratio
    
    def __repr__(self):
        return f"Car(max_speed={self.max_speed}, weight={self.weight}, ratio={self.ratio})"
    
    
cars = [Car(202, 2250), Car(190, 2200), Car(180, 2100), Car(160, 1100)]
sorted(cars, key=lambda car: car.ratio)

[Car(max_speed=180, weight=2100, ratio=0.08571428571428572),
 Car(max_speed=190, weight=2200, ratio=0.08636363636363636),
 Car(max_speed=202, weight=2250, ratio=0.08977777777777778),
 Car(max_speed=160, weight=1100, ratio=0.14545454545454545)]

## Task 6: method decorators

### Task

Implement `log` decorator function that will print out into console (standard out) information that includes:

1. what is the name of called function/method?
1. what are positional attributes passed to function/method?
1. what are keyword attributes passed to function/method?


### Helpful: examples of function decorators

Example code of decorator with example usage. Observe that exception is raised if user is not admin. This means that decorator `check_is_admin` was called.

In [121]:
def check_is_admin(f):
    def wrapper(*args, **kwargs):
        if kwargs['username'] != 'admin':
            raise Exception("User not allowed.")
        return f(*args, **kwargs)    
        
    return wrapper

class Store(object):
    
    @check_is_admin
    def get_food(self, username, food):
        return {}
    
    @check_is_admin
    def put_food(self, username, food):
        return


# Everything is fine. User is admin. No exeception will be raised.
Store().get_food(username="admin", food=[])

{}

In [122]:
# Line below will raise exception
Store().get_food(username="normaluser", food=[])

Exception: User not allowed.

### Solution

In [123]:
def log(f):
    def wrapper(*args, **kwargs):
        print(f"Called {f.__name__} {args} {kwargs}")
        return f(*args, **kwargs)    
        
    return wrapper

class Store(object):
    
    @log
    def get_food(self, username, food):
        return {}
    
    @log
    def put_food(self, username, food):
        return
    
    def __repr__(self):
        return f"Store()"


# Everything is fine. User is admin. No exeception will be raised.
Store().get_food(username="admin", food=[])
# Line below will raise exception
Store().get_food(username="normaluser", food=[])

Called get_food (Store(),) {'username': 'admin', 'food': []}
Called get_food (Store(),) {'username': 'normaluser', 'food': []}


{}

## Task 7: counting using dictionaries

### Task

Create a function that takes in a list of names (strings) and creates a dictionary where:

1. keys a are lengths of the names
1. values are list of all names that have the same length

Example. For following names list:

`['John', 'Johnny', 'Kate', 'Chris', 'Mike', 'Anna', 'Jon', 'Lex']`

the following dictionary should be returned:

`{ 3 : ['Jon',  'Lex'], 4 : ['John', 'Kate', Mike', 'Anna'], 5 : ['Chris'], 6 : ['Johnny']}`

**BONUS:** what is a simpler way to implement it?

### Solution

In [124]:
names_list = ['John', 'Johnny', 'Kate', 'Chris', 'Mike', 'Anna', 'Jon', 'Lex']

In [125]:
from collections import defaultdict
def cd(L):
    d = defaultdict(list)
    for n in L:
        key = len(n)
        d[key].append(n)
    return d
pprint(cd(names_list))

NameError: name 'pprint' is not defined

## Task 8: utility classes, fuctional programming, named tuples

### Task

Create a class called utils that has 3 methods:

1. Method `contains_zero` which takes list as input, and returns `True` if there is a value in the list that is considered zero.

2. Method `first_letters_up` which takes list of strings as input, and return a list of the same strings, but with first letter of each word in upper-case, e.g.

Input: `['anna kalinowska', 'robert rachwal']`

Result: `['Anna Kalinowska', 'Rober Rachwal']`

3. Method `name_the_tuple` which takes two parameters: a tuple as first and a list of strings as second. It should return named tuple object with content of tuple and fields sames from string list.


### Helpful: namedtuple usage

Example namedtuple definition and usage:

Person = namedtuple('Person', ['name', 'age', 'gender'])
p = Person("John", 33, "male")

### Solution



In [126]:
from collections import namedtuple
class Utils(object):
    
    def contains_zero(self, ll):
        return not all(ll)
    
    def first_letter_up(self, strlist):
        return [s.title() for s in strlist]
    
    def name_the_tuple(self, data, attr_names):
        tt = namedtuple('tt', attr_names)
        return tt(*data)

In [127]:
L1 = [1,1,1,1,""]
print(Utils().contains_zero(L1))
L2 = [1,1,1,1,"1"]
print(Utils().contains_zero(L2))

True
False


In [128]:
L3 = ['anna kalinowska', 'robert rachwal']
print(Utils().first_letter_up(L3))

['Anna Kalinowska', 'Robert Rachwal']


In [129]:
the_tuple = Utils().name_the_tuple(('Anna', 'Kalinowska'), ("name", "last_name"))
print(the_tuple)
print(the_tuple.name)
print(the_tuple.last_name)

tt(name='Anna', last_name='Kalinowska')
Anna
Kalinowska


## Task 9: sets and finding unique values

### Task

Create a function to find unique values in dictionary.

For input:

`{1: "a", 2: "b", 3:"a"}`

The result output should be:

`["a", "b"]`

### Solution



Most intuitive soltion (step by step):

In [130]:
def unique_values(dd):
    unique_values_list = []
    
    for v in dd.values():
        if not v in unique_values_list:
            unique_values_list.append(v)
            
    return unique_values_list

dd = {1: "a", 2: "b", 3:"a"}
unique_values(dd)

['a', 'b']

Much shorter version using `set` as core structure. In sets there are not duplicates. Whenever you have to find unique values, check whether sets might help you.

In [131]:
def unique_values2(dd):
    return set(dd.values())

dd = {1: "a", 2: "b", 3:"a"}
unique_values2(dd)

{'a', 'b'}

### Helpful: sets usage

In [132]:
A = {'a', 'b'}

In [133]:
B = {'a', 'b', 'c'}

In [134]:
C = B - A

In [135]:
print(C)

{'c'}


## Task 10: sets used for input validation

### Task

Implement function has_invalid_fields that take list of strings as input. It should return `True` if there are parametrs other than:

`['name', 'id', 'age']` 

Return `False` otherwise. Which will mean that there are only above mentioned strings on the list.

Example input:

`['name', 'name', 'id', 'age', 'name', 'id', 'dummy']`

Result: `True`
    
### Solution    

In [136]:
def has_invalid_fields(fields):
    return bool(set(fields) - set(['name', 'id', 'age']))

In [137]:
has_invalid_fields(['name', 'name', 'id', 'age', 'name', 'id', 'dummy'])

True

In [138]:
has_invalid_fields(['name', 'name', 'id', 'age', 'name', 'id'])

False

## Task 11: sanitazing text file

### Task

Get rid of all whitespace characters (spaces, tabs, newlines, etc.) in a file to reveal a message. Print it out as one line.

File to download (use 'Save Page As...' in Browser 'File' menu): https://raw.githubusercontent.com/infoshareacademy/py-ubs1/master/labs/files/msg.txt

### Helpful: opening file and read from it

In [139]:
with open('files/base.txt') as file_object:
    lines = file_object.readlines()
    print(lines)

['13.13121\n', '1.31331  \n', '13.21\n', '131.3131\n', '444.111\n', '12313.23']


### Helpful: getting rid of whitespaces

In [140]:
with open('files/base2.txt') as file_object:
    lines = file_object.readlines()
    print("Original lines:")
    print(lines)
    
    new_lines = [l.strip() for l in lines]
    print("Modified lines")
    print(new_lines)
    
    print("Modified lines joined into one string with multiple lines:")
    print("\n".join([l for l in new_lines if l]))

Original lines:
['    13.13121     \t\n', '\n', '\n', '\n', '\n', '      1.31331  \n', '    13.21\n', '\t\t\t131.3131\n', '\n', '      444.111\n', '\n', '   12313.23']
Modified lines
['13.13121', '', '', '', '', '1.31331', '13.21', '131.3131', '', '444.111', '', '12313.23']
Modified lines joined into one string with multiple lines:
13.13121
1.31331
13.21
131.3131
444.111
12313.23


### Solution

In [141]:
with open('files/msg.txt') as file_object:
    lines = file_object.readlines()    
    new_lines = [l.strip() for l in lines]
    
    msg = "".join([l for l in new_lines if l])
    print(msg)

MerryChristmas!!


## Task 12

### Task
List for counters for words present in the file. What are the 10 most popular words?

File to process: https://raw.githubusercontent.com/infoshareacademy/py-ubs1/master/labs/files/pg28198.txt

### Helpful: replacing characters in strings

In [142]:
print(msg)
print(msg.replace("!!", "."))
print(msg.replace("!!!", "."))
print(msg.replace("!", ""))
import string
print((msg+"     .").strip(string.whitespace + ".!"))

MerryChristmas!!
MerryChristmas.
MerryChristmas!!
MerryChristmas
MerryChristmas


### Solution

In [143]:
from collections import defaultdict
from operator import itemgetter
with open("files/pg28198.txt") as f:
    lines = f.readlines()
    d = defaultdict(int)
    for line in lines:
        for c in "".join((string.punctuation,
                          string.digits,
                          '\ufeff')):
            line = line.replace(c, "")
            
    
        for word in line.split():
            word = word.strip().lower()
            d[word] += 1
            
    print(sorted(list(d.items()),
                 reverse=True,
                 key=itemgetter(1))[:10])   

[('the', 5021), ('and', 3489), ('to', 2236), ('of', 2084), ('a', 2034), ('in', 1478), ('he', 1288), ('was', 1271), ('it', 1179), ('i', 1049)]


### Solution: using specialized `Counter` collection

In [144]:
from collections import Counter
from operator import itemgetter
with open("files/pg28198.txt") as f:
    lines = f.readlines()
    d = Counter()
    for line in lines:
        for c in "".join((string.punctuation,
                          string.digits,
                          '\ufeff')):
            line = line.replace(c, "")
            
    
        for word in line.split():
            word = word.strip().lower()
            d[word] += 1
            
    print(d.most_common(10))

[('the', 5021), ('and', 3489), ('to', 2236), ('of', 2084), ('a', 2034), ('in', 1478), ('he', 1288), ('was', 1271), ('it', 1179), ('i', 1049)]


## Task 13

### Task
Based on the same file as in the previous task. File to process: https://raw.githubusercontent.com/infoshareacademy/py-ubs1/master/labs/files/pg28198.txt

Get some additional info:

1. What's the number of unique words?
2. What's the most popular word that ends a sentence?
3. What's the longest sentence in text?
4. What's the most popular pair of words?

### Helpful: characters replacements with translation table

In [145]:
import string
print(string.punctuation)

!"#$%&'()*+,-./:;<=>?@[\]^_`{|}~


In [146]:
line = "Hello! bunny@storez.com That's really $$$ and ####"

Translation table store information on which character should be substituted and with what. However, character are identified by their ordinal value (a number which represents this character, instead of character itself). Thus we must use `ord` function to unique this ordinal value, e.g.:

In [147]:
ord('!')

33

Creating translation table from a string with characters to be removed:

In [148]:
trans_table = dict.fromkeys(map(ord, string.punctuation), None)
print(trans_table)

{33: None, 34: None, 35: None, 36: None, 37: None, 38: None, 39: None, 40: None, 41: None, 42: None, 43: None, 44: None, 45: None, 46: None, 47: None, 58: None, 59: None, 60: None, 61: None, 62: None, 63: None, 64: None, 91: None, 92: None, 93: None, 94: None, 95: None, 96: None, 123: None, 124: None, 125: None, 126: None}


Using the translation table:

In [149]:
line = line.translate(trans_table)
print(line)

Hello bunnystorezcom Thats really  and 


Helpful: using `re` module to replace characters

In [150]:
import re

line = "Hello! bunny@storez.com That's really $$$ and ####"
line = re.sub('[!@#\'$]', '', line)
print(line)

Hello bunnystorez.com Thats really  and 


### Solution

In [151]:
import string
import re
from collections import defaultdict
from operator import itemgetter
trans_table = dict.fromkeys(map(ord, string.punctuation+string.digits), None)

with open("files/pg28198.txt") as f:
    lines = f.read()
    
    sentences = re.split('[\!\?\.]', lines)
    
    d = defaultdict(int)
    
    for sentence in sentences:
        sentence = sentence.translate(trans_table).strip()
        if sentence:
            word = sentence.split()[-1]
            d[word] += 1
    
    print(sorted(list(d.items()), key=itemgetter(1), reverse=True)[:10])

[('it', 146), ('him', 107), ('Mrs', 84), ('me', 81), ('her', 77), ('you', 65), ('Mr', 61), ('Scrooge', 54), ('again', 47), ('said', 38)]


## Task 14: transforming text file into CSV file

### Task

Covert file `the_list.txt` into CSV format and save as new file (e.g.`the_list.csv`). Figure out what is the number of columns and how are they named.

File: https://raw.githubusercontent.com/infoshareacademy/py-ubs1/master/labs/files/the_list.txt

### Solution



In [152]:
with open("files/the_list.txt") as f, open("files/the_list.csv", "w") as f_out:
    lines = f.readlines()
    lines = [line.strip() for line in lines]
    
    
    rows = "\n".join([",".join(lines[i:i+4]) for i in range(0, len(lines), 4)])
    
    print(rows)
    
    f_out.write(rows)

User name,Email,Score,Group Name
John Kerry,john.kerry@santaswarehouse.com,3,Group 3
Clickety Click,ckekek.click@santaswarehouse.com,4,Group 3
Jack Jackly,jack.jackly@santaswarehouse.com,4,Group 3
Ducky Duck,ducky.duck@santaswarehouse.com,5,Group 3
Sporty Rusty,sporty.rusty@santaswarehouse.com,6,Group 3
Gigelly Goon,gigelly.goon@santaswarehouse.com,7,Group 3
Trouble Thrumpet,trouble.thrumpet@santaswarehouse.com,8,Group 3
Gal Gally,gal.gally@santaswarehouse.com,8,Group 3
Funny Fonny,funny.fonny@santaswarehouse.com,8,Group 3
Adam Abacus,adam.abacus@santaswarehouse.com,8,Group 3
Quicky Quiver,quicky.quiver@santaswarehouse.com,9,Group 3
Roman Romanov,roman.romanov@santaswarehouse.com,10,Group 3
Greedy Garlic,greedy.garlic@santaswarehouse.com,8,Group 4
Creative Carl,creative.karl@santaswarehouse.com,8,Group 4
Monty Mount,monty.mount@santaswarehouse.com,9,Group 4
General Glass,general.glass@santaswarehouse.com,9,Group 4
Lonely Lucker,lonely.lucker@santaswarehouse.com,11,Group 4
Triple Troubl

## Task 15: data manipulation and analysis using *Pandas* library


CSV file to process (this is the same CSV as the on generated in previous task): https://github.com/infoshareacademy/py-ubs1/blob/master/labs/files/the_list.csv
        
0. Read CSV file you've created in previous task.
1. Add column `magic` to `DataFrame` with random numbers from normal distribution.
2. Add `4` to every `Score` and multiply it by `5`. The result should be stored in the same column. Use transformation here.
3. Create column `percent` and insert `Score` column value there divided by `100`.
4. What's the average `Score` in `Group 4`?



**BONUS tasks:**

5. Add additional column called `AltGroup` with random values for each row. The values should be strings from the following list:

```
Runners
Cooks
Hikers
```

6. Get average, min and max values of `Score` and `Magic Score` columns for each combination of `Group Name` and `AltGroup` column.

Example result for one combination:
```
                      Score        Magic Score
Group 3    Cooks      8.000000     9.525938
```

7.  Change values in column `Group Name` from 

```
Group 1, Group 2, ...
```
to
```
G.01, G.02, ...
```

### Helpful: example data reading, processing and analysis using *NumPy* and *Pandas*

In [153]:
import numpy as np
import pandas as pd

In [154]:
df = pd.read_csv('files/the_list.csv')

In [155]:
df

Unnamed: 0,User name,Email,Score,Group Name
0,John Kerry,john.kerry@santaswarehouse.com,3,Group 3
1,Clickety Click,ckekek.click@santaswarehouse.com,4,Group 3
2,Jack Jackly,jack.jackly@santaswarehouse.com,4,Group 3
3,Ducky Duck,ducky.duck@santaswarehouse.com,5,Group 3
4,Sporty Rusty,sporty.rusty@santaswarehouse.com,6,Group 3
5,Gigelly Goon,gigelly.goon@santaswarehouse.com,7,Group 3
6,Trouble Thrumpet,trouble.thrumpet@santaswarehouse.com,8,Group 3
7,Gal Gally,gal.gally@santaswarehouse.com,8,Group 3
8,Funny Fonny,funny.fonny@santaswarehouse.com,8,Group 3
9,Adam Abacus,adam.abacus@santaswarehouse.com,8,Group 3


In [156]:
df.describe()

Unnamed: 0,Score
count,24.0
mean,8.708333
std,2.866283
min,3.0
25%,7.75
50%,8.5
75%,11.25
max,13.0


In [157]:
df.head()

Unnamed: 0,User name,Email,Score,Group Name
0,John Kerry,john.kerry@santaswarehouse.com,3,Group 3
1,Clickety Click,ckekek.click@santaswarehouse.com,4,Group 3
2,Jack Jackly,jack.jackly@santaswarehouse.com,4,Group 3
3,Ducky Duck,ducky.duck@santaswarehouse.com,5,Group 3
4,Sporty Rusty,sporty.rusty@santaswarehouse.com,6,Group 3


In [158]:
df[df['Score'] > 5]

Unnamed: 0,User name,Email,Score,Group Name
4,Sporty Rusty,sporty.rusty@santaswarehouse.com,6,Group 3
5,Gigelly Goon,gigelly.goon@santaswarehouse.com,7,Group 3
6,Trouble Thrumpet,trouble.thrumpet@santaswarehouse.com,8,Group 3
7,Gal Gally,gal.gally@santaswarehouse.com,8,Group 3
8,Funny Fonny,funny.fonny@santaswarehouse.com,8,Group 3
9,Adam Abacus,adam.abacus@santaswarehouse.com,8,Group 3
10,Quicky Quiver,quicky.quiver@santaswarehouse.com,9,Group 3
11,Roman Romanov,roman.romanov@santaswarehouse.com,10,Group 3
12,Greedy Garlic,greedy.garlic@santaswarehouse.com,8,Group 4
13,Creative Carl,creative.karl@santaswarehouse.com,8,Group 4


In [159]:
df_minscore_and_above = df[df['Score'] > 5]

In [160]:
df_minscore_and_above.head()

Unnamed: 0,User name,Email,Score,Group Name
4,Sporty Rusty,sporty.rusty@santaswarehouse.com,6,Group 3
5,Gigelly Goon,gigelly.goon@santaswarehouse.com,7,Group 3
6,Trouble Thrumpet,trouble.thrumpet@santaswarehouse.com,8,Group 3
7,Gal Gally,gal.gally@santaswarehouse.com,8,Group 3
8,Funny Fonny,funny.fonny@santaswarehouse.com,8,Group 3


In [161]:
df_minscore_and_above[df_minscore_and_above['Group Name'] == 'Group 3']

Unnamed: 0,User name,Email,Score,Group Name
4,Sporty Rusty,sporty.rusty@santaswarehouse.com,6,Group 3
5,Gigelly Goon,gigelly.goon@santaswarehouse.com,7,Group 3
6,Trouble Thrumpet,trouble.thrumpet@santaswarehouse.com,8,Group 3
7,Gal Gally,gal.gally@santaswarehouse.com,8,Group 3
8,Funny Fonny,funny.fonny@santaswarehouse.com,8,Group 3
9,Adam Abacus,adam.abacus@santaswarehouse.com,8,Group 3
10,Quicky Quiver,quicky.quiver@santaswarehouse.com,9,Group 3
11,Roman Romanov,roman.romanov@santaswarehouse.com,10,Group 3


In [162]:
df_group3_min_score = df_minscore_and_above[df_minscore_and_above['Group Name'] == 'Group 3']

In [163]:
np.mean(df_group3_min_score['Score'])

8.0

In [164]:
np.median(df_group3_min_score['Score'])

8.0

In [165]:
grouped = df.groupby(df['Group Name'])

In [166]:
grouped.mean()

Unnamed: 0_level_0,Score
Group Name,Unnamed: 1_level_1
Group 3,6.666667
Group 4,10.75


In [167]:
df.shape

(24, 4)

In [168]:
np.random.randn(df.shape[0])

array([ 1.84300678, -0.54578569, -2.41403193, -0.68097633, -0.51053806,
       -0.0051895 ,  1.64759063,  1.17693418, -0.48818672,  0.63110899,
       -0.56914372,  0.02816768, -0.07344332,  0.30277263,  0.08475747,
       -0.38089672, -0.20441115,  0.24852721,  0.03992678,  2.21642557,
        1.35356008, -2.08877286, -1.36143839, -0.37996375])

In [169]:
new_df = pd.DataFrame({'magic_score' : np.random.randn(df.shape[0])})

In [170]:
new_df

Unnamed: 0,magic_score
0,0.330501
1,-0.638508
2,0.584839
3,0.483646
4,0.719996
5,1.200826
6,0.057846
7,0.594079
8,0.196016
9,-1.294138


In [171]:
df_copy = df.copy()

In [172]:
new_df.shape

(24, 1)

In [173]:
df_copy['Magic score'] = new_df

In [174]:
df_copy

Unnamed: 0,User name,Email,Score,Group Name,Magic score
0,John Kerry,john.kerry@santaswarehouse.com,3,Group 3,0.330501
1,Clickety Click,ckekek.click@santaswarehouse.com,4,Group 3,-0.638508
2,Jack Jackly,jack.jackly@santaswarehouse.com,4,Group 3,0.584839
3,Ducky Duck,ducky.duck@santaswarehouse.com,5,Group 3,0.483646
4,Sporty Rusty,sporty.rusty@santaswarehouse.com,6,Group 3,0.719996
5,Gigelly Goon,gigelly.goon@santaswarehouse.com,7,Group 3,1.200826
6,Trouble Thrumpet,trouble.thrumpet@santaswarehouse.com,8,Group 3,0.057846
7,Gal Gally,gal.gally@santaswarehouse.com,8,Group 3,0.594079
8,Funny Fonny,funny.fonny@santaswarehouse.com,8,Group 3,0.196016
9,Adam Abacus,adam.abacus@santaswarehouse.com,8,Group 3,-1.294138


In [175]:
df_copy['Magic score'] = df_copy['Magic score'].transform(lambda x: x + 10)

In [176]:
df_copy

Unnamed: 0,User name,Email,Score,Group Name,Magic score
0,John Kerry,john.kerry@santaswarehouse.com,3,Group 3,10.330501
1,Clickety Click,ckekek.click@santaswarehouse.com,4,Group 3,9.361492
2,Jack Jackly,jack.jackly@santaswarehouse.com,4,Group 3,10.584839
3,Ducky Duck,ducky.duck@santaswarehouse.com,5,Group 3,10.483646
4,Sporty Rusty,sporty.rusty@santaswarehouse.com,6,Group 3,10.719996
5,Gigelly Goon,gigelly.goon@santaswarehouse.com,7,Group 3,11.200826
6,Trouble Thrumpet,trouble.thrumpet@santaswarehouse.com,8,Group 3,10.057846
7,Gal Gally,gal.gally@santaswarehouse.com,8,Group 3,10.594079
8,Funny Fonny,funny.fonny@santaswarehouse.com,8,Group 3,10.196016
9,Adam Abacus,adam.abacus@santaswarehouse.com,8,Group 3,8.705862


In [177]:
import pandas as pd
import numpy as np

In [178]:
data = {'state': ['Ohio', 'Ohio', 'Ohio', 'Nevada', 'Nevada', 'Nevada'],
        'year': [2000, 2001, 2002, 2001, 2002, 2003],
        'pop': [1.5, 1.7, 3.6, 2.4, 2.9, 3.2]}

In [179]:
frame = pd.DataFrame(data)

In [180]:
frame

Unnamed: 0,state,year,pop
0,Ohio,2000,1.5
1,Ohio,2001,1.7
2,Ohio,2002,3.6
3,Nevada,2001,2.4
4,Nevada,2002,2.9
5,Nevada,2003,3.2


In [181]:
frame.head()

Unnamed: 0,state,year,pop
0,Ohio,2000,1.5
1,Ohio,2001,1.7
2,Ohio,2002,3.6
3,Nevada,2001,2.4
4,Nevada,2002,2.9


In [182]:
pd.DataFrame(data, columns=['year', 'state', 'pop'])

Unnamed: 0,year,state,pop
0,2000,Ohio,1.5
1,2001,Ohio,1.7
2,2002,Ohio,3.6
3,2001,Nevada,2.4
4,2002,Nevada,2.9
5,2003,Nevada,3.2


In [183]:
frame2 = pd.DataFrame(data, columns=['year', 'state', 'pop', 'debt'],
                      index=['one', 'two', 'three', 'four',
                             'five', 'six'])

In [184]:
frame2

Unnamed: 0,year,state,pop,debt
one,2000,Ohio,1.5,
two,2001,Ohio,1.7,
three,2002,Ohio,3.6,
four,2001,Nevada,2.4,
five,2002,Nevada,2.9,
six,2003,Nevada,3.2,


In [185]:
frame2.columns

Index(['year', 'state', 'pop', 'debt'], dtype='object')

In [186]:
frame2['state']

one        Ohio
two        Ohio
three      Ohio
four     Nevada
five     Nevada
six      Nevada
Name: state, dtype: object

In [187]:
frame2.year

one      2000
two      2001
three    2002
four     2001
five     2002
six      2003
Name: year, dtype: int64

### Solution

In [188]:
import numpy as np
import pandas as pd

0. Read CSV file you've created in previous task.

In [189]:
df_sol15 = pd.read_csv('files/the_list.csv')

1. Add column `magic` to `DataFrame` with random numbers from normal distribution.

In [190]:
df_sol15['magic'] = np.random.randn(df_sol15.shape[0])
df_sol15

Unnamed: 0,User name,Email,Score,Group Name,magic
0,John Kerry,john.kerry@santaswarehouse.com,3,Group 3,0.08557
1,Clickety Click,ckekek.click@santaswarehouse.com,4,Group 3,1.608405
2,Jack Jackly,jack.jackly@santaswarehouse.com,4,Group 3,-0.295715
3,Ducky Duck,ducky.duck@santaswarehouse.com,5,Group 3,-0.369075
4,Sporty Rusty,sporty.rusty@santaswarehouse.com,6,Group 3,0.832785
5,Gigelly Goon,gigelly.goon@santaswarehouse.com,7,Group 3,-1.339696
6,Trouble Thrumpet,trouble.thrumpet@santaswarehouse.com,8,Group 3,-0.185401
7,Gal Gally,gal.gally@santaswarehouse.com,8,Group 3,1.668617
8,Funny Fonny,funny.fonny@santaswarehouse.com,8,Group 3,1.230496
9,Adam Abacus,adam.abacus@santaswarehouse.com,8,Group 3,2.337257


2. Add `4` to every `Score` and multiply it by `5`. The result should be stored in the same column. Use transformation here.

In [191]:
df_sol15['Score'] = df_sol15['Score'].transform(lambda x: (x + 4) * 5)
df_sol15

Unnamed: 0,User name,Email,Score,Group Name,magic
0,John Kerry,john.kerry@santaswarehouse.com,35,Group 3,0.08557
1,Clickety Click,ckekek.click@santaswarehouse.com,40,Group 3,1.608405
2,Jack Jackly,jack.jackly@santaswarehouse.com,40,Group 3,-0.295715
3,Ducky Duck,ducky.duck@santaswarehouse.com,45,Group 3,-0.369075
4,Sporty Rusty,sporty.rusty@santaswarehouse.com,50,Group 3,0.832785
5,Gigelly Goon,gigelly.goon@santaswarehouse.com,55,Group 3,-1.339696
6,Trouble Thrumpet,trouble.thrumpet@santaswarehouse.com,60,Group 3,-0.185401
7,Gal Gally,gal.gally@santaswarehouse.com,60,Group 3,1.668617
8,Funny Fonny,funny.fonny@santaswarehouse.com,60,Group 3,1.230496
9,Adam Abacus,adam.abacus@santaswarehouse.com,60,Group 3,2.337257


3. Create column `percent` and insert `Score` column value there divided by `100`.

In [192]:
df_sol15['percent'] = df_sol15['Score']/100
df_sol15

Unnamed: 0,User name,Email,Score,Group Name,magic,percent
0,John Kerry,john.kerry@santaswarehouse.com,35,Group 3,0.08557,0.35
1,Clickety Click,ckekek.click@santaswarehouse.com,40,Group 3,1.608405,0.4
2,Jack Jackly,jack.jackly@santaswarehouse.com,40,Group 3,-0.295715,0.4
3,Ducky Duck,ducky.duck@santaswarehouse.com,45,Group 3,-0.369075,0.45
4,Sporty Rusty,sporty.rusty@santaswarehouse.com,50,Group 3,0.832785,0.5
5,Gigelly Goon,gigelly.goon@santaswarehouse.com,55,Group 3,-1.339696,0.55
6,Trouble Thrumpet,trouble.thrumpet@santaswarehouse.com,60,Group 3,-0.185401,0.6
7,Gal Gally,gal.gally@santaswarehouse.com,60,Group 3,1.668617,0.6
8,Funny Fonny,funny.fonny@santaswarehouse.com,60,Group 3,1.230496,0.6
9,Adam Abacus,adam.abacus@santaswarehouse.com,60,Group 3,2.337257,0.6


4. What's the average `Score` in `Group 4`?

In [193]:
df_sol15.groupby(['Group Name']).mean()

Unnamed: 0_level_0,Score,magic,percent
Group Name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
Group 3,53.333333,0.556702,0.533333
Group 4,73.75,0.083748,0.7375
