This notebook provides a _very_ basic introduction to the following STD python modules:

- general: `string`
- file handling: `os`
- time: `time datetime`
- numerical: `math random`

Part 2.2 continues with

- file handling: `json pickle csv`
- developer tools: `logging`
- containter datatypes: `collections`
- enumerators: `enum`
- regular expressions: `re`

## Part 2.1. Standard library

Task is the same: Replace `...` (Ellipsis) symbols with suitable pieces of code. 

### `string`

In [None]:
import string

Not much to say here, the `string` module, among other things, contains a few useful constants such as:

In [None]:
print('Lowercase letters:', repr(string.ascii_lowercase))
print('Uppercase letters:', repr(string.ascii_uppercase))
print('All letters:', repr(string.ascii_letters))
print('Punctuation:', repr(string.punctuation))

These are all constants of type `str`.

They can be used in different scenarious and are mainly there to free developers from typing them manually.

In [None]:
def create_letter_number_map() -> dict[str, int]:
    '''
    This function returns a dictionary which maps a letter to its index.
    Ex.: -> {'a': 0, 'b': 1, ..., 'z': 25}
    '''
    return dict(zip(string.ascii_lowercase, range(26)))

In [None]:
# test
res = create_letter_number_map()
for i, ch in enumerate(string.ascii_lowercase):
    assert res[ch] == i, f'failed {ch}; expected {i}, got {res[ch]}'

Wasn't much to see here, therefore let's look at some things which the `str` type can do as a bonus.

_Note_: `str` type is immutable; that is, methods cannot change the value. As opposed to the `list` type: for example, `list.sort` modifies the list object in-place.

`str.join`

One can join an iterable containing strings with a delimeter.

In [None]:
a = ['a', 'b', 'c', 'd']
res1 = ''.join(a)
res2 = ','.join(a)
res3 = '; '.join(a)
res4 = 'HA'.join(a)
print(res1, res2, res3, res4, sep='\n')

Be careful as joining an iterable containing non-str values would result in an error (`TypeError`):

In [None]:
a = [1, 2, 3, 4]
res5 = ''.join(a)

In [None]:
def joining_non_strs(a: list[int]) -> str:
    '''
    Fix a problem described above. That is,
    Given a list of integers, concatenate them and return as a string.
    Note: use the `str.join` method.
    Ex.: [1,2,3] -> '123'; [0, 4] -> '04'
    '''
    return ''.join(map(str, a))

In [None]:
# test
assert joining_non_strs([1,2,3]) == '123'
assert joining_non_strs([0,4]) == '04'

`str.split`

does the reverse of `str.join`: when called on a string, splits its contents into smaller strings on a specified character (`sep=' '`, by default)

In [None]:
str1 = '1 2 3 4'
str2 = '1    2 3  4       5'
str3 = '1 2,4 5,6 4'
str4 = 'ahaahaahhaahahahhahaha'

print(str1.split()) # defaults to a space character
print(str2.split()) # ignores repeated separators
print(str3.split()) # when no separator is specified, splits on spaces
print(str3.split(sep=','))
print(str4.split('ha'))

`.join` and `.split` are true inverses of each other:

In [None]:
str5 = '1 2 3'
res1 = ' '.join(str5.split())
str6 = ['1', '2', '3']
res2 = ' '.join(str6).split()
print(str5 == res1 and str6 == res2)

A slightly harder exercise: write a matlab-like string-to-matrix parser.

(https://www.mathworks.com/help/matlab/learn_matlab/matrices-and-arrays.html)

In [None]:
def create_matrix(s: str) -> list[list[int]]:
    '''
    Given a string of digits, semicolons and spaces, where
    each space separates elements of one row in a matrix and each semicolon separates different rows,
    return a resulting matrix.
    It is guaranteed, that there is no inconsistency in the number of elements in every row and column.
    Ex.: '1 2;3 4' -> [[1, 2], [3, 4]]
    '''
    res = []
    for row_str in s.split(';'):
        res.append([*map(int, row_str.split())])
    return res

In [None]:
# test
assert create_matrix('1 2;3 4') == [[1, 2], [3, 4]]
assert create_matrix('1 3 5; 2 4 6; 7 8 10') == [[1, 3, 5], [2, 4, 6], [7, 8, 10]]
assert create_matrix('1;2;3') == [[1], [2], [3]]
assert create_matrix('1 2 3') == [[1, 2, 3]]

`str.upper str.lower str.title`

One can also make a string uppercase/lowercase as well as make it into a title:

In [None]:
print('hElLo, hoW aRe thIngS?'.upper())
print('hElLo, hoW aRe thIngS?'.lower())
print('hElLo, hoW aRe thIngS?'.title())

In [None]:
def swap_case(s: str) -> str:
    '''
    Given a string of english letters,
    return a new string where every uppercase letter is lower case and vice versa.
    Ex.: 'aBCde' -> 'AbcDE'
    '''
    return s.swapcase()

In [None]:
# test
assert swap_case('aUhfEq') == 'AuHFeQ'
assert swap_case('i') == 'I'
assert swap_case('') == ''
assert swap_case(string.ascii_lowercase) == string.ascii_uppercase

`str.endswith str.startswith`

speak for themselves

In [None]:
print('hello'.endswith('llo'))
print('hello'.endswith('he'))
print('hello'.startswith('lo'))
print('hello'.startswith('hel'))

`str.find str.count`

`.find` returns the lowest index in the string where substring is found; `.count` returns the number of occurrences of substring

In [None]:
print('01123424325'.find('2'))
print('011234243225'.count('2'))

`str.replace`

`.replace` returns a new string with all occurances of one substring are replaced by a new substring. It <b>does not</b> change the string itself.

In [None]:
s1 = 'Hello, Lola'
print(s1.replace('l', 'I')) # the 'L' didn't get changed because 'L' != 'l'

s2 = 'abcruhbceucbcqe'
print(s2.replace('bc', '_'))

Homework: look at `str.strip` and `str.format` on your own and then uncomment the next cell

In [None]:
# s3 = '   hi   \n     '
# print(f'before:[{s3}]')
# print(f'after:[{s3.strip()}]')

# s4 = 'Hello, my name is {}, I am {} years old'
# print(s4.format('Jaden', 28))
# print(s4.format('Lola', 19))

## `time`

In [None]:
import time

This module (as well as the next one) has the functionality to work with time mostly using the standard data types (like `float`s and `str`ings).

Let's look at some useful and widely-used functions.

`time`

returns a floating number of seconds since the epoch (that is, January 1, 1970, 00:00:00 (UTC)). This number is often called "Unix time".

`ctime`

converts Unix time to a human-readable string format.

In [None]:
current_unix_time = time.time()
print(f'as for now, {current_unix_time} seconds have passed since 01.01.1970 00:00:00')

In [None]:
current_time_string = time.ctime(current_unix_time)
print(f'now is {current_time_string}')

# let's subtract 200k seconds from now and see where it takes us:
two_hundr_k_secs_back_time_string = time.ctime(current_unix_time - 200_000)
print(f'200k seconds ago was {two_hundr_k_secs_back_time_string}')

`perf_counter`

If you want to measure the time performance of a piece of code, you should use the `perf_counter` function.

Let's create a huge list of integers and find its sum in two ways.

In [None]:
def create_list(size: int) -> list[int]:
    assert size > 0, 'Size must be a positive number'
    return list(range(size))

In [None]:
def sum_of_list_1(lst: list[int]) -> int:
    s = 0
    for el in lst:
        s += el
    return s

Come up with a better way:

In [None]:
def sum_of_list_2(lst: list[int]) -> int:
    '''
    Return a sum of all elements in a list.
    '''
    return sum(lst)

def sum_of_list_3(lst: list[int]) -> int:
    '''
    Knowing how the list is created (see `create_list` function), write a constant time algorithm
    to return a sum of its elements.
    '''
    return len(lst)*(len(lst)-1) // 2

In [None]:
huge_list = create_list(10_000_000) # you can change this number later to see how the time needed changes

Now, measure the time needed to run each of these functions (`delta_t` variables):

In [None]:
...
res1 = sum_of_list_1(huge_list)
delta_t_1 = ...
print(f'sum_of_list_1 took {delta_t_1} seconds; the result is {res1}')

In [None]:
...
res2 = sum_of_list_2(huge_list)
delta_t_2 = ...
print(f'sum_of_list_2 took {delta_t_2} seconds; the result is {res2}')

In [None]:
...
res3 = sum_of_list_3(huge_list)
delta_t_3 = ...
print(f'sum_of_list_3 took {delta_t_3} seconds; the result is {res3}')

In [None]:
# test
assert res1 == res2 == res3, 'The sums are not equal'
assert delta_t_1 > delta_t_2, 'The second function must be faster than the first one'
assert delta_t_2 > delta_t_3, 'The third function must be faster than the second one'

## `datetime`

This module is similar to `time`, but it works with its own data type: `datetime.datetime` 

In [None]:
import datetime

You can create a `datetime` object like this:

Note that the format is (year, month, date, hour, minute, second, microsecond); the last four parameters are optional and default to zero.

In [None]:
dt1 = datetime.datetime(2006, 1, 25, 3, 23, 34, 12430)
dt2 = datetime.datetime(2006, 1, 25)
print(f'these objects have a special {type(dt1)} type')

print(dt1, dt2, sep='\n')

If you want to get a datetime object of the current time, use the `.now` method:

In [None]:
dt_now = datetime.datetime.now()

print('now is:', dt_now)

<b>Why</b> storing a point in time as an object is useful?

<b>Because</b> you can do cool arithmetic operations with it:

In [None]:
delta = dt_now - dt1
print(type(delta))

print(f'between now and {dt1} there are {delta}')

Similarly to how it's done in the `time` module, you can convert things into Unix time and
create `datetime` objects from a Unix time number:

In [None]:
print(dt_now.timestamp())

dt_from_unix = datetime.datetime.fromtimestamp(1_600_000_000.0)
print(dt_from_unix)

Remember how we subtracted two `datetime` objects and got a `timedelta` object? The same works in reverse: we can add/subtract `timedelta` to/from `datetime` to get new `datetime`:

In [None]:
six_weeks = datetime.timedelta(days=6*7)
print(six_weeks)

print('six weeks in the future:', dt_now + six_weeks)
print('six weeks in the past:', dt_now - six_weeks)

In [79]:
def datetime1() -> int:
    '''
    Returns an integer amount of whole seconds between 12 Jan 2013 12:00:00.0000 and 25 Jan 2006 03:30:26.0000
    '''
    delta = datetime.datetime(2013, 1, 12, 12) - datetime.datetime(2006, 1, 25, 3, 30, 26)
    return int(delta.total_seconds())

In [80]:
# test
res = datetime1()
assert isinstance(res, int), 'Not an integer'
assert sum(map(int, str(res))) == 37, 'Wrong answer'

In [98]:
def datetime2(dt: datetime.datetime) -> str:
    '''
    Return a string representation of a given datetime object in the following format:
        dd.mm.yyyy hh:mm:ss
    '''
    return dt.strftime("%d.%m.%Y %H:%M:%S")

In [99]:
import re, random
assert datetime2(dt_now) == '10.05.2023 09:49:46'
assert datetime2(dt_from_unix) == '13.09.2020 14:26:40'
for _ in range(15):
    rnd_timestamp = random.randint(1_000_000, 1_800_000_000)
    result = datetime2(datetime.datetime.fromtimestamp(rnd_timestamp))
    assert re.match(r'\d\d.\d\d.\d{4} \d\d:\d\d:\d\d', result), \
        f'{result} failed; expected format "dd.mm.yyyy hh:mm:ss", got {result}'
    print('+', end='')


+++++++++++++++

## `os`

## `math`

In [None]:
temperature_anomalies = [
    -0.17, -0.09, -0.11, -0.18, -0.28, -0.33, -0.32, -0.37, -0.18,
    -0.11, -0.35, -0.23, -0.27, -0.31, -0.3 , -0.23, -0.11, -0.11,
    -0.28, -0.18, -0.08, -0.15, -0.28, -0.37, -0.47, -0.26, -0.22,
    -0.39, -0.43, -0.49, -0.43, -0.44, -0.36, -0.34, -0.15, -0.14,
    -0.36, -0.46, -0.29, -0.27, -0.27, -0.18, -0.28, -0.26, -0.27,
    -0.22, -0.1 , -0.21, -0.19, -0.36, -0.15, -0.09, -0.15, -0.28,
    -0.12, -0.19, -0.14, -0.02,  0.  , -0.02,  0.13,  0.19,  0.07,
    0.09,  0.2 ,  0.09, -0.07, -0.02, -0.1 , -0.11, -0.17, -0.07,
    0.01,  0.08, -0.13, -0.14, -0.19,  0.05,  0.06,  0.03, -0.03,
    0.06,  0.03,  0.05, -0.2 , -0.11, -0.06, -0.02, -0.08,  0.05,
    0.03, -0.08,  0.01,  0.16, -0.07, -0.01, -0.1 ,  0.18,  0.07,
    0.16,  0.26,  0.32,  0.14,  0.31,  0.16,  0.12,  0.18,  0.32,
    0.39,  0.27,  0.45,  0.4 ,  0.22,  0.23,  0.32,  0.45,  0.33,
    0.46,  0.61,  0.38,  0.39,  0.54,  0.63,  0.62,  0.53,  0.68,
    0.64,  0.66,  0.54,  0.66,  0.72,  0.61,  0.65,  0.68,  0.75,
    0.9 ,  1.02,  0.92,  0.85,  0.98,  1.02,  0.85,  0.89
]
years = list(range(1880, 2023))

## `random`