<a href="https://colab.research.google.com/github/ShaunakSen/problem-solving-with-code/blob/master/Improve_Python.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## Advanced concepts to improve Python

> Credits: https://www.youtube.com/playlist?list=PLP8GkvaIxJP0VAXF3USi9U4JnpxUvQXHx

### Emulating switch/case Statements in Python with Dictionaries

if/else statements can get really long and complex to read

Python has first class functions, what that basically means is that they are like any other objects

In [None]:
def myfunc(a,b):
    return a+b
funcs = [myfunc]
funcs[0](2,3)

5

This feature will allow us to emulate a switch/case statement

We can construct a dict with keys as the conditions and values as the handler functions

```python
func_dict = {
    'cond_a': handle_a,
    'cond_b' : handle_b,
}

func_dict[cond]()
```
But  there is no default condition here

One soln is to use `.get(condition)`, if key is not found it will return `None` instead of a `KeyError`

Note if you have many conditions this is much more efficient than an if/else statement as in the latter we have to try out all conditions in sequence, here its just a lookup

In [None]:
### standard code
def dispatch_if(operator, x, y):
    if operator == 'add':
        return x+y
    if operator == 'sub':
        return x-y
    if operator == 'mul':
        return x*y
    if operator == 'div':
        return x/y
    return None

### Lambda functions

A lambda function is a small anonymous function.

A lambda function can take any number of arguments, but can only have one expression.

`lambda arguments : expression`

In [None]:
x = lambda a : a+10
print (x(5))

15


In [None]:
x = lambda a, b : a * b
print(x(5, 6))

30


The power of lambda is better shown when you use them as an anonymous function inside another function.

Say you have a function definition that takes one argument, and that argument will be multiplied with an unknown number:



Use that function definition to make a function that always doubles the number you send in

Or, use the same function definition to make a function that always triples the number you send in
Or, use the same function definition to make both functions, in the same program




In [None]:
def myfunc(n):
  return lambda a : a * n

mydoubler = myfunc(2)
mytripler = myfunc(3)

print(mydoubler(11))
print(mytripler(11))

22
33


### Back to the switch statement problem

In [None]:
### standard code
def dispatch_if(operator, x, y):
    if operator == 'add':
        return x+y
    if operator == 'sub':
        return x-y
    if operator == 'mul':
        return x*y
    if operator == 'div':
        return x/y
    return None

In [None]:
def dispatch_dict(operator, x, y):
    return {
        'add': lambda: x+y,
        'sub': lambda: x-y,
        'mul': lambda: x*y,
        'div': lambda: x/y
    }.get(operator, None)

dispatch_dict('add',4,5)()

9

`dict.get(key[, value])`

- value (optional) - Value to be returned if the key is not found. The default value is None.


The thing to understand here is that `dispatch_dict` is actually returning a lamda function, so `dispatch_dict('add',4,5)` actually returns the function, we have to () call it to get the value

dispatch_dict('abc',4,5)() fails as it returns `None` and we do `None()` and NoneType is not callable

In [None]:
def dispatch_dict(operator, x, y):
    ### we return the function and call it as well
    return {
        'add': lambda: x+y,
        'sub': lambda: x-y,
        'mul': lambda: x*y,
        'div': lambda: x/y
    }.get(operator, lambda: None)()

dispatch_dict('add',4,5)

9

In [None]:
dispatch_dict('abc',4,5)

## Using `defaultdict` in python

> https://realpython.com/python-defaultdict/

---

### Handling Missing Keys in Dictionaries


A common issue that you can face when working with Python dictionaries is how to handle missing keys. If your code is heavily based on dictionaries, or if you’re creating dictionaries on the fly all the time, then you’ll soon notice that dealing with frequent KeyError exceptions can be quite annoying and can add extra complexity to your code. With Python dictionaries, you have at least four available ways to handle missing keys:

1. Use .setdefault()
2. Use .get()
3. Use the key in dict idiom
4. Use a try and except block


`setdefault(key[, default])`

> If key is in the dictionary, return its value. If not, insert key with a value of default and return default. default defaults to None.



In [2]:
a_dict = {}
a_dict['missing_key']

KeyError: 'missing_key'

In [3]:
a_dict.setdefault('missing_key', 'hello mini')

'hello mini'

In [4]:
a_dict['missing_key']

'hello mini'

In [5]:
a_dict.setdefault('missing_key_2', 'hello mini again')

'hello mini again'

In [6]:
a_dict['missing_key_2']

'hello mini again'

In [7]:
a_dict.setdefault('missing_key', 'hello shona')
a_dict['missing_key'] ### no change

'hello mini'

In the above code, you use .setdefault() to generate a default value for missing_key. Notice that your dictionary, a_dict, now has a new key called missing_key whose value is 'default value'. This key didn’t exist before you called .setdefault(). Finally, if you call .setdefault() on an existing key, then the call won’t have any effect on the dictionary. Your key will hold the original value instead of the new default value.



In [8]:
a_dict = {}
a_dict.get('missing_key', 'not found')

'not found'

In [9]:
a_dict

{}

Here, you use .get() to generate a default value for missing_key, but this time, your dictionary stays empty. This is because .get() returns the default value, but this value isn’t added to the underlying dictionary. For example, if you have a dictionary called D, then you can assume that .get() works something like this:

`D.get(key, default) -> D[key] if key in D, else default`

With this pseudo-code, you can understand how .get() works internally. __If the key exists, then .get() returns the value mapped to that key. Otherwise, the default value is returned. Your code never creates or assigns a value to key. In this example, default defaults to None.__

You can also use conditional statements to handle missing keys in dictionaries. Take a look at the following example, which uses the key in dict idiom:

```python
>>> a_dict = {}
>>> if 'key' in a_dict:
...     # Do something with 'key'...
...     a_dict['key']
... else:
...     a_dict['key'] = 'default value'
...
>>> a_dict
{'key': 'default value'}
```

In this code, you use an if statement along with the in operator to check if key is present in a_dict. If so, then you can perform any action with key or with its value. Otherwise, you create the new key, key, and assign it a 'default value'. Note that the above code works similar to .setdefault() but takes four lines of code, while .setdefault() would only take one line (in addition to being more readable).

You can also walk around the KeyError by using a try and except block to handle the exception. Consider the following piece of code:

```python
>>> a_dict = {}
>>> try:
...     # Do something with 'key'...
...     a_dict['key']
... except KeyError:
...     a_dict['key'] = 'default value'
...
>>> a_dict
{'key': 'default value'}
```



### Understanding the Python defaultdict Type

The Python standard library provides collections, which is a module that implements specialized container types. One of those is the Python defaultdict type, which is an alternative to dict that’s specifically designed to help you out with missing keys. defaultdict is a Python type that inherits from dict:



In [10]:
from collections import defaultdict

In [11]:
print (issubclass(defaultdict, dict))

True


The above code shows that the Python defaultdict type is a subclass of dict. This means that defaultdict inherits most of the behavior of dict. So, you can say that defaultdict is much like an ordinary dictionary.

The main difference between defaultdict and dict is that when you try to access or modify a key that’s not present in the dictionary, a default value is automatically given to that key. In order to provide this functionality, the Python defaultdict type does two things:

1. It overrides `.__missing__()`.
2. It adds .default_factory, a writable instance variable that needs to be provided at the time of instantiation.


The instance variable .default_factory will hold the first argument passed into `defaultdict.__init__()`. This argument can take a valid Python callable or None. If a callable is provided, then it’ll automatically be called by defaultdict whenever you try to access or modify the value associated with a missing key.

Take a look at how you can create and properly initialize a defaultdict:





In [19]:
def_dict = defaultdict(list)  # Pass list to .default_factory

In [20]:
def_dict['one'] = 1 # Add a key-value pair

In [21]:
def_dict['missing'] # Access a missing key returns an empty list

[]

This also creates an entry for `{'missing': []}` in the `def_dict`

In [22]:
def_dict

defaultdict(list, {'one': 1, 'missing': []})

In [23]:
def_dict['another_missing'].append(4)  # Modify a missing key

In [24]:
def_dict

defaultdict(list, {'one': 1, 'missing': [], 'another_missing': [4]})

Here, you pass list to .default_factory when you create the dictionary. Then, you use def_dict just like a regular dictionary. Note that when you try to access or modify the value mapped to a non-existent key, the dictionary assigns it the default value that results from calling list().

### Using the Python defaultdict Type

__Grouping Items__

A typical use of the Python defaultdict type is to set .default_factory to list and then build a dictionary that maps keys to lists of values. With this defaultdict, if you try to get access to any missing key, then the dictionary runs the following steps:

1. Call list() to create a new empty list
2. Insert the empty list into the dictionary using the missing key as key
3. Return a reference to that list

This allows you to write code like this:



In [25]:
dd = defaultdict(list)
dd['key'].append(1)

In [26]:
dd

defaultdict(list, {'key': [1]})

In [28]:
dd['key'].append(2)

In [29]:
dd

defaultdict(list, {'key': [1, 2]})

In [30]:
dd['key'].append(3)
dd

defaultdict(list, {'key': [1, 2, 3]})

Here, you create a Python defaultdict called dd and pass list to .default_factory. Notice that even when key isn’t defined, you can append values to it without getting a KeyError. That’s because dd automatically calls .default_factory to generate a default value for the missing key.

You can use defaultdict along with list to group the items in a sequence or a collection. Suppose that you’ve retrieved the following data from your company’s database:

![Imgur](https://i.imgur.com/LmGgVWn.png)

With this data, you create an initial list of tuple objects like the following:



In [31]:
dep = [('Sales', 'John Doe'),
       ('Sales', 'Martin Smith'),
       ('Accounting', 'Jane Doe'),
       ('Marketing', 'Elizabeth Smith'),
       ('Marketing', 'Adam Doe')]

In [32]:
dep_dd = defaultdict(list)

for dep_, name_ in dep:
    dep_dd[dep_].append(name_)

dep_dd

defaultdict(list,
            {'Sales': ['John Doe', 'Martin Smith'],
             'Accounting': ['Jane Doe'],
             'Marketing': ['Elizabeth Smith', 'Adam Doe']})

> This code is straightforward, and you’ll find similar code quite often in your work as a Python coder. However, the defaultdict version is arguably more readable, and for large datasets, it can also be a lot faster and more efficient. __So, if speed is a concern for you, then you should consider using a defaultdict instead of a standard dict.__



__Grouping Unique Items__:

Continue working with the data of departments and employees from the previous section. After some processing, you realize that a few employees have been duplicated in the database by mistake. You need to clean up the data and remove the duplicated employees from your dep_dd dictionary. To do this, you can use a set as the .default_factory and rewrite your code as follows:



In [33]:
dep = [('Sales', 'John Doe'),
       ('Sales', 'Martin Smith'),
       ('Accounting', 'Jane Doe'),
       ('Marketing', 'Elizabeth Smith'),
       ('Marketing', 'Elizabeth Smith'),
       ('Marketing', 'Adam Doe'),
       ('Marketing', 'Adam Doe'),
       ('Marketing', 'Adam Doe')]

In [35]:
dep_dd = defaultdict(set)

for dep_, name_ in dep:
    dep_dd[dep_].add(name_)

dep_dd

defaultdict(set,
            {'Sales': {'John Doe', 'Martin Smith'},
             'Accounting': {'Jane Doe'},
             'Marketing': {'Adam Doe', 'Elizabeth Smith'}})

__Counting Items__

If you set .default_factory to int, then your defaultdict will be useful for counting the items in a sequence or collection. When you call int() with no arguments, the function returns 0, which is the typical value you’d use to initialize a counter.

To continue with the example of the company database, suppose you want to build a dictionary that counts the number of employees per department. In this case, you can code something like this:




In [36]:
dep = [('Sales', 'John Doe'),
       ('Sales', 'Martin Smith'),
       ('Accounting', 'Jane Doe'),
       ('Marketing', 'Elizabeth Smith'),
       ('Marketing', 'Adam Doe')]

In [39]:
dd = defaultdict(int)

dd

defaultdict(int, {})

In [41]:
for dep_, name_ in dep:
    dd[dep_]+=1

In [42]:
dd

defaultdict(int, {'Sales': 2, 'Accounting': 1, 'Marketing': 2})

Here, you set .default_factory to int. When you call int() with no argument, the returned value is 0. You can use this default value to start counting the employees that work in each department. For this code to work correctly, you need a clean dataset. There must be no repeated data. Otherwise, you’ll need to filter out the repeated employees.

Another example of counting items is the mississippi example, where you count the number of times each letter in a word is repeated. Take a look at the following code:



In [43]:
s = 'mississippi'

dd = defaultdict(int)

for s_ in s:
    dd[s_] += 1

dd

defaultdict(int, {'m': 1, 'i': 4, 's': 4, 'p': 2})

As counting is a relatively common task in programming, the Python dictionary-like class collections.Counter is specially designed for counting items in a sequence. With Counter, you can write the mississippi example as follows:



In [45]:
from collections import Counter
counter = Counter(s)

counter

Counter({'m': 1, 'i': 4, 's': 4, 'p': 2})

In this case, Counter does all the work for you! You only need to pass in a sequence, and the dictionary will count its items, storing them as keys and the counts as values. Note that this example works because Python strings are also a sequence type.

__Accumulating Values__

Sometimes you’ll need to calculate the total sum of the values in a sequence or collection. Let’s say you have the following Excel sheet with data about the sales of your Python website:



In [46]:
incomes = [('Books', 1250.00),
           ('Books', 1300.00),
           ('Books', 1420.00),
           ('Tutorials', 560.00),
           ('Tutorials', 630.00),
           ('Tutorials', 750.00),
           ('Courses', 2500.00),
           ('Courses', 2430.00),
           ('Courses', 2750.00),]

With this data, you want to calculate the total income per product. To do that, you can use a Python defaultdict with float as .default_factory and then code something like this:



In [47]:
dd = defaultdict(float)
dd['test']

0.0

In [48]:
dd = defaultdict(float)
for product, income in incomes:
    dd[product] += income

dd

defaultdict(float, {'Books': 3970.0, 'Tutorials': 1940.0, 'Courses': 7680.0})

<a style='text-decoration:none;line-height:16px;display:flex;color:#5B5B62;padding:10px;justify-content:end;' href='https://deepnote.com?utm_source=created-in-deepnote-cell&projectId=c9f7b205-46e2-4f7d-8027-1722d788f5d8' target="_blank">
 </img>
Created in <span style='font-weight:600;margin-left:4px;'>Deepnote</span></a>