# "fastcore: An Underrated Python Library"

> A unique python library that extends the python programming language and provides utilities that enhance productivity.
- author: "<a href='https://twitter.com/HamelHusain'>Hamel Husain</a>, <a href='https://twitter.com/jeremyphoward'>Jeremy Howard</a>"
- toc: false
- image: image/terminal.png
- comments: true
- categories: [fastcore, fastai]
- permalink: /fastcore/
- badges: true

![screenshot with code](fastcore_imgs/terminal.jpg "Credit: https://unsplash.com/photos/ieic5Tq8YMk")

## Background

I recently embarked on a journey to sharpen my python skills:  I wanted to learn advanced patterns, idioms, and techniques.  I started with reading a bunch of books on Advanced Python, however, like anything else, the information didn’t seem to stick without having somewhere to apply it.  That’s when an opportunity to help document and write tests for the [python library fastcore](https://fastcore.fast.ai/) presented itself.  For the uninitiated, [fastcore](https://fastcore.fast.ai/) is a set of modules on top of which many [fast.ai](https://github.com/fastai) projects are built on.  I talk more about this experience towards the end of this article.  However, I want to first tell you a bit about the library!

## Why fastcore is interesting

1. **Get exposed to ideas from other languages without leaving python:**  I’ve always heard to become a better programmer, it is beneficial to learn other languages.  From a pragmatic point of view, I’ve found it difficult to learn other languages because I could never use them at work.   Fastcore extends python to include patterns found in languages as diverse as Julia, Ruby and Haskell.  Now that I understand these tools and have benifited from them, I am motivated to learn other languages.
2. **You get a new set of pragmatic tools**: fastcore includes utilities that will allow you to write more concise expressive code, and perhaps solve new problems.
3. **Learn more about the python programming language:**  Because fastcore extends the python programming language, many advanced concepts are exposed during the process.  For the motivated, this is a great way to see how many of the internals of python work.  


In this blog post, I’m going to focus primarily on showing some highlights of my favorite tools as it relates to #2.  My goal is to pique your interest in this library, and hopefully motivate you to check out the documentation after you are done to learn more!

## A whirlwind tour through fastcore

Here are some things you can do with fastcore that immediately caught my attention.

In [1]:
#hide
from fastcore.foundation import *
from fastcore.utils import *
from functools import partial
import numpy as np
import inspect
from pdb import set_trace

### Making **kwargs transparent

Whenever I see a function that has the argument <strong>**kwargs</strong>, I cringe a little.  This is because it means the api is obfuscated and I have to read the source code to figure out what valid parameters might be.  Consdier the below example:

In [2]:
def baz(a, b=2, c =3): return a + b + c

def foo(c, a, **kwargs):
    return c + baz(a, **kwargs)

inspect.signature(foo)

<Signature (c, a, **kwargs)>

Without reading the source code, it might be hard for me to know that `foo` also accepts additional parameters `b` and `c`.  We can fix this with [`delegates`](https://fastcore.fast.ai/foundation.html#delegates):

In [3]:
def baz(a, b=2, c =3): return a + b + c

@delegates(baz) # this decorator will pass down keyword arguments from baz
def foo(c, a, **kwargs):
    return c + baz(a, **kwargs)

inspect.signature(foo)

<Signature (c, a, b=2)>

You can customize the behavior of this decorator.  For example, you can have your cake and eat it too by passing down your arguments and also keeping `**kwargs`:

In [4]:
@delegates(baz, keep=True)
def foo(c, a, **kwargs):
    return c + baz(a, **kwargs)

inspect.signature(foo)

<Signature (c, a, b=2, **kwargs)>

You can also exclude arguments.  For example, we exclude argument `d` from delegation:

In [5]:
def basefoo(c=2, d=3): 
    pass

@delegates(basefoo, but= ['d']) # exclude `d`
def foo(a, b=1, **kwargs):
    pass

inspect.signature(foo)

<Signature (a, b=1, c=2)>

You can also delegate between classes:

In [6]:
class BaseFoo:
    def __init__(self, e, c=2): pass

@delegates()# since no argument was passsed here we delegate to the superclass
class Foo(BaseFoo):
    def __init__(self, a, b=1, **kwargs): super().__init__(**kwargs)
        
inspect.signature(Foo)

<Signature (a, b=1, c=2)>

For more information, read the [docs on delegates](https://fastcore.fast.ai/foundation.html#delegates)

### Avoid boilerplate when setting instance attributes

One thing I have always wondered if it was possible to avoid is the lengthy biolerplate involved with setting attributes in `__init__`, that often involves repeating many variable (which is error prone):

In [7]:
class Test:
    def __init__(self, first_variable, second_variable, third_variable):
        self.first_variable = first_variable
        self.second_variable = second_variable
        self.third_variable = third_variable

Ouch! That was painful.  Look at all the repeated variable names.  Do I really have to repeat myself like this when defining a class?  Not Anymore!  Checkout out [`fastcore.utils.store_attr`](https://fastcore.fast.ai/utils.html#store_attr):

In [8]:
class Test:
    def __init__(self, first_variable, second_variable, third_variable):
        store_attr(self, 'first_variable, second_variable, third_variable')
        
t = Test(5,4,3)
assert t.second_variable == 4

That's much better.  However, if I'm pretty lazy.  I can extend this a bit more if really don't want to repeat myself, using this [wisdom I found on stack overflow](https://stackoverflow.com/questions/582056/getting-list-of-parameter-names-inside-python-function):

In [21]:
def save_attr(self):
    "lazier version of store_attr"
    store_attr(self, ','.join(self.__init__.__code__.co_varnames[1:]))

class Test:
    def __init__(self, first_variable, second_variable, third_variable):
        save_attr(self)

t = Test(1,2,4)
assert t.second_variable == 2

The point in this case is fasatcore can give you lots of cool ideas on how to extend python to get rid of anything that annoys you.  And inspecting the source code is also very helpful - but that was not required in this case. 

### Avoiding subclassing boilerplate

One thing I hate about python is the `__super__().__init__()` boilerplate associated with subclassing.  For example:

In [10]:
from torch import nn
class ParentClass:
    def __init__(self): self.some_attr = 'hello'
        
class ChildClass(ParentClass):
    def __init__(self):
        super().__init__()

cc = ChildClass()
assert cc.some_attr == 'hello' # only accessible b/c you used super

We can avoid this boilerplate by using the metaclass [`PrePostInitMeta`](https://fastcore.fast.ai/foundation.html#PrePostInitMeta).  We define a new class called `NewParent` that is a wrapper around the `ParentClass`:

In [11]:
class NewParent(ParentClass, metaclass=PrePostInitMeta):
    def __pre_init__(self, *args, **kwargs): super().__init__()

class ChildClass(NewParent):
    def __init__(self):pass
    
sc = ChildClass()
assert sc.some_attr == 'hello' 

Learn more about how this works by [reading the docs](https://fastcore.fast.ai/foundation.html#PrePostInitMeta).

### Type Dispatch

Type dispatch, or [Multiple dispatch](), allows you to change the way a function behaves based upon the input types it recevies.  This is a prominent feature in some  programming languages like Julia.  For example, this is a conceptual example of how multiple dispatch works in Julia, returning different values depending on the input types of x and y: 

```julia
collide_with(x::Asteroid, y::Asteroid) = ...  
# deal with asteroid hitting asteroid 

collide_with(x::Asteroid, y::Spaceship) = ...  
# deal with asteroid hitting spaceship 

collide_with(x::Spaceship, y::Asteroid) = ...  
# deal with spaceship hitting asteroid 

collide_with(x::Spaceship, y::Spaceship) = ...  
# deal with spaceship hitting spaceship 
```

Type dispatch can be especially useful in data science, where you might allow different input types (i.e. numpy arrays and pandas dataframes) to function that processes data. Type dispatch allows you to have a common API for functions that do similar tasks. 

Unfortunately, Python does not support this out-of-the box.  Fortunately, there is the [`@typedispatch`](https://fastcore.fast.ai/dispatch.html#typedispatch-Decorator) decorator to the rescue:

### An improvement upon functools.partial: partialler

`functools.partial` is a great utlity that creates functions from other functions that lets you set default values. Lets take this function for example that filters a list to only contain values >= `val`:

In [12]:
test_input = [1,2,3,4,5,6]
def f(arr, val): 
    "Filter a list to remove any values that are less than val."
    return [x for x in arr if x >= val]

f(test_input, 3)

[3, 4, 5, 6]

You can create a new function out of this function using `partial` that sets the default value to 5:

In [13]:
filter5 = partial(f, val=5)
filter5(test_input)

[5, 6]

One problem with `partial` is that it removes the original docstring and replaces it with a generic docstring:

In [14]:
filter5.__doc__, inspect.signature(filter5)

('partial(func, *args, **keywords) - new function with partial application\n    of the given arguments and keywords.\n',
 <Signature (arr, *, val=5)>)

`fastcore.partialler` fixes this, and makes sure the docstring is retained such that the new functions api is transparent to the end user:

In [15]:
filter5 = partialler(f, val=5)
filter5.__doc__, inspect.signature(filter5)

('Filter a list to remove any values that are less than val.',
 <Signature (arr, *, val=5)>)

### Composition of functions

A technique that is pervasive in functional programming languages is function composition, whereby you chain a bunch of functions together to achieve some kind of result.  This is especially useful when applying various data transformations.  Consider a toy example where I have three functions:  (1) Removes elements of a list less than 5 (from the prior section) (2) adds 2 to each number (3) sums all the numbers:

In [16]:
def add(arr, val): return [x + val for x in arr]
def arrsum(arr): return sum(arr)

# See the previous section on partialler
add2 = partialler(add, val=2)

transform = compose(filter5, add2, arrsum)
transform([1,2,3,4,5,6])

15

But why is this useful?  You might me thinking, I can accomplish the same thing with:

```py
arrsum(add2(filter5([1,2,3,4,5,6])))
```
You are not wrong!  However, composition gives you a convenient interface incase you want to do something like the following:

In [17]:
def fit(x, transforms:list):
    "fit a model after performing transformations"
    x = compose(*transforms)(x)
    y = [np.mean(x)] * len(x) # its a dumb model.  Don't judge me
    return y
    
fit(x=[1,2,3,4,5,6], transforms=[filter5, add2]) # filters out elements < 5, adds 2, then predicts the mean

[7.5, 7.5]

### A more useful <code>__repr__</code>

In python, `__repr__` helps you get information about an object for logging and debugging.  Below is what you get by default when you define a new class:  _Note: we are using `store_attr`, which was discussed earlier to simplify our `__init__`__.

In [29]:
class Test:
    store_attrs='a,b,c'
    def __init__(self, a, b=2, c=3): store_attr(self) # `store_attr` will use class attribute store_attrs if passed no arguments.
    
Test(1)

<__main__.Test at 0x7f8cbf62b610>

We can use `basic_repr` to quickly give us a more sensible default:

In [28]:
class Test:
    store_attrs='a,b,c'
    def __init__(self, a, b=2, c=3): store_attr(self) 
    __repr__ = basic_repr('a,b,c')
    
Test(2)

Test(a=2, b=2, c=3)

For information, read [the docs](https://fastcore.fast.ai/utils.html#basic_repr).

### Parallel Processing In Python

TODO parallel

### A Better pathlib

### A Drop In Replacement For List

## But Wait ... There's More!

TODO

## Further Reading

- Blog post on [delegation](https://www.fast.ai/2019/08/06/delegation/)
- Peer reviewed paper on fastai lib

## Aside: Thoughts about contributing to open source

![Screenshot of GitHub](fastcore_imgs/github.jpg "Credit: https://unsplash.com/photos/4J2OD3njVgI")

I spent about a month contributing to fastai, which mainly involved writing documentation for fastcore, but also setting up CI and performing DevOps tasks like creating Docker containers.   I was asked to share some thoughts on my experiences, which I share below.

Contributing to an open-source is a great way to sharpen your programming skills.  If your contributions are well received by maintainers, it can also become a great mentorship opportunity.  However, many people struggle with how to contribute to open-source projects. One mistake that people often make is they jump in and try to add new features (that they perhaps need) to a project.  Nadia Eghbal, the author of the recent book [Working in Public: The Making and Maintenance of Open Source Software](https://www.amazon.com/Working-Public-Making-Maintenance-Software/dp/0578675862)
conducted an in-depth survey of open source.  One of her observations that is relevant:

> The distribution --where one or few developers do most of the work followed by a long tail of casual contributors and many more passive users -- is now the norm, not the exception, in open source… One study found that in more than 85% of open source projects, less than 5% of the developers were responsible or over 95% of the code and social interactions.  

Given this context, it is reasonable for maintainers to view many contributions, especially those that add new features as more surface area that they alone, must maintain. No matter how well-intentioned you might be, maintainers  expect that you will not stick around after your contribution is made, unless proven otherwhise.  Therefore, I am a big believer that in order to add new functionality, you must first “make room” for it by reducing the existing maintenance burden on that project. How can you reduce the maintenance burden on a project?  Here are some ways I find found to be very effective:

- Write tests.

- Write documentation.

- Help automate tasks such as CI workflows if they do not exist. 

- Fix Bugs, but write tests and docs while doing so.

The above tasks tend to reduce the maintenance burden on projects, even after you leave.  An extremely high-value activity is writing tests and documentation as they force you to internalize how a piece of software works so that you may explain it to users.  Most importantly, these tasks can help you read software with an intention and deeply learn idioms, patterns, and tools that you can take elsewhere.  Finally, being a beginner can give you an advantage in writing clearer documentation that makes less assumptions about the knowledge of the reader.


## Shameless plug: fastpages

This blog post was written entirely in a Jupyter Notebook, which I just committed to GitHub and it automatically got converted to a blog post!  Sound interesting?  [Checkout fastpages](https://github.com/fastai/fastpages).