# Break your API gently - or not at all

PyCon.DE 2019, Berlin

### Tim Hoffmann

@timhoffm

https://github.com/timhoffm/pyconde2019-api

<table style="border: 0;">
<tr style="background: #ffffff;"><td><img src="images/zeiss_logo.png" height=60 width=60 /></td><td>Carl Zeiss Semiconductor Manufacturing Technologies GmbH</td></tr>
<tr style="background: #ffffff;"><td><img src="images/matplotlib_logo.png" height=60 width=60 /></td><td>Matplotlib Core Developer</td></tr>
</table>


Code that is used by others defines an interface.

Changing that interface breaks other peoples code.

# Preventive Measures

## Reduce the API footprint

- Mark functions, methods, attributes as private:
  ~~~
  _internal
  ~~~

- Use keyword-only arguments:
  ~~~
  def is_close_to_int(x, *, atol=1e-10):
  ~~~

- Don't write code that is not needed (YAGNI)

## Do it right from the beginning

***You know how you should write code.***

- Clean code
  
- Design patterns

- ...

- Naming!

### Choose meaningfull and precise names

In [None]:
def use(arg):

In [None]:
def use(backend):

In [None]:
def set_backend(name):

### Each parameter should describe exactly one logical concept


In [None]:
plot_image(data, cmin=-2, cmax=2)

In [None]:
plot_image(data, clim=(-2, 2))

In [None]:
plot_image(data, clim='symmetric')

### Each parameter should describe exactly one logical concept

![](images/legend_textcolor.png)

  ~~~
  legend(match_textcolor=True)
  
  legend(textcolor='match')
  
  legend(textcolor='artist')
  
  legend(textcolor='linecolor')
  ~~~

  https://github.com/matplotlib/matplotlib/issues/10720

## Types of API changes

![](images\break_types.png)

## Non-breaking changes: Extensions

- Add classes, methods functions, attributes  

- Append parameters with default values
  ~~~
  def func(a) --> def func(a, b=None)
  ~~~  
  
- Insert into keyword-only parameters  

- Reorder keyword-only parameters

In [None]:
import warnings
import pandas
val = 1

### Keyword-only parameters

***Use whenever feasible***

- More readable code
  
- More freedom to change the API without breaking it

In [None]:
list.sort(cmp)

In [None]:
list.sort(cmp=None, *, key=None)

In [None]:
list.sort(*, key=None)

In [None]:
def is_close_to_int(x, atol=1e-10):
    
is_close_to_int(val, 1e-8)


def is_close_to_int(x, atol=1e-10, rtol=1-e7):

In [None]:
def is_close_to_int(x, *, atol=1e-10):
    pass
    
is_close_to_int(val, atol=1e-8)


def is_close_to_int(x, *, rtol=1-e7, atol=1e-10):
    pass

## Breaking API changes

### Deprecation

- Document in the release notes   

- Warn when deprecated API is used:

  ~~~
  warnings.warn('X is deprecated. Use Y instead.',
                DeprecationWarning)`
  ~~~

Warning types:

| Type                        | What for | Target audience                          |
|:--------------------------- |:-------- |:---------------------------------------- |
| `DeprecationWarning`        | deprecated features | developers using your library            |
| `FutureWarning`             | deprecated features | end users                                |
| `PendingDeprecationWarning` | warnings about features that will be deprecated in the future | developers using your library particularly interested in forward-compatibility |


Further details:

- [Warning categories](https://docs.python.org/3/library/warnings.html#warning-categories)

- [PEP-0565](https://www.python.org/dev/peps/pep-0565/)

## Renaming a function

In [None]:
def func(arg):
    ...

In [None]:
def new_func(arg):
    ...
    
def func(arg):
    warnings.warn('func() is deprecated. Use new_func() instead.',
                  DeprecationWarning)
    return new_func(arg)

## Renaming a class

In [None]:
class A:
    ...

In [None]:
class B:
    ...

class A(B):
    def __init__(self, arg):
        warnings.warn("Class 'A' is deprecated. Use class 'B' instead.",
                      DeprecationWarning)
        super().__init__(arg)

## Renaming an attribute

In [None]:
class Circle:
    def __init__(self, size):
        self.size = size

In [None]:
class Circle:
    def __init__(self, size):
        self.radius = size
        
    @property
    def size(self):
        warnings.warn(
            "The attribute 'size' is deprecated. Use 'radius' instead.",
            DeprecationWarning)
        return self.radius
    
    @property.setter
    def size(self, val):
        warnings.warn(
            "The attribute 'size' is deprecated. Use 'radius' instead.",
            DeprecationWarning)
        self.radius = val

## Renaming a global variable

Python >= 3.7: [PEP-0562 -- Module __getattr__ and __dir__](https://www.python.org/dev/peps/pep-0562/)

In [None]:
# lib.py

_renamed_variables = {"var": "new_var"}

new_var = 'value'

def __getattr__(name):
    new_name = _renamed_variables.get(name)
    if new_name is not None:
        warnings.warn(f"{name} is deprecated. Use {new_name} instead.",
                      DeprecationWarning)
        return globals()[new_name]
    raise AttributeError(f"module {__name__} has no attribute {name}")

In [None]:
# main.py

import lib
lib.old_name  # Works, but emits the warning

Python < 3.7: https://pypi.org/project/mprop/ (a lot more magic)

## Renaming a parameter

***Watch out for all supported call patterns***

- positional parameters can be called using a keyword
- parameters with a default can be called positionally

In [None]:
def func(a, b_old, c=None):
    print(b_old)

In [None]:
def func(a, b, c=None):
    print(b)

In [None]:
func(1, 2)
func(1, 2, c=3)


func(1, b_old=2)
func(1, 2, 3)

## Renaming a parameter

In [None]:
_Undef = object()

def func(a, b=_Undef, c=None, *, b_old=_Undef):
    if b_old is _Undef and b is _Undef:
        raise TypeError("func() missing required positional argument: 'b'")
    elif b_old is not _Undef:
        if b is not _Undef:
            raise TypeError(
                "Parameter b replaces b_old in func(). Please remove b_old.")
        else:
            warnings.warn(
                'Parameter b_old is deprecated. Please use b instead.',
                DeprecationWarning)
            b = b_old
    print(b)

func(1, 2)
func(1, b_old=2)
func(1, b=2)
func(1, 2, b_old=2)

## Change of return values

- **Control parameter**  
  Example: [numpy.polyfit](https://docs.scipy.org/doc/numpy/reference/generated/numpy.polyfit.html?highlight=fit#numpy.polyfit)
  
      numpy.polyfit(..., full=False, ...)
  
          full : bool, optional
              Switch determining nature of return value. When it is False
              (the default) just the coefficients are returned, when True
              diagnostic information from the singular value decomposition
              is also returned.


- **Complex return type**  if they are already there:  
  `dict`, `dataclass`, (`namedtuple`)

## Change of behavior

No simple transition strategy. Often better to create a new function with a different name.

~~~
os.system()

subprocess.call(), subprocess.check_call(), subprocess_check_output()

subprocess.run()
~~~

Good transition documentation: [Replacing Older Functions with the subprocess Module](https://docs.python.org/3/library/subprocess.html#replacing-older-functions-with-the-subprocess-module)

# Fancy stuff

## Decorators for signature changes

Implementation details: `matplotlib.cbook.deprecation`

---

**Deprecate a function or method**

In [None]:
@deprecated("3.2", alternative="os.path.expanduser('~')")
def get_home():
    ...

**Rename a parameter**

In [None]:
@_rename_parameter("3.1", "arg", "backend")
def use(backend, warn=False, force=True):

**Delete a parameter**    

In [None]:
@_delete_parameter("3.2", "dryrun")
def print_jpg(self, filename_or_obj, *args, dryrun=False, ...):

**Change a parameter to keyword-only**

In [None]:
@_make_keyword_only("3.2", "minor")
def set_xticks(self, ticks, minor=False):

## Advanced: Migrating a complex method to a set of 'namespaced' methods

Example: pandas [DataFrame.plot()](https://pandas.pydata.org/pandas-docs/version/0.23.4/generated/pandas.DataFrame.plot.html)

- One single function with too much functionality
  ~~~
  df.plot(kind='line', ...)
  df.plot(kind='bar', ...)
  ...
  ~~~

- Separate methods: too verbose
  ~~~
  df.plot_line()
  df.plot_bar()
  ...
  ~~~

- Solution: Namespace for grouping
  ~~~
  df.plot.line()
  df.plot.bar()
  ...
  ~~~

### Can we have both simultaneously?

~~~
df.plot(kind='line')
df.plot.line()
~~~

Yes, `df.plot` needs to be a namespace and callable:

In [None]:
class PlotAccessor:
    
    def __call__(self, *args, kind='line', **kwargs):
        if kind == 'line':
            return self.line(*args, **kwargs)
        
    def line(self, x=None, y=None, **kwargs):
        ...
        

class DataFrame:
    
    plot = PlotAccessor()

## Expert: Change internal behavior using placeholder objects

In [None]:
class Axis:
    def __init__(self):
        self.ticks = []
        self.reset()
        
    def reset_ticks(self):
        del self.ticks[:]
        self.ticks.extend([self._get_tick()])

- `Axis.ticks` is a plain list.
- It's populated with a default during `reset_ticks()`.
- `reset_ticks()` is called many times when creating a figure with many subplots (*slow!*).
- `ticks` and `reset_ticks` are public API.

How to speed things up without breaking the public API?
And without rewriting large parts of the figure creation logic?

**Solution:** Use a cheap placeholder object that is lazily replaced by the list only when it is accessed.

Perform an action when `axis.ticks` is queried.

In [None]:
class _LazyTickList:
    """A descriptor for lazy instantiation of tick lists."""

    def __get__(self, instance, cls):
        if instance is None:
            return self
        else:
            instance.ticks = [instance._get_tick()]
            return instance.majorTicks

        
class Axis:
    def __init__(self):
        self.ticks = _LazyTickList()
        self.reset_ticks()
        
    def reset_ticks(self):
        try:
            del self.ticks
        except AttributeError:
            pass

Github PR: [Alternate implementation of lazy ticks](https://github.com/matplotlib/matplotlib/pull/10302)

# Summary

- Prevent later changes
  
  - Limit the API footprint
  
  - Write good APIs from the beginning

- Break the API gently

  - Warn on future API changes
  
  - Provide a smooth transition path
