# Function arguments in Python

In this lecture, we will explore the four types of functional arguments in Python, namely

1. Position-only arguments (only in Python 3.8--not in Anaconda as of Feb. 2021).
    1. *Skipping for now*
2. Positional or keyword arguments.
3. Varargs (a variable number of extra arguments).
4. Keyword only arguments.
5. kwargs (a variable number of extra keyword arguments)

## Note on position-only arguments.

A new type of argument was introduced in Python 3.8, but Anaconda hasn't been upgraded yet due to some incompatibility with certain packages.  We will be ignoring this type of argument for now.  Expect this lecture to be updated in the future.

In [2]:
import sys

In [3]:
sys.version

'3.7.6 (default, Jan  8 2020, 19:59:22) \n[GCC 7.3.0]'

## Position or keyword arguments.

The most common type of arguments can be accessed either through their position or as keywords.  Note that Python requires that arguments with a default value follow those that do not.

In [4]:
def f(pos_or_kw1, pos_or_kw2, pos_or_kw3 = None):
    return f"a1 = {pos_or_kw1}, a2 = {pos_or_kw2}, a3 = {pos_or_kw3}"

#### Non-default via position

In [5]:
f(1, 2)

'a1 = 1, a2 = 2, a3 = None'

#### Non-default via keyword

In [6]:
f(pos_or_kw1=1, pos_or_kw2=2)

'a1 = 1, a2 = 2, a3 = None'

#### keyword arguments can be in any order

In [7]:
f(pos_or_kw2=1, pos_or_kw1=2)

'a1 = 2, a2 = 1, a3 = None'

#### Args without defaults are required

In [8]:
f(1)

TypeError: f() missing 1 required positional argument: 'pos_or_kw2'

#### Default via position

In [37]:
f(1, 2, 3)

'a1 = 1, a2 = 2, a3 = 3'

#### Default via keyword

In [38]:
f(pos_or_kw1=1, pos_or_kw2=2, pos_or_kw3=3)

'a1 = 1, a2 = 2, a3 = 3'

#### Again keyword arguments can be in any order

In [39]:
f(pos_or_kw3=1, pos_or_kw2=2, pos_or_kw1=3)

'a1 = 3, a2 = 2, a3 = 1'

## Varargs

We can add a variable number of additional arguments using the `*args` arguments.  The arguments for these additiona entries will be stored in a tuple named `args`.  

Note that varargs must follow all position or keyword arguments.

In [40]:
def g(p_kw1, p_kw2, p_kw3 = None, *args):
    return f"a1 = {p_kw1}, a2 = {p_kw2}, a3 = {p_kw3}, star_args = {args}"

#### Positions for defaults are preserved.

Note that the third position provides a value to `p_kw3` then the remaining arguments get passed to `args`

In [41]:
g(1, 2, 3, 4, 5)

'a1 = 1, a2 = 2, a3 = 3, star_args = (4, 5)'

#### `args` is just another name

While it is convential to use `args` in `*args`, the choice of name is up to the programmer.

In [42]:
def g2(*my_args):
    return my_args

In [43]:
g2(1, 2, "a", "b")

(1, 2, 'a', 'b')

## Keyword-only parameters

Any parameters defined after `*args` will be deemed *keyword-only* and can only be accessed through keyword-assignment.

In [44]:
def h(p_kw1, p_kw2, p_kw3 = None, *args, kw_only = "> Silas"):
    return f"a1 = {p_kw1}, a2 = {p_kw2}, a3 = {p_kw3}, star_args = {args}, Iverson {kw_only}"

#### Using the default value

In [45]:
h(1, 2, 3, 4, 5)

'a1 = 1, a2 = 2, a3 = 3, star_args = (4, 5), Iverson > Silas'

#### Changing the default with keyword-assignment

In [46]:
h(1, 2, 3, 4, 5, kw_only = "!= Malone")

'a1 = 1, a2 = 2, a3 = 3, star_args = (4, 5), Iverson != Malone'

#### Defining keyword-only parameters without varargs.

If you would like to define keyword-only parameters **but no varargs**, insert a `*,` after the last `positional or keyword` parameter.

In [47]:
def h2(p_kw1, p_kw2, p_kw3 = None, *, kw_only = "> Silas"):
    return f"a1 = {p_kw1}, a2 = {p_kw2}, a3 = {p_kw3}, Iverson {kw_only}"

#### Using both the default value

In [48]:
h2(1, 2)

'a1 = 1, a2 = 2, a3 = None, Iverson > Silas'

#### Keeping the default value for `kw_only`

In [49]:
h2(1, 2, 3)

'a1 = 1, a2 = 2, a3 = 3, Iverson > Silas'

#### Can't access `kw_only` via a positional argument.

In [50]:
h2(1, 2, 3, 4)

TypeError: h2() takes from 2 to 3 positional arguments but 4 were given

#### Must use keyword assignment to change `kw_only`

In [31]:
h2(1, 2, 3, kw_only="< Hooks")

'a1 = 1, a2 = 2, a3 = 3, Iverson < Hooks'

## A variable number of additional keyword arguments

Finally, we can use `**kwargs` to gather any number of additional keyword-only arguments.  The resulting values will be stored in a `dict` with keywords as keys and argument values as values.  Again, the `kwargs` name is customary, but can be changed by the programmer.

In [32]:
def m(p1, p2, *, kw1 = None, **my_kwargs):
    return my_kwargs

In [33]:
m(1, 2, Iverson="Great", Bergen="likes R")

{'Iverson': 'Great', 'Bergen': 'likes R'}

# Impact on `pipeable` functions

When creating `pipeable` functions, we need to make a couple adjustments.

1. We **CANNOT** mix `*args` and curried functions, as the function will never complete.
2. Consider switching all parameters with default values to keyword-only parameters.
3. Use `**kwargs` as a way to allow for variable number of inputs.

In [56]:
!pip install composable



In [57]:
import pandas as pd
from composable import pipeable

In [58]:
help(pd.Series.str.slice)

Help on function slice in module pandas.core.strings.accessor:

slice(self, start=None, stop=None, step=None)
    Slice substrings from each element in the Series or Index.
    
    Parameters
    ----------
    start : int, optional
        Start position for slice operation.
    stop : int, optional
        Stop position for slice operation.
    step : int, optional
        Step size for slice operation.
    
    Returns
    -------
    Series or Index of object
        Series or Index from sliced substring from original string object.
    
    See Also
    --------
    Series.str.slice_replace : Replace a slice with a string.
    Series.str.get : Return element at position.
        Equivalent to `Series.str.slice(start=i, stop=i+1)` with `i`
        being the position.
    
    Examples
    --------
    >>> s = pd.Series(["koala", "fox", "chameleon"])
    >>> s
    0        koala
    1          fox
    2    chameleon
    dtype: object
    
    >>> s.str.slice(start=1)
    0        o

In [59]:
@pipeable
def slice(col, start=None, stop=None, step=None):
    return col.str.slice( start = start, stop = stop, step = step)

In [60]:
from composable.strict import map
c = pd.Series(map(str, range(1000,1011)))
c

0     1000
1     1001
2     1002
3     1003
4     1004
5     1005
6     1006
7     1007
8     1008
9     1009
10    1010
dtype: object

In [61]:
c.str.slice(1, 3)

0     00
1     00
2     00
3     00
4     00
5     00
6     00
7     00
8     00
9     00
10    01
dtype: object

In [63]:
c >> slice(1, 3) # first argument passed to `col` by position.

AttributeError: 'int' object has no attribute 'str'

In [64]:
@pipeable
def slice(col, *, start=None, stop=None, step=None):
    return col.str.slice(start, stop, step)

In [65]:
c >> slice(start = 1, stop = 3) # second and third digits

0     00
1     00
2     00
3     00
4     00
5     00
6     00
7     00
8     00
9     00
10    01
dtype: object

In [None]:
@pipeable
def replace(col, pat, rep1, n = -1, case = None, flags = 0, regex = none):
    return col.str.slice()

In [66]:
help(c.str.replace)

Help on method replace in module pandas.core.strings.accessor:

replace(pat, repl, n=-1, case=None, flags=0, regex=None) method of pandas.core.strings.accessor.StringMethods instance
    Replace each occurrence of pattern/regex in the Series/Index.
    
    Equivalent to :meth:`str.replace` or :func:`re.sub`, depending on
    the regex value.
    
    Parameters
    ----------
    pat : str or compiled regex
        String can be a character sequence or regular expression.
    repl : str or callable
        Replacement string or a callable. The callable is passed the regex
        match object and must return a replacement string to be used.
        See :func:`re.sub`.
    n : int, default -1 (all)
        Number of replacements to make from start.
    case : bool, default None
        Determines if replace is case sensitive:
    
        - If True, case sensitive (the default if `pat` is a string)
        - Set to False for case insensitive
        - Cannot be set if `pat` is a compil

In [69]:
[m for m in dir(pd.DatetimeIndex) if not m.startswith('_')]

['T',
 'all',
 'any',
 'append',
 'argmax',
 'argmin',
 'argsort',
 'array',
 'asi8',
 'asof',
 'asof_locs',
 'astype',
 'ceil',
 'copy',
 'date',
 'day',
 'day_name',
 'day_of_week',
 'day_of_year',
 'dayofweek',
 'dayofyear',
 'days_in_month',
 'daysinmonth',
 'delete',
 'difference',
 'drop',
 'drop_duplicates',
 'droplevel',
 'dropna',
 'dtype',
 'duplicated',
 'empty',
 'equals',
 'factorize',
 'fillna',
 'floor',
 'format',
 'freq',
 'freqstr',
 'get_indexer',
 'get_indexer_for',
 'get_indexer_non_unique',
 'get_level_values',
 'get_loc',
 'get_slice_bound',
 'get_value',
 'groupby',
 'has_duplicates',
 'hasnans',
 'holds_integer',
 'hour',
 'identical',
 'indexer_at_time',
 'indexer_between_time',
 'inferred_freq',
 'inferred_type',
 'insert',
 'intersection',
 'is_',
 'is_all_dates',
 'is_boolean',
 'is_categorical',
 'is_floating',
 'is_integer',
 'is_interval',
 'is_leap_year',
 'is_mixed',
 'is_monotonic',
 'is_monotonic_decreasing',
 'is_monotonic_increasing',
 'is_month_en

In [70]:
help(pd.DatetimeIndex.is_leap_year)

Help on property:

    Boolean indicator if the date belongs to a leap year.
    
    A leap year is a year, which has 366 days (instead of 365) including
    29th of February as an intercalary day.
    Leap years are years which are multiples of four with the exception
    of years divisible by 100 but not by 400.
    
    Returns
    -------
    Series or ndarray
         Booleans indicating if dates belong to a leap year.
    
    Examples
    --------
    This method is available on Series with datetime values under
    the ``.dt`` accessor, and directly on DatetimeIndex.
    
    >>> idx = pd.date_range("2012-01-01", "2015-01-01", freq="Y")
    >>> idx
    DatetimeIndex(['2012-12-31', '2013-12-31', '2014-12-31'],
                  dtype='datetime64[ns]', freq='A-DEC')
    >>> idx.is_leap_year
    array([ True, False, False])
    
    >>> dates_series = pd.Series(idx)
    >>> dates_series
    0   2012-12-31
    1   2013-12-31
    2   2014-12-31
    dtype: datetime64[ns]
    >>> dat

In [71]:
help(pd.DatetimeIndex.is_interval)

Help on function is_interval in module pandas.core.indexes.base:

is_interval(self) -> bool
    Check if the Index holds Interval objects.
    
    Returns
    -------
    bool
        Whether or not the Index holds Interval objects.
    
    See Also
    --------
    IntervalIndex : Index for Interval objects.
    is_boolean : Check if the Index only consists of booleans.
    is_integer : Check if the Index only consists of integers.
    is_floating : Check if the Index is a floating type.
    is_numeric : Check if the Index only consists of numeric data.
    is_object : Check if the Index is of the object dtype.
    is_categorical : Check if the Index holds categorical data.
    is_mixed : Check if the Index holds data with mixed data types.
    
    Examples
    --------
    >>> idx = pd.Index([pd.Interval(left=0, right=5),
    ...                 pd.Interval(left=5, right=10)])
    >>> idx.is_interval()
    True
    
    >>> idx = pd.Index([1, 3, 5, 7])
    >>> idx.is_interval()
  

In [72]:
help(pd.DatetimeIndex.intersection)

Help on function intersection in module pandas.core.indexes.datetimelike:

intersection(self, other, sort=False)
    Specialized intersection for DatetimeIndex/TimedeltaIndex.
    
    May be much faster than Index.intersection
    
    Parameters
    ----------
    other : Same type as self or array-like
    sort : False or None, default False
        Sort the resulting index if possible.
    
        .. versionadded:: 0.24.0
    
        .. versionchanged:: 0.24.1
    
           Changed the default to ``False`` to match the behaviour
           from before 0.24.0.
    
        .. versionchanged:: 0.25.0
    
           The `sort` keyword is added
    
    Returns
    -------
    y : Index or same type as self



In [78]:
help(pd.DatetimeIndex.insert)



Help on function insert in module pandas.core.indexes.datetimelike:

insert(self, loc, item)
    Make new Index inserting new item at location. Follows
    Python list.append semantics for negative values.
    
    Parameters
    ----------
    loc : int
    item : object
    
    Returns
    -------
    new_index : Index
    
    Raises
    ------
    ValueError if the item is not valid for this dtype.



In [74]:
help(pd.DatetimeIndex.is_mixed)

Help on function is_mixed in module pandas.core.indexes.base:

is_mixed(self) -> bool
    Check if the Index holds data with mixed data types.
    
    Returns
    -------
    bool
        Whether or not the Index holds data with mixed data types.
    
    See Also
    --------
    is_boolean : Check if the Index only consists of booleans.
    is_integer : Check if the Index only consists of integers.
    is_floating : Check if the Index is a floating type.
    is_numeric : Check if the Index only consists of numeric data.
    is_object : Check if the Index is of the object dtype.
    is_categorical : Check if the Index holds categorical data.
    is_interval : Check if the Index holds Interval objects.
    
    Examples
    --------
    >>> idx = pd.Index(['a', np.nan, 'b'])
    >>> idx.is_mixed()
    True
    
    >>> idx = pd.Index([1.0, 2.0, 3.0, 5.0])
    >>> idx.is_mixed()
    False

