# Some tricks in Python

## Function arguments

If you look the signature of [scikit-learn's `cross_validate`](https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.cross_validate.html):

In [1]:
from sklearn.model_selection import cross_validate
from inspect import signature

signature = signature(cross_validate)
print(signature)

(estimator, X, y=None, *, groups=None, scoring=None, cv=None, n_jobs=None, verbose=0, fit_params=None, params=None, pre_dispatch='2*n_jobs', return_train_score=False, return_estimator=False, return_indices=False, error_score=nan)


You can notice the `*`. The `*` is a special syntax in Python that marks the separation between positional-only or positional-or-keyword parameters (before the `*`) and keyword-only parameters (after the `*`):
- Parameters before the `*` (like `estimator`, `X`, `y`) can be passed either positionally or as keyword arguments.
- Parameters after the `*` (like `groups`, `scoring`, `cv`, etc.) must be passed as keyword arguments. You cannot pass them positionally.


## Public vs. Private Functions in Python

In Python, there’s no strict enforcement of "public" or "private" access like in some other languages (e.g., Java or C++). Instead, Python relies on **naming conventions** to indicate the intended visibility or usage of variables, functions, or methods:

1. **`func_public`** (no underscore):
   - This is considered a **public** function by convention.
   - It’s intended to be part of the public API of a module or class, meaning it’s safe and expected for external code to call it.

2. **`_func_private`** (single leading underscore):
   - This is considered a **private** or **protected** function by convention.
   - The single underscore signals to other developers that this function is intended for **internal use** within the module or class and should not be accessed directly from outside. However, Python does **not enforce this**—it’s still fully callable.

Key Points
- **Both are callable**: Both `func_public()` and `_func_private()` can be called in Python. The single underscore is just a hint to programmers—it doesn’t impose any technical restriction.
- **Convention, not enforcement**: The underscore is part of Python’s “we’re all consenting adults” philosophy. It relies on developers respecting the convention rather than the language enforcing access control.

Example:

In [2]:
def func_public():
    print("This is public")

def _func_private():
    print("This is private")

# Both work fine
func_public()
_func_private()

This is public
This is private


## What is an Accessor in Python?

In Python, an **accessor** typically refers to a way to **access** (retrieve) an object's attribute or data, often implemented using a **property** or a method. More broadly, it’s a mechanism to get (and sometimes set) the value of an object’s internal state in a controlled way. The term is most commonly associated with the `@property` decorator, which allows you to define a method that behaves like an attribute.

Example of an Accessor with `@property`

In [3]:
class Person:
    def __init__(self, name):
        self._name = name

    @property
    def name(self):
        return self._name

person = Person("Alice")
print(person.name)

Alice


Pandas heavily relies on attributes (and properties) as part of its API to provide a convenient, intuitive, and efficient way to access data and metadata in its core objects like `Series`, `DataFrame`, and `Index`. While Pandas also uses methods extensively, attributes play a critical role in its design, often implemented as properties (using the `@property` decorator or similar mechanisms internally) to expose object state in a clean, attribute-like way.

Examples include:
1. `DataFrame.shape`, which returns a tuple of (rows, columns) accessed as `df.shape`, not `df.shape()`.
2. `Series.values`, which returns the underlying NumPy array as `s.values`.
3. `DataFrame.columns`, which gives the column labels as an `Index` object via `df.columns`.
4. `DataFrame.index`, which provides the row index as `df.index`.

Pandas uses attributes for several reasons:
1. They offer intuitive and concise syntax—`df.shape` or `df.columns` feels natural, resembling direct access to a table’s properties, unlike a method like `df.get_shape()`.
2. They’re performant for precomputed or cached metadata like `shape` or `index`, avoiding method call overhead.
3. Properties enable encapsulation and flexibility—`df.columns` might look simple but could include logic behind the scenes without changing the API.
4. This aligns with NumPy’s attribute-based design (e.g., `array.shape`), making Pandas familiar to NumPy users.
5. Attributes distinguish read-only metadata access (e.g., `shape`, `dtypes`) from methods that modify data or compute (e.g., `df.drop()`, `df.groupby()`).

Many of these "attributes" are properties, technically methods under the hood but exposed as attributes. For example, `df.shape` might retrieve dimensions internally but is accessed without parentheses. Pandas could use methods like `df.get_shape()`, but attributes are cleaner, clarify intent, and suit data analysts’ exploratory workflows.

In summary, Pandas relies on attributes (often properties) for intuitive, efficient, and consistent access to metadata and structure, enhancing readability, aligning with NumPy, and separating descriptive access from transformative operations.