# Libraries

In [11]:
import pandas as pd
import numpy as np
import seaborn

# Make dataset

So this is calling the function `load_dataset()` from the namespace `seaborn`. So the `load_dataset()` function instantiates the object, which I've assigned to new variable `iris`.

In [9]:
iris = seaborn.load_dataset("iris")
iris.head()

Unnamed: 0,sepal_length,sepal_width,petal_length,petal_width,species
0,5.1,3.5,1.4,0.2,setosa
1,4.9,3.0,1.4,0.2,setosa
2,4.7,3.2,1.3,0.2,setosa
3,4.6,3.1,1.5,0.2,setosa
4,5.0,3.6,1.4,0.2,setosa


The doco `help("seaborn.load_dataset")` doesn't seem to tell me what class of object is returned by `seaborn.load_dataset()`, so I can use the function below to check the class.

Isitha says he thinks `seaborn` calls `pandas`, so `pandas` does the initialisation and sets the class. `load_dataset()` most likely just downloads the csv and gets `pandas` to actually read it.

Based on this line in the doco, this appears to be the case?

`-data/
    kws : dict, optional
        Passed to pandas.read_csv`

In [12]:
type(iris)

pandas.core.frame.DataFrame

A `pandas` dataframe can also be instantiated by the ~method~ function below. The ~method~ function `DataFrame()` comes from the namespace `pandas`. It also make an object of class `DataFrame`. This class can be found in library `pandas`, package `core`, module `frame` (ends in .py); hence class `pandas.core.frame.DataFrame`.

The documentation is confusing though because it says this `pandas.DataFrame = class DataFrame(pandas.core.generic.NDFrame)`. This is because `pandas.DataFrame` inherits from `pandas.core.genericNDFrame`. So all the methods and attributes in `NDFrame` are also present in `DataFrame`. Thanks Isitha!

In [20]:
df = pd.DataFrame(
     {
         'col1': [1, 2],
         'col2': [3, 4]
     }
)

print(type(df))
df.head()

<class 'pandas.core.frame.DataFrame'>


Unnamed: 0,col1,col2
0,1,3
1,2,4


If I create a numpy array object, here I use the ~method~ function `array` from `numpy` to instantialise class `ndarray` (the documentation states it under Returns section).

In [35]:
arr = np.array(
    [[1, 2],
     [3, 4]]
)
print(type(arr))
arr

<class 'numpy.ndarray'>


array([[1, 2],
       [3, 4]])

# Understanding syntax

## Attributes

To see all the attributes for an object (e.g. a class), I can use `dir()`.

The following also work:

`dir(pd.core.frame.DataFrame)`

`dir(df)`

But this doesn't work because I didn't declare the `pandas` namespace as `pandas`, I declared it as `pd`.

`dir(pandas.core.frame.DataFrame)`

In [29]:
# Very long so not run
#dir(pd.DataFrame)

I can see a given attribute for an object like so. Unlike a method call, there's no parentheses (). So the `shape` attribute for `df` is actually for class `DataFrame`, whereas the `shape` attribute for `arr` is for class `ndarray`.

In [42]:
print(type(df))
df.shape

<class 'pandas.core.frame.DataFrame'>


(2, 2)

In [41]:
print(type(arr))
arr.shape

<class 'numpy.ndarray'>


(2, 2)

The syntax `df.shape` is possible because the class has been instantialised when `df` was created. The syntax below perhaps shows this more clearly, following the `class.attribute`.

In [44]:
# Testing without assigning it to a variable.
pd.DataFrame(
     {
         'col1': [1, 2],
         'col2': [3, 4]
     }
).shape

(2, 2)

According to the docs (https://docs.python.org/3/tutorial/classes.html) attributes follow the syntax `object.attribute`. This means for something like `module.function`, the function is just an attribute of the module object.

Since a namespace is a mapping from names to objects, e.g. the local names in a function invocation. Quote: "In a sense the set of attributes of an object also form a namespace."

## Methods and Functions

Unlike the attribute, both methods and functions seem to be followed by parentheses e.g. `package.class.method()` or `package.function()`.

Methods appear to be functions - but they're tied to a class, hence `class.method()`.

Functions aren't tied to a class, so you can just call them `function()`.

### Functions

So from the above, we saw `pd.DataFrame()` was a function from the package `pandas`. If I imported the `core.frame` module directly I could probably just use `DataFrame()` without needing to specify the namespace `pd`. But then I run the risk of clashing method and function names from other packages, so this helps with scoping.

Also `np.array()`:

`np` namespace for package `numpy`

`array()` function

function created object with class `ndarray`.

### Methods

Methods are bound to a class. So I need to declare the class for a method to work. E.g. `pd.drop()` won't work, because the method `drop()` is tied to class `DataFrame`. So then I need to do `pd.DataFrame().drop()`.

Methods are different to functions because the object is implicitly passed into the method.

Also the method can operate on data contained in the class (object is an instance of a class; the class is the definition and object is instance of the data) [StackOverflow source](https://stackoverflow.com/questions/155609/whats-the-difference-between-a-method-and-a-function).

Does this mean functions *can't* change an object in place? I know a method can mutate an object in place.

#### Numpy method example

*Note (This is also made more confusing because there's a function in numpy for `all()`).*

So np.ndarray has method `.all()`. This method is for class `ndarray`, so it gets called `object.method()`, because the object is an instance of `pkg.class`. So really the method is something like... `pkg.class.method()`, but it's not that simple, because `np.ndarray().all()` doesn't work because there's no data in the class `ndarray()`. 

In [64]:
np.ndarray().all()

TypeError: ndarray() missing required argument 'shape' (pos 1)

But as soon as I put some data in the class, the method can work.

In [66]:
np.ndarray(1).all()

True

So `np.all()` is also a function, not just a method. So here, I'm using the function on an object `np.ndarray(1)`

In [68]:
np.all(np.ndarray(1))

True

# Changing objects in place

According to Isitha, both functions and methods can change an object in place. So that doesn't fix that issue.