# Everything (in Python) is an object

Topics:

- Objects
- Libraries
- Duck-typing

Time: 10 minutes

## Objects

By _object_ we mean a bundle of data.
Sometimes the bundle is simple (such as a floating point number, e.g. `0.2`),
and sometimes it's complicated, such as a dataframe.
All object-oriented programming languages (of which Python is one) treat objects as the primary thing of interest,
but Python takes it in an unusual direction.

Data inside an object is one of:

- Named data, called a _property_, accessed as `object.property`
- Numbered data, accessed via an _index_, accessed as `object[index]`
- A function (often called a _method_ in the context of objects), accessed as `object.method`

(It turns out Python actually uses properties for everything, but these are the names common in computer science.)

In [15]:
# Make a complex number
# Python uses `j` for the imaginary unit, which is the engineering standard
complex_number_example = 12 + 6j

# Grab the imaginary part of the complex number
print( complex_number_example.imag )
#                             ^^^^ property
#       ^^^^^^^^^^^^^^^^^^^^^      object


list_example = ["a", 200, complex_number_example]

# Grab the value at index 1
print( list_example[1] )
#                   ^  index
#      ^^^^^^^^^^^^    object

string_example = "hello"

print( string_example.upper() )
#                     ^^^^^    method
#      ^^^^^^^^^^^^^^          object

6.0
200
HELLO


### Subobjects

You can _nest_ objects, as in set the value of a property on one object to itself be an object.

In [20]:
# `datetime` is a library that handles dates and times
# We'll cover why we treat libraries like objects in a moment
import datetime
date_and_time_then = datetime.datetime.now()
#                                     ^^^       method on the sub-object
#                            ^^^^^^^^           (class) sub-object on the object
#                   ^^^^^^^^                    (library) object

This function itself returned an object, now stored in `date_and_time_now`.
Note that the property values we set at the time we called the function that made the object:

In [21]:
date_and_time_then.microsecond

744933

How do we know which properties are available?

In [10]:
# Incomplete. Here to show tab completion
date_and_time_then.

6

We'll mention "special" properties later, which are ones with names that start and end with double-underscores. Jupyter hides these from you (because you almost certainly don't want to access them directly).
If you really want to see them, pass the object to the `dir` function and it will give you a list of all defined properties.

In [15]:
# It's a long output, and you generally don't need to do it
# dir(date_and_time_then)

['__add__',
 '__class__',
 '__delattr__',
 '__dir__',
 '__doc__',
 '__eq__',
 '__format__',
 '__ge__',
 '__getattribute__',
 '__getstate__',
 '__gt__',
 '__hash__',
 '__init__',
 '__init_subclass__',
 '__le__',
 '__lt__',
 '__ne__',
 '__new__',
 '__radd__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__rsub__',
 '__setattr__',
 '__sizeof__',
 '__str__',
 '__sub__',
 '__subclasshook__',
 'astimezone',
 'combine',
 'ctime',
 'date',
 'day',
 'dst',
 'fold',
 'fromisocalendar',
 'fromisoformat',
 'fromordinal',
 'fromtimestamp',
 'hour',
 'isocalendar',
 'isoformat',
 'isoweekday',
 'max',
 'microsecond',
 'min',
 'minute',
 'month',
 'now',
 'replace',
 'resolution',
 'second',
 'strftime',
 'strptime',
 'time',
 'timestamp',
 'timetuple',
 'timetz',
 'today',
 'toordinal',
 'tzinfo',
 'tzname',
 'utcfromtimestamp',
 'utcnow',
 'utcoffset',
 'utctimetuple',
 'weekday',
 'year']

### How do we know what it does?

Use shift+tab to access the _docstring_ for an object.
This is a string provided by the original authors of the code where they write down what something should do or be.
It's set by the class that makes the object.

## Classes

A _class_ (in general computer science) is a common specification of the data (and methods) that must be present on an object to count as being a member of that class.
That object is then called an _instance_ of that class.



An example class in python is `list`.
All lists in Python have a method that returns the number of entries in the list.
It's one of the "secret" methods:

In [None]:
example_list = [1,2,3,4,5]
# Don't do it like this, see the next cell, this is just an example
example_list.__len__()

5

In [18]:
# Do it like this
len(example_list)

5

Finding the length of something is so common an operation that it has the special keyword `len`,
and it makes the code much easier to read when you use `len`.
Under-the-hood, all `len` is doing is accessing the `__len__()` method.

But other things also have lengths! For example, strings:

In [22]:
len("The length of a string is actually a really deep rabbit-hole.")

61

Even though that have some overlap there are differences, e.g., strings have a `.upper()` method that puts them into upper-case.
This doesn't make sense for lists of arbitrary data.

A class (in Python) is a means of creating objects that conforms to such a specification.

In [28]:
# numpy is "numeric python", and has a standard abbreviation to `np`
import numpy as np

# Pass np.array a list of lists, which it interprets as a "an array", which in this case is a 2x2 matrix
identity_matrix = np.array([[1,0], [0,1]])

# It even knows how to print it nicely, again because of a property created by the class
identity_matrix

array([[1, 0],
       [0, 1]])

In [35]:
# The original list `[[1,0], [0,1]]` doesn't have `.trace()` method
identity_matrix.trace()

2

### Contrast: Java

Java is another object-oriented language.
You can't just "define a function" in Java,
because all data must belong to some object.

Python "solves" this by making functions into a type of object.

In [32]:
# remember that string_example.upper is a method on string_example
# __doc__ stores the docstring of an object
print(string_example.upper.__doc__)

Return a copy of the string converted to uppercase.


## Libraries

A _library_ is code designed to be re-used between programs.
Python distinguishes between:

- _modules_ (single python files) 
- _packages_ (folders of modules, laid out in a certain way)

Since that difference won't matter to us we're going to just say "library" or "package".

### Importing

When you import a library two things happen:

- Python finds the library, and _runs the code inside it_
- Python builds a big object of things created by running that code, and gives that object the name of library

In [37]:
# the `random` library generates random data for you.
import random

dir(random)

['BPF',
 'LOG4',
 'NV_MAGICCONST',
 'RECIP_BPF',
 'Random',
 'SG_MAGICCONST',
 'SystemRandom',
 'TWOPI',
 '_ONE',
 '_Sequence',
 '_Set',
 '__all__',
 '__builtins__',
 '__cached__',
 '__doc__',
 '__file__',
 '__loader__',
 '__name__',
 '__package__',
 '__spec__',
 '_accumulate',
 '_acos',
 '_bisect',
 '_ceil',
 '_cos',
 '_e',
 '_exp',
 '_floor',
 '_index',
 '_inst',
 '_isfinite',
 '_log',
 '_os',
 '_pi',
 '_random',
 '_repeat',
 '_sha512',
 '_sin',
 '_sqrt',
 '_test',
 '_test_generator',
 '_urandom',
 '_warn',
 'betavariate',
 'choice',
 'choices',
 'expovariate',
 'gammavariate',
 'gauss',
 'getrandbits',
 'getstate',
 'lognormvariate',
 'normalvariate',
 'paretovariate',
 'randbytes',
 'randint',
 'random',
 'randrange',
 'sample',
 'seed',
 'setstate',
 'shuffle',
 'triangular',
 'uniform',
 'vonmisesvariate',
 'weibullvariate']

It's considered bad manners to have code in your library that does anything apart from set up definitions for the user.
Running that code still takes time though,
and it's why some libraries take longer to import than others.

Aside: You can go and look at these Python files!
They're very software-engineering in style, but they're written in Python.

Some packages also include other software that can't be written in Python (such as fast, compiled subroutines for pandas), but the overwhelming majority of Python packages are written in Python.

After importing a library what you, the user, actually get is an object.

### Installing

Some libraries come with Python, such as `random` and `datetime`.
These are considered so essential to using Python that they're now included as standard.

Anything not included needs to be downloaded,
and in some cases extra steps need to be run to have make the library work.

`pip` (the Package Installer for Python) is the **official** tool for this.
It does the downloaded _and_ the extra steps for you.
We'll touch on virtual environments at the end of the day,
which are a way to have isolated instances of Python+pip on your computer.

Pip downloads packages for a service called PyPI.
Anyone can upload packages (I've uploaded three).
Python has succeeded by being generally easy to learn, and easy to build packages for.

## Duck-typing

Python is, by design, permissive.

For example the following function describes itself as adding two numbers together.

In [7]:
def add(a,b):
    """Add two numbers together."""

    # The plus symbol is actually shorthand for a secret property on the object to the left of the symbol
    # Just like `len` actually ends up calling the `__len__` method on an object
    return a+b

That function is well-defined in Python, the cell ran, we got no errors.
We will continue to not get any errors for as long as `a` and `b` have a compatible definition of addition.

In [8]:
# Add two integers
add(2,3)

5

In [9]:
# Add an integer to a float
add(2,3.5)

5.5

In [11]:
# Add two lists
add([1,2],[3])

[1, 2, 3]

In [12]:
# Add two strings
add("Hello ", "world")

'Hello world'

In [13]:
# Should error
# Add an integer and a string
add(2, "?")

TypeError: unsupported operand type(s) for +: 'int' and 'str'

Python let us define a function that can fail,
and trusted that we wouldn't perform the actions that lead to failure.
This permissiveness makes Python code quick to write, and quick to fail in unexpected conditions.

In other languages you would have to **specify up front** (when defining the function) what kinds of data it expects.
That's because many languages are _type safe_, meaning that before any code is run that code is checked for things such as "are the arguments to this function of the expected type".

This lets the compiler check for common kinds of bugs.
It's really useful!
But it doesn't work with Python's paradigm.

Instead Python use what is called _duck typing_:

"If it looks like a duck, swims like a duck, and quacks like a duck, then it probably is a duck."

This means that, although an object knows which class made it initially,
it doesn't know if it can be used a certain way until it tries.

Remember that our `add` function worked perfectly well when both arguments were `list`s.
All that mattered was that the code ran as an expected.

## Recap

Objects are bundles of data. You access that data with `.` (for properties and methods) or square-brackets for indexed data.

Everything is an object, and you can nest objects. Even libraries (once imported) and functions are also are objects.

Classes are ways to make objects.

Python doesn't do any type-checking, instead using "duck typing"; if the object has the properties you need, then that's all that matters. 