# Agenda

1. Intro to objects
    - What are objects?
    - In Python, everything is an object -- so what?
    - Creating your own data structures (in general)
    - Creating your own classes (i.e., new types of data, new object types)
    - What happens when you create a new object?
    - Complex objects and composition
    - Attributes in our objects
2. Methods
    - Using methods
    - Writing methods
    - The `self` parameter in methods
    - Other parameters and passing arguments to them
3. Tomorrow:
    - Class attributes
    - Inheritance
    - How objects really work behind the scenes
    - Dunder ("magic") methods that let us customize our objects

# What are objects?

Software has always been hard to write, and even harder to maintain. When people came up with higher-level languages, they thought that things would get easier.

In the 1970s, people were overwhelmed by software and its complexity -- especially when they wanted to maintain/debug/change software. They wanted a new way to write code that would make it easier to understand and maintain.

Alan Kay, working at Xerox PARC, designed a new programming language called Smalltalk. Smalltalk used "objects." What did that mean?

Kay talked about the human body and biological systems as being hugely complex and yet working well. This happens because the body is divided into many cells. Each cell has a different type -- nerve cells, skin cells, fat cells, muscle cells. Each type of cell sends and receives different types of messages.

Smalltalk was designed such that each type of data you wanted to work with had a "type" that determined what messages it would send and receive. The messages were known as "methods," but they were really glorified function calls. The idea was that you could know what types of messages were sent and received by a particular piece of data, based on its type.

The idea:

- Instead of cells, we have objects
- Instead of types of cells, we have classes
- Instead of messages, we have methods

These terms are hard for people, because they are different from the rest of programming. Nearly every programming language today uses objects in some way or another. 

Why do we need objects? To make our code easier to understand, read, and write. It's really a system for packaging our software, so that it's easier to do those things.

There are still object skeptics, who say that there really isn't any huge advantage in using them. But most programmers and most languages are pushing us to use objects, because they can make things easier to build and maintain.

For some people, objects are almost a religion. If you do things the object way, then you're right. And if you don't, then you're wrong. Python doesn't have this approach at all! Python encourages you to use objects, and all of the data is packaged as objects, and we'll use methods, as well -- but if you want to write non-object code, you completely can do that in Python.

# Everything is an object

We love to say this in Python. But what does it mean? Why do we care?

- We can apply the same rules to objects that we define and design as apply to the internal, builtin objects that Python defines. There aren't two sets of rules; the rules are consistent.
- If we want to improve/extend the system, there is a standard way to do that by writing new classes.



# Jargon/vocabulary

- `class` and `type` -- these words are interchangeable when we speak, but they have different roles in Python itself. The idea is that each class/type is a category of value. You've seen classes before -- `str` is a class, as is `int`. The big deal with object-oriented programming is that you can create new types/classes, and they will be no more and no less a part of Python than `str` and `int`.
    - `'abcd'` is a string, meaning that it's an *instance* of `str`.
    - Another way to say this is that `str` is the *class* of `'abcd'`
    - Another way to say this is that `str` is the *type* of `'abcd'`.
    - `str`, the type, can also be used to create new instances of `str`, i.e., new strings. So when you say `str(5)`, you're creating a new `str` instance based on the `int` object `5`.
- If we want to create a new object of type `X`, then we invoke `X()` and get a new value back of type `X`.
    - I can create a new `int` with `int()`
    - I can create a new `str` with `str()`
    - I can create a new `list` with `list()`
- `instance` is another word for "value," but it mentions the class. Every value in Python is an *instance* of some type. So `'abcd'` is an instance of `str`, and `5` is an instance of `int`. If you want to find out the type of a given object, you can invoke the `type` builtin function. It'll return the type of an object.
- `object` -- this word is overused *FAR* too much in object-oriented programming. It means a value -- `5` and `'abcd'` are both objects. But every class is also an object, because everything in Python is an object. We also have a class called `object`, which is the "parent" of all classes out there.

In [1]:
str(5)

'5'

In [2]:
type(5)

int

In [3]:
type('5')

str

# So what? Why do we care about an object's type/class?

It tells us everything we need to know about how that object will behave.

We can, based on the class, konw:

- What type of data it stores
- What methods we can invoke
- What operators we can use
- What values we'll get back from each method and operator
- What inputs it can take, and what outputs it provides

If we see data of a type we know, then we can make lots of assumptions about it. If we see data of a type we don't know, then we should go study the documentation for it, so that we can know about the above.

# Do we really need objects?

No! They are helpful, and they help us to organize our code, but computers and programs will still exist and work even if we never create any of our own classes.

There are still some languages that don't have any objects.

In [4]:
# let's create a data structure that keeps track of a person
# it'll hold first name, last name, and shoe size

p = ('Reuven', 'Lerner', 46)

In [5]:
# p is an object of type tuple

type(p)

tuple

Having the data in a tuple has a few issues:

1. I have to retrieve the fields via a numeric index
2. I have to think of it as a tuple, when it would be nice to think of it as a "person". Higher-level thinking ("abstraction") is a very important part of programming, and objects help with it.


In [6]:
p[0]

'Reuven'

In [7]:
p[1]

'Lerner'

In [8]:
p[2]

46

In [9]:
# if I want to print the first + last names, I can:

print(f'{p[0]} {p[1]}')

Reuven Lerner


In [12]:
# one solution to make it nicer is to write a function

def fullname(person_tuple):
    return f'{person_tuple[0]} {person_tuple[1]}'

In [13]:
fullname(p)

'Reuven Lerner'

In [14]:
# the function and the tuple were defined separately
# in the world of objects, we want them to be joined at the hip

# Python, functions, classes, and capitalization

Both functions and classes in Python are what we call "callables," because we can "invoke" them with `()`. For this reason, it's easy to get confused:

- `int` is a class
- `collections.Counter` is a class, too

This gets worse because modern Python conventions tell us to Capitalize or CamelCase our class names, but snake_case our function names. What gives with `int` and `str`, then?

Those core data structures were grandfathered. They are the only exceptions, though; most classes in Python adhere to the CamelCase rule.

There are people who call `int` a function, even though it isn't! There are others who say that we shouldn't dwell on whether it's a function or a class, and just call it a "callable" or a "builtin."



# What are operators?  

Symbols we can use in Python expressions. They are, behind the scenes, transformed into methods.

- `+` is the addition operator
- `-` is the subtraction operator
- `=` is the assignment operator
- `==` is the equality operator
- `[]` is the "retrieve an element" operator

# What's the problem with what we did above with `p`?

1. We lost the advantage of *abstraction*. We're thinking in terms of tuples, not in terms of people.
2. There's no guarantee that `fullname` will be given a tuple, or the right kind of tuple, as an argument. In object-oriented programming, we combine our data and our functionality to make this less likely.
3. It's annoying to think about index 0 and index 1 instead of "first name" and "last name."

If we use a class, we can solve *all* of these problems!

# Exercise: Non-object objects

1. Define two 3-element tuples, each containing information about a company, with fields for the company's name, URL, and the number of employees. Create two such tuples, one for each of two companies.
2. Write a function that expects to get that tuple, and returns a string -- a link to the company's URL with the company name.
3. What happens if you run `type` on your tuple?

In [15]:
company1 = ('BigCo', 'https://bigco.com', 100_000)
company2 = ('SmallCo', 'https://smallco.com', 5)

In [16]:
type(company1)

tuple

In [17]:
type(company2)

tuple

In [20]:
def name_and_url(company_tuple):
    return f'<a href="{company_tuple[1]}{company_tuple[0]}"</a>'

In [21]:
print(name_and_url(company1))

<a href="https://bigco.comBigCo"</a>


In [22]:
print(name_and_url(company2))

<a href="https://smallco.comSmallCo"</a>


# What's wrong with this?

1. We are working at a low level, not a higher level of abstraction that lets us think about companies vs. tuples
2. There is no formal connection between our tuples and our function.

# How can we rewrite this has a class?

- We'll need to define a new data structure, a new class/type, each of whose instances represent a company
- We'll then need to define methods (functions) and attach them to the class

This will solve all of our problems. We will have that higher level of abstraction *and* we'll be able to invoke a method directly on the data, rather than making the connection ourselves. We'll also have access to the values in a better, more readable way than tuple indexes.

In [25]:
# here is a class Person that does what I had before

class Person:     # this is where we tell Python -- I'm defining a new type of data

    # inside of my class, I have indentation, and here we define methods
    # the first method I'll define is called __init__ ("dunder init," meaning "double underscore before and after init")

    # the first parameter for a method must be "self"
    # when someone creates a new instance of Person, they must give us arguments for the 3 parameters after self
    def __init__(self, first_name, last_name, shoe_size):

        # take each parameter, and use its value to add a new attribute to our new object
        self.first_name = first_name
        self.last_name = last_name
        self.shoe_size = shoe_size

p = Person('Reuven', 'Lerner', 46)          # I've created a new Person object with the 3 values from before 

In [26]:
# I can now retrieve those values:

p.first_name

'Reuven'

In [27]:
p.last_name

'Lerner'

In [28]:
p.shoe_size

46

In [29]:
type(p)

__main__.Person

# We can say:

- I've created a new `Person` class, a new data type for representing information about a person
- Each instance of `Person` is a new person object, representing a different person with their own first/last names and shoe size
- If `p` is an instance of `Person`, then we can also say that `type(p)` is `Person`
- We can retrieve any attributes we've set on `p` with a `.` and the attribute name.

# Next up

1. What happens when we define a class?
2. What happens when we create a new instance of that class?
3. How can we translate from our non-object Python into our object-oriented Python?

In [31]:
class Person:
    def __init__(self, first_name, last_name, shoe_size):
        self.first_name = first_name     # take the value in the first_name parameter, and assign it to a new attribute on self
        self.last_name = last_name
        self.shoe_size = shoe_size

# What's going on?

1. The `class` keyword in Python tells the language that we want to define a new type of data, parallel to `str`, `list`, `dict`, and all of the others. Our class (data type) will be just as capable as those builtin data types.
2. When we define a class, we give it a name. Traditionally, that name is in `CamelCase`.
3. We need a colon at the end of the line, indicating that we're about to open an indented block. That block is known as the "class body." Most of the class body will consist of method definitions, all starting with `def`. We will see some other things we can put there tomorrow.
4. When we use `def` inside of a class body, we're defining a *method*, a function that is connected to our data type. Only objects of type `Person` will be able to use any method defined here.
5. `__init__` is a special method, a *magic* method, that is invoked by Python automatically when we create a new instance. We'll talk about that in a moment.
6. We define our `__init__` with four parameters: `self` (which is mandatory for any method), then three more parameters. `self` will be assigned the instance of `Person` on which we're running, and that'll happen automatically. The other three parameters get their values from the arguments we pass to `Person` when we invoke it.
7. The job of `__init__` is to assign attributes (i.e., names and values) to `self`, the new instance. When `__init__` is invoked, `self` is an object, but a "naked" object, without any attributes that make it special. Our job in `__init__` is to assign to `self`, one at a time, the attributes that'll make it truly an object of type `Person`.
8. `__init__` doesn't return any value. The assignments to `self` are enough.
9. Do all of my parameters need to be used in assigning to `self`? No, but that's common. Can I assign attributes to `self` that have nothing to do with the parameters? Yes, and that's fine.

In [32]:
p = Person('Reuven', 'Lerner', 46)

In [34]:
p.first_name      # I'm retrieving first_name from p!  Because it was assigned, I can retrieve it!

'Reuven'

# Variables vs. attributes

Python has *two* different storage mechanisms for values:

- Variables, which can be local (in a function) or global (available everywhere)
- Attributes, which are connected to a particular object. You can think of attributes as a private dict on each value in Python. 

How can you tell the difference? Because attributes are tied to a particular object, you need to say both the object and the attribute; you cannot just say the attribute. The way you do that is with a `.`.

When I ask for `p.first_name`, I'm asking `p` (the Person object) to give me the value of its `first_name` attribute. Fortunately, that was assigned back in `__init__`.

Asking for `p.first_name` means that Python turns to `p` and says: Do you have an attribute `first_name`? This has *NOTHING* to do with local or global variables. (`p` is a global variable, but `first_name` is an attribute on `p`.)

Objects in Python use attributes a *lot*. Every time you see a `.`, that means: We're asking an object for its attribute.

Other programming languages talk about "instance variables" and "class variables." Many instructors in Python use this terminology also. I prefer not to do so, because the moment you understand that these are all attributes, the language becomes more obviously consistent and easier to understand.

# Namespace

A namespace is a collection of variables. Typically, Python has a global namespace and a local namespace (when a function is running). There is also a "builtin" namespace for names like `str`, `int`, `dict`, etc.

It's true, though, that Python uses modules and classes as a form of namespace, also. However, those aren't for variables. They are, instead, for attributes. You can see that because if you use a class or a module, you'll be using lots of `.` characters between the module/class name and its attributes.

In [35]:
type(p)

__main__.Person

# Variables vs. attributes

- If you ask Python for the value of a global variable, wherever you are, it'll work. If you define `x = 5` outside of a function and class definition, then you can retrieve it from anywhere you want as `x`.
- If you ask Python for the value of a local variable, so long as you're in the function where it was defined, you're fine. But other places in the program won't know what you're talking about; the variable doesn't exist there.

Attributes always exist, though -- and if you know what object the attribute is attached to, you can always retrieve it with `a.b` notation, where `a` is the value and `b` is the name of the attribute.