# What *is* that "dot" thing all about?

Early in your Python coding journey, you’ll begin encountering bits of code that use a dot (`.`).

- [`requests.get`](https://requests.readthedocs.io/en/latest/user/quickstart/#make-a-request)
- [`"some string”.upper()`](https://docs.python.org/3/library/stdtypes.html#str.upper)
- [`csv.DictWriter`](https://docs.python.org/3/library/csv.html#csv.DictReader)
- [`import sys; print(sys.path)`](https://docs.python.org/3/library/sys.html#sys.path)
 
You may even have the misfortune of running into snippets such as below, common in data analysis code using [pandas](https://pandas.pydata.org/):

```
dataframe.groupby("some_field”).size().rename("new_name").reset_index()
```

Such code snippets use a `.` to access [functions](art_of_functions.ipynb), [methods](https://docs.python.org/3/tutorial/modules.html), [classes](https://docs.python.org/3/tutorial/classes.html) and variables inside of Python [objects](https://docs.python.org/3/glossary.html#term-object) such as classes and [modules](https://docs.python.org/3/tutorial/modules.html).

It can all get quite confusing, especially if you're unfamiliar with terms such as _class_, _method_, _module_ and _object_. 

Below, we'll demonstrate various scenarios where you'll typically encounter the *dot notation*, as it's formally known, and demystify some of these terms and the role the `.` plays in various contexts. Along the way, you'll get a brief primer on some of the more advanced features of Python. You don't have to memorize these techniques or even use them in your own code, but understanding them will dramatically improve your ability to read other people's code and make use of third-party [libraries](python_libraries.ipynb).

## module.something

One of the first places you'll notice the dot notation is when importing and using Python [modules](https://docs.python.org/3/tutorial/modules.html).

Modules are simply text files ending with a `.py` extension where you can store variables, functions and other bits of code. Coders use modules to help organize their software, grouping related bits of _reusable_ functionality into one or more modules. They typically try to give their modules sensible names that convey their purpose.

For example, to download files from the Internet, you might install the `requests` library and use it as follows:

```python
import requests

requests.get('http://example.com')
```

In the above example, we imported a module called `requests`. And inside that module lives a function called `get`, which can be used to grab files from the web (in this case, the HTML source code of example.com).

### Built-in modules

Let's try working with a few modules. 

For example, the [sys](https://docs.python.org/3/library/sys.html) module provides access to information about the Python interpreter:

In [None]:
import sys

# List the OS the Python interpreter is running on
sys.platform


In [None]:
# List the directories where Python searches for modules it can import
sys.path

### Bring your own modules to the party

You can of course create your own modules to store useful code and access them using the dot notation. As an example, we created a module called [awesome.py](awesome.py).

Open the file and check out its contents: a variable, a function and a class.

Now try importing the module and accessing the variable (`NUMBER`) and the function (`hello`). 

> We'll ignore the class (`Bird`) for now. Don't worry, we'll get to that in a moment.

In [None]:
import awesome

In [None]:
awesome.NUMBER

In [None]:
awesome.hello()

Pretty simple, right? The key takeaway in the context of modules is that the `.` provides access to bits of code stored in importable `.py` files.

## The Hidden Life of Objects

Like many programming languages, Python supports a style of coding known as [object-oriented programming](https://en.wikipedia.org/wiki/Object-oriented_programming). Object-oriented coding, or OO, is a powerful paradigm that allows you to combine data and code into objects. 

The mechanism for this blending of data and code is a [class](https://docs.python.org/3/tutorial/classes.html). We typically create classes to model some entity and its related attributes and behaviors.

For example, we've included a fun little `Bird` class in the [awesome.py](awesome.py) module. To use this class, you simply call it using parentheses `()`, the same way that you would call a function.

In [None]:
# No need to import awesome again since we did so above
my_bird = awesome.Bird()

Above, we created what is known as an _instance_ of the `Bird` class. Think of the class as the mold from which many individual birds can be stamped.

Once "instantiated", you can access the data and code related to a bird using the `.` notation:

In [None]:
my_bird.name # print the default name for the bird

In [None]:
my_bird.fly()

In [None]:
my_bird.eat_worm()

If you examine the `Bird` class, you'll notice that it contains a number of code snippets that look suspciously similar to functions. For example, here is the code for `fly`:

```python
def fly(self):
    print("I'm flying!! WEEE!!!")
```

## Always refer to your "self"

The `fly` method looks, feels and acts like a function because it basically is a function. However, when a function lives inside of a class, we call it a [method](https://docs.python.org/3/tutorial/modules.html).

The strange and often confusing nuance about methods is the use of the `self` argument. `self` is always required as the first argument on methods, and is assumed to be present by Python's OO system (so if you forget it, Python will raise an error).

> You could in theory name `self` something else (e.g. `this`, `that` or `owl`). But it's a universal convention, and if you use a different name, be prepared for Pythonistas to come after you with pitchforks.

`self` is required as the first argument in a method because it gives Python a way to link methods with *a particular instance of the class*, and all the bits of data that are associated with that instance. In this way, when you call a method on a class, Python knows which bundle of data to operate on.

This point may not be obvious in the case of the `fly` method, which is quite simple. Below we'll look at a different method on the `Bird` class that helps clarify why we need the `self` argument.

## The widget factory

It's often helpful to think of classes as the molds in a widget factory, used to stamp out new widgets on the assembly line.

It can be hard to understand `self` in the abstract, until we see it being used. The `change_name` method on the class can help drive home the point. Here is the code for that method:

```python
def change_name(self, new_name):
    # Set the new name on the *instance* using "self"
    self.name = new_name
    print(f'My name is now {self.name}')
```

This method allows you to change the name of a bird by replacing the default name (`Robin`) that was created when we instantiated the class.

In [None]:
my_bird.change_name('Debbie')

You can now verify the name of your bird is different by directly accessing it's `name` attribute:

In [None]:
my_bird.name

It's important to emphasize that we have *not* changed the `Bird` class itself. Instead, we simply updated one instance of the bird class stored in the variable `my_bird`. Let's create another bird to illustrate:

In [None]:
your_bird = awesome.Bird()
your_bird.name

Above, we see that `your_bird` has the default name of `Robin`. But we can of course change that:

In [None]:
your_bird.change_name('Suzie')

In [None]:
your_bird.name

This example is arguably more complex than needed, since it's also possible to directly change the value of an attribute without the use of a method. Updating an instance attribute works the same way as updating a variable. You simply assign a new value by using the `=` sign:

In [None]:
my_bird.name = 'Lenny'

In [None]:
my_bird.name

## Classes and methods in the wild

While the `Bird` example is contrived and a bit silly, it hopefully conveys the critical point that classes can be used to stamp out many instances. And once you understand these basics about classes, instances and methods, all sorts of dot-notation syntax starts to make sense.

For example, Python's [string data type](https://docs.python.org/3/library/string.html) has oodles of useful methods that can be called on instances of a string:

In [None]:
my_string = 'hello' # create a string instance
my_string.upper()   # make it loud and screamy

In [None]:
my_string.startswith("h") # check the first letter

In [None]:
my_string.endswith("l") # check the last letter

And the fun doesn't stop with strings. Lists and [dictionaries](python_dict_basics.ipynb) have their own unique methods as well:

In [None]:
numbers = [1,2,3]
person = {
    'name': 'Joe',
    'age': 30,
    'favorite color': 'mauve?'
}

In [None]:
numbers.append(4) # add a number to the end of the list
numbers

In [None]:
numbers.pop(0) # remove the number in the first position
numbers

In [None]:
person.keys() # list the keys in the dictionary

In [None]:
person.values() # list the values in the dictionary

And of course, libraries make extensive use of classes.

In [None]:
import csv

with open('files/data/animal_ratings.csv') as infile:
    # Create an instance of the DictReader class
    reader = csv.DictReader(infile)
    # Then loop through the rows and do stuff
    for row in reader:
        animal = row['animal'].title()
        rating = row['awesomeness']
        print(f"{animal} has an awesomeness rating of {rating}")

## Method Chaining

Remember that _pandas_ snippet from way at the beginning of this tutorial? 

```
dataframe.groupby("some_field”).size().rename("new_name").reset_index()
```

That style of syntax is known as _method chaining_. 

Now that you're armed with the knowledge that _methods_ are basically functions that live in classes (and instances of those classes), you can begin to make sense of the phrase "method chaining": It's a technique that allows you to consecutively call methods, one after another, _without having to store and operate on the return value of each step in the chain._

There's one additional concept that's required to make sense of such code.

Similar to Python functions, methods can explicitly `return` a value.

Let's construct a new class to prove the point. Here we'll introduce a special method called [\_\_init\_\_](https://docs.python.org/3/tutorial/classes.html#class-objects) that you can use to add data attributes to an instance when you first create it. The syntax is a bit gnarly, but a simple demo should hopefully make its purpose clear.

In [None]:
class Number:
    
    def __init__(self, number):
        # Store the number in "value" when you create the instance
        self.value = number
        
    def add(self, other_number):
        # Add our original number (stored in self.value) to some other_number
        # and return the solution
        return self.value + other_number

**Important**: Note that above, our `add` method plucks its value from...well...the `value` attribute, which is stored when we create the instance. The `add` method then adds its own stored value to `other_number` and returns the solution.

Ok, let's create an instance of `Number`. Note that because our special `__init__` method requires a number argument, we must pass this argument when we create the instance.

In [None]:
num = Number(2) # The parens are reminiscent of a function call, right?
num.value

We can see that by passing `2` to the `Number` class when we create the instance, the number gets stored in the `value` attribute. Now let's do some addition.

In [None]:
num.add(3) # Add 3 to our number

Ok, now that you have a sense of `self` -- pun intended -- and the fact that methods can return values, you're ready for one last concept that will help you grok the "method chaining" syntax.


Let's say that we wanted a number class that (1) was able to update itself **_and_** (2) allow you to perform multiple consecutive operations -- all without having to individually store and operate on the value in each step.

The first requirement is pretty straight-forward. We could update the `add` method to simply store the new solution in the `value` attribute:


```python
    def add(self, other_number):
        # Replace the original value with the newly calculated value
        self.value = self.value + other_number
        # Return the updated value
        return self.value
```

Let's create a new class using this approach.

In [None]:
class FancyNumber:
    
    def __init__(self, number):
        self.value = number
        
    def add(self, other_number):
        # Replace the original value with the newly calculated value
        self.value = self.value + other_number
        # Return the updated value
        return self.value

In [None]:
num = FancyNumber(1)
num.add(1)

Let's confirm the underlying value of our number has changed from `1` to `2`.

In [None]:
num.value # should now be 2

We could continue updating the number by calling `num.add`:

In [None]:
num.add(1)
num.add(1)
num.value # should now be 4

But Pythonistas and coders in general are allergic to keystrokes and visual clutter. Wouldn't it be nice if we could simply reference `num` a single time, and then just call `add` repeatedly? Let's try it:

In [None]:
num = FancyNumber(0)
num.value

In [None]:
num.add(1).add(1)

Ruh roh! Python got angry at us!

Read the error message carefully. It states that the `int` object has no attribute called `add`. 

Now look back at our `FancyNumber` class. Notice the `add` method is returning an integer (ie the sum of the original value and some other number)?

We already know how classes can serve as containers for related _attributes_ in the form of data and methods. Unfortunately for us, Python's built-in integer data type does not have a method called `add`. Don't believe us? Check out [the docs](https://docs.python.org/3/library/stdtypes.html#numeric-types-int-float-complex).

So how do we fix this situation? We clearly need to return something other than an integer in order to implement method chaining on our `FancyNumber` class.

Let's take stock of some key concepts:

1. Methods can return values
2. Python uses the `self` argument to reference specific instances of a class

So what if, instead of returning an integer, our `add` method simply returned itself (ie the instance of the class). Let's create one last version of the class and see if it works.



In [None]:
# A number class that supports method chaining
class FanciestNumber:
    
    def __init__(self, number):
        self.value = number
        
    def add(self, other_number):
        # Replace the original value with the newly calculated value
        self.value = self.value + other_number
        # Return the instance of the class (NOT its current value)
        return self

In [None]:
num = FanciestNumber(1)

In [None]:
num.add(1).add(1)
num.value # This should be 3

That did the trick!! We now have a number that updates itself and allows us to repeatedly call the `add` method.

We should note you can use method chaining with different types of objects. For example, we could have done some basic method chaining using our very first implementation of `Number`:

In [None]:
num = Number(1)
num.add(1).bit_count()

Above, the `add` method is of course on our original `Number` class. But the `bit_count` method is a (not so frequently used) method on integers. The important point is that you **when you encounter (or use) method chaining, it's critical that you remain aware of the return value at every point in the chain.**

If you're ever in doubt, you can rewrite the code to use individual steps. In fact, it can be helpful to apply this approach when first writing the code. Once you're confident the code works as expected, you can _then_ rewrite it into a more compact form using method chaining.

In [None]:
num = Number(2)
value = num.add(1)
value.bit_count()

# You can the rewrite the above as num.add(1).bit_count()

Folks who use Python libraries such as [pandas](https://pandas.pydata.org/docs/user_guide/index.html) -- e.g. its [DataFrame class](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.html) -- use method chaining extensively as a way to help reduce clutter. 

Remember that gnarly one-liner?

```
dataframe.groupby("some_field”).size().rename("new_name").reset_index()
```

You _could_ rewrite this snippet as a series of steps:

```
grouped = dataframe.groupby('some_field')
sized = grouped.size()
renamed = sized.rename("some_name")
df = renamed.reset_index()
df
```

But with method chaining, you can avoid having to create intermediate variables and just perform multiple operations on each version of the DataFrame instance (or whatever the return value is at a given point in the chain).

## Poking and prodding objects

Python provides a few tools to poke and prod your objects, which can be quite helpful when you're trying to unravel what's happen in a series of "chained" method calls.

In particular, the built-in [type function](https://docs.python.org/3/library/functions.html#type) will be a trusted friend.

Let's create a DataFrame in pandas to illustrate.

In [None]:
import pandas as pd
d = {
    'first': ['Joe', 'Jane', 'Jane'],
    'last': ['Smith', 'Smith', 'Doe']
}
df = pd.DataFrame(data=d)
df

Now let's say we wanted to count the frequency of first names.

In [None]:
df.groupby('first').size().rename('count_of_first_names').reset_index()

We can see the final result, but it might be hard to understand _why_ the above works. If we break the code up into separate steps and apply `type`, we can get a handle on how things work.

In [None]:
grouped = df.groupby('first')
type(grouped)

Ok, so we now know we have an instance of a class from the pandas library called `DataFrameGroupBy`.

At this point, we could further poke at this object using yet another built-in function called [dir](https://docs.python.org/3/library/functions.html#dir). This function is quite handy for listing the attributes (ie variables and methods) that are available in an object such as a class instance.

In [None]:
dir(grouped)

OUCH! Okay, so that is quite a long and likely confusing list. If you took time to look closely, you might notice the `groups` attribute. Let's try calling it to see what it does.

In [None]:
grouped.groups

Aha! We can see that our original data has now been grouped by the `first` name, and the data structure has stored references to the row (or "index" in pandas lingo) where each name appear. 

> NOTE: The `dir` function can be handy, but we also encourage you to first review the official documentation for a class or function once you've determined what it is using the `type` function. That's perhaps the more "normal" course of action.

Armed with these tools, we can rinse and repeat this process for each method call.

In [None]:
sized = grouped.size()
type(sized) # Now we have a pandas Series

In [None]:
renamed = sized.rename('count_of_first_names')
type(renamed) # still a Series...

In [None]:
new_df = renamed.reset_index()
type(df) # Now back to a DataFrame

You should now have a sense of how each step in the chain is working. 

And hopefully you appreciate that it's critical to know what data type you're operating on at each step in the chain, in order to know which methods or data attributes are available at a given step.

Deconstructing code in this way can help illuminate what these gnarly one-liners are actually doing.

As you gain comfort with various Python libraries and the language in general, we suspect you'll come to appreciate method chaining as a powerful technique that enables more compact and readable code.

But at the outset, it can be downright confusing. Hopefully you're now equipped with a few key concepts that can help you decipher this style of code when you encounter it in the wild.

## Why bother with classes and OO at all?

It's a fair question to ask why we need all the complexity that comes with classes and, more broadly, object-oriented programming. Can't we all just stop chaining methods and confusing people?

After all, it's perfectly possible to write valid and useful Python code without ever creating a class, much less chaining methods. But you'll notice that many libraries use classes, and a primary reason for their existence and widespread use is complexity. Specifically, once you gain some comfort with classes and OO, **you can use them to dramatically reduce the complexity of large code bases**.

As programs grow in size from a few lines in one script or Notebook to hundreds or thousands -- or hundreds of thousands -- of lines scattered across many modules, it can become extremely difficult to maintain and debug code. 

Classes provide a mechanism to model aspects of our code in sensible ways, so we can group together related bits of data and functionality (aka *methods*) and use them in larger programs.

For example, if you're building a system to gather and publish election night results, you might want to create classes for `Race` and `Candidate`. In such a system, you could store the votes each candidate received from a given precinct on separate instances of the `Candidate` class (one per candidate). Meanwhile, the `Race` class might have a `determine_winner` method that tallies the vote counts of each candidate and figures out the winner -- or if the race was a tie.

Classes can be quite useful in such a system since they let you model real world entities and more easily associate useful data and functionality with each entity. Such an approach can dramatically improve your ability to make sense of complex systems. And of course, you can use this approach for more abstract domains such as reading and writing CSV files, interacting with an operating system, analyzing data, and so on. 

There are many more features of classes and OO in general -- we've barely scratched the surface here -- that make them useful and flexible tools for writing code, and we encourage you to learn more about them on your Python coding journey.

All that said, you may find classes to be overkill in your own daily work. But keep them in mind as a handy tool for managing complexity, especially as the number of lines of code increase.

And if nothing else, a basic understanding of classes and how they work will help you understand *how* to use dot-notation to access data and functionality in classes and modules.