## Attributes vs methods

or why it's sometimes `object.something` and sometimes `object.something()`

Let's see the error

In [2]:
import pandas as pd # import pandas - let's read some data as well.

ourData = pd.read_csv("./DTSPRcourse-AY-2023-24.git/data/gss_cat.csv")

Let's inspect the data, just so we are sure it's loaded:

In [3]:
ourData.describe()

Unnamed: 0,year,age,tvhours
count,21483.0,21407.0,11337.0
mean,2006.501978,47.180081,2.980771
std,4.451994,17.2875,2.587151
min,2000.0,18.0,0.0
25%,2002.0,33.0,1.0
50%,2006.0,46.0,2.0
75%,2010.0,59.0,4.0
max,2014.0,89.0,24.0


OK - it gives me some details about some of the columns, but I feel like there are more - let's try another function

In [4]:
ourData.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 21483 entries, 0 to 21482
Data columns (total 8 columns):
 #   Column   Non-Null Count  Dtype  
---  ------   --------------  -----  
 0   year     21483 non-null  int64  
 1   marital  21483 non-null  object 
 2   age      21407 non-null  float64
 3   race     21483 non-null  object 
 4   rincome  21483 non-null  object 
 5   partyid  21483 non-null  object 
 6   relig    21483 non-null  object 
 7   tvhours  11337 non-null  float64
dtypes: float64(2), int64(1), object(5)
memory usage: 1.3+ MB


Ahh okay - there are more columns - they just weren't numeric and therefore describe doesn't give them.  

I wonder what the shape of my dataframe is - i.e. how many rows and columns:

In [5]:
ourData.shape()

TypeError: 'tuple' object is not callable

Oh oh - that's an error... but why??  It even tells me that a `tuple` \[e.g. (1, 2)\] is not callable

what does that mean... hmm.. it sounds like it's trying to run a function. You call a function.. 

Let's just try it without `()` and see what happens.

In [6]:
ourData.shape

(21483, 8)

Oh that gave me an outcome - and that matches (kinda) what I learned from `.info()` - see that the largest number in the non-null count is 21483 as well.  

But why is one of them working with `()` and the other one not? 

and what happens if I try `.info` without the parentheses? 

In [7]:
ourData.info

<bound method DataFrame.info of        year        marital   age   race         rincome             partyid  \
0      2000  Never married  26.0  White   $8000 to 9999        Ind,near rep   
1      2000       Divorced  48.0  White   $8000 to 9999  Not str republican   
2      2000        Widowed  67.0  White  Not applicable         Independent   
3      2000  Never married  39.0  White  Not applicable        Ind,near rep   
4      2000       Divorced  25.0  White  Not applicable    Not str democrat   
...     ...            ...   ...    ...             ...                 ...   
21478  2014        Widowed  89.0  White  Not applicable  Not str republican   
21479  2014       Divorced  56.0  White  $25000 or more         Independent   
21480  2014  Never married  24.0  White  $10000 - 14999        Ind,near dem   
21481  2014  Never married  27.0  White  $25000 or more    Not str democrat   
21482  2014        Widowed  71.0  White  $20000 - 24999        Ind,near rep   

                   

huh - that's strange.  It tells us that it's bound method... and then spits out part of the data frame.. Let's try to understand a bit better. 


## Class - what gives us objects! 

A class something that describes an object in the abstract. So each object belongs to a class. Here we think of a class as Person, so we can imagine a person called Amari is in instance of a Person.

In [8]:
class Person:
    # This things happens everytime a create an instance of a person
    def __init__(self, name, age, haircolor): 
        
        # these are attributes! 
        self.name = name # we take the name we get and store it under self.name
        self.age = age
        self.haircolor = haircolor
        # self tells us that they are stored as a part of the object - in our case a specific instance of person!
        
        
        # This is a method! 
    def greeting(self, helloTo):
        return(f"Hi {helloTo}! My name is {self.name}") # create a greeting - the person can say hi!


Okay, so this is a bit complex and we don't need to understand exactly what's going on! We are just looking under the hood for how Python deals with objects.  You can imagine when someone designed the `DataFrame` class in the `pandas` module they basically had to go through and do what we have done above for all the things a `DataFrame` can do! 

This is a little simplified, but in essense it's the same!

So what did we do - We created a `class` of objects called `Person`.  We can see that when we define a `Person` it requires `name`, `age`, and `haircolor`.  These are stored under `self`. This is similar to when we call a function and put an argument in the function. 

An `attribute` is a variable that is stored along with our object. Often we use it to describe the object in some way. Just like we can imagine that `name`, `age`, and `haircolor` are attributes (things that describe) a person.

But what is this `self`.  This is just Python's way of referring to the specific instance of an object. You can imagine that I create one `Person`: Esmeralda, and another `Person`: Amari.  But I don't know the names yet, so I need some way of referring to any instance of an `class` and that's what the `self` is.  We'll see more details below.

The second this I did was create a `method` called `greeting`. A `method` is really just a normal function, but which exists relating to a specific object.  In general, when we call a function, it just exists, and doesn't require us to have an object instance defined first.  For methods, they are linked to an object.

This `greeting` takes `self` (Python's way of knowing it's relating to the object itself) and `helloTo` (who are we greeting). This `method` returns a greeting. It says hello to the name we put into the `method` and introduces the `Person` with their name.  

In [9]:
# Let's try

esmeralda = Person(name = "Esmeralda", age = "31", haircolor = "brown")


Great - we now have an instance of a person! 

In [10]:
# let's see what happens if I try to get the attribute name or age:

print(esmeralda.name)
print(esmeralda.age)

Esmeralda
31


cool let's see what happens if we try to call it (i.e. use `()`)

In [11]:
print(esmeralda.name())

TypeError: 'str' object is not callable

That's the same error we got from calling `ourData.shape()` other than it was different type! 

Let's try the method now:

In [12]:
esmeralda.greeting("Amari")

'Hi Amari! My name is Esmeralda'

Cool is used the name we gave it and returned a greeting !

Now what happens if we try: `esmeralda.greeting`

In [13]:
esmeralda.greeting

<bound method Person.greeting of <__main__.Person object at 0x7f50bd1453d0>>

It tells us that it's a bound method just like when we called `ourData.info`.  The reason why the `ourData.info` gave us a bit more information is that the authors told it to give some more information when it is called like this.

Okay so that's the difference between `attributes` and `methods`.  An `attribute` describes something about the object, or stores some information. A `method` is sorta like a function that is owned by the object and allows us to manipulate it, or do something new.  The `methods` might be used to change some of the attributes of the object. 

Let's create a new `method` that allows us to color the hair of the person:

In [14]:
class Person:
    # This things happens everytime a create an instance of a person
    def __init__(self, name, age, haircolor): 
        
        # these are attributes! 
        self.name = name # we take the name we get and store it under self.name
        self.age = age
        self.haircolor = haircolor
        # self tells us that they are stored as a part of the object - in our case a specific instance of person!
        
        
        # This is a method! 
    def greeting(self, helloTo):
        return(f"Hi {helloTo}! My name is {self.name}") # create a greeting - the person can say hi!
    
    def colorHair(self, newHairColor):
        self.haircolor = newHairColor # This method changes an attribute but doesn't have an output

In [15]:
# Let's try coloring the hair

esmeralda = Person(name = "Esmeralda", age = "31", haircolor = "brown")

# check the haircolor
print(esmeralda.haircolor)

brown


In [16]:
esmeralda.colorHair("purple") # notice no output

In [17]:
print(esmeralda.haircolor)

purple


We did it! 

So we can do this another way as well, and this sorta relates to the idea of `self`.

in our case running `esmeralda.colorHair("purple")` would be identical to running `esmeralda.haircolor = "purple"`, let use this to turn it back to brown.

In [18]:
esmeralda.haircolor = "brown"
print(esmeralda.haircolor)

brown


This is because `self` is practically substituted with the name of the object. 

so in:

In [None]:
 def colorHair(self, newHairColor):
    self.haircolor = newHairColor 

we imagine that we call the `colorHair` method on the object esmeralda, and in fact we can do that.

In [19]:
print(esmeralda.haircolor) # currently brown

# now let's explicitly call the colorhair method on esmeralda

Person.colorHair(esmeralda, "Purple")
print(esmeralda.haircolor)

brown
Purple


We use the `Person.colorHair` to show that we are using the `method` belonging to the class `Person`

## This might lead to some confusion with naming conventions! 

Because Python uses methods in this way we can't name things with a person:

In [20]:
a.number = 1

NameError: name 'a' is not defined

But we can actually do that in R - and some people want that to be a common naming convention! So be aware, if you in R see `a.number` - this might just be someone storing a number! 