# Python Intro Day 4
## Classes, Methods, I/O

In this class we start doing the "heavy lifting" and explore where the real power comes from: Classes

Of course we wouldn't be able to do some computing without being able to get data in or out of our program, therefore we will also delve into basic I/O (input/output).

### Classes

A class is a set of data and functions which do a specific task. In a class, the functions are generally called methods, as the concepts is they are methods which work on the classes internal data.

The reason why we make classes is to have modularity. We can then pass our class as an argument, just like a variable, value or function. 

the most basic class just has to use the `class` keyword.

    class NAMEOFCLASS():
        # method, data definitions

After that, a definition of a method or data must be given.

In [3]:
class MyClass():
    def mymethod(self):
        print("my method was called in my class!")

The `self` keyword is a self-referential object to use data inside the object. It is needed as an argument to every method inside a class. we will see why later.

OK, so how do we use a class? First we have to *instantiate* it, meaning, make a variable of that class. It looks very similar to using a function with a return value, but don't be fooled! 

Let's make the MyClass class and put it in a variable then call it's method.

In [4]:
myclass = MyClass()
myclass.mymethod()

my method was called in my class!


What if we need to initialise variables before we use them in a class? Say, we need some names or special values already defined inside the class to use. 

there is a special initialisation function that is called when you first create an object called `__init__`, so we put all of our special data declarations or additional initialisation code there.

then once we have the data defined, we can use the special `self`-referential object to retrieve the data.

Let's redefine the MyClass class to initialise some special internal variables.

In [7]:
class MyClass():
    def __init__(self):
        self.myvariable = "Nathan"
    def mymethod(self):
        print("my method was called in my class! thanks", self.myvariable)
        
myclass = MyClass()
myclass.mymethod()

my method was called in my class! thanks Nathan


So now that we know, we can create objects of simpler types, what about the methods on classes or objects that we already know about? Are there methods we can use on strings or other classes and functions we can use already (instead of making everything ourselves)

well thankfully, the answer is yes. There are many methods available to us already. We have already seen some of them when using `dir()` to tell us about the internals of an object. Let's try this with a string.

In [3]:
myname = "Nathan"
dir(myname)

['__add__',
 '__class__',
 '__contains__',
 '__delattr__',
 '__dir__',
 '__doc__',
 '__eq__',
 '__format__',
 '__ge__',
 '__getattribute__',
 '__getitem__',
 '__getnewargs__',
 '__gt__',
 '__hash__',
 '__init__',
 '__iter__',
 '__le__',
 '__len__',
 '__lt__',
 '__mod__',
 '__mul__',
 '__ne__',
 '__new__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__rmod__',
 '__rmul__',
 '__setattr__',
 '__sizeof__',
 '__str__',
 '__subclasshook__',
 'capitalize',
 'casefold',
 'center',
 'count',
 'encode',
 'endswith',
 'expandtabs',
 'find',
 'format',
 'format_map',
 'index',
 'isalnum',
 'isalpha',
 'isdecimal',
 'isdigit',
 'isidentifier',
 'islower',
 'isnumeric',
 'isprintable',
 'isspace',
 'istitle',
 'isupper',
 'join',
 'ljust',
 'lower',
 'lstrip',
 'maketrans',
 'partition',
 'replace',
 'rfind',
 'rindex',
 'rjust',
 'rpartition',
 'rsplit',
 'rstrip',
 'split',
 'splitlines',
 'startswith',
 'strip',
 'swapcase',
 'title',
 'translate',
 'upper',
 'zfill']

There are many interesting methods that we can call on each object, such as find, split, replace, lower, capitalise, and many bool like functions that start with `is` such as isalpha, isdigit, islower

In [12]:
print(myname.capitalize())
print(myname.lower())
print(myname.replace("h",""))
print(myname.islower())
print(myname.isdigit())
print(myname.isalpha()) 

Nathan
nathan
Natan
False
False
True


Lists can also do other things you might want, just like you would expect, such as sorting!

In [5]:
new_list = [55,7,230,90]
# ascending order sort
new_list.sort()
print(new_list)
# descending order sort
new_list.sort(reverse=True)
print(new_list)

[7, 55, 90, 230]
[230, 90, 55, 7]


But why can we do this? isn't `myname` a string? It isn't really a class we created, so why do we have methods associated with it. 

Well actually, ALL data in python derives from the original `object` class. So they have special methods that can be used on them. 

What do you mean derived? I mean class inheritance!

#### Inheritance

Inheritance is a special feature that can be used with classes wherein classes can be derived from other classes and inherit their *methods and data*.

We can thing of inheritance as a `is-a` entity-relation. Whereas a derived class is a type of the parent class. A taxonomy is a type of classification and we can use that as an example.

The syntax is almost exactly the same for defining a class that is inherited from another:

    class CLASSNAME(PARENTCLASSNAME):
        # method, data definition(s)

Let's do a basic example with animals. A dog and cat are both animals, but they make different sounds. let's model how they speak with classes and inheritance.

In [19]:
# parent class that animals derive from
class Animal():
    def __init__(self):
        self.sound = ""
    def speak(self):
        print(self.sound)

class Dog(Animal): # dog is a type of animal
    def __init__(self):
        self.sound = "bark bark!" # redefine a dog's sound

class Cat(Animal): # cat is a type of animal
    def __init__(self):
        self.sound = "meow meow..." # redefine a cat's sound

# make a dog and cat object
dog = Dog()
cat = Cat()

# make them both speak
dog.speak()
cat.speak()

bark bark!
meow meow...


As you can see, the `Cat` and `Dog` class do not have the speak method defined, because they inherited this from the parent, in this case the `Animal` class. 

There is not multiple inheritance in Python, therefore inheritance declaration can only ever have at most one parent class name. 

#### Objects

Sometimes I use the term objects to speak about class instantiations. This is because there is a parent of all classes in python called `object`, where are classes are derived from. This means we can think of all data at types of object classes and use their defined methods just like any other class.

In [17]:
help(object)

Help on class object in module builtins:

class object
 |  The most base type



## Input and Output (I/O)

Input and output is essential in any programming language. Without being able to read or write from multiple sources.

Output is generally to a file or to the screen. Input is generally from a file or the keyboard. Therefore we will explore file and keyboard I/O
### File I/O

When it comes to files in Python, the main built-in function is called `open()`

let's `open()` a file so that we can write something to it. 

In [4]:
text = "I think this is really important, and should be written into a file."
filename = "myfile.txt"
f = open(filename, 'w') # open file for writing 'w'
f.write(text) # write the text to a file
f.close() # close the file

as you see the `open()` function requires two arguments, one is the name of the file and the other is how to use that file. 

this is important, because we may only want to read a file sometimes, we don't always ways to write. let's look at the `help()` for `open()`

In [5]:
help(open)

Help on built-in function open in module io:

open(file, mode='r', buffering=-1, encoding=None, errors=None, newline=None, closefd=True, opener=None)
    Open file and return a stream.  Raise IOError upon failure.
    
    file is either a text or byte string giving the name (and the path
    if the file isn't in the current working directory) of the file to
    be opened or an integer file descriptor of the file to be
    wrapped. (If a file descriptor is given, it is closed when the
    returned I/O object is closed, unless closefd is set to False.)
    
    mode is an optional string that specifies the mode in which the file
    is opened. It defaults to 'r' which means open for reading in text
    mode.  Other common values are 'w' for writing (truncating the file if
    it already exists), 'x' for creating and writing to a new file, and
    'a' for appending (which on some Unix systems, means that all writes
    append to the end of the file regardless of the current seek position

wow, that's a lot of information, but for now let's just pay attention to the table in the help. It tells us information on how to open files in different ways, for reading, writing, appending, etc. the difference between writing and appending is that writing will destroy the contents before writing, and appending will just add the writes to the end of the file. 

So now we know that we can also ready by using 'r'  to specify read. Let's read all the data we just wrote to the file, then print it out.

In [6]:
f = open(filename, 'r')
for line in f:
    print(line)
f.close()

I think this is really important, and should be written into a file.


wow, that was really easy. As you can see, the file object when opened is iterable and can be used in a loop. 

we can also read all the lines in a file, and use them later if we want.

one final thing to note is the special keyword `with`.
In python `with` can be used to define variables that exist in a special scope and are *cleaned up* afterwards. This means the call we are making to `close()` won't be necessary. 

Using the keyword `with` helps readability and is the preferred way to do file specific tasks in Python.

The basic syntax is:

    with FUNCTION as VARIABLE:
        # use VARIABLE in this scope, it will be cleaned up later
        
Essentially, the two lines are doing the same thing:
1. `f = open(filename)`
2. `with open(filename) as f`

The only benefit is that the latter will do the additional call to `close()` for us.

So now let's use `with` to read the file again, but this time, we will split the line into it's words and print them.

In [23]:
with open(filename) as f:
    for line in f:
        words = line.split() # splits on whitespace
        print(words, type(words))

for word in words:
    print(word)

['Hello,', 'Here', 'is', 'a', 'line', 'for', 'our', 'file.'] <class 'list'>
['Here', 'is', 'another', 'one.'] <class 'list'>
['This', 'is', 'the', 'last', 'one.'] <class 'list'>
This
is
the
last
one.


##### A note on line delimiting, or end of line characters

One additional thing to point out is that, text files use *line delimiters*, or special characters we cant see which place string or subsequent lines to make them more readable, otherwise we would have to *scroll right* forever to read an article. Therefore every text file uses a special end of line sequence of characters to tell the computer that it is an end of line.

There are two main special characters, new line and return carriage, which we represent as `\n` and `\r` respectively.

In Unix like systems (Mac OS X and GNU/Linux), the end of line sequence is new line `\n`
In Windows systems, the end of line sequence is new line and return carriage `\r\n`

Although we are actually typing two characters, backslash and `n` or `r`, the computers interpret them to be a single special character. 

Let's write a few lines to a file, and see what we mean.

In [18]:
# define lines to write into a file
lines = ["Hello, Here is a line for our file.", "Here is another one.", "This is the last one."]
with open(filename,'w') as f: # clobber our file, overwrite
    f.writelines(lines)       # special function to writelines

Let's take a moment to look at our file, does anything seem strange?

I would definitely say so!, It seems that our file does not have line delimiters and all the text is jumbled. Let's add them in.

In [19]:
with open(filename,'w') as f: # clobber our file, overwrite
    for line in lines:
        f.write(line + "\n")

That is much better, now lastly, when reading a text file, we may want to remove them when reading, which can done easily by a call to `strip()` which will remove whitespace which includes tabs, spaces, new line, and return carriage `[ \t\r\n]`.

let's read and print the words again.

In [22]:
with open(filename) as f:
    for line in f:
        words = line.strip().split()
        print(words, type(words))

['Hello,', 'Here', 'is', 'a', 'line', 'for', 'our', 'file.'] <class 'list'>
['Here', 'is', 'another', 'one.'] <class 'list'>
['This', 'is', 'the', 'last', 'one.'] <class 'list'>


## Lecture assignment

* create WordStats class
  * read filename
  * split lines into words
  * count words
  * print counts