# An introduction to Python

I'll go through a few important Python concepts before getting into the statistical learning. I'll show you how to import libraries, but also some neat things having to do with strings, lists, and dictionaries.

1. importing libraries
2. manipulating strings
3. using lists
4. using dictionaries
5. opening files

## Importing libraries

To import a Python library/package means to include either Python code or some pre-compiled programs into your own Python script. The important thing to keep track of here is the *namespace*.

As an example of a namespace, suppose you've written a Python function called `fit()` to fit a model to some data. Suppose you also want to import a library that already has a library with a function called `fit()`. How will you be able to call each function separately? That's the main concern when dealing with namespaces.

### Method 1: import

Here is a way of importing a library that keeps it completely separate.

In [1]:
import sklearn.linear_model

This imports all of scikit-learn's `linear_model` library and lets you use these models in your program by calling `sklearn.linear_model`.

In [2]:
lm = sklearn.linear_model.LinearRegression()
lm

LinearRegression(copy_X=True, fit_intercept=True, n_jobs=1, normalize=False)

### Method 2: import as

You can also import a library under a name of your choosing.

In [3]:
import sklearn.linear_model as hotdog

This will let you call the linear models library by only using `linear_model`.

In [4]:
lm = hotdog.LinearRegression()
lm

LinearRegression(copy_X=True, fit_intercept=True, n_jobs=1, normalize=False)

### Method 3: import selectively

You can also import only some parts of a library, to keep things clean.

In [5]:
from sklearn.linear_model import LinearRegression

Now the `LinearRegression` class will available for you to use on its own.

In [6]:
lm = LinearRegression()
lm

LinearRegression(copy_X=True, fit_intercept=True, n_jobs=1, normalize=False)

### What you should do

Look at some code examples online and see how people usually import a library. It's better to use the common method so that people reading your code see familiar things. For example, the library `numpy` is always import as

In [7]:
# The official way
import numpy as np

## Strings

In Python a [string](https://docs.python.org/3/library/stdtypes.html#text-sequence-type-str) is text. This will be format used for things such as column names, dictionary keys (more later), and text data.

A very useful built-in function in Python is the [`print()`](https://docs.python.org/3/library/functions.html#print) function!

In [8]:
print("Hello world!")

Hello world!


Use the print function often to display data to your Jupyter session or your Python console.

Here are a few neat things you can do with strings.

In [9]:
print("Strings can be concatenated by " + "adding them together.")

print("A string can also be " + (5 * "multiplied "))

print("The str() function can turn numbers into a string: " + str(100))

Strings can be concatenated by adding them together.
A string can also be multiplied multiplied multiplied multiplied multiplied 
The str() function can turn numbers into a string: 100


Another useful set of string commands are [`split()`](https://docs.python.org/3/library/stdtypes.html#str.split), `join()`, and `replace()` (these last two are documented in [string methods](https://docs.python.org/3/library/stdtypes.html#string-methods).

In [10]:
print("The split function will separate delimited substrings.".split(" "))

print(" ".join(["The", "join", "method", "does", "the", "opposite."]))

print("The replace method does a find and replace.".replace("e", "i"))

['The', 'split', 'function', 'will', 'separate', 'delimited', 'substrings.']
The join method does the opposite.
Thi riplaci mithod dois a find and riplaci.


The most important string function you will come across is the [`.format()`](https://docs.python.org/3/library/string.html#formatstrings) method. If you find the Python documentation obtuse, there also [this webpage](https://pyformat.info/) devoted to `.format()`.

In [11]:
# The first mode of .format() is positional
print("Johnny ate {0} cakes while Timmy ate {1}.".format(5, 15.1))

Johnny ate 5 cakes while Timmy ate 15.1.


The `.format()` method can also receive specific formatting options. For example, `{:5}` will reverse 5 characters to print an integer. `{:<10}` will left-align a string within a 10-character space. Finally, `{:07.2f}` will reserver 7 characters to hold a float number, padding it with zeros.

In [12]:
# The second mode is more specific
print("Johnny ate {:5d} cakes while {:<10} ate {:07.2f}.".format(5, "Cthulhu", 151.2))

# A more realistic example
print("Epoch {:5d} complete, cost {:10.6f}, accuracy {:5.2f}%".format(11, 0.0231532, 97.231))

Johnny ate     5 cakes while Cthulhu    ate 0151.20.
Epoch    11 complete, cost   0.023153, accuracy 97.23%


## Using lists

Lists in Python are really just flexible data containers. You can store any kind of data in them, and you can manipulate them any way you want. Lists are also useful to iterate through for-loops.

You can find the documentation on lists [here](https://docs.python.org/3/tutorial/datastructures.html).

In [13]:
# A list is declared with square brackets
mylist = ["a", "b", "c", 1, 2, 3]
print(mylist)

# They can also be created by splitting strings
my_delimited_list = "a, b, c, 1, 2, 3".split(", ")
print(my_delimited_list)

['a', 'b', 'c', 1, 2, 3]
['a', 'b', 'c', '1', '2', '3']


You can also perform some operations on lists. Here are some useful ones.

In [14]:
mylist = ["a", "b", "c", 1, 2, 3]
my_delimited_list = "a, b, c, 1, 2, 3".split(", ")

# The append method concatenates something to the end of a list (does not return the appended list)
mylist.append("a")
print(mylist)
mylist.append((1, 2, 3))
print(mylist)
mylist.append(my_delimited_list)
print(mylist)

# You can also use the addition operator
print(my_delimited_list + ["apple", "orange", "banana"])

['a', 'b', 'c', 1, 2, 3, 'a']
['a', 'b', 'c', 1, 2, 3, 'a', (1, 2, 3)]
['a', 'b', 'c', 1, 2, 3, 'a', (1, 2, 3), ['a', 'b', 'c', '1', '2', '3']]
['a', 'b', 'c', '1', '2', '3', 'apple', 'orange', 'banana']


A very nice thing about lists is that they're iterable. This means that they can be fed to a for-loop and processed one-by-one.

In [15]:
mylist = ["1", "2", "3", 1, 2, 3]

# A simple for loop
for i in mylist:
    print(i)

# Or play around a bit
for i in mylist:
    if isinstance(i, str):
        print("This is a string. {0}".format(i))
    else:
        print("This isn't a string. {0}".format(i))

1
2
3
1
2
3
This is a string. 1
This is a string. 2
This is a string. 3
This isn't a string. 1
This isn't a string. 2
This isn't a string. 3


A more advanced technique is called "list comprehension", which is like Python's [`map()`](https://docs.python.org/3/library/functions.html#map) or [`apply()`](http://stat.ethz.ch/R-manual/R-devel/library/base/html/apply.html) function in other languages. 

In [16]:
# The list comprehension will apply a function on all the elements in mylist
mylist = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
my_other_list = [str(x) for x in mylist]
print(my_other_list)

# The list comprehension can also take conditions
my_shorter_list = [str(x) for x in mylist if x > 5]
print(my_shorter_list)

['1', '2', '3', '4', '5', '6', '7', '8', '9', '10']
['6', '7', '8', '9', '10']


A list comprehension can be useful in some situations, but a for-loop can do the same in a few more lines.