# Introduction to Python, Numpy, Pandas, and Matplotlib


This lab will detail some pythonic equivalents to the `Introduction to R` lab of [ISL](http://faculty.marshall.usc.edu/gareth-james/ISL/).

This lab **will not** be an exhaustive tutorial of Python, Numpy, Pandas, or Matplotlib -- I myself still have plenty to learn about each :)

### Basic Commands

To create a list of numbers, we can use the syntax [1,2,3...] and optionally save this list to a variable

In [6]:
myList = [1,3,2,5]
myList

[1, 3, 2, 5]

* note that in a jupyter notebook, the last expression in a code cell will get printed

You can use the help() built-in python function for help, or check the python manual for guidance if you need.

In [7]:
help(len)

Help on built-in function len in module builtins:

len(obj, /)
    Return the number of items in a container.



Where R supports element-by-element vector addition out of the box, python's default for list addition is to concatenate the two lists

In [8]:
[1,3,5] + [2,4,6]

[1, 3, 5, 2, 4, 6]

If we instead want `[a,b,c] + [d,e,f]` to return `[a+d, b+e, c+f]`, we can use numpy arrays.

In [9]:
# import the numpy library, and alias it as np
import numpy as np

# instantiate a numpy array by wrapping a list in np.array()
first = np.array([1,3,5])
second = np.array([2,4,6])

firstPlusSecond = first + second
firstPlusSecond

array([ 3,  7, 11])

You can check the length of python lists using `len()` or numpy arrays using `len()` or alternatively `.shape()`

In [18]:
print(len([1,2,3]))
print(np.array([4,2,4,2,4]).shape) 

3
(5,)


Generally, numpy arrays need to be the same length to be added together.

In [19]:
np.array([1,3,5]) + np.array([2,4])

ValueError: operands could not be broadcast together with shapes (3,) (2,) 

This isn't the whole story -- numpy will try to make sense of operations you give it.
But in general, you'll want to add together arrays with matching shapes.
If you want to see a result I found interesting, try uncommenting the last line of the following code cell

In [17]:
first = np.array([1,1,1]) # sort of like the matrix row [1,1,1]
second = np.array([[2], [3], [4]]) # this is more like a column
print('first:')
print(first)
print(first.shape)

print('-----------')
print('second:')
print(second)
print(second.shape)
#first + second

first:
[1 1 1]
(3,)
-----------
second:
[[2]
 [3]
 [4]]
(3, 1)


In R, the `ls()` function will list all the objects saved in a session.
Python has similar commands in `dir()`, `locals()`, and `globals()`

(figure out a better way to word this)
However, these also list python global objects, not just user-defined ones. 

Because this is a Jupyter notebook, we can use the `%who` magic to see only user-defined objects

In [25]:
%who

first	 firstPlusSecond	 myList	 np	 second	 


I don't know if it's all that [pythonic](https://docs.python-guide.org/writing/style/) to "undeclare" your variables, but you're certainly allowed to using `del`.
I haven't been able to find an equivalent to R's removal of all user-defined objects, but I'm sure it can be done. (the mathematician in me is always happy to say "a solution exists!" and walk away)

In [27]:
del myList
myList

NameError: name 'myList' is not defined