# Day 1 - Introduction to the Structure and use of Python

---

---

## References

- [Official Python Wiki](https://wiki.python.org/moin/) : `wiki.python.org/moin/`
- [Official Python Tutorial](https://docs.python.org/3/tutorial/) : `docs.python.org/3/tutorial/`


- [Google's Python Course ](https://developers.google.com/edu/python/) : `developers.google.com/edu/python/`
- [WikiBooks Python Programming](https://en.wikibooks.org/wiki/Python_Programming) : `wikibooks.org/wiki/Python_Programming`
- [Beginning Python, Advanced Python, and Python Exercises](http://www.davekuhlman.org/python_book_01.html) : `davekuhlman.org/python_book_01.html`


- [The Hitchhiker’s Guide to Python](http://python-guide-pt-br.readthedocs.io/en/latest/) : `python-guide-pt-br.readthedocs.io/en/latest/`

<a id='toc'></a>
## Table of Contents

* [Big Picture](#picture)
    * [What is Python?](#whatis)
    * [Features of Python](#features)
    * [Some Package Examples](#examples)  
        - [Arrays and plotting with `numpy` and `matplotlib`](#ex_plot)  
        - [Internet scraping with `requests` and `beautifulsoup`](#ex_internet)
    * [Python Ideology](#ideology)
    
    
* [Python Core](#core)
    * [Syntax](#syntax)
    * [Elements](#elements)
    * [Control-Flow Statements](#controlflow)
    * [Data Types](#types)
    * [Strings](#strings)

---

---

## Examples: python usage examples

....

<a id='picture'></a>  [Top](#toc)
## Big Picture

**Computer Program**: *set of instructions that tell a computer (processor) what actions to take.*

**Programming Language**: *framework for describing those instructions in a cohensive and systematic way.*

The language (specifically the **"interpreter"** or the **"compiler"** of the language) is responsible for converting between the descriptions in the language into explicit instructions that the "hardware" (computer processor) can understand.

i.e. `'print(a + b)'`

In [None]:
# a = 7
# b = 5
# print(a + b)

In [None]:
# print(hex(id(a)))
# print(hex(id(b)))

Code Structure | Language Samples
- | -
<img src="figs/small.png" style="width: 500px;"/>  | ![Image](figs/codes.png)

<a id='whatis'></a>  [Top](#toc)

### What is python?

- **`python` is written in `C`**, (*thus it is a "higher" level language*)
    - *fewer lines of code.*
    
    
- Written for **convenient and complexity** instead of bute-force "power"
    - *lots of new, heavy-lifting code is still written in `C` or `Fortran`.*
    - *very complicated processes can be quickly written (or tested) in python.*
    
    
- **"Interpretted"** language, instead of "Compiled" (like `C`)
    - *recall hello-world in `C` vs. `python`.*
    
    
- **Open Source** (unlike other high-level languages like IDL, MatLab, Mathematica)
    - *Tremendous amounts of community development, contributions and public packages.*

<a id='features'></a>  [Top](#toc)

### Features / Nature of Python

- **"Dynamic Typing"**   

    ```fortran
    ! fortran
    CHARACTER(LEN=6) :: name = "apples"
    INTEGER :: name = 4
    ```
    
    ```python
    # python
    name = "apples"
    name = 4
    ```
    
    - In `python`, *Objects* have types, *variables* do not.



- Automatic **"Memory Management"** *(and protection)*  

    ```C
    # c
    int *arr = (int *) malloc(NUM*sizeof(int));  
    int dat = arr[NUM+20]
    ```
    
    ```python
    # python (method 1)
    arr = [0 for ii in range(NUM)]
    dat = arr[NUM=2]
    > IndexError: list index out of range
    
    # python (method 2)
    arr = np.ndarray(NUM)
    dat = arr[NUM+20]
    > IndexError: index 5 is out of bounds for axis 0 with size 3
    ```

    - *NOTE: this can have downsides!*
    
    
- Strongly **object oriented** (but can be easily used for "functional programming")


- **Interactive** (e.g. this Notebook)


- Very **large standard library**


- **countless external libraries**, like:

Name | Description | Reference
---- | ----------- | ---------
`numpy` | Numerical Python, with powerful array objects | http://www.numpy.org/
`scipy` | Scientific Python, lots of numerical methods, databases, etc | https://www.scipy.org/
`astropy` | Astronomy Python, astronomy/astrophysics specific libraries | http://www.astropy.org/
`SQLAlchemy` | SQL Alchemy, SQL database interface package | http://www.sqlalchemy.org/
`cython` | Cython, python-C interface pseudo-language | http://cython.org/
`jython` | Jython, javascript friendly python interface | http://www.jython.org/
`matplotlib` | MatPlotLib, standard plotting package | http://matplotlib.org/
`requests` | Requests, html library | https://pypi.python.org/pypi/requests
`django` | Django, web-app framework | https://www.djangoproject.com/
`pyglet` | Pyglet, multimedia & game library | https://bitbucket.org/pyglet/pyglet/wiki/Home

<a id='examples'></a> [Top](#toc)

## Package Examples using `ipython`

<a id='ex_plot'></a> [Top](#toc)

### Arrays and plotting with `numpy` and `matplotlib`

In [None]:
# Import the modules that we want
#    `matplotlib.pyplot` provides easy access to a bunch of cool methods and objects
import matplotlib.pyplot as plt
import numpy as np

# Choose the number of points we're going to create
#    I'm using all caps for this variable name to show that it behaves as a "constant"
NUM = 1000

# Create 'figure' and 'axes' objects for plotting
fig, ax = plt.subplots(figsize=[10, 5])
# Create `NUM` points evenly spaced between [0.0, 1.0]
x0 = np.linspace(0.0, 1.0, NUM)
# calculate y = e^(-2*x^2)
y0 = np.exp(-2 * (x0**2))
# Plot a line of y vs. x, specify the line-weight ("lw") and label
ax.plot(x0, y0, 'r-', lw=3.0, label='Line')

# Set the title of the axes, with a particular fontsize
ax.set_title('Banneker-Aztlan', fontsize=20)

# Now draw random x values between [0.0, 1.0] (instead of evenly spaced)
x1 = np.random.uniform(0.0, 1.0, NUM)
# Define y in the same way, but now add some random noise
y1 = np.exp(-x1*x1*2.0) + np.random.normal(0.0, 0.1, NUM)
# Create a scatter plot of y vs. x
ax.scatter(x1, y1, alpha=0.5, label='Points')
# Show the figure
plt.show()
fig.savefig('ba_test.pdf')

<a id='ex_internet'></a> [Top](#toc)

### Internet scraping with `requests` and `beautifulsoup`

In [None]:
import requests
from bs4 import BeautifulSoup

URL = "https://arxiv.org/abs/1702.02180"
session = requests.Session()
response = session.get(URL, timeout=30)
soup = BeautifulSoup(response.text, 'html5lib')
print(soup)

In [None]:
meta = soup.find_all("meta")

for mm in meta:
    if mm.attrs['name'] == 'citation_title':
        print(mm['content'] + "\n")
        
    if mm.attrs['name'] == 'citation_author':
        print(mm['content'])

<a id="ideology"></a> [Top](#toc)

### Python Ideology (the Zen of Python)

In [None]:
# import this

- Beautiful is better than ugly
- Explicit is better than implicit
- Simple is better than complex
- Complex is better than complicated
- Readability counts

---

---

<a id="core"></a> [Top](#toc)

## Python Core

<a id="syntax"></a> [Top](#toc)

### Syntax

- **White Space**
    - **Indentation** determines the nesting of code.  Indents follow colons ('`:`') which set-off a new nesting level.
    
    ```python
    for ii in [0, 1, 2, 3]:
        print(ii)
    ```
    
    - **blank lines** have no effect.
    
    This is exactly the same as the above:
    ```python
    for ii in [0, 1, 2, 3]:

        print(ii)
    ```

    - **spaces** between different object have no effect
    
    This is exactly the same as the above (*except ugly*):
    ```python
    for ii   in [0,   1, 2,3]:
        print(    ii    )
    ```
    
    - **line breaks** terminate each unit of code.  (can be combined with semi-colons ['`;`']; but don't).
    
    ```python
    for ii in [0, 1, 2, 3]:
        print(ii, end='...'); print("... The number is ", ii)
    ```
    
        - But line-breaks within a parentheses or brackets don't matter (but good to follow proper indentation), e.g.
        
    ```python
    some_list = [0, 1, 2, 3,
                   4, 5, 6, 7]
    ```

- **Variable Names** (modified from: http://www.davekuhlman.org/python_book_01.html#names-and-tokens)

    - Allowed characters: a-z A-Z 0-9 underscore, and must begin with a letter or underscore.    
    - Names and identifiers are case sensitive.    
    - Identifiers can be of unlimited length.  

    - Naming conventions (Not rigid, but):  

        - Modules and packages: *all lower case.*
            (e.g. '`import numpy as np`')    
        - Globals and constants: *Upper case.*   
            (e.g. '`SPEED_LIGHT = 3e10'`) 
        - Classes: *Bumpy caps with initial upper.*    
            (e.g. '`PrintFormatter`' or '`Print_Formatter`')
        - Methods and functions: *All lower case with words separated by underscores.*  
            (e.g. '`calculate_comoving_distance`')
        - Local variables: *Lower case (with underscore between words).*  
            (e.g. '`dist_com`')


- **Comments**
    - "Inline": start with a pound (`'#'`)
    - "Block": start and end with triple quotes (`'''`, or, `"""`)
    
    ```python
    # Declare and set the variable `a`
    def my_func(xx, aa=2.0):
        """This is a callable 'function', like the one we plotted before.
        
        Functions often (and should usually) have block comments directly below
        their definition which describe what they do, and what their inputs and
        outputs are.
        
        Arguments
        ---------
        xx : (N,) array of float
        aa : float
        
        Returns
        -------
        yy : (N,) array of float
        
        """
        # construct the argument to the exponent function
        arg = - aa * (xx**2)
        # Take 'e' to the power of the argument
        yy = np.exp(arg)
        # return the resulting values
        return yy
    
    ```
    
    - The above type of block comment, when it directly follows a function or class definition, is called a 'docstring' and can be accessed using the '`help()`' command in `ipython`, or with a trailing '`?`'.  
    e.g. '`np.arange?`' or '`help(np.arange)`'

<a id='elements'></a> [Top](#toc)

### Language Elements

- **Literals**: generally strings and numbers, the things that go on the right-hand-side of an assignment, e.g.

    ```python
    "Hello"
    14
    13.6
    4.213e6
    ```


- **Objects** / **Classes** / **Types**: are structures which contain/store information (like "literals", or "variables"), and provide certain functions to interface with.

    - There are simple, builtin types like `integers`,
    ```python
    a = int(2)
    b = int('3')
    print(a, b)
    > 2 3
    ```
    
    - More comples types like `lists`,
    ```python
    a = [1, 5, 3, 4.5, 'apple']
    ```
    
    - And custom types like `classes`,
    ```python
    class MyContainer:
        box = "of apples"
        jar = 5
        
    cont = MyContainer()
    print(cont.box)
    > of apples
    print(cont.jar)
    > 5
    ```
    
    

- **Variables**: object that contains/stores a value which may or may not change, for example, literals, or functions.

    ```python
    a = np.arange(5)
    print(a)
    > [0 1 2 3 4]

    my_arange = np.arange
    print(my_arange(5))
    > [0 1 2 3 4]
    ```
    
    
- **Functions**: are "callable" procedures which can have input and output parameters.

    ```python
    def my_func(xx, aa=2.0):
       # see above
    ```

- **Statements**

    - **Assignment**: set a variable to a value, e.g.

    ```python
    a = 3.15e7
    ```

    - **Import**: load a "module" from somewhere on the same system (and on the system path), e.g.

    ```python
    import numpy as np
    ```
    
    - **Control-flow**: these statements like: '`if`', '`elif`' & '`else`', or the loops, like '`for`', '`while`', control the flow of code in response to particular conditions.  We'll come back to these in a minute.

    - **Delete**: "unbind" or freeup a variable (memory handling is non-trivial, and mostly hidden from the user), e.g.
    ```python
    large_var = np.zeros((100, 100, 100, 100))
    # ...
    del large_var
    ```

    - **Other**: there are also additional statements which deal with particular situations, e.g.  
    `yield`, `pass`, `continue`, `break`, `assert`, `raise`...  
    ```
    ```

- **Operators**

    - Operators in python are **functions** which take as arguments the objects on each side of the operator, and return some value.  Operators are special because they have particular symbols to trigger/handle those functions.
    - These can be simple mathematical operators like, (addition) '`a = 2+2`', (exponentiation), '`b = 10**2`', etc
    - These can also be boolean/logical operators like,
    
    ```python
    a = True
    b = False
    
    print(a or b)
    > True
    c = a and b
    print(c)
    > False
    ```
    
    - Access operator '`[]`', (used on different types of "collections" like lists, arrays, dictionaries)
    
    ```python
    a = np.arange(5)
    print(a[0])
    > 0
    print(a[-1])
    > 4
    ```
    
    - Also function call: '`()`', dictionary display: '`{key: value}`', etc

<a id='controlflow'></a> [Top](#toc)

### More on **Control-Flow Statements** (conditionals and loops)

Conditionals

In [None]:
# Lets choose a random number using `numpy`
# var = np.random.randint(0, 10)
# print("var = ", var)
# if var < 5:
#     print("Lower half: ", var)
# else:
#     print("Upper half: ", var)

Loops

In [None]:
# for loop over list

# for loop over array `np.random.randint`

In [None]:
# Get random integers between [0, 10] until we get 5 above 8.
# Count the total number of iterations

The `for`-loop is the workhorse of the control-flow statements.  Frequency, we want to iterate over something in a `for`-loop, and also track the number of times we've looped.  For example:

In [None]:
num = 0
var = np.random.randint(0, 100, 5)
# ...

This can be done automatically using '`enumerate()`':

In [None]:
# ...

We also often want to loop over multiple iterables at the same time... use '`zip()`'

In [None]:
var_one = np.random.randint(-10, 0, 10)
var_two = np.random.randint(0, 10, 10)
# ...

<a id='types'></a> [Top](#toc)

### (Built-in) Data Types

### Numeric

- **Integers**

In [None]:
a = 3
print(a, type(a))
b = int('27')
print(b, type(b))

print(a+b, type(a+b))

- **Floats**

In [None]:
a = 3.14159
print(a, type(a))
b = float('2.71828')
print(b, type(b))

print(a+b, type(a+b))

- Watch out for implicit conversion in python-3!  `python2.7` vs. `python3.5` example!

<a id='strings'></a> [Top](#toc)

### Strings

- Collections of characters (any unicode characters allowed in `python3`)

In [None]:
a = "Hello"
b = "Banneker-Aztlan Institute"
c = a + " " + b
print(c, type(c))

In [None]:
d = int(42)
print("int: ", d, type(d))

d = str(d)
print("str: ", d, type(d))

### String Formatting

We're going to focus on "new-style" string formatting using the `str.format()` function.

In [None]:
yr = 31557600.0   # 31,557,600.0

str_yr = "{}".format(yr)
print(str_yr)

"Old style" (or c-style) string formatting uses the '`%`' character; but is very similar in the simple use cases.

In [None]:
str_yr_int = "%d" % yr
print(str_yr_int)

In [None]:
str_yr_flt = "%f" % yr
print(str_yr_flt)

In [None]:
hr = 3600
day = "monday"

str_hr = "{}".format(hr)
print(str_hr)

str_day = "{}".format(day)
print(str_day)

In [None]:
print("{:f}".format(np.pi))
print("{:.2f}".format(np.pi))

### Basic String Operations

In [None]:
test = "Complex is better than complicated"
# Find the length of a string (this also works with lists and other collections)
print("Length: ")
print("First 10 characters: ")
print("First half: ")
# Convert to a list

# Print each word with a for loop

# Make a new string replacing each space with a comma (do this two ways)