<a href="https://colab.research.google.com/github/4dsolutions/clarusway_data_analysis/blob/main/python_warm_up/warmup_python_intro.ipynb"><img align="left" src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open in Colab" title="Open and Execute in Google Colaboratory"></a><br/>
[![nbviewer](https://raw.githubusercontent.com/jupyter/design/master/logos/Badges/nbviewer_badge.svg)](https://nbviewer.org/github/4dsolutions/clarusway_data_analysis/blob/main/python_warm_up/warmup_python_intro.ipynb)


# Python's Many Libraries and Namespaces

<a data-flickr-embed="true" href="https://www.flickr.com/photos/kirbyurner/52563704012/in/album-72177720296706479/" title="LMS Dashboard"><img src="https://live.staticflickr.com/65535/52563704012_71ef4beb8a_b.jpg" width="1024" height="354" alt="LMS Dashboard"></a><script async src="//embedr.flickr.com/assets/client-code.js" charset="utf-8"></script>

Python Warm-up Notebooks:

*  [Introduction to Python](warmup_python_intro.ipynb)  \(you are here)
*  [3rd Party Libraries](warmup_3rd_party_datascience.ipynb)
*  [Object Types](warmup_data_structures.ipynb)
*  [Object Oriented Paradigm](warmup_object_oriented.ipynb)
*  [Calling Callables and Type Checking](warmup_callables.ipynb)
*  [Class and Static Methods, Properties](warmup_object_oriented2.ipynb)
*  [SQLite3 and Context Managers](warmup_object_sql.ipynb)
*  [Iterators and Generators](warmup_generators.ipynb) 

Like most computer languages, Python comes with a library, where what we call "books" (in a library) are called "modules" and "packages".  

The library that comes with Python when you get it from [Python's home](https://python.org), is called [the Standard Library](https://docs.python.org/3/library/index.html).  Any true and complete Python should come with that.

In addition to the Standard Library, are all the packages and modules you write yourself and/or install.  If they're from outside the Standard Library, we call them "3rd party".  

Python's 3rd party libraries have helped make Python very useful in many walks of life.  We will spend most of our time, as data analysts, using 3rd party packages, such as `numpy` and `pandas`.

## Levels of Python

Python may be usefully presented with five levels, or call them dimensions:

* Level 0: core syntax with keywords & punctuation, indentation (import, if...)
* Level 1: a large set of built-ins (e.g. print)
* Level 2: special names with the double underlines
* Level 3: Standard Library (e.g. math)
* Level 4: 3rd Party Ecosystem (e.g. numpy, pandas, matplotlib)

We take a multi-level approach right from the beginning, spiralling back to add more details with each pass.

Let's start at Level 0 by using Python as a calculator.  

One of the classic tutorials, originally by Guido himself, Python's inventor, is called *Using Python as a Calculator*.  [Check it out!](https://docs.python.org/3.9/tutorial/introduction.html#using-python-as-a-calculator)

In [1]:
2 + 2

4

In [2]:
2 * 2

4

In [3]:
2 ** 2

4

We're also involving Level 2 although the special names often lie just below the surface.  Here's another way to write the same operations as above:

In [4]:
2 .__add__(2)  # + triggers __add__

4

In [5]:
2 .__mul__(2)  # * triggers __mul__

4

In [6]:
2 .__pow__(2)  # ** triggers __pow__

4

NOTE:  Every integer, such as 2, contains innate knowledge, which we may access using the "dot operator" (some call it "an accessor").  Ordinary operators such as +, * and ** actually map to, and trigger, corresponding special methods with funny looking `__rib__`-like names.

Try stuff yourself:

Python right out of the box knows a lot of math.  But it doesn't know trigonometry or how to take logarithms.  Things an ordinary scientific calculator would know how to do.  

In order to access these additional mathematical capabilities, you might import ```math``` from the Standard Library.  

Like this:

In [7]:
import math

Where is this math module I just imported?  We can check:

In [8]:
math.__file__

'/Users/kirbyurner/opt/anaconda3/lib/python3.9/lib-dynload/math.cpython-39-darwin.so'

NOTE:  Your location will be different, and your file extension may not be `.so` but `.dll` instead (if on Windows).  

The math module is compiled C code.  What we call CPython, which is the reference Python, the one most people use, is itself written in the C language.  

A lot of its modules and packages are written in Python (look for the `.py` and `.pyc` extension), but some, like `math` are written directly in C.

In [9]:
print(dir(math))

['__doc__', '__file__', '__loader__', '__name__', '__package__', '__spec__', 'acos', 'acosh', 'asin', 'asinh', 'atan', 'atan2', 'atanh', 'ceil', 'comb', 'copysign', 'cos', 'cosh', 'degrees', 'dist', 'e', 'erf', 'erfc', 'exp', 'expm1', 'fabs', 'factorial', 'floor', 'fmod', 'frexp', 'fsum', 'gamma', 'gcd', 'hypot', 'inf', 'isclose', 'isfinite', 'isinf', 'isnan', 'isqrt', 'lcm', 'ldexp', 'lgamma', 'log', 'log10', 'log1p', 'log2', 'modf', 'nan', 'nextafter', 'perm', 'pi', 'pow', 'prod', 'radians', 'remainder', 'sin', 'sinh', 'sqrt', 'tan', 'tanh', 'tau', 'trunc', 'ulp']


The above print statement is dumping out a list of all the names the `math` namespace contains.  

The idea of a namespace is not unique to Python.  Think of how, in everyday life, you come across many jargons, many shoptalks.  Think of namespaces as shoptalks. A shoptalk comes with every profession, every hobby, every sport.

If you uncomment (remove the hash tag) from the code below, you will get a helpful summary of what the math module (in the Standard Library) contains.

In [10]:
# help(math)

In [11]:
math.log10(100)

2.0

Don't worry if you've forgotten trig.  Let practice with Python be a way to help you remember what you might have learned in high school.  Pretend you're in high school again, learning math.

In [12]:
round(math.cos(math.radians(90)))  # round is native

0

In [13]:
math.pi  # a few important constants are built in

3.141592653589793

In [14]:
math.e

2.718281828459045

NOTE:  Python is also comfortable with complex numbers and has a Standard Library module `cmath` for working with them.  We don't need to bother with complex numbers much in data science, at least not in an introductory setting.  But it's good to know Python can work with them too.

In [15]:
z = 1 + 3j

In [16]:
z.real

1.0

In [17]:
z.imag

3.0

In [18]:
abs(z)  # length

3.1622776601683795

In [19]:
math.hypot(1, 3)  # same answer

3.1622776601683795

In [20]:
z.conjugate()     # notice how the dot is used a lot -- characteristic of object oriented languages

(1-3j)

Lets import and look inside another namespace:

In [21]:
import string
print(dir(string))

['Formatter', 'Template', '_ChainMap', '__all__', '__builtins__', '__cached__', '__doc__', '__file__', '__loader__', '__name__', '__package__', '__spec__', '_re', '_sentinel_dict', '_string', 'ascii_letters', 'ascii_lowercase', 'ascii_uppercase', 'capwords', 'digits', 'hexdigits', 'octdigits', 'printable', 'punctuation', 'whitespace']


In [22]:
string.__file__  # this one is written in Python, not C

'/Users/kirbyurner/opt/anaconda3/lib/python3.9/string.py'

Aha! These names look less "mathy" and more "stringy" i.e. they have to do with character strings.  Lets try some of these out...

In [23]:
string.ascii_letters

'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ'

In [24]:
string.punctuation

'!"#$%&\'()*+,-./:;<=>?@[\\]^_`{|}~'

In [25]:
string.digits

'0123456789'

Notice the single quote marks delimited characters in between.  You may use the single quote, double quote, triple single quote, and triple double quote.  Let's practice:

In [26]:
a = "assign this string to name a"  # notice double quotes
b = '...and this string to name b'  # notice single quotes
c = """triple quotes let
me include line breaks
in my string"""
d = '''either triple single,
or triple double'''

In [27]:
for the_string in a, b, c, d:  # <-- we call this a for loop
    print(the_string)

assign this string to name a
...and this string to name b
triple quotes let
me include line breaks
in my string
either triple single,
or triple double


NOTE:  there are other ways to import.  You might want to bring some of the names from math into the "top level" namespace, so that no prefix is necessary.  Instead of `math.sqrt` (for square root), you'll be able to go `sqrt` directly.

In [28]:
math.sqrt(100)

10.0

In [29]:
from math import sqrt

In [30]:
sqrt(49)

7.0

Since taking the 2nd root is the same as raising to the 1/2 power, Python is actually able to do roots without importing anything.  But `sqrt` is a convenience.  Think `calculator keys` -- that's what `math` provides.  And that's just for starters.

In [31]:
pow(49, 1/2) # 2nd root

7.0

In [32]:
pow(27, 1/3)  # 3rd root

3.0

## Summary

Python, when you simply boot into it, already provides a rich namespace.  We call these names the "built-ins" i.e. what Python knows right out of the starting gate.

For example, it would make no sense to ```import math``` if Python did not already know what it means to ```import```.  

Let's dump out everything Python "knows" (what names it recognizes) right when you start it up:

In [33]:
print(dir(__builtins__))



You could call this "a list of names you never need to import because Python already knows them" i.e. "they're built in".

You may be able to guess what a few of these do, as many are familiar words.  A lot of this will seem mysterious, and that's OK.  We do not learn Python in a day.  Some of these names are more for "internal use" by Python itself.

What's important to remember at this juncture is you're looking at another namespace.

Finally, lets look at what's even more core to Python (actually `import` is here):  its keywords.  The language is anchored by these keywords, none of which may be used as names for other purposes.  You may not know how to use all of these, and yet still be getting everything you need out of Python, for now.  

For example, the keywords `async` and `await` are recent additions and have to do with what we call asynchronous programming.  Many veteran Python programmers have yet to need this, whereas others use these keywords daily.

Whatever source code editor (vscode, pycharm, spyder... emacs, vim) you're using to write Python source code (`.py` files) likely knows the difference between keywords and built-ins, and color codes accordingly.  

Take your time memorizing i.e. don't make it a top priority.  As long as you're comfortable accessing the documentation, you'll be increasingly productive without keeping it all in your head.  Learn what you need when you need it.

In [34]:
import keyword  # Python's keywords
print(keyword.kwlist)

['False', 'None', 'True', '__peg_parser__', 'and', 'as', 'assert', 'async', 'await', 'break', 'class', 'continue', 'def', 'del', 'elif', 'else', 'except', 'finally', 'for', 'from', 'global', 'if', 'import', 'in', 'is', 'lambda', 'nonlocal', 'not', 'or', 'pass', 'raise', 'return', 'try', 'while', 'with', 'yield']


In [35]:
import sys # but then lets check the version of Python we're using
print(sys.version)

3.9.15 (main, Nov  4 2022, 11:11:31) 
[Clang 12.0.0 ]


Later, we will see that "namespace" and "Python dictionary" go together.  In the meantime...

In [36]:
import this  # check out the last line

The Zen of Python, by Tim Peters

Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!


In [37]:
# import antigravity  # uncomment, and do this for fun