In [1]:
from IPython.core.interactiveshell import InteractiveShell
InteractiveShell.ast_node_interactivity = "all"

Essential Python Libraries:
- Numpy: [Numpy](https://numpy.org/) is short for *Numerical Python*, has long been a cornerstone of *numerical computing* in Python
- pandas: [pandas](https://pandas.pydata.org/) name itself derived from *panel data*. It provides *high-level data structures and functions* designed to make working with *structured or tabular data* intuitive and flexible.
- matplotlib: [matplotlib](https://matplotlib.org/) is the most popular Python library for producing *plots and other two-dimensional data visualizations*. It is designed for creating plots suitable for *publication*.
- IPython and Jupyter: [IPython project](https://ipython.org/) is to make a better interactive Python interpreter. IPython is designed for both *interactive computing and software development work*. [Jupyter project](https://jupyter.org/) is a broader initiative to design language-agnostic *interactive computing tools*. The IPython web notebook became the Jupyter notebook, with support for over *40* programming languages.
- SciPy: [SciPy](https://scipy.org/) is a collection of packages addressing a number of foundational problems in *scientific computing*.
- scikit-learn: [scikit-learn](https://scikit-learn.org/) has become the premier general-purpose *machine learning toolkit* for Python programmers.
- statsmodels: [statsmodels](https://statsmodels.org/) is a *statistical analysis* package.
- Other packages: TensorFlow and PyTorch become popular for *machine learning* or AI work.

## IPython Basics

In [2]:
print('Hello, world!')

Hello, world!


Nearly all of the commands and tools in this chapter can be used in IPython shell.

In [3]:
%pwd

'/Users/wenyunxin/Documents/github/computer-science-practicing/math'

In [4]:
import json
path = '/Users/wenyunxin/documents/github/computer-science-practicing/first 20 hours.txt'
with open(path, 'r', encoding='utf-8') as f:
    for line in f:    # f once has been used, the file is closed
        print(line)
    lines = f.readlines()    # This will get a blank list

lines

Transcriber: Gustavo Rocha

Reviewer: Marssi Draw



Hi everyone.



Two year ago, my life changed forever.



My wife Kelsey and I



welcomed our daughter Lela

into the world.



Now, becoming a parent

is an amazing experience.



Your whole world changes over night.



And all of your priorities

change immediately.



So fast that it makes it really difficult

to process sometimes.



Now, you also have to learn

a tremendous amount about being a parent



like, for example,

how to dress your child.



(Laughter)



This was new to me.



This is an actual outfit, 

I thought this was a good idea.



And even Lela knows

that it's not a good idea. (Laughter)



So there is so much to learn and

so much craziness all at once.



And to add to the craziness, 

Kelsey and I both work from home,



we're entrepreneurs,

we run our own businesses.



So, Kelsey develops courses

online for yoga teachers.



I'm an author.



And so, I'm working from home,

Kelsey's working from home.

[]

In [5]:
# tab completion
an_apple = 27
an_example = 42
an_apple

b = [1, 2, 3]
b.reverse()
b

b?

27

[3, 2, 1]

[0;31mType:[0m        list
[0;31mString form:[0m [3, 2, 1]
[0;31mLength:[0m      3
[0;31mDocstring:[0m  
Built-in mutable sequence.

If no argument is given, the constructor creates a new empty list.
The argument must be an iterable if specified.

In [6]:
# introspection
print?

[0;31mSignature:[0m [0mprint[0m[0;34m([0m[0;34m*[0m[0margs[0m[0;34m,[0m [0msep[0m[0;34m=[0m[0;34m' '[0m[0;34m,[0m [0mend[0m[0;34m=[0m[0;34m'\n'[0m[0;34m,[0m [0mfile[0m[0;34m=[0m[0;32mNone[0m[0;34m,[0m [0mflush[0m[0;34m=[0m[0;32mFalse[0m[0;34m)[0m[0;34m[0m[0;34m[0m[0m
[0;31mDocstring:[0m
Prints the values to a stream, or to sys.stdout by default.

sep
  string inserted between values, default a space.
end
  string appended after the last value, default a newline.
file
  a file-like object (stream); defaults to the current sys.stdout.
flush
  whether to forcibly flush the stream.
[0;31mType:[0m      builtin_function_or_method

In [7]:
# ? combined with *
import numpy as np
np.*load*?

np.__loader__
np.load
np.loadtxt

## Python language basics

*Four spaces* as default indentation in Python programming is recommended.

Every number, string, data structure, function, class, module, and so on, that exists in Python interpreter is an *object*. Each object has an associated *type* (e.g. integer, string, or function) and internal data.

Almost every object in Python has attached functions, known as *methods*, that have access to the object's internal contents. They are usually called by using

```Python
obj.some_method(x, y, z)
```

Understanding *the semantics of referencing* in Python, and when, how, why data is copied, is especially critical when you are working with large datasets in Python.

In [12]:
a = [1, 2, 3]
id(a)
b = a
id(b)
id(b) == id(a)
b is a
b == a

c = 3
d = 3.0
id(c)
id(d)
type(c)
type(d)
id(c) == id(d)
c is d    # False, they reference different objects, e.g. different memory addresses
c == d    # True, the value of c and the value of d are equal. When comparing integers with float numbers,
          # The integers first transform to float numbers

4717162560

4717162560

True

True

True

4439211160

4686802672

int

float

False

False

True

Assignment is also referred to as *binding*, as we are binding a name to an object. *Variables names* that have been assigned may occasionally be referred to as *bound variables*.

Python is a *strong typed* language, which means that every object has a specific type (or *class*), and implicit conversions will occur only in certain permitted circustances, such as:

In [14]:
# 3 wii be implicitly converted to a float number for addition operation
3 + 3.0

6.0

`isinstance` can accept *a tuple of types* if you want to check that an object's type is among those present in the tuple

In [23]:
a = 3
b = 4.5

# Why do these two return True?
isinstance(a, (int, float))
isinstance(b, (int, float))

issubclass(int, float)
issubclass(float, int)
issubclass(bool, int)

True

True

False

False

True

In [28]:
# Attributes and methods
f = "foo"
fs = getattr(f, "split")
fs
fs()

<function str.split(sep=None, maxsplit=-1)>

['foo']

In [8]:
import spacy
text = ("When Sebastian Thrun started working on self-driving cars at "
        "Google in 2007, few people outside of the company took him "
        "seriously. “I can tell you very senior CEOs of major American "
        "car companies would shake my hand and turn away because I wasn’t "
        "worth talking to,” said Thrun, in an interview with Recode earlier "
        "this week.")

# load the English model
nlp = spacy.load("en_core_web_sm")

# process a text
doc = nlp(text)

# Analyze syntax
print("Noun phrases:", [chunk.text for chunk in doc.noun_chunks])
print("Verbs:", [token.lemma_ for token in doc if token.pos_=="VERB"])

# Find named entities, phrases and concepts
for entity in doc.ents:
    print(entity.text, entity.label_)

Noun phrases: ['Sebastian Thrun', 'self-driving cars', 'Google', 'few people', 'the company', 'him', 'I', 'you', 'very senior CEOs', 'major American car companies', 'my hand', 'I', 'Thrun', 'an interview', 'Recode']
Verbs: ['start', 'work', 'drive', 'take', 'tell', 'shake', 'turn', 'talk', 'say']
Sebastian Thrun PERSON
Google ORG
2007 DATE
American NORP
Thrun PERSON
Recode ORG
earlier this week DATE
