# Cultural and Literary Text Mining
## Pre-Lab Session: Python Preliminaries

**Author**: Kent K. Chang

Part of the [CALTMIG](https://caltmig.kentchang.com/) cult.

This notebook means to introduce some jargons so you can get started with Python. Most of them will be introduced in greater detail in later sessions – some vague ideas of what each seems to mean would suffice. 

## Editoral conventions

* **"Note" sections**: You may skip them during your first read.
* **inline bold faced words**: also jargons
* **`typewriter` `font`**: code or terminal commands

* * *

## Some terms

For more accurate definitions please go to [Python 3 Glossary](https://docs.python.org/3/glossary.html).

### Jupyter Notebook

Currently on your screen is a Jupyter Notebook. You can think of this as a magical, Pythonized version of Microsoft Word. In it you can write an article and also put blocks of code. A code block looks like this:

In [1]:
print('Hello from the other side . . . of Jupyter')

Hello from the other side . . . of Jupyter


The magical bit is: you can run this block code above by hitting the play button on the toolbar while you select it.

Each Jupyter Notebook is composed of a few **cells**. They can be prose (formatted using Markdown, a markup language) and code blocks.

#### Note
```
In [1]:
```
`In` is input and its output is printed immediately below.

* * *

### Interpreter

Indeed, Jupyter is not your run-of-the-mill coding environment. The fact that you can write a line of **statement**,  execute it, and see the output on the fly suggests that Python is an **interpreted** language. 

That means, when you hit "play", that `print()` statement goes into the Python **intrepreter** which actually runs the code and returns the result.

Of course, as you progress and your code grows longer, you might want to set up your own Python environment and run all the lines in a file altogether. That is certainly possible and will be covered in future sessions.

#### Note
The other kind of programming language is **compiled language** like C++. You can't run C++ in the Python way; you have to write all your code in a single file and find some way to compile it to an executable file.

* * *

### Anaconda

If you're curiously keen and want to set up your own environment, Anaconda is usually the go-to option. You can think of it as an all-in-one kit for Python programming. 

Technically, you can download the Python interpreter, install it, set up the **package** manager called `pip` and install packages you want individually. Then you pick your favorite text editor and start coding. Anaconda does all those things for you; it lets you start coding with little fuss.

* * *

### Function 

The intuition is: a function is a black box which allows you to throw things in, do something with it, and often the function returns something. A Python function is usually defined to achieve some *function*. 

Technically, it would be easier to think of function this way, if you're ready for some cringe attack:

$$ y = f(x) = x + 1 $$

Your math (or maths) teacher must have told you $f$ here is a function that takes $x$ as input. In Python a **function definition** for that math expression looks like this:

```python
def f(x):
    y = x + 1
    return y
```

#### Notes

Here are a few more related terms. Feel free to skip the following section during the first read.

* **parameter and argument**: In the above definition, `x` is a parameter, which is expected to be passed into the function. When you **invoke** the function somewhere in your code you write `f(5)`; `5` here is the argument that is actually passed into the function in runtime.
* **return value**: What a function eventually spits out (`y` in this case). You don't really have to have a return value, and you can have more than one of those.
* **method**: For now, just think of method as another name for function.  

* * *

### Module

Formally defined as "an organizational unit of Python code," a module, for now, is simply any block of code that works.

* * *

### Package and library

Technically, package is a set of statements and function definitions. You can think of a package as a useful toolbox, and library is composed of a lot of toolboxes put together to make a series of tasks easier. 

People love Python because you can get and use a ton of nice toolboxes easily for free. Python already comes with a **standard library** which gives you some basic functionalities. But if you want to do fancy things, you need to **import** toolboxes into your code.

Our first example wants you to download a text file remotely. The `request` library will enable us to do that. So later you will see, the first lines of our code include:

```python
import request
```

We will introduce a few libraries:

* **request**: helps send HTTP requests to download files remotely
* **NLTK**: the essential tool kit for natural language processing, enables you to do stuff like tokenization
* **numpy**: allows you to easily create and work with multidimensional arrays, or vectors and matrices
* **pandas**: like the Python version of Microsoft Excel, lets you maniuplate data frames, or worksheets
* **matplotlib**: offers plotting functionalities