# A 9-hour Python tutorial focusing on data processing

![Creative Commons License](https://i.creativecommons.org/l/by/4.0/88x31.png)  
This work by Jephian Lin is licensed under a [Creative Commons Attribution 4.0 International License](http://creativecommons.org/licenses/by/4.0/).

## 1. Basic concepts and where to ask for help

### What is Python?
* Python is a programming language designed for **readability**
* Python has many *convenient* features making it **relatively slow**
* Python is popular because the **wide variety of packages**

### Python? Anaconda?
* Python was created by **Guido van Rossum** in 1991
* Guido likes the British comedy group **Monty Python**, so he named it as Python
* In terms of biology, both Python and Anaconda are both snakes  [(One is longer, the other is heavier)](http://www.differencebetween.net/science/nature/difference-between-python-and-anaconda/)
* In terms of computer science, **Anaconda is a Python distribution**, containing many Python packages.
* Both [Python](https://www.python.org/) and [Anaconda](https://www.anaconda.com/) can be downloaded from web and install on your own machine.

### Cloud services
* Free cloud services are getting popular:
* [Kaggle](https://www.kaggle.com/), [Colaboratory](https://colab.research.google.com/), [CoCalc](https://cocalc.com), etc.
* Uh... Kaggle and Colaboratory thrived independently, but are then bought by Google...
* CoCalc was aim on mathematical computations, especially on algebra system, and it uses Google servers...
* For basic computation, you really don't need to install Python on your own machine, provided that you have internet...

### Script?
* running Python by command lines (default option)
* fast, offline, easy to interact with other applications

### Notebook?
* running Python on a browser (e.g., **Jupyter**)
* cross-platform, used by most Cloud services, rich text format

### Your best friends
* `shift+enter`: evaluate a cell
* `tab`: autocomplete or show the possible complettions
* _object_.: press `tab` to see functions under _object_
* _func_?: evaluate to read the documentation of _func_
* _func_??: evaluate to read the source code of _func_
* Google: the answers are likely available online

Press `shift+enter` to evaluate the cell below.

In [1]:
1 + 1

2

Move your text cursor to the end of `ran` and press `tab`.  
Jupyter will autocomplete `ran` to be `range`.

In [None]:
ran

After you tell Python what `a` is, type `a.` and press `tab` to see related functions.

In [2]:
a = 'Hello'
type(a)

str

In [None]:
### <tab> would not work 
### if you did not run 
### a = 'Hello'
### first.
a.

For example,  
you may try `a.upper()`  
and guess the meaning of the function.

In [3]:
a.upper()

'HELLO'

Different objects have different functions associated with them.  
For example, 
```Python
a = 'Hello'
a.upper()
```
will return `HELLO`, but 
```Python
a = 1 
a.upper()
```
will return AttributeError.

In [4]:
a = 1
type(a)

int

In [None]:
a.upper()

In [None]:
### With a = 1, 
### press <tab> to see functions related to an integer

a.

Evaluate 
```Python
a = 'Hello'
a.upper?
```
to read the documentation of the function `upper`.  
(And you may presss `Esc` to close the documentation.)

In [None]:
a = 'Hello'
a.upper?

The function `upper` is associated with a string, so only `upper?` wouldn't work.

In [None]:
upper?

To become an expert, you will read others' code and see how they deal with it.  
Use ``??`` to check the source code if available.

In [None]:
import random

random.randint??

In [None]:
### evaluate this cell several times to get different numbers

random.randint(1,5)

Finally, Google is always ready to help.  
For example, Google "how to swap two variables in python".

In [None]:
a = 1
b = 2
a,b = b,a
print(a,b)

### Assign and print
In Python, a single `=` means to assign a value.  
For example, `a = 'Hello'` means assign the variable `a` as a string `'Hello'`.  
Here we call `a` as a **variable** and `'Hello'` as the **value** of the variable.

To see the value of a variable, use `print`.

In [5]:
a = 'Hello'
print(a)

Hello


In [6]:
a = 123
b = 'Hello'
c = 'Everybody'
print(a, b, c)

123 Hello Everybody


In [7]:
print(a, b, c, sep='! ')

123! Hello! Everybody


#### Exercise
Now you notice the exclamation mark `!` only appears between the variables.  
This is normal, if you read the documentation of `print`, you will see 
    
    sep:   string inserted between values, default a space.

Read the documentation (by evaluating `print?`) carefully and find a way to output `123! Hello! Everybody!`.

In [9]:
a = 123
b = 'Hello'
c = 'Everybody'
### your answer here


#### Exercise

Suppose someone wrote the following:
```Python
a = 'I come from taiwan'
```
This is annoying since it should be `Taiwan` but not `taiwan`.  

Type `a.` and press `tab` to see all related functions.  
Find a function under `a` that allows you to replace `taiwan` by `Taiwan`.

In [None]:
a = 'I come from taiwan'
### your answer here


#### Exercise 

When you collect data, if you did not carefully tell the participants  
how to fill in the form, then you will get all kinds of answers.  
Suppose you are setting up a time for a meeting and you get the  
following answers from three different people.
```Python
a = 'Monday Wednesday Friday'
b = 'Monday, Tuesday, Thursday'
c = 'Monday;Friday'
```
Extract the dates for each one by the `split` function.

In [10]:
a = 'Monday Wednesday Friday'
b = 'Monday, Tuesday, Thursday'
c = 'Monday;Friday'
print(a.split()) ### this is correct
print(b.split()) ### how to remove the comma?
print(c.split()) ### how to remove the semicolon?

['Monday', 'Wednesday', 'Friday']
['Monday,', 'Tuesday,', 'Thursday']
['Monday;Friday']


#### Exercise
Reading the source code build up your knowledge on programming.

Evaluate the cell below and read the source code.  
The `random` package uses the value of pi.  
Find out how to get the value of pi.

In [None]:
import random

random??

In [None]:
### from m??? import ?? as _pi
### your answer here

print(_pi)

**Finding the possible solutions is an essential part of programming,  
and it is a skill that will benefit you in the long run.**

Seriously, I think no one learn programming _only_ from shcool.

### Online resources for Python
1. [SoloLearn](https://www.sololearn.com/) is a plateform for you to learn to code for free.  It teaches various programming languages, and mobile apps are also available!
1. [Kaggle Learn](https://www.kaggle.com/learn/overview) allows you to learn and run Python on Cloud.
2. [Python for Everybody](https://www.py4e.com/book) is a free/open-sourced book with free course videos that provide the details of Python.
2. [Coursera](https://www.coursera.org/) offers lots of (kind of) _free_ course.


### Jupyter shortcuts
* Press `Esc` to enter the **Command Mode**
* Press `Enter` to enter the **Edit Mode**
* In Command Mode, press `A` (`B`) to insert a cell above (below)
* In Command Mode, press `DD` to delete the cell
* In Command Mode, press `H` to read all shortcuts

### Python installation
If you are a Linux user, you can do 
```bash
sudo apt install python
```
in Ubuntu or 
```bash
sudo pacman -S python
```
in Arch Linux to install Python easily.  

If you are using Windows or Mac, then you will have to download the installation package from [Python website](https://www.python.org/).

### Python package installation

Warning: Installing packages through Jupyter is not recommended.  
This part is only to illustrate the installation process.

Code in this section is unlikely to work due to  
the settings on different machines,  
lack of internet or  
lack of permission.

### Technicalities
The exclamation mark `!` allows you to run command in your shell.  
`cat` is a program that print the content of a file.  
`/etc/os-release` stores the OS information of the machine.  

Alternatively, you can do `lsb_release`.

Note: These commands are mainly for Linux.

In [None]:
!cat /etc/os-release

### Install with `pip`
You may find packages on [the Python Package Index](https://pypi.org/), also known as PyPI.  

In general, you have to find the package official website  
and follow the instruction to install.  
Take NumPy as an example, find its [installation guide](https://www.scipy.org/install.html) and follow the instructions therein.  

However, `pip` provides you an easy way to install packages available on PyPI.  
For NumPy, you can do 
```Python
pip install numpy
```
and it will download the package and install.  

Note:  You can do `pip uninstall numpy` to uninstall.

In [None]:
!pip install numpy

In [None]:
!pip install funniesttest

### Virtual environment
Chances are that the cells above won't work well.  
This is actually good!  
Python has way too many packages available and some can conflict with each other.  

If a package is not so fundamental that everyone need that  
then don't install it globally.  
Creating a **virtual environment** is a better approach.

In [None]:
!virtualenv my_project ### create a virtual environment called my_project
!source my_project/bin/activate && pip install funniesttest ### go to the virtual environment and install

This avoids the permission issue  
but still need internet to access the package.

### Offline installation
Suppose you plan to install the package `funniesttest`  
and you already have the package file obtained from [here](https://pypi.org/project/funniesttest/) on PyPI. 

Do the following steps (in the virtual environment if necessary)  
1. Unpack the package by `gzip` or `tar` if necessary.
```bash
gzip -d filename.tar.gz
tar -xvf filename.tar
```
2. Go to the folder.
```bash
cd foldername
```
3. Install with `pip`.  Note that the dot at the end is not a period!
```Python
pip install .
```

If the package has a file `filename.whl`, then ignore above and do
```Python
pip install filename.whl
```

In [None]:
### create a virtual environment called my_project
!virtualenv my_project 

In [None]:
### unpack the package
!cp funniesttest-1.0.tar.gz my_project
!cd my_project && gzip -dk funniesttest-1.0.tar.gz && tar -xvf funniesttest-1.0.tar
### show where we are and list what's in the folder
!pwd
!ls

In [None]:
### activate my_project
### go to the folder
### then install
!source my_project/bin/activate && cd my_project/funniesttest-1.0/ && pip install .

In [None]:
### reset
### run this cell only when 
### you want to wipe out the virtual environment
!rm -rf my_project

### Conclusion
To get some experiecnes of Python, use Cloud services.  
For machines that you are not the owner, ask IT for help.  
For your own machine, it is nice to get your hand dirty and go through the installation by yourself.