# Lesson 0: Introduction

##### Navigation Links

- [Workshop description](https://jenfly.github.io/eoas-python/)
- [Computer setup instructions](https://jenfly.github.io/eoas-python/SETUP)
- Lessons:
  - **0 - Introduction**
  - [1 - Python Basics](1-python-basics.ipynb)
  - [2 - Reading & Summarizing CSV Data](2-reading-summarizing-csv-data.ipynb)
  - [3 - Basic Calculations & Plots](3-basic-calculations-plots.ipynb)
  - [4 - Sorting, Filtering & Aggregation](4-sorting-filtering-aggregation.ipynb)
  - [5 - Indexing & Subsets](5-indexing-subsets.ipynb)
  - [6 - Visualization & Geographic Maps](6-visualization-geographic-maps.ipynb)
- [Pandas cheatsheet](pandas-cheatsheet.ipynb)
- [Solutions to exercises](solutions/solutions-1.ipynb)
- [Additional resources](resources.ipynb)
- [Post-workshop survey](http://bit.ly/eoas-python-survey)

## Welcome to EOAS Python Workshop!

1. View slides in your browser: [bit.ly/eoas-python-slides](http://bit.ly/eoas-python-slides)

2. Download files: [bit.ly/eoas-python-download](http://bit.ly/eoas-python-download)
  - Unzip / extract the *eoas-python-master* folder and note where it is located (e.g. Downloads, Desktop, etc.). If desired, move it to another location on your hard drive.
  
3. Open JupyterLab &mdash; 3 options, depending on your setup:
  1. Open Anaconda Navigator and click "Launch" underneath the "lab" icon, or
  2. Open Terminal (Mac/Linux) or Anaconda Prompt (Windows) and run `jupyter lab` at the command line, or
  3. Go to [ubc.syzygy.ca](https://ubc.syzygy.ca/), click on "Sign-In" in the top right corner, and sign in with your CWL credentials.

# Agenda

### Today

- Lesson 0: Introduction (45 min)
  - Overview of scientific computing in Python
  - JupyterLab and Jupyter notebooks
- Lesson 1: Python Basics (1 hr 45 min)
- Lesson 2: Reading & Summarizing CSV Data (30 min)

### Tomorrow

- Lesson 3: Basic Calculations & Plots (45 min)
- Lesson 4: Sorting, Filtering & Aggregation (45 min)
- Lesson 5: Indexing & Subsets (45 min)
- Lesson 6: Visualization & Geographic Maps (45 min)

# Why Python for science?

- Free + open source
  - Transparency + reproducibility
  - Equal access
  - Sharing + collaboration

> *Fernando Perez - Project Jupyter: From Interactive Python to Open Science* [(video)](https://youtu.be/xuNj5paMuow) [(slides)](https://conferences.oreilly.com/jupyter/jup-ny-2017/public/schedule/detail/62419)

- Powerhouse for scientific computing + data analysis
  - High-level, user friendly interface
  - Efficient computation with Fortran, C/C++ etc. behind the scenes

- General purpose programming language

![python_uses](img/python_uses.png)

From: [Python Developers Survey 2017](https://www.jetbrains.com/research/python-developers-survey-2017)

- Python as "glue"

> *Jake VanderPlas - The Unexpected Effectiveness of Python in Science* [(video)](https://www.youtube.com/watch?v=ZyjCqQEUa8o) [(slides)](https://speakerdeck.com/jakevdp/the-unexpected-effectiveness-of-python-in-science)

Example: download and merge tens/hundreds of [Environment Canada weather data files](http://climate.weather.gc.ca/climate_data/daily_data_e.html?hlyRange=2013-06-11%7C2018-10-02&dlyRange=2013-06-13%7C2018-10-02&mlyRange=%7C&StationID=51442&Prov=BC&urlExtension=_e.html&searchType=stnName&optLimit=yearRange&StartYear=1840&EndYear=2018&selRowPerPage=100&Line=39&searchMethod=contains&Month=10&Day=2&txtStationName=vancouver&timeframe=2&Year=2018)

> If you'd like to learn more about downloading and merging data with Python, you can work through the following bonus exercises, which progress through the data wrangling techniques needed for the Environment Canada weather data.
>
>- 1.1(d)
>- 1.3(d) 
>- 2(e), (f) 
>- 3(h)
>- 4(g)

# Jupyter

- **JupyterLab**: Development environment for scientific computing + data analysis
- Within JupyterLab, we'll be using Jupyter **notebooks**:
  - Code, plots, formatted text, equations, etc. in a single document
  - Uses an IPython kernel to run Python code (IPython = "Interactive Python")
  - Also supports R, Julia, Perl, and over 100 other languages (and counting!)

## Example Notebook

[Demo](example-notebook.ipynb)

- Notebooks are great for exploration and for documenting your workflow
- Many options for sharing notebooks in human readable format:
  - Share online with [nbviewer.jupyter.org](http://nbviewer.jupyter.org/)
  - If you use Github, any notebooks you upload are automatically rendered on the site
  - Convert to HTML, PDF, etc. with [nbconvert](https://nbconvert.readthedocs.io/en/latest/)

# Other Workflows & Environments

For repetitive tasks, you can re-use code by creating a **library** or automate a workflow with a Python **script**.
- Code is saved in `.py` text files
- There are many, many options for development environments:
  - Command line + text editor (e.g. Atom, Sublime, Emacs, etc.)
  - Integrated development environment (e.g. PyCharm, Spyder, Visual Studio Code, etc.)

  
In this workshop, we will focus only on Jupyter notebooks.
- For a great example of a typical workflow of interactive exploration in Jupyter $\rightarrow$ automation with libraries/scripts, check out Jake VanderPlas' blog post [Reproducible Data Analysis in Jupyter](https://jakevdp.github.io/blog/2017/03/03/reproducible-data-analysis-in-jupyter/).

# Getting Started with JupyterLab

Let's take a tour of the main features of JupyterLab and create our first Jupyter notebook!

If you're running JupyterLab locally on your computer:
- Navigate to your `eoas-python-master` folder

If you're working online at [ubc.syzygy.ca](https://ubc.syzygy.ca/):
- To switch from the older Jupyter notebook app to JupyterLab, edit the url to replace `/tree` with `/lab`
- For example, if your CWL username is `bsmith00`, change the url from https://ubc.syzygy.ca/jupyter/user/bsmith00/tree to https://ubc.syzygy.ca/jupyter/user/bsmith00/lab
    

[This quick-start tutorial](http://bit.ly/jupyter-quickstart) may be a helpful reference as you navigate around JupyterLab.

### Jupyter Tour

- Files Sidebar - show/hide
- Demo CSV viewer, text editor
- Create a new notebook
  - Note the .ipynb extension (comes from "IPython notebook", the previous name before it was changed to Jupyter to reflect multi-language support)
  - Rename the notebook to "workshop.ipynb"
- Notebooks auto-save periodically, or you can manually save
- Next time you open JupyterLab, you can open your "workshop.ipynb" notebook by double-clicking it in the Files Sidebar

# Working with Notebooks

A notebook consists of a series of "cells":
- **Code cells**: execute snippets of code and display the output
- **Markdown cells**: formatted text, equations, images, and more

By default, a new cell is always a code cell.

## Code Cells

### Python as a Calculator

- We can use mathematical operators such as `+`, `-`, `*`, `/`, `**`
- To run a cell, press `Shift-Enter` or press the Run button on the Notebook Toolbar

In [1]:
2 + 2

4

In [2]:
3 / 4

0.75

In [3]:
5 * 6

30

In [4]:
7 ** 2

49

Combining mathematical operators:

In [5]:
5 + 30 / 3

15.0

In [6]:
(5 + 30) / 3

11.666666666666666

## Markdown Cells

In Markdown cells, you can write plain text or add formatting and other elements with [Markdown](https://github.com/adam-p/markdown-here/wiki/Markdown-Cheatsheet). These include headers, **bold text**, *italic text*, hyperlinks, equations $A=\pi r^2$, inline code `print('Hello world!')`, bulleted lists, and more.

- To create a Markdown cell, select an empty cell and change the cell type from "Code" to "Markdown" in the dropdown menu of the Notebook Toolbar
- To run a Markdown cell, press `Shift-Enter` or the Run button from the Notebook Toolbar
- To edit a Markdown cell, you need to double-click inside it

# Other Notebook Basics

- Organizating cells &mdash; insert, delete, cut/copy/paste, move up/down, split, merge
- Closing vs. shutting down a notebook &mdash; kernel process in background
- Re-opening a notebook after shutdown
  - All the code output is maintained from the previous kernel session
- Clear output of all cells or selected cell(s)
- Running all cells or selected cell(s)
- Restarting and interrupting the kernel

# Python Scientific Ecosystem

A **library** is a collection of pre-written code. It can consist of:
- A single file with Python code (a *module*), or
- A collection of multiple files bundled together (a *package*)

Some libraries come built-in with core Python but most are developed and maintained by external "3rd party" development teams
- Python core + 3rd party libraries = **ecosystem** 
- To install and manage 3rd party libraries, you need to use a package manager such as `conda` (which comes with Anaconda/Miniconda)

Over-simplified view of the ecosystem we'll be using in this workshop:

![ecosystem](img/ecosystem.png)

Some of the main libraries in the Python scientific ecosystem:

![ecosystem_big](img/ecosystem_big.png)

From [The Unexpected Effectiveness of Python in Science](https://speakerdeck.com/jakevdp/the-unexpected-effectiveness-of-python-in-science) (Jake VanderPlas)

In this workshop, we'll be focusing on Pandas, with a brief introduction to Matplotlib and Cartopy.

Other useful libraries for scientific computing include:

- [Numpy + Scipy](https://www.scipy.org/getting-started.html) - Numeric and scientific tools including: 
  - Linear algebra
  - Statistics and random numbers
  - Numerical integration
  - Differential equations
  - Interpolation, optimization and curve fitting
  - Special functions
  - Fast Fourier transforms
  - Signal processing
- [Matplotlib](https://matplotlib.org/) for visualization
- [Seaborn](https://seaborn.pydata.org/) for statistical data visualization
- [Statsmodels](https://www.statsmodels.org/stable/index.html) for statistical analysis and modelling
- [Sympy](https://docs.sympy.org/latest/tutorial/intro.html) for symbolic computation

See the [additional resources page](resources.ipynb) for links to tutorials and examples for these libraries.

---

Go to: [next lesson](1-python-basics.ipynb)